Discover real AI creators shaping the future. Track their latest blogs, X posts, YouTube videos, WeChat Official Account posts, and GitHub commits — all in one place.
RT staysaasy We had a great time joining the inimitable @swyx on his podcast to talk about AI, X, building teams and more. Check it out on his page or below! Original tweet: https://x.com/staysaasy/status/2043744108812902480
RT simon Everyone who think vision is solved should look at some enterprise documents Original tweet: https://x.com/disiok/status/2043740333394231755
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It
View quoted postRT Hunter Lovell and if this sounds like fun (it is), we're hiring! https://www.langchain.com/careers Original tweet: https://x.com/huntlovell/status/2043727652046213322
Fun fact - we have no one with dev rel as a title (and never had) Everyone has always been just an engineer building things and then talking about why those things matter and are cool
View quoted post"how can i help?" i analyzed 100 asks from portfolio companies identified in notes and emails to see what they asked 26% specific person intro 9% investor discovery 7% hiring/talent 9% pr/media 11% business/partnerships 14% scheduling/coordination 6% strategy/advice 6% product/technical 4% event related 4% admin 2% research 2% permission/approvals then further analyzed what percent could be assisted with AI:
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr- 📄 Paper: https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- 💻 Code: https://github.com/run-llama/ParseBench?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- 📊 Dataset: http...