Jerry Liu

JL

Jerry Liu

0 followers

686 content items

25 in the last 7 days

About

Parsing the world's hardest PDFs @llama_index. cofounder/CEO Careers: https://t.co/EUnMNmbCtx Enterprise: https://t.co/Ht5jwxSrQB

Platforms

𝕏Jerry Liu

Content History

𝕏x•about 4 hours ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Your extraction schema is now a conversation away. ⁣ 📄 Upload a doc — the agent drafts the schema from it⁣ 💬 Want changes? Just ask⁣ 📚 Or grab a template and go⁣ ⁣ Writing JSON Schema by hand? That's over.⁣ ⁣ Conversational Extract is live in LlamaParse. Try it now at https://cloud.llamaindex.ai?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jul-

𝕏x•about 6 hours ago

Work hard play hard 🇫🇷🇪🇸

𝕏x•about 6 hours ago

100%. If I'm driving on the 101 it is a life hazard to read verbose billboards

@Murtaza Khomusi

SF peepz. Stop trying to fit more than 5 words on a billboard. Bad vibez. Thank you for your attention to this matter.

View quoted post

𝕏x•about 9 hours ago

If you're a founder, come pitch your wildest ideas to other founders in a safe space! Food and drinks included 🍹

@LlamaIndex 🦙

Co-hosting 'Swap and Pitch Night' this Thursday in SF with @tembo, @clerk, and @autonomaai. Pitch night where founders pitch each other's startups. Good test of whether you actually listen, and sometimes someone pitches your startup better than you do. 855 Brannan, 6 to

Quoted tweet media 1

View quoted post

𝕏x•about 9 hours ago

Our team made the cutest graffiti art yesterday 🎨

Our team made the cutest graffiti art yesterday 🎨

@LlamaIndex 🦙

We eat docs🫢

Quoted tweet media 1

View quoted post

𝕏x•about 9 hours ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Co-hosting 'Swap and Pitch Night' this Thursday in SF with @tembo, @clerk, and @autonomaai. Pitch night where founders pitch each other's startups. Good test of whether you actually listen, and sometimes someone pitches your startup better than you do. 855 Brannan, 6 to 8:30pm. RSVP here: https://luma.com/swapandpitch

RT LlamaIndex 🦙
Co-hosting 'Swap and Pitch Night' this Thursday in SF with @tembo, @clerk, and @autonomaai.

Pitch night where founders pitch each other's startups. Good test of whether you actua...

𝕏x•about 11 hours ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We eat docs🫢

RT LlamaIndex 🦙
We eat docs🫢

𝕏x•1 day ago

Agreed with this. people underestimate the importance of good abstractions and maintainability. part of the reason LLMs/AI are so popular in the first place is because of the ease of using the models during inference-time (through prompting, orchestration) instead of training

@Jesse Zhang

I actually feel strongly that the "learning" companies will want to do with their data will mostly not be to train models. Let's say you have a bunch of valuable data about your customers or employees. Your best bet to make that IP useful is to turn it into skills or artifacts

View quoted post

𝕏x•1 day ago

OpenAI is absolutely cooking this week

@Sam Altman

come for the best model, stay because we don’t treat you with contempt

View quoted post

𝕏x•1 day ago

There is a massive opportunity for any startup to build specialized, domain-specific workflows that are at the Pareto frontier in accuracy, cost, and latency. It is clear that not every task requires frontier intelligence (e.g. 5.6 Sol xhigh). Every single task has an optimal model+harness configuration in order to solve that task in an accurate, cheap, fast manner. This holds true even as frontier models get better! We see this as being especially true for document OCR. We think there will always be a massive gap in the Pareto frontier for document OCR vs. frontier model capabilities for visual understanding. Over the past year, we’ve invested massive amounts of research efforts in building auto-routing harnesses, fine-tuned layout models, and specialized models for specific elements within the document distribution - tables, charts, handwriting, and more. All this applied research makes its way towards making LlamaParse the best document processing service out there. Come check it out: https://cloud.llamaindex.ai/

There is a massive opportunity for any startup to build specialized, domain-specific workflows that are at the Pareto frontier in accuracy, cost, and latency.

It is clear that not every task requi...

𝕏x•1 day ago

Parse any PDF into agent-ready context. Fully local, free desktop app. No model / internet connection required!

@LlamaIndex 🦙

Ever wanted to quickly turn a PDF into clean text to paste into your favorite AI agent, without having use CLIs or open the browser? We built exactly that. Using @TauriAp ps, with a Rust backend powered by LiteParse and a React frontend, we created a cross-platform desktop app

View quoted post

𝕏x•1 day ago

congrats on the launch! As the task horizons on agents gets longer, we need more work evaluating and training models to be better in long-running, open-ended, evolving real-world environments.

@Skyfall AI

Today we present Morpheus, a persistent enterprise simulation platform designed to make Continual Learning a reality. Morpheus is the world’s first real world Reinforcement Learning environment. Every Reinforcement Learning environment operates in the game world. Benchmarks like

View quoted post

𝕏x•1 day ago

Retweeted from @Skyfall

RT Skyfall AI Today we present Morpheus, a persistent enterprise simulation platform designed to make Continual Learning a reality. Morpheus is the world’s first real world Reinforcement Learning environment. Every Reinforcement Learning environment operates in the game world. Benchmarks like Atari, OpenAI Gym, MuJoCo, and Procgen are all small, game-like worlds that reset every few minutes. But the real world never resets. A business keeps running and evolving everyday. We tested how frontier LLMs would perform in realistic and dynamic business environments 🧬on Morpheus. The main conclusion was that LLMs are not continual learners. 🧵Here’s how we did it and what we learned:

𝕏x•1 day ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Ever wanted to quickly turn a PDF into clean text to paste into your favorite AI agent, without having use CLIs or open the browser? We built exactly that. Using @TauriAp ps, with a Rust backend powered by LiteParse and a React frontend, we created a cross-platform desktop app that lets you: • Drag and drop PDFs • Convert them into clean Markdown • Preview page screenshots with extracted bounding boxes overlaid for easy inspection It's a simple way to understand your document's structure and verify the parsing results before handing them off to an LLM. Check out the demo below! 👇 GitHub: https://github.com/run-llama/liteparse-desktop Get started with LiteParse: https://developers.llamaindex.ai/liteparse/

𝕏x•2 days ago

lmao

@Claude

We're extending Claude Fable 5 access on all paid plans, as well as keeping Claude Code’s weekly rate limits 50% higher, through July 19.

View quoted post

𝕏x•3 days ago

sama scrolling AI Twitter like the rest of us

@Sam Altman

30% of the cost was on fable at these levels of usage?

View quoted post

𝕏x•5 days ago

wtf I thought this was AI cc @disiok

@Lawrence Jang

guys I just cancelled my Claude plan I don’t know what happened

View quoted post

𝕏x•5 days ago

We comprehensively benchmarked GPT-5.6 on document understanding. At a high-level there's no change between GPT-5.6 Sol and GPT-5.5 in terms of performance over tables, text, charts, layout, and more. The GPT-class of models typically does quite well over table understanding, but they struggle with transcribing complex text layouts and formatting, with transcribing charts, and with deriving bounding boxes over source elements. Come check out our leaderboard of over 70+ frontier models, open-weight models, and OCR solutions on ParseBench: https://www.parsebench.ai/

We comprehensively benchmarked GPT-5.6 on document understanding.

At a high-level there's no change between GPT-5.6 Sol and GPT-5.5 in terms of performance over tables, text, charts, layout, and m...

@LlamaIndex 🦙

@OpenAI released GPT 5.6 today and we ran a day 0 benchmark in ParseBench to test improvements in document understanding. The new family of models continues to excel at reading text and tables, but continues to struggle with charts and layout. What's most interesting is that

Quoted tweet media 1

View quoted post

𝕏x•5 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Re @OpenAI released GPT 5.6 today and we ran a day 0 benchmark in ParseBench to test improvements in document understanding. The new family of models continues to excel at reading text and tables, but continues to struggle with charts and layout. What's most interesting is that Luna is about 6 x cheaper than Sol and only results in minor degradations across all ParseBench metrics, showing that an increase in reasoning tokens doesn't always result in a commensurate improvement in visual understanding.

RT LlamaIndex 🦙
Re @OpenAI released GPT 5.6 today and we ran a day 0 benchmark in ParseBench to test improvements in document understanding. The new family of models continues to excel at reading...

𝕏x•5 days ago

Actually I did a quick test. I like the ChatGPT Work/Codex split better than Claude Cowork/Code. The interface is much more unified. The functionality is effectively the same. The chat history is shared between the two modes. The team put more care into the specific toggles that should feel different between knowledge work and code.

@Jerry Liu

ChatGPT Work == Claude Cowork ChatGPT Codex == Claude Code I kinda wish OpenAI created a single unified app surface for all work, coding or not, even though I get the UI/UX would be different

View quoted post

𝕏x•5 days ago

ChatGPT Work == Claude Cowork ChatGPT Codex == Claude Code I kinda wish OpenAI created a single unified app surface for all work, coding or not, even though I get the UI/UX would be different

@OpenAI

Introducing ChatGPT Work, a new agent in ChatGPT powered by Codex and GPT-5.6. It can take action across your apps and files, stay with a project for hours if needed, and turn a goal into finished work. It’s a whole new way to get work done.

View quoted post

𝕏x•5 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Low user caps in LlamaParse? Gone.🎉 Every plan — Free through Pro — now unlocks up to 100 team members. No more picking who gets a seat. No more shared logins. No more upgrading just to invite a teammate. Whether you're a two-person startup on the free tier or a team scaling on Pro, your whole team can now build together, same workspace, same projects, every tier. True team collaboration is now the default, not a premium feature. Signup for free and invite your team today --> https://cloud.llamaindex.ai?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jul-

RT LlamaIndex 🦙
Low user caps in LlamaParse? Gone.🎉

Every plan — Free through Pro — now unlocks up to 100 team members.
No more picking who gets a seat. No more shared logins. No more upgrading ...

𝕏x•5 days ago

Congrats to our llama cousins 🫡🦙

@ollama

Big day for Ollama! When we started, open models and the open source AI ecosystem were in their early days with few believers. Our belief in open source has never wavered. With today's fundraising announcement and our 9M+ active builders, we’re ready to scale open models into

Quoted tweet media 1

View quoted post

𝕏x•6 days ago

You don't need heavyweight VLMs to OCR simple text-only PDFs. Doing that is like bringing a bazooka to a knife-fight, and is completely unnecessary and worse quality than a tuned OCR approach. Output tokens are expensive and slow. Reading text as images hurts transcription accuracy. We've built an improved routing layer in LlamaParse that picks the right level of document parsing capability depending on the complexity of the page. If the page is simple/text-heavy, we'll use our cost-effective techniques. If the page contains dense tables or charts, we'll revert to a heavier VLM-based approach in our agentic mode. You can toggle this in one-click in the product! Check it out: https://cloud.llamaindex.ai/

You don't need heavyweight VLMs to OCR simple text-only PDFs.

Doing that is like bringing a bazooka to a knife-fight, and is completely unnecessary and worse quality than a tuned OCR approach. Out...

@LlamaIndex 🦙

We've rolled out improvements to LlamaParse Cost Optimizer. Our intelligent tier routing now more reliably ensures you always strike the right balance between cost and accuracy when processing large documents. Simple pages default to our cost-effective tier, complex pages are

Quoted tweet media 1

View quoted post

𝕏x•6 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Introducing: Granular job tracking and cost attribution in LlamaParse 🦙 Parsing thousands of pages a day is a lot to manage. Engineering teams need to have a pulse on what's running, who ran it, and what it costs. Our latest tagging system make it easier to track and attribute usage and costs at scale in LlamaParse: 🏷️ 𝗨𝘀𝗲𝗿 𝗠𝗲𝘁𝗮𝗱𝗮𝘁𝗮: Always know where every job belongs. Attach your own labels to any parse job and they come back on every response. 🏷️ 𝗨𝘀𝗮𝗴𝗲 𝗧𝗮𝗴𝘀: Know what your document processing actually costs, by project and by team. Tag your requests, and your spend becomes filterable by those tags: by customer, by team, by product line, whatever makes sense for your org. 🏷️ 𝗦𝗶𝗴𝗻𝗲𝗱 𝗪𝗲𝗯𝗵𝗼𝗼𝗸𝘀: Trust every callback you receive. Every webhook delivery now includes an HMAC signature, so your systems can verify a callback really came from us before acting on it.

RT LlamaIndex 🦙
Introducing: Granular job tracking and cost attribution in LlamaParse 🦙

Parsing thousands of pages a day is a lot to manage. Engineering teams need to have a pulse on what's run...

𝕏x•7 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We've rolled out improvements to LlamaParse Cost Optimizer. Our intelligent tier routing now more reliably ensures you always strike the right balance between cost and accuracy when processing large documents. Simple pages default to our cost-effective tier, complex pages are still processed by our higher accuracy and higher cost agentic and agentic plus tiers. Only pay for the quality each page actually needs. Live now for all customers.

RT LlamaIndex 🦙
We've rolled out improvements to LlamaParse Cost Optimizer.

Our intelligent tier routing now more reliably ensures you always strike the right balance between cost and accuracy w...

𝕏x•8 days ago

Retweeted from @Logan

RT Logan Markewich Whats the tesseract equivalent for schema-based extraction? Tesseract gives fast and dirty OCR for ~free. But what about pulling out a specific schema?

𝕏x•8 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Most agentic retrieval demos assume clean, well-structured documents. Enterprise reality is often different, consisting of messy PDFs where critical information is buried across tables, figures, and complex layouts. That’s why we teamed up with @lancedb to explore how LiteParse (our lightning-fast parser) combined with LanceDB’s native multimodal storage can improve retrieval quality and agent response accuracy. By separating PDFs into multiple information layers - pages (text + screenshots + embeddings), chunks, and extracted assets - and storing them in LanceDB for fast multimodal retrieval, we built a hybrid pipeline that unlocks information traditional RAG systems often miss. Instead of relying on chunk-level retrieval alone, agents can retrieve and reason across pages, chunks, and visual assets, making complex enterprise PDFs far more accessible. The result is a significantly stronger retrieval foundation for agentic workflows. 📖 Read the full breakdown in the blog post: https://www.lancedb.com/blog/from-messy-pdfs-to-verifiable-answers-with-liteparse-and-lancedb 💻 Explore the full implementation on GitHub: https://github.com/lancedb/liteparse-lancedb-pdf-qa

RT LlamaIndex 🦙
Most agentic retrieval demos assume clean, well-structured documents. Enterprise reality is often different, consisting of messy PDFs where critical information is buried across ta...

𝕏x•9 days ago

Is this what founder mode looks like

@Clay Travis

Sources: President Trump, commerce secretary Howard Lutnick, and White House task force head Andrew Giuliani put together a team of elite lawyers — from outside the government — to challenge the Flo Balogun red card. Specifically they challenged the use of slow motion instant

View quoted post

𝕏x•9 days ago

There is a massive demand for processing files *in the agent loop*. The number of users submitting agent queries with file attachments is exponentially increasing over time. Our mission is to make LiteParse the best parser in terms of cost, accuracy, speed, and semantics, for any agent loop you're using. This includes the recent Eve framework by @vercel to the Claude SDK to other frameworks. Right now, the default behavior of every agent loop is to use a simple library (e.g. pypdf, pdftotext) to do a light pass through the file, before attempting deeper vision based approaches. This leads to hallucinated context, and bad retrieval. This also takes a long time, because the agent has to reconstruct writing the imports/postprocessing code from scratch. LiteParse is awesome and you should check it out. Even if you don't plan to use it as a standalone library, it is installable as a one-click skill for any agent, and of course, we now support Eve! Template: https://github.com/run-llama/eve-liteparse LiteParse: https://github.com/run-llama/liteparse

@LlamaIndex 🦙

The team at @vercel recently released the Eve agent framework, so we built a template that integrates LiteParse with it🦙 The template provides a set of read-only filesystem tools that let Eve resolve paths, list directories, and read text-based files. We then pair those with

View quoted post

𝕏x•10 days ago

We've created a comprehensive Retrieval Harness for modern agentic retrieval in 2026. The harness provides a persistent data pipeline that can connect to a data source, index and update a large knowledge base, and expose a broad set of tools akin to filesystem operations (semantic/keyword search, regex grep, file search, read). You can plug this into any of your agents to let them autonomously crawl an arbitrary knowledge base to solve a task with any complexity. Check out our reference implementation: https://github.com/run-llama/legal-kb LlamaParse: https://cloud.llamaindex.ai/

@LlamaIndex 🦙

Agentic retrieval is changing the way retrieval-augmented applications are built, especially in domains like legal and fintech, where agents need to autonomously navigate large, evolving knowledge bases. That’s exactly the use case we designed Index v2 for. To demonstrate

View quoted post

𝕏x•10 days ago

Happy July 4th everyone 🇺🇸 250 years ago, our founding fathers did not have access to frontier models. (Alexander Hamilton wrote 51 of the 85 Federalist papers. Imagine how many he would've written if he had access to Fable) All jokes aside, America continues to be the place where the best people come to learn, build and innovate. Proud to call this country my home.

𝕏x•11 days ago

I’ve gotten reasonable success by creating a skill that reflects my writing style, and hillclimbing the skill based on my real writing samples. I think that solves 70% of the style problem, though doesn’t solve the inherent ability to articulate clear, differentiated insights, which is probably an inherent posttraining problem

@Lee Robinson

Are current LLMs incompatible with great creative writing? I can't tell if it's cope or not, but it seems like even with the best models, I still can't get them to write like humans would. For coding, there is a verifiable reward like it compiling or tests passing. But for

View quoted post

𝕏x•11 days ago

Hello brother

@Jerry Liu

We've been thinking about the best way to use screen data for a while now. After 500k hours of beta use, we're launching Dayflow: the open source automatic work journal. One of my favorite @rabois-isms: the best predictor of success is how well you allocate your time. But

View quoted post

𝕏x•11 days ago

You are going to end up running hallucinated code

@Michigan TypeScript

~60% Fable cost cut by transparently turning the code into an image and having the model OCR it. WILD idea. also hilarious. https://github.com/teamchong/pxpipe

Quoted tweet media 1

Quoted tweet media 2

View quoted post

𝕏x•11 days ago

Retweeted from @Conor

RT Conor Bronsdon .@aiDotEngineer World's Fair was one of the most unique, interesting conferences I've been to: - incredible conversations with builders - hilarious & creative touches (shout out @swyx) from a flash mob to cleric costumes and the integration of the USMNT game - a @altryne / @thursdai_pod livestream alongside excellent talks + stellar side events like the Agent Open by @jerryjliu0 + @murtazakhomusi & @morgane_paloma Fantastic week with @chaoyu_ & the rest of the @Modular team - exciting times ahead in AI!

RT Conor Bronsdon
.@aiDotEngineer World's Fair was one of the most unique, interesting conferences I've been to:

- incredible conversations with builders
- hilarious & creative touches (shout out ...

RT Conor Bronsdon
.@aiDotEngineer World's Fair was one of the most unique, interesting conferences I've been to:

- incredible conversations with builders
- hilarious & creative touches (shout out ...

RT Conor Bronsdon
.@aiDotEngineer World's Fair was one of the most unique, interesting conferences I've been to:

- incredible conversations with builders
- hilarious & creative touches (shout out ...

RT Conor Bronsdon
.@aiDotEngineer World's Fair was one of the most unique, interesting conferences I've been to:

- incredible conversations with builders
- hilarious & creative touches (shout out ...

𝕏x•11 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 The team at @vercel recently released the Eve agent framework, so we built a template that integrates LiteParse with it🦙 The template provides a set of read-only filesystem tools that let Eve resolve paths, list directories, and read text-based files. We then pair those with LiteParse, which parses files from their source and returns clean, structured Markdown⚙️ Finally, we equipped the agent with detailed instructions on when and how to combine these tools effectively, giving it a reliable workflow for navigating and understanding document collections out of the box📁 The result is a solid starting point that you can extend with your own channels, tools, and skills🔧 Check it out: https://github.com/run-llama/eve-liteparse

𝕏x•12 days ago

the scroll feature in the claude code CLI is really nice

𝕏x•12 days ago

3 years ago I gave a talk at the first @aiDotEngineer conference on "Advanced RAG" techniques in order to work around the limitations of naive RAG. It's insane how much the world has changed since then, and the world has evolved into standardized, higher-level abstractions around agent harnesses and context. Some general patterns: 1. Retrieval complexity can be encoded at the agent layer. This means that you can give relatively simple but performant search tools to an agent (e.g. really fast bm25, vector search), and let the agent reasoning enter the right queries to find the right results. 2. To some extent this is still evolving, but I do think we will increasingly care less about "hacking" the context window and more about deciding what business context is relevant in the first place. 3. The way we build agents has fundamentally changed from defining code, to defining runbooks, to defining goals. Big congrats to @swyx and the entire AI Engineer team for continuing to put out awesome conferences every year.

3 years ago I gave a talk at the first @aiDotEngineer conference on "Advanced RAG" techniques in order to work around the limitations of naive RAG.

It's insane how much the world has changed since...

3 years ago I gave a talk at the first @aiDotEngineer conference on "Advanced RAG" techniques in order to work around the limitations of naive RAG.

It's insane how much the world has changed since...

𝕏x•12 days ago

Re credits @murtazakhomusi

𝕏x•12 days ago

OOH campaigns in SF be like

OOH campaigns in SF be like

𝕏x•12 days ago

Retweeted from @Gergely

RT Gergely Orosz Re Part 2, w @ankrgyl, @bernhardsson, @Sirupsen, @jerryjliu0, @travers00, @Steve_Yegge, @addyosmani And what a game @grinich And many others! Also @swyx AIE is 🔥, how is it better every year??

RT Gergely Orosz
Re Part 2, w @ankrgyl, @bernhardsson, @Sirupsen, @jerryjliu0, @travers00, @Steve_Yegge, @addyosmani

And what a game @grinich

And many others! Also @swyx AIE is 🔥, how is it be...

RT Gergely Orosz
Re Part 2, w @ankrgyl, @bernhardsson, @Sirupsen, @jerryjliu0, @travers00, @Steve_Yegge, @addyosmani

And what a game @grinich

And many others! Also @swyx AIE is 🔥, how is it be...

RT Gergely Orosz
Re Part 2, w @ankrgyl, @bernhardsson, @Sirupsen, @jerryjliu0, @travers00, @Steve_Yegge, @addyosmani

And what a game @grinich

And many others! Also @swyx AIE is 🔥, how is it be...

RT Gergely Orosz
Re Part 2, w @ankrgyl, @bernhardsson, @Sirupsen, @jerryjliu0, @travers00, @Steve_Yegge, @addyosmani

And what a game @grinich

And many others! Also @swyx AIE is 🔥, how is it be...

𝕏x•13 days ago

when you have fomo but not enough use cases

when you have fomo but not enough use cases

𝕏x•13 days ago

Retweeted from @simon

RT simon Are there any CS profs left at Berkeley and Stanford? Might as well just merge all the PhD students into the big labs at this point

@Andrew Curran

Department Chair of Electrical Engineering and Computer Sciences at Berkeley.

View quoted post

𝕏x•13 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Agentic retrieval is changing the way retrieval-augmented applications are built, especially in domains like legal and fintech, where agents need to autonomously navigate large, evolving knowledge bases. That’s exactly the use case we designed Index v2 for. To demonstrate what’s possible, we built legal-kb, a reference application that integrates Index v2 into an agentic knowledge automation workflow. It uses Index v2 as the underlying knowledge base and exposes its retrieve, read, grep, and find APIs as tools that an AI agent can use to autonomously explore and reason over your documents. With legal-kb, you get: 📚 Project-scoped knowledge bases for agent-powered chat 📖 Visual citations directly in agent responses 🌿 Version control for your knowledge base 📦 Data export capabilities Try it out: https://legal-kb.dev Explore the code: https://github.com/run-llama/legal-kb Get started with LlamaParse: https://cloud.llamaindex.ai/signup

𝕏x•13 days ago

This was an awesome collab between 7 companies 🔥 Huge shoutout to @browserbase @modal @braintrust @turbopuffer @p0 @cursor_ai and of course our very own @llama_index teams. For those who attended - hope you enjoyed the pickleball, food, drinks, and Wimbledon-inspired swag :)

@morgane palomares

The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it. See you at the next one ;)

Quoted tweet media 1

Quoted tweet media 2

Quoted tweet media 3

Quoted tweet media 4

View quoted post

𝕏x•13 days ago

Retweeted from @Ankur

RT Ankur Goyal Super fun to touch grass with friends from @turbopuffer @modal @llama_index @p0 @browserbase @cursor_ai

@morgane palomares

The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it. See you at the next one ;)

Quoted tweet media 1

Quoted tweet media 2

Quoted tweet media 3

Quoted tweet media 4

View quoted post

𝕏x•13 days ago

Retweeted from @morgane

RT morgane palomares The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it. See you at the next one ;)

RT morgane palomares
The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it.

See you at the next one ;)

RT morgane palomares
The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it.

See you at the next one ;)

RT morgane palomares
The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it.

See you at the next one ;)

RT morgane palomares
The Agent Open was a galaxy brain idea that @murtazakhomusi and I did together — couldn’t be happier we went for it.

See you at the next one ;)

𝕏x•14 days ago

We are SO back

@Anthropic

Claude Fable 5 will be available again globally tomorrow. After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding

View quoted post

𝕏x•14 days ago

Retweeted from @Anthropic

RT Anthropic Claude Fable 5 will be available again globally tomorrow. After a series of productive conversations with the US government, we're redeploying the model with a new set of classifiers to target and block more cybersecurity tasks. In the near term, some routine tasks like coding and debugging will fall back to Opus 4.8. We’ll continue to refine these classifiers over the coming weeks to reduce false positives and better distinguish genuine misuse from legitimate requests. We’ve also begun drafting a consensus framework—with Amazon, Microsoft, Google, and other Glasswing partners—for assessing the severity of AI jailbreaks and how AI developers should respond to them. We invite other industry partners and model providers to join us in this effort. Finally, we’re scaling up our collaboration with the US government on model testing and safeguards. This will include pre-release access to models and safeguards for evaluation, information sharing on jailbreaks and misuse, and dedicated resources for joint research. Thank you to our users for your patience, and to our partners across the government, industry, and the research community who worked alongside us to make Fable 5 available again. Read our full blog: https://www.anthropic.com/news/redeploying-fable-5

𝕏x•14 days ago

Fable 5 back Lebron possibly to the Warriors AI eng world fair it's a good week to be in SF

@Anthropic

We’ve received notice that the Department of Commerce has lifted export controls on Claude Fable 5 and Mythos 5. We'll begin restoring access tomorrow, and will share an update soon. We’re grateful to our users for their patience, and to everyone who worked with us on

View quoted post

𝕏x•14 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Doors are open at Agent Open, right next door to AI Engineer Two courts, blue sky, no fog (a San Francisco miracle) 🌤️ This is the perfect walk-over window, we are just 3 minutes from Moscone. Don't miss the State of AI panel at 5PM, tournament + party til 10PM. RSVP: https://luma.com/the-agent-open?utm_medium=socials&utm_source=twitter_li&utm_campaign=2026-06

𝕏x•14 days ago

Fully solving document parsing includes covering every point on the Pareto curve of accuracy, cost, and latency: 1️⃣ High-accuracy parsing - requires 99%+ accuracy, price insensitive. Especially relevant in regulated industries like financial service and insurance. 2️⃣ Low cost, high volume parsing - requires inhaling a massive volume of documents as context for agents. Can run offline in a batch setting. 3️⃣Low latency and low cost parsing - these are use cases where the user is uploading a massive volume of files ad-hoc and in the agent loop (e.g. uploading 1k pdfs to claude cowork). Requires an extremely fast pass to make sense of the docs before a deeper dive LlamaParse covers the cost-accuracy modes for document OCR with our document agent harness. LiteParse, our OSS project, is designed to be in the agent loop, and can route to deeper VLM-enabled modes. I talked about this and other topics during the @aiDotEngineer talk today. Stay tuned for the slides! In the meantime, check out our full set of parsing results on ParseBench: https://www.parsebench.ai/ LlamaParse: https://cloud.llamaindex.ai/ LiteParse: https://github.com/run-llama/liteparse

Fully solving document parsing includes covering every point on the Pareto curve of accuracy, cost, and latency:
1️⃣ High-accuracy parsing - requires 99%+ accuracy, price insensitive. Especially re...

𝕏x•14 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We just made it a lot easier for AI agents to work with your documents. LlamaParse MCP now does more than parse or classify files: it can pull structured data out of contracts, invoices, and reports automatically, and give agents direct access to your knowledge base (PDFs, Office docs, images, and more) so they can search, read, and retrieve information just like a human would. We also reorganized everything into focused, purpose-built tools: whether you're classifying documents, extracting data, or building search over your company's files, agents can do it faster, more reliably, and in parallel. This allows for smarter, more capable AI workflows for anyone working with documents at scale. Curious to try it out? Get started now with the LlamaParse Platform → https://cloud.llamaindex.ai?utm_medium=socials&utm_source=twitter&utm_campaign=2026-- Blog post: https://llamaindex.ai/blog/extending-the-llamaparse-mcp-for-more-document-processing-power?utm_medium=socials&utm_source=twitter&utm_campaign=2026-- Take a look at the GitHub repository → https://github.com/run-llama/mcp-llamaindex-ai

RT LlamaIndex 🦙
We just made it a lot easier for AI agents to work with your documents.

LlamaParse MCP now does more than parse or classify files: it can pull structured data out of contracts, in...

𝕏x•15 days ago

Retweeted from @Lukas

RT Lukas Levert Today’s the day. The inaugural Agent Open is here and it’s gonna be awesome. Arrive early, play some pickleball, meet some people, eat good food, and enjoy the panel featuring @travers00 @ankrgyl @Sirupsen @jerryjliu0 and @bernhardsson, moderated by @GergelyOrosz.

RT Lukas Levert
Today’s the day. The inaugural Agent Open is here and it’s gonna be awesome.

Arrive early, play some pickleball, meet some people, eat good food, and enjoy the panel featuring @tra...

𝕏x•15 days ago

Come for the Germany games, stay for the Raising Canes 🇺🇸🇺🇸🦅

@Freddy🇩🇪

Frustration dinner.

Quoted tweet media 1

View quoted post

𝕏x•15 days ago

This is a neat idea around doing model routing and sub-agent delegation while ensuring cache hits on accumulated context for all agents. It makes a lot of sense: you want to ensure that all subagents can also make use of accumulated context in their cache

@Cognition

Devin Fusion uses a hybrid-model harness built around two ideas: First, a “sidekick” agent: a smaller agent runs in parallel with the frontier agent. The frontier agent delegates work, monitors progress, and keeps ownership of planning, ambiguity, and final review. This lets

Quoted tweet media 1

View quoted post

𝕏x•15 days ago

Which AI engineer is GOATed at pickleball? Come find out tomorrow! https://luma.com/the-agent-open?tk=F0FFFy

@LlamaIndex 🦙

Bring your hot takes and your drop shots.🔥 Next week, we're co-hosting The Agent Open: an afternoon of pickleball, food, drinks, and industry-leading speakers who ace their code and commit to their serves. 🏓 Catch our co-founder and CEO, Jerry Liu after Day 2 of AIE World's

View quoted post

𝕏x•15 days ago

We're excited to introduce the Retrieval Harness in LlamaParse - which is the 2026 version of RAG over documents Generalized agents need the right set of tools to scalably search and read through an arbitrary corpus of data (from 10 docs to 1m+ docs). They can already demonstrate great retrieval performance over a local filesystem, need a proper backend for a large collection of managed data. The Retrieval Harness exposes a diverse set of tools for various needs: 1. Hybrid Retrieval: Combine vector search with keyword search, let the agent set the alpha value to toggle between the two 2. List Files: a scalable version of `ls` to list files within an index 3. File Grep: enable regex search within a given file 4. File Read: Allow agents to read a subsection from an existing document. The agent can choose to interleave any sequence of these tools in order to complete a variety of tasks, from simple to hard. Come check it out! Blog: https://www.llamaindex.ai/blog/announcing-retrieval-harness?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jun- Sign up to LlamaParse: https://cloud.llamaindex.ai/

@LlamaIndex 🦙

Semantic search alone doesn't cut it. Neither does brute-force grep. Agents need both. Today we're shipping the Retrieval Harness in LlamaParse Index: semantic search, server-side grep, and file-level navigation working together in a single agent reasoning loop. 🦙🌤️ Grep a

View quoted post

𝕏x•15 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Semantic search alone doesn't cut it. Neither does brute-force grep. Agents need both. Today we're shipping the Retrieval Harness in LlamaParse Index: semantic search, server-side grep, and file-level navigation working together in a single agent reasoning loop. 🦙🌤️ Grep a file, list what's in an index, read past a chunk boundary, run hybrid search with reranking — all as native agent tools. Now in beta across all paid tiers. Full breakdown in the blog 👇 Learn More: https://www.llamaindex.ai/blog/announcing-retrieval-harness?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jun-

𝕏x•15 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We're all over AI Engineer World's Fair on June 29 to July 2. 🦙 📍Visit us at booth L-G47. LlamaParse demos + Fear of Docs swag 🎤 @jerryjliu0, our Founder & CEO, on agentic document parsing and shipping agents that survive production. Tue 6/30, 11:10 AM, Vision & OCR (Room 2006) 🎤 George He, our Head of Engineering, in a live parse-off, LlamaParse vs the leading LLMs. Thu 7/2, 1:55 PM, Expo Stage 4 🥒 The Agent Open (6/30) RSVP 👉 https://luma.com/the-agent-open?utm_source=x&utm_medium=organic_social&utm_campaign=aiewf-2026

RT LlamaIndex 🦙
We're all over AI Engineer World's Fair on June 29 to July 2. 🦙

📍Visit us at booth L-G47. LlamaParse demos + Fear of Docs swag

🎤 @jerryjliu0, our Founder & CEO, on agentic doc...

𝕏x•15 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We're all over AI Engineer World's Fair on June 29 to July 2. 🦙 📍Visit us at booth L-G47. LlamaParse demos + Fear of Docs swag 🎤 @jerryjliu0, our Founder & CEO, on agentic document parsing and shipping agents that survive production. Tue 6/30, 11:10 AM, Vision & OCR (Room 2006) 🎤 George He, our Head of Engineering, in a live parse-off, LlamaParse vs the leading LLMs. Thu 7/2, 1:55 PM, Expo Stage 4 🥒 The Agent Open (6/30) RSVP 👉 https://luma.com/the-agent-open?utm_source=x&utm_medium=organic_social&utm_campaign=aiewf-2026

RT LlamaIndex 🦙
We're all over AI Engineer World's Fair on June 29 to July 2. 🦙

📍Visit us at booth L-G47. LlamaParse demos + Fear of Docs swag

🎤 @jerryjliu0, our Founder & CEO, on agentic doc...

𝕏x•17 days ago

Agreed with this. There's a lot of value in both agents and software (i don't think SaaS is dead), but the product interfaces are different: 1. An agent is like a human. There are a core set of interfaces for communication, and they can be relatively simple (chat, voice, slack) 2. Both humans and agents need software as tools. The best interface for each tool depends heavily on what its being used for (ticket tracking, CRM) Right now it makes sense that every SaaS company is building agents to capture the value of work automation, but I do think the interfaces will unbundle between the two over time.

@Rhys

http://x.com/i/article/2060109453488455680

View quoted post

𝕏x•17 days ago

Retweeted from @Afiz

RT Afiz ⚡️ It really isn't an exaggeration! LiteParse clocks in at an average of 3ms per page for a reason: it skips the heavy AI processing and cloud overhead entirely. Here is exactly how it pulls off that kind of speed: - Purely Local & Lightweight: It runs completely on your machine (built on a Rust core with a native PDFium C library) rather than sending files over the network to a distant cloud server. - No Heavy VLMs/GPUs: Instead of using an expensive, slow Vision-Language Model to "read" the page layout, it relies on fast, deterministic layout heuristics and projects text onto a spatial grid.   - Selective OCR: It only activates OCR (using lightweight engines like Tesseract) when it encounters scanned pages or embedded images; otherwise, it extracts native text layers directly.   Because it reconstructs headings, tables, and lists into structured Markdown almost instantly, it's a massive win for real-time RAG pipelines and coding agents that need a quick first pass over documents. Quick demo 🧵👇

@Jerry Liu

LiteParse is unreasonably good for document parsing ✅ It is the fastest document parsing tool out there - average parse time per page is 3ms ⚡️⚡️ ✅ Now that we support markdown, it tops opendataloader-bench, OlmOCR-bench, and ParseBench in terms of accuracy ✅ It supports 50+

Quoted tweet media 1

Quoted tweet media 2

View quoted post

𝕏x•17 days ago

From playing around with /goal It feels like there's less and less of a need to build any type of workflow manually (whether through code, drag and drop, or a prompt). Instead, specify the goal, let the model intelligence figure out the underlying steps. If the task is repeatable, then you can gather a dataset with ground-truth, and hillclimb it for increased cost / lower accuracy. To some extent this is what every non-frontier lab is optimizing for. The world is moving from prompt engineering -> goal and eval engineering.

𝕏x•17 days ago

LiteParse is unreasonably good for document parsing ✅ It is the fastest document parsing tool out there - average parse time per page is 3ms ⚡️⚡️ ✅ Now that we support markdown, it tops opendataloader-bench, OlmOCR-bench, and ParseBench in terms of accuracy ✅ It supports 50+ other document formats ✅ It even gives you basic bounding boxes that your coding agent can stitch together Even if you need deeper VLM-enabled parsing (e.g. LlamaParse), there's no reason you shouldn't be using this as a first pass for everything. https://github.com/run-llama/liteparse

LiteParse is unreasonably good for document parsing

✅ It is the fastest document parsing tool out there - average parse time per page is 3ms ⚡️⚡️
✅ Now that we support markdown, it tops opendatalo...

LiteParse is unreasonably good for document parsing

✅ It is the fastest document parsing tool out there - average parse time per page is 3ms ⚡️⚡️
✅ Now that we support markdown, it tops opendatalo...

@LlamaIndex 🦙

We built LiteParse, the fastest document parsing solution on the planet and made it open source. And it just hit 10k github stars. 🦙 Fast to run. Fast to love. Thanks for building with us. If you haven't tried it already, repo at: https://github.com/run-llama/liteparse?utm_medium=socials&utm_source=twitter_li&utm_campaign=2026-06

Quoted tweet media 1

View quoted post

𝕏x•18 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 The @n8n_io node for the LlamaParse Platform is now an officially verified community node, as part of a broader partnership with n8n to bring cutting-edge document intelligence to the low-code and no-code world🚀 The new version of the node brings together document parsing, classification, extraction, splitting, and retrieval in one place, all wired to a single LlamaParse API credential🦙 Each resource can now also act as a callable tool inside an n8n AI Agent: so instead of building static pipelines, you can let the agent decide when to retrieve context, parse a file, or extract structured data based on what the user actually needs🤖 A few workflows worth highlighting: routing documents by type before extracting structured fields, plugging retrieval directly into an agent backed by your own knowledge base, and running parse outputs through different tiers side by side to find the right balance between accuracy and cost🔃 If you're already using n8n, install it directly from your workflow canvas by searching 𝘓𝘭𝘢𝘮𝘢𝘗𝘢𝘳𝘴𝘦 𝘗𝘭𝘢𝘵𝘧𝘰𝘳𝘮 and give it a try!🔧 📚️ Full breakdown in our blog post: https://www.llamaindex.ai/blog/bring-your-document-workflows-to-n8n-with-the-llamaparse-node?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jun-

RT LlamaIndex 🦙
The @n8n_io node for the LlamaParse Platform is now an officially verified community node, as part of a broader partnership with n8n to bring cutting-edge document intelligence to ...

𝕏x•19 days ago

Retweeted from @Gergely

RT Gergely Orosz Catch me + five epic and extremely hands-on AI founders on Tuesday (30 June) - @ankrgyl, @jerryjliu0, @Sirupsen, @bernhardsson & @travers00 It just so happens to be smack in the middle of an AI pickleball tournament. Yes it's very SF! Register here: https://luma.com/the-agent-open

RT Gergely Orosz
Catch me + five epic and extremely hands-on AI founders on Tuesday (30 June) - @ankrgyl, @jerryjliu0, @Sirupsen, @bernhardsson & @travers00

It just so happens to be smack in the m...

𝕏x•19 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 We built LiteParse, the fastest document parsing solution on the planet and made it open source. And it just hit 10k github stars. 🦙 Fast to run. Fast to love. Thanks for building with us. If you haven't tried it already, repo at: https://github.com/run-llama/liteparse?utm_medium=socials&utm_source=twitter_li&utm_campaign=2026-06

RT LlamaIndex 🦙
We built LiteParse, the fastest document parsing solution on the planet and made it open source.

And it just hit 10k github stars. 🦙
Fast to run. Fast to love.

Thanks for build...

𝕏x•20 days ago

Unlimited OCR is a great model on table parsing and understanding proper reading order. However it does struggle a little on semantic formatting, charts (it does decent at bounding boxes). Attaching the results below to compare against PaddleOCR-VL-1.6. Overall it's still a great addition to the open-weight OCR model collection! Check the latest results on ParseBench: https://www.parsebench.ai/

Unlimited OCR is a great model on table parsing and understanding proper reading order.

However it does struggle a little on semantic formatting, charts (it does decent at bounding boxes).

Attach...

@Baidu AI

We’re open-sourcing Unlimited OCR — built to read long documents in one pass. With 3B total parameters and only 500M activated, Unlimited OCR sets new end-to-end SOTA results on OmniDocBench v1.5 and v1.6. The key innovation is Reference Sliding Window Attention (R-SWA),

Quoted tweet media 1

View quoted post

𝕏x•20 days ago

We've provided some updated results on Mistral OCR that make use of the annotation feature for charts. The overall score is ahead of GPT-5.5 and just behind Gemini 3.1 Pro, which is quite impressive for a model at its price range. It does a great job on content faithfulness, semantic formatting, and visual grounding. It does an ok job on tables and an ok job (though to be fair, non-zero) on charts. There's been some great advancements on visual understanding capabilities lately. See screenshot below for performance, will update the main benchmark page soon: https://www.parsebench.ai/

We've provided some updated results on Mistral OCR that make use of the annotation feature for charts.

The overall score is ahead of GPT-5.5 and just behind Gemini 3.1 Pro, which is quite impressi...

We've provided some updated results on Mistral OCR that make use of the annotation feature for charts.

The overall score is ahead of GPT-5.5 and just behind Gemini 3.1 Pro, which is quite impressi...

@Jerry Liu

We benchmarked Mistral OCR against other frontier and open-weight models on ParseBench 📊 For a model at its price point, it is quite competitive! - It wins on semantic formatting - understanding strikethroughs, superscripts/subscripts, title hierarchy, links - It is

Quoted tweet media 1

Quoted tweet media 2

View quoted post

𝕏x•20 days ago

Retweeted from @Clelia

RT Clelia Bertelli (🦙/acc) I decided to run my knowledge base inside a @Cloudflare Worker👀 How? Surprisingly easily: • Spin up a @qdrant_engine cluster and grab the credentials • Process incoming files at the edge: parse with @llama_index LiteParse and chunk with @ChonkieAI’s chunking library, both running on WASM • Store file contents in a Durable Object • Embed chunks directly in Qdrant using its built-in inference (pretty neat, right?) But the Worker doesn’t just handle ingestion: it can also retrieve information from your knowledge base, read files from Durable Object storage, and grep through them on demand. I deploy it in a single click and connect it to my coding agents via local stdio MCP, running 𝘯𝘱𝘹 -𝘺 @𝘤𝘭𝘦-𝘥𝘰𝘦𝘴-𝘵𝘩𝘪𝘯𝘨𝘴/𝘸𝘰𝘳𝘬𝘦𝘳𝘣𝘢𝘴𝘦-𝘮𝘤𝘱. Deploy your own now (works with free plans!) → http://github.com/AstraBert/workerbase

RT Clelia Bertelli (🦙/acc)
I decided to run my knowledge base inside a @Cloudflare Worker👀

How? Surprisingly easily:
• Spin up a @qdrant_engine cluster and grab the credentials
• Process incomin...

𝕏x•20 days ago

Retweeted from @Niels

RT Niels Rogge ParseBench is now also available on Papers with Code! Find it here: https://paperswithcode.co/benchmark/parsebench

RT Niels Rogge
ParseBench is now also available on Papers with Code!

Find it here: https://paperswithcode.co/benchmark/parsebench

@Jerry Liu

We benchmarked Mistral OCR against other frontier and open-weight models on ParseBench 📊 For a model at its price point, it is quite competitive! - It wins on semantic formatting - understanding strikethroughs, superscripts/subscripts, title hierarchy, links - It is

Quoted tweet media 1

Quoted tweet media 2

View quoted post

𝕏x•20 days ago

Retweeted from @simon

RT simon Changing all our pitch decks to quality, quality, quality cc @jerryjliu0

@Kevin Kwok

quality has a quality all its own it's interesting they didn't put arrow from compute cloud to best researchers. Because it def helps recruit them

Quoted tweet media 1

View quoted post

𝕏x•20 days ago

We benchmarked Mistral OCR against other frontier and open-weight models on ParseBench 📊 For a model at its price point, it is quite competitive! - It wins on semantic formatting - understanding strikethroughs, superscripts/subscripts, title hierarchy, links - It is competitive on content faithfulness (reading order + hallucinations + omissions) and visual grounding (bounding boxes) - It does ok on tables and doesn't really have chart capabilities. Of course, some of the frontier models + OCR providers like Azure Doc Intelligence + AWS Textract are a bit more expensive. Check out our full leaderboard on ParseBench: https://www.parsebench.ai/

We benchmarked Mistral OCR against other frontier and open-weight models on ParseBench 📊

For a model at its price point, it is quite competitive!
- It wins on semantic formatting - understanding...

We benchmarked Mistral OCR against other frontier and open-weight models on ParseBench 📊

For a model at its price point, it is quite competitive!
- It wins on semantic formatting - understanding...

𝕏x•22 days ago

Retweeted from @Parallel

RT Parallel Web Systems Join us Tuesday, June 30th, for the inaugural Agent Open pickleball tournament! 🏟️ Our co-founder, @travers00, will join leaders from @Pragmatic_Eng, @braintrust, @llama_index, @turbopuffer, and @modal for a panel on the state of AI. First-come, first-served: https://luma.com/the-agent-open?utm_source=parallel

RT Parallel Web Systems
Join us Tuesday, June 30th, for the inaugural Agent Open pickleball tournament! 🏟️

Our co-founder, @travers00, will join leaders from @Pragmatic_Eng, @braintrust, @llama_i...

𝕏x•22 days ago

This is a reminder that we are co-hosting a massive pickleball tournament 🏓 next Tuesday June 30th, as folks are in town for the @aiDotEngineer World Fair. Come check it out! https://luma.com/the-agent-open

@LlamaIndex 🦙

Bring your hot takes and your drop shots.🔥 Next week, we're co-hosting The Agent Open: an afternoon of pickleball, food, drinks, and industry-leading speakers who ace their code and commit to their serves. 🏓 Catch our co-founder and CEO, Jerry Liu after Day 2 of AIE World's

View quoted post

𝕏x•22 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Bring your hot takes and your drop shots.🔥 Next week, we're co-hosting The Agent Open: an afternoon of pickleball, food, drinks, and industry-leading speakers who ace their code and commit to their serves. 🏓 Catch our co-founder and CEO, Jerry Liu after Day 2 of AIE World's Fair on a panel moderated by Gergely Orosz (𝘛𝘩𝘦 𝘗𝘳𝘢𝘨𝘮𝘢𝘵𝘪𝘤 𝘌𝘯𝘨𝘪𝘯𝘦𝘦𝘳). Get ready for a real conversation about the Future of AI and learn how to avoid the hype, stay out of the kitchen, and strategically position your team for what's next. 🗓️ Tuesday, June 30th @ 5:00pm PT 📍 San Francisco (register for the address) 🦙 Co-hosted by LlamaIndex with Braintrust, turbopuffer, Cursor, Parallel Web Systems, Modal, and Browserbase Come for the panel, stay for the championship rounds. RSVP: https://luma.com/the-agent-open?utm_source=llamaindex&utm_medium=socials&utm_campaign=2026-06 #AIEngineering #AIAgents #DocumentAI

#AIEngineering#AIAgents#DocumentAI

𝕏x•22 days ago

Retweeted from @Afiz

RT Afiz ⚡️ Re @jerryjliu0 I also built a fullstack PDF parser using LiteParse, which is super fast. https://x.com/itsafiz/status/2068647234691026970

@Afiz ⚡️

Built a super fast PDF parsing service with LiteParse! LiteParse is a standalone OSS PDF parsing tool by @llama_index focused exclusively on fast and lightweight parsing. Building it was fun. @cursor_ai made it even better 100% local and Open source https://x.com/i/broadcasts/1DGLddArpbQGm

View quoted post

𝕏x•23 days ago

We parsed this SpaceX equity research PDF faster than the time it took for Screen Studio to zoom in ⚡️🔥 liteparse is now the best open-source document parsing tool out there. There’s no reason to not use it as a first pass, even if you do have docs that require heavier VLM processing downstream. Try it out now over any document: https://www.llamaindex.ai/liteparse-demo Repo: https://github.com/run-llama/liteparse

@Jerry Liu

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing

View quoted post

𝕏x•23 days ago

Retweeted from @Clelia

RT Clelia Bertelli (🦙/acc) Convert almost anything to PDF with http://pdfitdown.app! I recently released PdfItDown v3, the library I built to convert most file formats to PDF. Now it has a full-stack web app too :) The frontend is built with @tan_stack and deployed on a @Cloudflare Worker, with @WorkOS handling authentication and @posthog powering analytics🌐 The backend runs on @FastAPI, deployed as a @Docker image on @Railway. It uses @upstash Redis for rate limiting and @opentelemetry + @AxiomFM for observability, while @llama_index LiteParse handles the markdown conversion step🛠️ As always, it’s fully OSS and self-hostable: http://github.com/AstraBert/pdfitdown-app 😉 Leave a little ⭐ to PdfItDown as well: http://github.com/AstraBert/PdfItDown

𝕏x•24 days ago

Retweeted from @Clelia

RT Clelia Bertelli (🦙/acc) PdfItDown, the python library I wrote to convert anything to PDF, has now reached v3!🚀 Icymi, at @llama_index we just released LiteParse v2.1 with support for markdown output, and I decided to swap it in as the default markdown converter instead of MarkItDown (still available as optional dependency)🦙 The package is now lighter and has better accuracy/structure preservation🤝 📦 uv add "pdfitdown>=3.1.0" 👩‍💻 http://github.com/AstraBert/PdfItDown

RT Clelia Bertelli (🦙/acc)
PdfItDown, the python library I wrote to convert anything to PDF, has now reached v3!🚀
Icymi, at @llama_index we just released LiteParse v2.1 with support for markdown ...

𝕏x•25 days ago

I agree with this. Programming abstractions have moved from code to English. Markdown files within a directory are a really simple but versatile container for storing this task hierarchy

@Guillermo Rauch

The next hot programming language is… markdown. A minimal eve agent: 📂 𝚊𝚐𝚎𝚗𝚝/ 📄 𝚒𝚗𝚜𝚝𝚛𝚞𝚌𝚝𝚒𝚘𝚗𝚜.𝚖𝚍 📂 𝚜𝚔𝚒𝚕𝚕𝚜/ 📄 𝚢𝚘𝚞𝚛-𝚎𝚡𝚙𝚎𝚛𝚝𝚒𝚜𝚎.𝚖𝚍 Deployable in one command: 𝚟𝚎𝚛𝚌𝚎𝚕. It’s the most accessible programming has ever been. And

View quoted post

𝕏x•25 days ago

Retweeted from @KShivendu

RT KShivendu is in SF 🌉🚀 Had the pleasure of meeting @jerryjliu0 at a @llama_index meetup in SF. We talked about document parsing and vector search market. 🌉 DM if you wanna meet in SF. Would love to chat about search, agents, and distributed systems! 🤖🔍

RT KShivendu is in SF 🌉🚀
Had the pleasure of meeting @jerryjliu0 at a @llama_index meetup in SF. We talked about document parsing and vector search market. 🌉

DM if you wanna meet in SF. Would l...

𝕏x•25 days ago

Go America 🇺🇸⚽️

𝕏x•25 days ago

Retweeted from @Tom

RT Tom Bielecki Crazy fast and free. I'm really glad they added markdown.

@Jerry Liu

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing

View quoted post

𝕏x•25 days ago

It's kind of crazy how well LiteParse does on markdown document parsing even compared against frontier VLMs - when it doesn't use VLMs or any AI/OCR models at all. It's pure code. On ParseBench, it outperforms Qwen 3.5-9B / GLM-OCR. There's still a gap vs. models like Gemma 4 and PaddleOCR-VL especially on dense visual outputs, but if your documents are text/table-heavy this gap closes rapidly. Come check it out: it's the fastest document parser you can possibly use, and it's completely free/open-source. Repo: https://github.com/run-llama/liteparse

It's kind of crazy how well LiteParse does on markdown document parsing even compared against frontier VLMs - when it doesn't use VLMs or any AI/OCR models at all. It's pure code.

On ParseBench, ...

@Jerry Liu

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing

View quoted post

𝕏x•26 days ago

Retweeted from @simon

RT simon 🐐

@David

@tkexpress11 @midjourney By VC standards we should either "conquer the world or die in a fire" and neither of these are spiritually compelling for me. I never wanted a company I just wanted a home. At this point we have a large and loyal paid community, we build tons of features for them (I think we did

View quoted post

𝕏x•26 days ago

Retweeted from @adrian

RT adrian Re @LoganMarkewich is a wizard 🪄

@Jerry Liu

We built the fastest PDF -> markdown parser in the world 🚀⚡️ AND it’s more accurate than any other open-source, model-free parser (pymupdf4llm, opendataloader, pdf-inspector, markitdown) on 3 standardized benchmarks: olmOCR0-bench, opendataloader-bench, ParseBench Introducing

View quoted post

𝕏x•26 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 LiteParse v2.1 is here, and its bringing the fastest markdown output possible. In this release, we are fulfilling our top request: markdown output. But in the spirit of "lite"-ness, we are doing this completely LLM-free and fast. Not only is it fast, it also beats all other model-free competitors in 3 separate benchmark datasets. Read more about it in our release blog: https://www.llamaindex.ai/blog/markdown-comes-to-liteparse?utm_medium=socials&utm_source=twitter&utm_campaign=2026-jun-

𝕏x•26 days ago

I made a new talk on generalized knowledge agents and the modern context layer for DAIS 2026 🔥 Come check it out!

@LlamaIndex 🦙

~90% of enterprise data is unstructured, locked in the documents that power the majority of knowledge work. The next massive frontier? AI agents that can deeply understand, reason over, and edit these files at scale to automate entire workflows. Our CEO, Jerry Liu is speaking at

Quoted tweet media 1

View quoted post

𝕏x•26 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 ~90% of enterprise data is unstructured, locked in the documents that power the majority of knowledge work. The next massive frontier? AI agents that can deeply understand, reason over, and edit these files at scale to automate entire workflows. Our CEO, Jerry Liu is speaking at @Databricks #DataAISummit today on the core advances in OCR and agent orchestration making this a reality. 📍 Yerba Buena Salon 7 · 10:20 AM 🔗 Link to Jerry's session: https://dataaisummit.databricks.com/flow/db/dais2026/scheduler/page/catalog/session/1770259201918001g3N5

RT LlamaIndex 🦙
~90% of enterprise data is unstructured, locked in the documents that power the majority of knowledge work. The next massive frontier? AI agents that can deeply understand, reason ...

#DataAISummit

𝕏x•27 days ago

Agentic search has moved from fixed RAG pipelines into flexible agent harnesses with access to a set of search tools: keyword search (bm25, grep regex) and semantic search. When you upload a collection of unstructured documents to LlamaParse, we expose all these tools for agents to access. Come check out our webinar on June 30th where we explore all these different tools and identify which ones work the best for agentic search: https://landing.llamaindex.ai/retrieval-harness

@LlamaIndex 🦙

Vector databases or pure grep? Teams are split on the right retrieval architecture for agents. ⁣ ⁣ The reality? You need both. Semantic search for a fast first pass; grep and file reads for surgical precision when top-k chunks cut off mid-answer. ⁣ ⁣ On June 29, our Head of

Quoted tweet media 1

View quoted post

𝕏x•27 days ago

Retweeted from @simon

RT simon need more investors who deeply understand the model capability frontiers

@Henry Yin✈️ICRA

Most AI investing happens downstream of the frontier: a capability emerges, a category gets named, and capital rushes in. But by the time a category earns a clean box on a market map, the best builders have usually been living in the messy version for months. Agents. Reasoning.

Quoted tweet media 1

View quoted post

𝕏x•27 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Vector databases or pure grep? Teams are split on the right retrieval architecture for agents. ⁣ ⁣ The reality? You need both. Semantic search for a fast first pass; grep and file reads for surgical precision when top-k chunks cut off mid-answer. ⁣ ⁣ On June 29, our Head of Engineering George He goes under the hood on the architecture decisions and dead ends behind building this harness into LlamaParse Index.⁣ ⁣  Register here : https://landing.llamaindex.ai/retrieval-harness

RT LlamaIndex 🦙
Vector databases or pure grep? Teams are split on the right retrieval architecture for agents. ⁣
⁣
The reality? You need both. Semantic search for a fast first pass; grep and file ...

𝕏x•27 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 Our CEO Jerry Liu is joining founders from LangChain, CrewAI, and others at @Databricks #DataAISummit today for a panel on the The Agentic Stack — what the stack looks like, where it's headed, and what happens when agents become the primary consumers of infrastructure, not humans. Panel begins at 11.30 am, see you there. 👋

RT LlamaIndex 🦙
Our CEO Jerry Liu is joining founders from LangChain, CrewAI, and others at @Databricks #DataAISummit today for a panel on the The Agentic Stack — what the stack looks like, where ...

#DataAISummit

𝕏x•27 days ago

We made Claude better and faster at understanding PDFs The trick isn’t just creating the fastest free document parser out there (with liteparse), but also *tuning the skill itself* so that Claude Code can use it with fewer turns and expensive file operations. This is a fantastic blog post by @itsclelia which dives into the decision traces of Claude Code in how it operates over your filesystem, and identifies opportunities for optimization. We were able to incentivize the right skill behavior by doing the following: ✅ Preventing expensive mistakes like re-parsing the PDF for every search, leaving OCR on, reading screenshots when unnecessary, and preventing huge grep dumps ✅ Providing a simple BM-25 backed retrieval on parsed text ✅ Reducing the number of `grep` and `seq` sequential turns to reduce latency The net result is that we are 37% cheaper and higher accuracy than using Claude Code over raw PDFs. LiteParse is fully free and open-source, and you can plug in the skill today! Blog: https://www.llamaindex.ai/blog/building-a-better-liteparse-skill-with-evals?utm_medium=socials&utm_source=twitter&utm_campaign=2026-- Repo: https://github.com/run-llama/liteparse

We made Claude better and faster at understanding PDFs

The trick isn’t just creating the fastest free document parser out there (with liteparse), but also *tuning the skill itself* so that Claude ...

@LlamaIndex 🦙

How much can good documentation save an AI agent in cost and time? Turns out, a lot. We built a custom skill that teaches Claude how to parse PDFs more efficiently, then used real usage traces to find where it was wasting time and money (re-reading the same file over and over,

Quoted tweet media 1

View quoted post

𝕏x•28 days ago

Retweeted from @LlamaIndex

RT LlamaIndex 🦙 How much can good documentation save an AI agent in cost and time? Turns out, a lot. We built a custom skill that teaches Claude how to parse PDFs more efficiently, then used real usage traces to find where it was wasting time and money (re-reading the same file over and over, taking unnecessary "screenshots" of pages, etc.) After a few rounds of fixes based on what we observed, the results vs. just having Claude read PDFs the default way: → 37% lower cost per question → Better answer quality across the board → Fewer wasted steps The big takeaway: look at what an agent actually leaves in its traces, and fix bottlenecks from there. Full case study 👉️ https://www.llamaindex.ai/blog/building-a-better-liteparse-skill-with-evals?utm_medium=socials&utm_source=twitter&utm_campaign=2026-- Benchmark code 👉️ https://github.com/run-llama/benchmark-claude-pdfs

RT LlamaIndex 🦙
How much can good documentation save an AI agent in cost and time? Turns out, a lot.

We built a custom skill that teaches Claude how to parse PDFs more efficiently, then used real...

𝕏x•28 days ago

It's no secret that coding agents are a great proxy for computer use and generalized knowledge work. After all, Claude Cowork is just a UI wrapper on Claude Code. I think Cursor has a great shot here. However, I do think it has a bit of work to do on the product and harness side to actually create an easy-to-use, powerful interface to help non-technical users manage docs. 1. For a long time you couldn't upload a PDF to Cursor 2. It needs a broader set of interfaces to knowledge work data sources (CRM, ERP, ITSM, etc. etc.) 3. It needs more integrated surfaces to manage, read, and edit unstructured documents - from markdown to Word docs/Powerpoint to Excel

@Riley Brown

Notice how Cursor isn’t saying “for developers”. Just “Useful AI”. Cursor will likely become a direct competitor to Codex and Claude Desktop. - Their in app browser is great. - Their composer model is good and fast for most tasks They just need to render documents like

Quoted tweet media 1

View quoted post

𝕏x•28 days ago

Congrats to the team for being AI's biggest growth/success story so far (outside the labs, which don't count) 🚀📈

@Cursor

We're excited to join forces with @SpaceX to advance the frontier of useful AI. Expect significant improvements to Cursor soon.

View quoted post