document OCR + workflows @llama_index. cofounder/CEO Careers: https://t.co/EUnMNmb4DZ Enterprise: https://t.co/Ht5jwxRU13
Parsing complex tables in PDFs is extremely challenging. Existing metrics for measuring table accuracy, like TEDS (tree edit distance similarity), overweight exact table structure and underweight semantic correctness. 🚫 Overweight: If the rows within a table are out of order - even if the semantic meaning is still consistent - then TEDS heavily penalizes these values, even though the downstream AI agent would have no problem interpreting the values. 🚫 Overweight: If the HTML is semantic equivalent but output with different tags (th vs. td), TEDS will penalize 🚫 Underweight: If the header is dropped or transposed, then TEDS mildly penalizes these values, even though the entire semantic meaning of the table is destroyed. We recently released ParseBench, a comprehensive enterprise document benchmark with a heavy focus on *semantic correctness* for tables. We define a new metric: TableRecordMatch - which treats tables as a bag of records, where each record is a dictionary of key-value pairs, with keys being the headers and values being the cell values. We combine it with the GriTS metric (more robust than TEDS) to come up with the final GTRM score. It’s worth giving our full paper a read if you haven’t already. Also come check out our website hub! Website: http://parsebench.ai/ Blog: https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr- Paper: https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-
Let's talk parsing tables. Two days ago we launched ParseBench,the first document OCR benchmark built for AI agents. This deep dive breaks down TableRecordMatch (GTRM), our metric for evaluating complex tables the way your pipeline actually consumes them: as records keyed by
View quoted postRT LlamaIndex 🦙 Let's talk parsing tables. Two days ago we launched ParseBench,the first document OCR benchmark built for AI agents. This deep dive breaks down TableRecordMatch (GTRM), our metric for evaluating complex tables the way your pipeline actually consumes them: as records keyed by column headers. https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=twitter&utm_campaign=2026-- Original tweet: https://x.com/llama_index/status/2044420652224975203
This is awesome
we couldn’t afford the sphere so we got the next best thing costs less, hits harder
View quoted postDocument OCR benchmarks are still an open problem Existing document OCR benchmarks are either too narrowly focused on a specific type (e.g. FinTabNet, ChartQA), or on documents that aren’t reflective of real-world tasks (e.g. OmniDocBench, OlmOCR-bench on over academic papers) ParseBench is a step towards solving this problem. * It tries to comprehensively cover real-world document distributions within the enterprise. * It contains comprehensive evaluations across 5 different dimensions (tables, charts, content faithfulness, formatting, grounding). * It tries to use metrics that optimize for agent semantic understanding rather than structural similarity. We released this yesterday, and there’s a TON of content: 1. Whitepaper 2. HF dataset 3. Github repo 4. Blog 5. Video And today, we’re excited to feature http://parsebench.ai, our home page website for ParseBench 💫 come check it out! Take a look at some of our other materials if you’re interested: Blog: https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr- Paper: https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-
ParseBench is the most comprehensive OCR benchmark for real-world enterprise documents: financial filings, contracts, insurance documents, and more. We evaluate across 5 dimensions that are present among these documents: 1. Tables: including merged cells, hierarchical headers, cross-page tables 2. Charts: data point extraction 3. Content faithfulness: omitted/hallucinated text, reading order 4. Semantic formatting: strikethrough, superscripts, bold/italics 5. Visual grounding: element localization, classification If you're dealing with paperwork heavy use cases in finance, insurance, legal, and more, come check it out. There's a *lot* of content in there, and we'll be doing deep dives into each of the dimensions. Find full details in our blog and ArXiv paper! Blog: https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr- ArXiv: https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It
View quoted postThis is why we released liteparse :) Free, open-source, designed for agents. Natively supports OCR / screenshotting for deeper visual understanding in a document when needed.
@kepano I just tried it this morning on the 245-page Mythos pdf and it failed badly and the outputs were all mangled. Converting pdfs is really hard, I think it has to probably be a Skill not a program, for a SOTA LLM for it to work properly.
View quoted postRT Javed Alam Ocr benchmark Original tweet: https://x.com/jalam1001/status/2043826836656807941
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It
View quoted postRT Ankur Goyal great to see more open evals for an important problem Original tweet: https://x.com/ankrgyl/status/2043758595721048227
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It
View quoted postRT simon Everyone who think vision is solved should look at some enterprise documents Original tweet: https://x.com/disiok/status/2043740333394231755
We’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It
View quoted postWe’re open sourcing the first document OCR benchmark for the agentic era, ParseBench. Document parsing is the foundation of every AI agent that works with real-world files. ParseBench is a benchmark that measures parsing quality specifically for agent knowledge work: ✅ It optimizes for semantic correctness (instead of exact similarity) ✅ It has the most comprehensive distribution of real-world enterprise documents It contains ~2,000 human-verified enterprise document pages with 167,000+ test rules across five dimensions that matter most: tables, charts, content faithfulness, semantic formatting, and visual grounding. We benchmarked 14 known document parsers on ParseBench, from frontier/OSS VLMs to specialized parsers to LlamaParse. Here are some of our findings: 💡 Increasing compute budget yields diminishing returns - Gemini/gpt-5-mini/haiku gain 3-5 points from minimal to high thinking, at 4x the cost. 💡 Charts are the most polarizing dimension for evaluation. Most specialized parsers score below 6%, while some VLM-based parsers do a bit better. 💡 VLMs are great at visual understanding but terrible at layout extraction. GPT-5-mini/haiku score below 10% on our visual grounding task, all specialized parsers do much better. 💡 No method crushes all 5 dimensions at once, but LlamaParse achieves the highest overall score at 84.9%, and is the leader in 4 out of the 5 dimensions. This is by far the deepest technical work that we’ve published as a company. I would encourage you to start with our blog and explore our links to Hugging Face to GitHub. All the details are in our full 35-page (!!) ArXiv whitepaper. 🌐: Blog: https://www.llamaindex.ai/blog/parsebench?utm_medium=socials&utm_source=xjl&utm_campaign=2026-apr- 📄 Paper: https://arxiv.org/abs/2604.08538?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- 💻 Code: https://github.com/run-llama/ParseBench?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- 📊 Dataset: http...
We pit LlamaParse against frontier models (Opus 4.6, Gemini 3.1 Pro, GPT-5.4) in a live OCR arena. ICYMI: the full workshop is on Youtube! Frontier VLMs are getting quite good at visual understanding, but they don’t handle a long set of issues related to document understanding tasks: 🚫 If the table is dense, they will drop values 🚫 imprecise chart transcription 🚫 hallucinations on dense text, even though they are simply represented in the source document 🚫 They will refuse to extract content from certain pages due to content filters 🚫 They are oftentimes way too expensive We did this live webinar a few weeks ago, but the recording is on YouTube. We show comparisons between the parsed results with LlamaParse - which orchestrates text and vision based models - vs. one-shotting it into a frontier model. In this George also gives a comprehensive overview of why document understanding is a hard problem in the first place. Come check it out! https://www.youtube.com/watch?v=rlqPlIoaH9I
Common Failure Modes Break VLM-Powered OCR in Production. 🔁 Repetition Loops — model spirals into infinite whitespace, exhausts resources, cascades latency across your system 🛑 Recitation Errors — safety filters hard-stop legitimate extractions as "copyright violations"
View quoted postLiteParse is the best document parsing library for coding agents. It's free, fast, integrates natively with the LLM's native visual understanding capabilities, and comes with support for 50+ formats and text bounding boxes. We hit 4K+ Github stars in 3 weeks 📈 and we're continuing to ship improvements and use cases on top. There are so many applications for this. We're getting started with an initial workshop hosted by @LoganMarkewich on April 28th 9am PT. Check it out: https://landing.llamaindex.ai/liteparse Repo: https://github.com/run-llama/liteparse
LiteParse hit 4K+ GitHub stars in 3 weeks. ~500 pages in 2 seconds. No GPU. No API keys. 50+ file formats. Now @LoganMarkewich, our Head of Open Source, will show you how to build with it. Live workshop — April 28, 9 AM PST: Build a Financial Due Diligence Agent with LiteParse.
View quoted postRT LlamaIndex 🦙 LiteParse hit 4K+ GitHub stars in 3 weeks. ~500 pages in 2 seconds. No GPU. No API keys. 50+ file formats. Now @LoganMarkewich, our Head of Open Source, will show you how to build with it. Live workshop — April 28, 9 AM PST: Build a Financial Due Diligence Agent with LiteParse. Raw financial PDFs → structured agent-ready data. We'll build it live. Register → https://landing.llamaindex.ai/liteparse Original tweet: https://x.com/llama_index/status/2042633839156342843
Completely agreed. Everyone in SF knows how good these models are at coding / work automation. When it comes to writing, there’s still a insane amount of model-generated AI slop because there’s no easy, guided way for users to prompt and train models to write and communicate in a way that doesn’t evoke the “AI ick” Part of this is stylistic (em-dashes), but part of it is also the tooling around these models to help shape the user intent and communicate in a believable manner.
Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is
View quoted postRamp is setting the gold standard for AI usage for any company that’s not OpenAI/Anthropic If you’re not tokenmaxxing you’re falling behind
Our company mission today is to give AI agents the highest-quality document context. The native open-source libs that agents have access to (e.g. PyPDF) do naive text extraction. But this is incomplete for most advanced knowledge work. AI agents need the following from documents: ✅ Clean, linearized markdown from multimodal complex documents (charts, tables, scans) ✅ Rich layout / bounding boxes - agents should be able to trace every generated answer/decision back to the source! ✅ Proper image segmentation. Don't just give the agent access to the full page, also give access to image segmentations to enable it to generate more targeted citations. ✅ The ability to define custom schemas so that they can extract from documents in a structured format. We're working on this with LlamaParse (our VLM-powered service) and LiteParse (our free OSS parser). Check out the blog here! https://www.llamaindex.ai/blog/beyond-raw-text-how-llamaparse-and-liteparse-give-agents-real-document-understanding?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr-
Agents like @openclaw are incredibly powerful, as long as the information they receive is clean and structured🦞 When it comes to PDFs and other unstructured documents, most agents struggle. The tools they rely on often return only raw text, losing critical context like layout,
Automating Loan Income Verification with AI agents 💸☑️ Loan processors spend a huge portion of time verifying the borrower’s documents (tax forms, pay stubs, bank deposits), and cross-checking them for consistency. This tutorial shows you how to encode that logic into an agentic workflow that incorporates both state-of-the-art document OCR/extraction along with agentic reasoning with structured outputs to understand discrepancies between documents. All decisions are flagged to the human for approval. For these sensitive applications where 100% accuracy is critical, the goal is to not replace the human but help them make decisions more efficiently and accurately. The full tutorial is here: https://github.com/jerryjliu/llamaparse_use_cases/blob/main/loan_processing/tutorial.md. Feel free to follow it as a human or point your coding agent towards it. If you’re interested, come check out LlamaParse! https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
RT LlamaIndex 🦙 Agents like @openclaw are incredibly powerful, as long as the information they receive is clean and structured🦞 When it comes to PDFs and other unstructured documents, most agents struggle. The tools they rely on often return only raw text, losing critical context like layout, tables, and images❌ That’s why we created LlamaParse and LiteParse Agent Skills, designed to give agents access to a deeper layer of document understanding, enabling more reliable knowledge extraction and automation across complex documents📝 📚Learn more about the problem, and how the skills solve it: https://www.llamaindex.ai/blog/beyond-raw-text-how-llamaparse-and-liteparse-give-agents-real-document-understanding?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- 🦙 Get started with LlamaParse: https://cloud.llamaindex.ai/signup?utm_medium=li_socials&utm_source=twitter&utm_campaign=2026-apr- Original tweet: https://x.com/llama_index/status/2042256316954194127
Trying to DIY your own document parser by screenshotting into a frontier VLM (Opus, 5.4, Gemini) carries when you try to scale it up into production workflows. Here are two edge cases we've observed: 1️⃣ Repetition and whitespace errors: the LLM will start outputting repeated characters like spaces/newlines/tabs and won't stop. 2️⃣ Recitation issues: the model's safety filters block your prompt to extract all text out of every page of a document, thinking it's a copyright violation. This is a great blog by George (our head of eng), come check it out! https://www.llamaindex.ai/blog/engineering-insights-failure-modes-that-break-vlm-powered-ocr-in-production?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- If you don't want to deal with these issues, come check out LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
Common Failure Modes Break VLM-Powered OCR in Production. 🔁 Repetition Loops — model spirals into infinite whitespace, exhausts resources, cascades latency across your system 🛑 Recitation Errors — safety filters hard-stop legitimate extractions as "copyright violations"
View quoted postIf you're an AI/agent builder, it's so important that you don't overbuild and overcommit on a specific toolset and infrastructure. Frontier labs are shipping not just the models, but the harnesses and surrounding tooling such that your existing stack might be obsolete next week. * e.g. if you had a super complex RAG stack, you may need to rip it out in favor of agents + sandboxes * e.g. if you spent a lot of time building the sandbox and serving layer, you may not need to anymore if you can just bootstrap the product with Claude Managed Agents The tradeoff is completely dependent on how good out-of-the-box these proprietary agent wrappers get. Back when the OpenAI Agent SDK came out, most people did not switch from frameworks because they were simply more powerful. Nowadays tools like the Claude Agent SDK + managed agent services are getting way better.
Introducing Claude Managed Agents: everything you need to build and deploy agents at scale. It pairs an agent harness tuned for performance with production infrastructure, so you can go from prototype to launch in days. Now in public beta on the Claude Platform.
View quoted postRT LlamaIndex 🦙 Common Failure Modes Break VLM-Powered OCR in Production. 🔁 Repetition Loops — model spirals into infinite whitespace, exhausts resources, cascades latency across your system 🛑 Recitation Errors — safety filters hard-stop legitimate extractions as "copyright violations" Same pipeline. Completely different root causes. Completely different fixes. Our enginerring leadership broke down what went wrong and how we solved both 👇 https://www.llamaindex.ai/blog/engineering-insights-failure-modes-that-break-vlm-powered-ocr-in-production?utm_medium=socials&utm_source=twitter&utm_campaign=2026-apr- Original tweet: https://x.com/llama_index/status/2041923086719631780
RT Erick 🚨 Esto es LITERALMENTE ORO para abogados, analistas, investigadores y builders de agentes. @jerryjliu0 acaba de soltar /research-docs: el skill que convierte a Claude en un investigador profesional. Le das una carpeta de documentos densos y te devuelve un reporte completo de investigación con: - Citas a nivel de palabra exactas. - Bounding boxes que te marcan en la página original dónde está cada dato. - Reporte HTML hermoso y 100% auditable. Todo con LiteParse (el parser local, ultrarrápido y sin modelos de LlamaIndex). REPOOO👇 Original tweet: https://x.com/ErickSky/status/2041691680076681669
This is a great tutorial (credits @itsclelia + @lancedb) on how to build a practical retrieval pipeline that integrates directly with your agent harness. 1. Ingest a massive pile of docs with liteparse. 2. Store data in a vector db (despite my memes to the contrary, you will need some database for larger scale retrieval). 3. Pair with image screenshotting tools that allow the agent to "dive deeper" into data. When you pair this with the Claude Agent SDK / Claude Code, the agent will do some initial retrieval pass to pull the relevant doc, and then use screenshotting/VLM-enabled capabilities to do deeper analysis. Blog: https://www.lancedb.com/blog/smart-parsing-meets-sharp-retrieval-combining-liteparse-and-lancedb
How can you improve your agentic search pipeline? I just wrote a blog post with @tech_optimist from @lancedb to answer exactly that. TLDR: - Parse files and take page-level screenshots with LiteParse, the parser we just open sourced at @llama_index - Chunk and embed text, and
RT Clelia Bertelli (🦙/acc) How can you improve your agentic search pipeline? I just wrote a blog post with @tech_optimist from @lancedb to answer exactly that. TLDR: - Parse files and take page-level screenshots with LiteParse, the parser we just open sourced at @llama_index - Chunk and embed text, and store everything (text, image bytes, vector data) in a local LanceDB instance - Expose text and image retrieval tools to a Claude agent, and let it reason on both data types With our eval dataset, the agent got near-perfect scores on most complex QA tasks, showing how a strong parsing foundation and multimodal retrieval can really improve your search🚀 Read the full breakdown here: https://www.lancedb.com/blog/smart-parsing-meets-sharp-retrieval-combining-liteparse-and-lancedb Original tweet: https://x.com/itsclelia/status/2041638826091946450
I built a Claude Code skill that allows it to generate a deep research report over any collection of complex docs (PDFs, Word, Pptx)….and generate word-level citations and bounding boxes directly back to the source! 📝 Check out “/research-docs”. 1. It parses out text and bounding boxes from every doc with liteparse, in seconds. 2. It then generates a full HTML report of the outputs that let you see word-level citations in each page. Raw Claude obviously has deep research capabilities, but it lacks an audit trail back to the source. This skill gives you a researched report that can be audited by others. Check it out: https://github.com/jerryjliu/liteparse_samples LiteParse: https://github.com/run-llama/liteparse
RT LlamaIndex 🦙 Open call to fintech leaders in NYC 🏦 May 13, in-person workshop with @jerryjliu0 on turning complex financial docs into LLM-ready data using agentic OCR. Build real pipelines. Hear from a Top 5 PE firm's production agent. Make sure to bring your laptops→ https://luma.com/updli8i6 Original tweet: https://x.com/llama_index/status/2041546697499910279
Hosted filesystems for agents are the new RAG
Introducing The File System Company of San Francisco http://filesystem.company
View quoted postI just listened to the podcast episode last night "19B end of Feb, wonder where they are right now" absolutely wild
Breaking: Anthropic is now at $30B ARR. Up from $19B in February. That's $11B ARR added in one month. WAT.
Let Claude Code automate your business operations I started a tutorial series geared towards providing example of real-world document-heavy tasks that can be automated with agentic workflows: starting with KYC (know your customer) and loan processing, and expanding to some other examples. These tutorials are designed as much for coding agents as well as humans. A lot of these operations tasks require extremely high accuracy, which requires both 1️⃣ Extremely high-quality extraction with confidence scores and citations 2️⃣ An agentic workflow with nonzero determinism: extract from a prescribed set of documents (identification, statements), and conform to a specified output format. If you’re also trying to build real-world, business-critical, document-heavy operations workflows, feel free to point your coding agent at this repo! It uses LlamaParse for high-accuracy document parsing/extraction, complete with citations and confidence scores, to give you guarantees on capabilities. Check out the KYC tutorial: https://github.com/jerryjliu/llamaparse_use_cases/blob/main/kyc/tutorial.md
Tutorial: Automating KYC with AI agents 🪪🕵 I’m creating a new tutorial series of automating practical document workflows with agents. Every financial institution needs to perform KYC (know your customer) to verify a customer’s identity, and this involves manually sifting
View quoted postTutorial: Automating KYC with AI agents 🪪🕵 I’m creating a new tutorial series of automating practical document workflows with agents. Every financial institution needs to perform KYC (know your customer) to verify a customer’s identity, and this involves manually sifting through IDs, bank statements, etc. and doing the cross-checking by hand. This is a great first use case for agentic document workflows: 1. Extract identification information from the user supplied ID (license, passport) 2. Extract fields from utility bills/bank statements and then use LLMs to cross-validate extracted fields with the extracted ID fields It obviously doesn’t cover the full e2e process and uses publicly available online data, but should be a good reference guide to get started. To make this work well, you do need high-quality document extraction with confidence scores and citations! Check out the tutorial: https://github.com/jerryjliu/llamaparse_use_cases/blob/main/kyc/tutorial.md If you’re interested come check out LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
RT Clelia Bertelli (🦙/acc) At @llama_index, we're committed to building the most capable document agents. That starts with powerful document processing building blocks like LlamaParse and LlamaExtract, but great agents also need the right access controls, as they should only see the documents they’re authorized to use. That’s why we teamed up with @auth0 (by @okta) to build a real-world demo of a secure document processing and retrieval pipeline, powered by fine-grained authentication so only trusted actors can access specific content. 📚 Learn how it works in the blog post: https://auth0.com/blog/securing-ai-documents-llamaindex-auth0/ 🦙 Get started with LlamaParse: http://cloud.llamaindex.ai/signup Original tweet: https://x.com/itsclelia/status/2041210431143109016
RT Kartik Talamadupula I will be speaking on this panel this afternoon at 12:15 pm PST on technical debt in agentic AI systems and the reality of deployments, especially at enterprise scale. The conference is free to attend, you can register below: https://datasciencedojo.com/agentic-ai-conference/?utm_term=kartik_speaker_highlight&utm_campaign=39611138-2026%20Agentic%20AI%20Conference&utm_content=fodai_agentic_ai&utm_medium=social&utm_source=linkedin&hss_channel=lcp-3740012#register See you at the conference! Original tweet: https://x.com/kr_t/status/2041191960690790726
This is a cool article that shows how to *actually* make filesystems + grep replace a naive RAG implementation. ̶F̶i̶l̶e̶s̶y̶s̶t̶e̶m̶s̶ ̶+̶ ̶g̶r̶e̶p̶ ̶i̶s̶ ̶a̶l̶l̶ ̶y̶o̶u̶ ̶n̶e̶e̶d̶ ̶ Database + virtual filesystem abstraction + grep is all you need
RT Logan Markewich A super useful tool to explore LiteParse outputs! Original tweet: https://x.com/LoganMarkewich/status/2040126119488454929
LiteParse Samples 📄🧑🏫 LiteParse is our free and fast document parser that can parse text with bounding boxes from any document. I’ve created a new repository that contains some demos on how you can make use of its outputs! ✅ Comparison against PyPDF and PyMuPDF ✅ Visual
View quoted postWe hosted our first ever @llama_index in-office meetup yesterday to pregame First Thursdays 🦙🔥 There’s more involved events we’ll be putting on with panels/talks/workshops where you learn stuff, but this one is pure vibes and fun. We had alcohol and also many non-alcoholic beverage options 🍻🧃 If you couldn’t make it (we had hundreds of registrants but didn’t admit too many initially to be conservative about space), we’re going to be throwing this every month! Next time we will double the number of pizzas. I will be putting on my EDM playlist 🪩
We took a brief break from parsing PDFs this First Thursday and welcomed the AI community to "Series B Lane" in San Francisco 🦙 New office. A-parse-rol Spritzes. LlamaIsland Iced Teas. 100+ builders. Then everyone walked one block to catch Reggie Watts at SF's street fest.
View quoted postLiteParse Samples 📄🧑🏫 LiteParse is our free and fast document parser that can parse text with bounding boxes from any document. I’ve created a new repository that contains some demos on how you can make use of its outputs! ✅ Comparison against PyPDF and PyMuPDF ✅ Visual Citations: search for any keyword and see joined bounding boxes pop up in the source images It should be a default tool to help AI agents OCR any document type (supports PDF, Word, Powerpoint + dozens of other formats), before heavier-weight VLM-based document parsing solutions (like our own LlamaParse). LiteParse Samples repo: https://github.com/jerryjliu/liteparse_samples LiteParse: https://github.com/run-llama/liteparse
RT LlamaIndex 🦙 We took a brief break from parsing PDFs this First Thursday and welcomed the AI community to "Series B Lane" in San Francisco 🦙 New office. A-parse-rol Spritzes. LlamaIsland Iced Teas. 100+ builders. Then everyone walked one block to catch Reggie Watts at SF's street fest. More of this coming soon 🎥⬇️ Original tweet: https://x.com/llama_index/status/2040100304138518960
Access control is one of the top priorities across every enterprise organization to secure AI agents. We're excited to collaborate with @auth0 on this blog post. We're building the infrastructure enabling agents to automate document heavy work (invoices, contracts, claims, market research, and much much more), and @okta / @auth0 is providing the infrastructure to make sure that they're able to operate under the right guardrails and permissions. https://auth0.com/blog/securing-ai-documents-llamaindex-auth0/
One thing that keeps coming up when teams add AI to their stack: auth gets way more complicated than the standard "who is logged in" You start asking questions like ⚪️ who's agent did this? ⚪️ what docs can my agent go read? ⚪️ who do i blame when things go wrong? @itsclelia
View quoted postRT Shiven Ramji Fine-grained authorization for RAG is one of the most underestimated problems in production AI. If your agent can retrieve documents, it needs to enforce who's allowed to see them, not just at the role level. With @auth0 FGA and LlamaIndex Workflows, authorization is structural: baked into the retrieval step, not bolted on at the API layer. Great collaboration with @jerryjliu0 and the @llama_index team showing exactly how this works in production → https://auth0.com/blog/securing-ai-documents-llamaindex-auth0/ Original tweet: https://x.com/thinkshiv/status/2039836920243486790
One thing that keeps coming up when teams add AI to their stack: auth gets way more complicated than the standard "who is logged in" You start asking questions like ⚪️ who's agent did this? ⚪️ what docs can my agent go read? ⚪️ who do i blame when things go wrong? @itsclelia
View quoted postThis is exactly what I've been doing with Claude Code. The biggest bottleneck with my ability to use these agents is ensuring they preserve relevant context between relevant sessions. Having the agent output files in .md and .html is not only a nicer way to view outputs than in the terminal, but also a good way to preserve context for future sessions. also been using Obsidian to view locally generated .md files the only slight hiccup is that the native harnesses aren't amazing at handling non-plaintext files (.pdf, .pptx, and more); the open-source skills use libraries that aren't optimized for generating readable text from complex layout docs. we built liteparse for this purpose to replace pypdf/pymupdf (https://github.com/run-llama/liteparse) i use it as part of my local claude code harness
LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating
View quoted postRT simon come burn tokens Original tweet: https://x.com/disiok/status/2039786133656395947
One of the benefits of working at @llama_index is that you get access to effectively unlimited* tokens. This includes the entire company, not just engineering! AI is a force multiplier on productivity. The more tokens you burn, the more we celebrate. The success of any knowledge
One of the benefits of working at @llama_index is that you get access to effectively unlimited* tokens. This includes the entire company, not just engineering! AI is a force multiplier on productivity. The more tokens you burn, the more we celebrate. The success of any knowledge worker in this new era is predicated on our ability to use agents to delegate, automate, and complement our own reasoning to produce high-quality outputs at a much faster rate. Note: I’ve seen comments in related threads along the lines of “burning more tokens for the sake of it is stupid”, “it’s optimizing for the wrong thing”, etc etc. I’m not here to argue this strawman. Everyone using AI - from engineering to marketing to GTM - has seen massive productivity gains. The fact is that if you’re not burning tokens, you’re getting left behind. If you want to come work at a company that not only makes cool AI stuff, but is organizationally AI-native, come check out our careers page. https://www.llamaindex.ai/careers *(we are not a frontier lab so there are T&Cs to this, but no one internally has reached this limit yet 🙂)
One of our company goals is to automate manual data entry from documents ✍️📑 Our Extract feature in LlamaParse does exactly that, and today we are launching Extract v2 🚀 Define a schema in natural language, and our agentic extraction will fill out the schema from the document with both exact match citations and semantic inference. The v2 change includes: ✅ Simplified tiers that vary from lower cost and higher accuracy ✅ Pre-saved extract configurations so that you can load/share existing configs and iterate on your schema ✅ Configurable parsing; so you can use our best-in-class doc OCR for the most complex tables/charts before extracting into structured output.
After the release of Parse v2, Extract is also getting an upgrade — 𝗶𝗻𝘁𝗿𝗼𝗱𝘂𝗰𝗶𝗻𝗴 𝗘𝘅𝘁𝗿𝗮𝗰𝘁 𝘃𝟮! 🎉 We've been reworking the experience from the ground up to make document extraction more powerful and easier to use than ever. Here's what's new: ✦
View quoted postLiteParse is our open-source document parser that provides high-quality spatial text parsing with bounding boxes. It can parse hundreds of pages of table-heavy documents in seconds - and give you bounding boxes over all the text elements! 🎁 This means that any agent automation you build over your document text will have an audit trail back to the exact source document. Check out the screenshots below as an example. Given any keyword search term, you can render the bounding boxes in the source document. Repo: https://github.com/run-llama/liteparse Bounding boxes: https://developers.llamaindex.ai/liteparse/guides/visual-citations/
I haven't hosted a pregame since college Come through tomorrow!
We're hosting a casual pregame for First Thursdays this week! 🍾 (First Thursdays is a monthly street party in SF on 2nd Street between Market and Howard. Our office is at the center of this street). We'll have drinks 🍹, pizza 🍕, and vibes 🦙. Come swing by 🔥 Starting 4pm:
View quoted postWe're excited to sponsor FutureLaw Week 2026 by @StanfordLaw + @CodeXStanford 🚀🧑⚖️ This includes the following: ✅ Agentic AI & Law Bootcamp ✅ LLM x Law Hackathon ✅ UN AI for Good ✅ FutureLaw Conference AI agents are helping to automate operation-heavy work in a lot of industries. Many of these industries, especially legal, depend heavily on parsing and extracting from massive piles of unstructured paperwork. We're building the document infrastructure for AI agents. We hope this week inspires lawyers, technologists, policymakers, and more on how to build agentic workflows that can intelligently parse, extract, edit, and generate legal documents end-to-end with extremely high accuracy and low human review efforts. Come check out the week's events! https://conferences.law.stanford.edu/futurelaw2026/
Lawyers <3 documents We're proud to sponsor @StanfordLaw and @CodeXStanford's FutureLaw Week 2026! 🏛️⚖️ AI x Law bootcamps, hackathons, the UN AI For Good Law Track & the FutureLaw Conference — all exploring the future of legal AI. Join us alongside friends from @DLA_Piper,
We are excited to be named to the 2026 Enterprise Tech 30 for the 3rd year in a row! 🙌 Big thanks to @Wing_VC and @EricNewcomer for the recognition and congrats to other companies in this list. We're actively hiring. We're also nonstop shipping 🚀
LlamaIndex is proud to be named to the 2026 Enterprise Tech 30, #3 in the Early Stage category. The ET30 is an annual list by @Wing_VC and Eric Newcomer, voted on by 90+ leading investors and corporate development leaders. It recognizes the private companies wi th the most
RT LlamaIndex 🦙 Lawyers <3 documents We're proud to sponsor @StanfordLaw and @CodeXStanford's FutureLaw Week 2026! 🏛️⚖️ AI x Law bootcamps, hackathons, the UN AI For Good Law Track & the FutureLaw Conference — all exploring the future of legal AI. Join us alongside friends from @DLA_Piper, @normativeai, @filevine, @harvey, @LexisNexis & the global legal tech community. April 11–16 👉 https://conferences.law.stanford.edu/futurelaw/ll Original tweet: https://x.com/llama_index/status/2039342780993220629
i'm so cooked
Claude Code has a regex that detects "wtf", "ffs", "piece of shit", "fuck you", "this sucks" etc. It doesn't change behavior...it just silently logs is_negative: true to analytics. Anthropic is tracking how often you rage at your AI Do with this information what you will
RT simon recursive improvement is here Original tweet: https://x.com/disiok/status/2039030753980518446
We @neosigmaai @RitvikKapila are building the future of self-improving AI systems! By closing the feedback loop between production data and system improvements, we help teams capture failures, convert them into structured evaluation signals, and use them to drive continuous
View quoted postIn PDF parsing, it’s super important to output the text in the correct layout (e.g. output tables as tables). This helps eliminate hallucinations from downstream LLM calls, but ALSO makes it easier to audit with human eyeballs. With LiteParse we’ve spent a lot of time making sure that the text/table layouts look correct. Take a look at the screenshots below! A lot of other free/open-source tools (PyPDF, PyMuPDF) output the text in a collapsed list and break table semantics. Check out the repo: https://github.com/run-llama/liteparse If you want higher-accuracy, VLM-enabled parsing, come check out LlamaParse: LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
RT LlamaIndex 🦙 LlamaIndex is proud to be named to the 2026 Enterprise Tech 30, #3 in the Early Stage category. The ET30 is an annual list by @Wing_VC and Eric Newcomer, voted on by 90+ leading investors and corporate development leaders. It recognizes the private companies wi th the most potential to shape the future of enterprise technology. Thank you to Wing Venture Capital and Eric Newcomer, and congratulations to all the companies honored this year. Original tweet: https://x.com/llama_index/status/2039009948903133490
We're hosting a casual pregame for First Thursdays this week! 🍾 (First Thursdays is a monthly street party in SF on 2nd Street between Market and Howard. Our office is at the center of this street). We'll have drinks 🍹, pizza 🍕, and vibes 🦙. Come swing by 🔥 Starting 4pm: https://luma.com/mkh44c7w
We’ve moved to the '#AI Waterfront' and it’s time to celebrate. Swing by on April 2nd to see our new office on 2nd street, meet our team, grab a bite or drink, and make new friends. Note: Space is limited, so please RSVP early. https://luma.com/mkh44c7w
View quoted postCreate a free, local, fast retrieval stack for Claude Code (or your favorite AI agent) `litesearch` is a project @itsclelia that lets you parse, index, and search over any collection of documents within your computer. It's fully CLI-native so easy to plug into any agent tool. Come check it out! Repo: https://github.com/AstraBert/litesearch The core parsing is powered by liteparse, our fast/free doc parser: https://github.com/run-llama/liteparse
Our OSS engineer @itsclelia recently built 𝗹𝗶𝘁𝗲𝘀𝗲𝗮𝗿𝗰𝗵, a fully local document ingestion and retrieval CLI/TUI application powered by LiteParse ⚡ litesearch demonstrates how developers can assemble a high-performance, local-first retrieval pipeline using open tools
View quoted postRT LlamaIndex 🦙 We’ve moved to the '#AI Waterfront' and it’s time to celebrate. Swing by on April 2nd to see our new office on 2nd street, meet our team, grab a bite or drink, and make new friends. Note: Space is limited, so please RSVP early. https://luma.com/mkh44c7w Original tweet: https://x.com/llama_index/status/2038692801618342146
RT LlamaIndex 🦙 Our OSS engineer @itsclelia recently built 𝗹𝗶𝘁𝗲𝘀𝗲𝗮𝗿𝗰𝗵, a fully local document ingestion and retrieval CLI/TUI application powered by LiteParse ⚡ litesearch demonstrates how developers can assemble a high-performance, local-first retrieval pipeline using open tools from across the ecosystem: • Parsing: LiteParse, the fast and accurate document parser we recently open sourced • Chunking: @ChonkieAI • Embeddings: A local @nomic_ai model via @huggingface transformers.js • Vector storage: A local @qdrant_engine edge shard (custom-built in Rust and compiled as a native add-on) • Retrieval: Query stored files with optional path-based filtering and configurable relevance thresholds • Runtime: @bunjavascript for speed and versatility 💻 Check out the repository and try it yourself: https://github.com/AstraBert/litesearch 📚 LiteParse docs: https://developers.llamaindex.ai/liteparse?utm_source=twitter&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2038647600623288398
RT PaddlePaddle ✨ From #1 OCR to a Global Ecosystem: PaddleOCR OCEAN Alliance is Here! After PaddleOCR became the world’s most-starred OCR open-source project, we’re taking the next step: building not just the best OCR technology, but the strongest global OCR & document AI ecosystem. 🌊 OCEAN stands for: Open Source · Community · Ecosystem · Application · Network Three partner tracks: 🔹 Technical Co-Build Partners — advancing the core tech and derivative projects 🔹 Ecosystem Platform Partners — bringing PaddleOCR to developers through major platforms 🔹 Application Benchmark Partners — turning OCR into real industry impact 🤝Featured Ecosystem Platform Partners: @huggingface @dify_ai @llama_index @pathway_com @Haystack_AI @zilliz_universe @milvusio @infiniflowai @CherryStudioHQ 🌍 Join us: Open to global partners contributing through code, platform integration, or real-world applications to help build long-term ecosystem value together. 📪 Apply via: [email protected] ⌚️ First-batch DDL: April 31, 2026 🔗 Full announcement: https://mp.weixin.qq.com/s/L_x7rI_4cxvQN_FI181IfQ Global leadership was the milestone. Ecosystem co-building is the journey ahead. 🚀 #ERNIE #PaddlePaddle #PaddleOCR #OCR #DocumentAI #OpenSource #AIEcosystem #RAG #DeveloperTools Original tweet: https://x.com/PaddlePaddle/status/2038619378003116116
RT Clelia Bertelli (🦙/acc) Hey there 👋 , I built 𝗹𝗶𝘁𝗲𝘀𝗲𝗮𝗿𝗰𝗵, a fully local document ingestion and retrieval CLI and TUI app, powered by LiteParse⚡ - Parse your unstructured documents with LiteParse, the lightning fast parser that we just open sourced at @llama_index - Chunk with @ChonkieAI - Embed with a local @nomic_ai model through @huggingface transformers.js - Store embeddings in a local @qdrant_engine edge shard (custom-built in Rust and compiled as a native add-on🦀) - Retrieve from stored files with (optional) path-based filtering and a relevance threshold The app runs on @bunjavascript, so make sure you have it installed🥞 Find all the code on GitHub: http://github.com/AstraBert/litesearch Original tweet: https://x.com/itsclelia/status/2038342859641037300
Last week we launched LiteParse - a free and fast document parser that provides more accurate AI-ready text than other free/fast parser libraries. It’s a great tool you can plug into assistant agents like Claude Code/OpenClaw and get good results, especially when paired with its screenshotting capabilities. But I do want to note that it doesn’t use any models under the hood (no VLMs/LLMs/even OCR models natively), and it’s not a replacement for VLM-based OCR solutions. It is fast because it is heuristic based! I attached a comparison table below. ✅ It is really good at text extraction and even table extraction, *specifically for LLM understanding*. It will lay the text out in a manner that’s easy for humans/AI to understand. ✅ It is great for assistant coding agents because the agent harness can use its text parsing to do a “fast” step, and then its screenshot capabilities to “dive deep” into a specific page 🚫 It is not great over scanned pages/visuals/anything requiring OCR. We do have OOB integrations with EasyOCR and PaddleOCR 🚫It doesn’t do layout detection and segmentation - it won’t draw bounding boxes over different elements on the page (though it does have word-level bounding boxes!) Tl;dr it’s great for plugging into an AI assistant tool. If you’re trying to OCR a bunch of docs in batch, check out LlamaParse :) LiteParse: https://github.com/run-llama/liteparse LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see
View quoted postTurn your PDFs into a podcast 📄🔈 We built a simple tool that integrates Gemini 3.1 voice with LlamaParse document parsing. Transcribe, read, and interact with all of your documents through voice! Repo: https://github.com/run-llama/voice-document-assistant Sign up to LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
🚀 The @GoogleDeepMind team just added Gemini 3.1 to the Live API, so we built a small demo showing how Gemini voice agents can plug directly into the document processing ecosystem powered by LlamaIndex. 🔥 In this example, we integrate LiteParse to enable fast, fully-local
View quoted postRT LlamaIndex 🦙 Transform your document processing with intelligent table extraction that goes beyond basic OCR. Tables in PDFs aren't just text - they're structured data trapped in visual formats. Our new deep dive explains how modern OCR for tables reconstructs spatial relationships, preserves header hierarchies, and ensures data integrity across complex documents. 📊 Why table extraction is fundamentally harder than standard text OCR - spatial relationships matter more than character recognition 🔧 The three core phases: detection, structure recognition, and data extraction with validation 💼 Real-world applications across financial services, healthcare, and logistics - from invoice processing to lab results ⚡ How LlamaParse handles multi-line rows, merged cells, and borderless tables while maintaining logical consistency We show a complete invoice processing example where complex line-item tables get converted to clean JSON with preserved relationships and validated totals - ready for immediate ERP integration. Read the complete guide: https://www.llamaindex.ai/blog/ocr-for-tables?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2037561042440687708
RT Clelia Bertelli (🦙/acc) The @GoogleDeepMind team really cooked with Gemini 3.1 in the Live API: it's fast and the output quality is great🔥 That's why at @llama_index we decided to test it out with our bread and butter: document processing📄 The voice agent we built: - Takes voice commands from terminal - Calls tools to explore available files and parse them, powered by LiteParse, our fully-local parser - Live-updates you on its task🔊 Take a look at the demo👇 Repo: http://github.com/run-llama/voice-document-assistant Original tweet: https://x.com/itsclelia/status/2037307156178129270
RT Martin Høst Normark Re @dwr Can recommend LiteParse using CLI https://x.com/jerryjliu0/status/2034665976428724267?s=46&t=2lAx6ujSVQsUlQlh1Yz8iA Original tweet: https://x.com/MartinHN/status/2037296913964282041
Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see
View quoted postRT Rachel E Add this one line of code to start parsing PDFs for free in Claude! #liteparse Original tweet: https://x.com/3z_score/status/2037269862415044887
Parse text from any PDF in seconds and give it to Claude Code 📑🤖 LiteParse is our open-source, model-free document parser that lets you digitalize text from any document in seconds. This is especially useful for coding agents, which are great at reading plaintext files but
View quoted postRT LlamaIndex 🦙 🚀 The @GoogleDeepMind team just added Gemini 3.1 to the Live API, so we built a small demo showing how Gemini voice agents can plug directly into the document processing ecosystem powered by LlamaIndex. 🔥 In this example, we integrate LiteParse to enable fast, fully-local document parsing. With our TUI-based voice assistant, you can literally talk to your terminal: - Speak commands - Trigger live document parsing via tool calls - Hear the agent read back results in real time 🔊 The assistant can extract content from single files or entire folders, leveraging the lightning-fast local parsing that LiteParse provides ⚡ Take a look at the demo👇 👩💻 GitHub repo http://github.com/run-llama/voice-document-assistant 📚 LiteParse docs https://developers.llamaindex.ai/liteparse?utm_medium=li_socials&utm_source=twitter Original tweet: https://x.com/llama_index/status/2037243292707152002
Our latest LiteParse release gives your AI agent access to text bounding boxes within any PDF 📐 LiteParse is our fast/free open-source document parser that can extract text from any document. Besides the extracted text, we also expose bounding boxes for every text block. This means that if you're building an AI agent over PDFs, it can now trace back to the precise line in the document and highlight it to the user, creating an audit trail for any decision made over this unstructured data. We've created a new guide that shows you how to get bounding boxes (and use it to highlight text on the page similar to the example below). Come check it out! https://developers.llamaindex.ai/liteparse/guides/visual-citations/?utm_medium=socials&utm_source=xjl&utm_campaign=2026-mar-liteparse LiteParse: https://github.com/run-llama/liteparse
Bounding boxes are key for citations, and we just shipped a new guide showing how to use LiteParse for visual citations! LiteParse is our fast and open-source document parser. Using both bounding box extraction and page screenshots, anyone (including agents) can learn how to
RT LlamaIndex 🦙 Bounding boxes are key for citations, and we just shipped a new guide showing how to use LiteParse for visual citations! LiteParse is our fast and open-source document parser. Using both bounding box extraction and page screenshots, anyone (including agents) can learn how to associate text with an element on the page. https://developers.llamaindex.ai/liteparse/guides/visual-citations/?utm_medium=socials&utm_source=twitter&utm_campaign=2026-mar-liteparse Original tweet: https://x.com/llama_index/status/2037198006483841481
Parse text from any PDF in seconds and give it to Claude Code 📑🤖 LiteParse is our open-source, model-free document parser that lets you digitalize text from any document in seconds. This is especially useful for coding agents, which are great at reading plaintext files but terrible at reading traditional document formats (PDF, Office docs). We have a one-line installable skill that lets you plug LiteParse into Claude Code and 40+ other agents. Repo is here: https://github.com/run-llama/liteparse
Improving Table Parsing for Word (.docx) Documents 📄🧩 Parsing Word/docx files is hard, even though counterintuitively the internal XML format is easier to understand than a PDF file. The XML captures the full semantic structure of text and tables, but the issue is which page the text and tables end up depend completely on the renderer. We built a big improvement within LlamaParse that allows us to link the source Word XML tables/table elements with the final rendered markdown output, including their page positions! ✅ This allows us to directly use the source Word XML formatting (bold/italic/strikethrough/subscript) ✅ It also allows us to directly interpret tables with merged cells We can do this while still giving back page positions in the rendered output. Huge thanks to our team for making this change. Come check out LlamaParse if you want to parse your Word docs with high accuracy! Blog: https://www.llamaindex.ai/blog/improving-table-parsing-for-word-docx-documents?utm_source=xjl&utm_medium=social LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
Word docs are one of the most common file formats people process in LlamaParse, and they've always been surprisingly frustrating to parse well. Here's the counterintuitive part: .docx actually has better structural information than most document formats. We just haven't been able
RT LlamaIndex 🦙 Word docs are one of the most common file formats people process in LlamaParse, and they've always been surprisingly frustrating to parse well. Here's the counterintuitive part: .docx actually has better structural information than most document formats. We just haven't been able to fully use it. Until now. A .docx file is a ZIP archive of XML files. That XML knows everything: cell boundaries, merged cells, column and row spans, nested tables, formatting tags, hyperlinks. A PDF of the same table has none of that. It's just text positioned at coordinates and line intersections that a parser has to reverse-engineer into structure. The hard part with Word XML isn't extracting the table content. It's knowing which page it's on. Word is a flow format — there are no page boundaries in the XML. Pagination depends on the renderer, fonts, margins, line-height. The same .docx renders differently in Word, LibreOffice, and Google Docs. We built a technique to resolve this, mapping Word XML table elements to their correct page positions in the rendered output. We now get the original document structure AND know exactly where each table appears. The quality improvement is most significant for: · Tables with rich cell formatting (bold, italic, strikethrough, superscript, lists inside cells) · Merged cells and column/row spans · Nested tables (tables inside table cells) If you're processing Word docs with table-heavy content, try it out. 📖 Full writeup: https://www.llamaindex.ai/blog/improving-table-parsing-for-word-docx-documents?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2036836522536902801
RT PaddlePaddle 🎉 Congrats to LlamaIndex on the release! Looking forward to seeing more developers explore #LiteParse and #PaddleOCR.⛽️ Original tweet: https://x.com/PaddlePaddle/status/2036633303202349352
There’s not that many fast, free, non-VLM document parsers out there: there’s PyPDF, PyMuPDF, Markitdown, OpenDataLoader. Last week, we launched LiteParse ⚡️📄: a fast, free, and non-VLM based document parser that provides the highest quality context to AI agents compared to
There’s not that many fast, free, non-VLM document parsers out there: there’s PyPDF, PyMuPDF, Markitdown, OpenDataLoader. Last week, we launched LiteParse ⚡️📄: a fast, free, and non-VLM based document parser that provides the highest quality context to AI agents compared to other tools out there. ✅ It extracts document text into an interpretable spatial representation ✅ It has native screenshotting capabilities to let agents (e.g. Claude Code) to do a “fast and light” text parsing step, and then a “deep-dive” into specific page content by feeding the screenshot into itself. ✅ It supports out-of-the-box integrations with other tools like PaddleOCR (@PaddlePaddle) Check out our benchmark below against the other tools. We use LLM QA performance to measure the ability for LLMs to semantically understand the parsed text, and also measure latency. If you try it out, we’d love to get your feedback. The repo is here: https://github.com/run-llama/liteparse If you’re looking for a more accurate, VLM-native solution that can parse any document at scale, LlamaParse might be better for you! https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
Use Agents + OCR to Automate Compliance Reporting 📑🔐 This is an awesome project by @zubeensyed that is able to ingest incident reports and match them against Article 33/34 notification requirements/risk thresholds. It uses LlamaParse for document classification and extraction, and our agent builder for stitching things into an e2e workflow. Check out the blog! https://medium.com/@zubeen/structuring-gdpr-breach-reports-with-agentic-ai-workflows-on-llamacloud-96b713666541 LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
Congratulations to @zubeensyed, one of our LlamAgent contest winners, for building an agentic AI workflow that automates GDPR breach report structuring! The agent takes incident reports and maps them to a standardized GDPR breach schema, aligning with Article 33 notification
RT LlamaIndex 🦙 Congratulations to @zubeensyed, one of our LlamAgent contest winners, for building an agentic AI workflow that automates GDPR breach report structuring! The agent takes incident reports and maps them to a standardized GDPR breach schema, aligning with Article 33 notification requirements and Article 34 risk thresholds. It classifies documents, extracts the relevant fields, and surfaces them in a review UI where a human can approve or reject the output. Read about the solution here: https://medium.com/@zubeen/structuring-gdpr-breach-reports-with-agentic-ai-workflows-on-llamacloud-96b713666541 Watch the full walkthrough of how this is built; https://www.youtube.com/watch?v=2TRn0QTrvsU Original tweet: https://x.com/llama_index/status/2036519434253377910
RT LlamaIndex 🦙 We’ve published a new blog with @googledevs on how to build a smart financial assistant using LlamaParse, our state-of-the-art agentic document parser, together with Gemini 3. In the guide, we show how to: 📝 Parse a financial PDF with LlamaParse 📊 Use VLM-enabled agentic OCR to accurately extract text and tables ⚡ Combine the parsed content with Gemini 3 to generate a clear, human-friendly financial report 📚 Read the blog: https://developers.googleblog.com/build-a-smart-financial-assistant-with-llamaparse-and-gemini-31 👩💻 Explore the repo: https://github.com/run-llama/llamaparse-gemini-demo 🦙 Get started with LlamaCloud: https://cloud.llamaindex.ai/signup Original tweet: https://x.com/llama_index/status/2036209384850858063
RT Jared.W Spent the weekend parsing 100MB+ PDFs with this — surprisingly stable. Feels like a strong default if you’re trying to turn raw docs into usable knowledge for OpenClaw / Claude Code. Shoutout to @jerryjliu0 for building this. Next problem: how to plug both knowledge and decision logic cleanly into the ReAct loop > I have been thinking it during the past days, and draw the graph to guide me 👇 Original tweet: https://x.com/JaredOfAI/status/2036179353525166451
We're excited to collaborate with @googledevs on building an agentic workflow over complex financial documents - using LlamaParse and Gemini 3.1 Pro Brokerage statements have complex layouts, dense tables, and oftentimes visual elements like charts. Our multi-step agentic
View quoted postLet your AI agent read any PDF on the internet in seconds 🌐⚡️ ``` curl -sL https://example.com/report.pdf | lit parse - ``` LiteParse is our fast and free document parser designed to seamlessly plug into 40+ different agents. Includes both text parsing and screenshotting capabilities; it doesn't use any VLM under the hood so you could run it on any device. Our latest update allows liteparse to support URL parsing and buffers/streams 🙌 Repo: https://github.com/run-llama/liteparse Guide: https://developers.llamaindex.ai/liteparse/guides/parsing-urls/
We're excited to collaborate with @googledevs on building an agentic workflow over complex financial documents - using LlamaParse and Gemini 3.1 Pro Brokerage statements have complex layouts, dense tables, and oftentimes visual elements like charts. Our multi-step agentic workflow does the following: 1. Ingest PDF into LlamaParse 2. Extract text and tables 3. Generate human-readable summary using Gemini Shoutout to @Vish_ow and @itsclelia 🙌 Check it out: https://developers.googleblog.com/build-a-smart-financial-assistant-with-llamaparse-and-gemini-31/?linkId=61022574
Improve document parsing accuracy by 15% for financial PDFs. Use LlamaParse and Gemini 3.1 Pro to extract high-quality data from unstructured brokerage statements and complex tables. 📈 Precise reasoning 📂 Structured PDF data ⚡️ Event-driven scaling Dive into the code on
View quoted postRT LlamaIndex 🦙 If you've ever worked in or around legal, you know that discovery is where document parsing really gets stress-tested. Low-resolution scans. Black and white images. Handwritten annotations. Charts buried in reports. Files that are technically PDFs but practically unreadable. And hundreds of thousands of them. Traditional OCR tools struggle with degraded scans, and anything visual (photographs, slide decks, tables) falls through the cracks entirely. That means your search index is noisy, your recall suffers, and relevant documents go unfound. This blog by @tuanacelik walks through how to set up LlamaParse for a legal discovery use-case: handling difficult scans with vision models, surfacing image and chart content, and using custom parsing instructions to guide output for predictable document patterns. The quality of everything downstream depends on what happened at ingestion. Worth getting right. Read the full blog here: https://www.llamaindex.ai/blog/parsing-the-unreadable-how-llamaparse-handles-legal-discovery-documents Original tweet: https://x.com/llama_index/status/2036079833000915272
We’ve created an agents skill that gives all of your agents the power to understand the most complex PDFs - with dense tables, unlabeled charts, messy handwriting and more. Our LlamaParse agents skill can be installed in one-line thanks to @vercel’s skills utility. LlamaParse orchestrates VLMs to deliver best-in-class accuracy over 40+ document types. The skills file allows agents to invoke it when needed to translate complex PDFs into agent-readable plaintext markdown. `npx skills add run-llama/llamaparse-agent-skills --skill llamaparse` If you prefer something fast, free, and local, you can similarly install liteparse as a skill. Come check it out: https://developers.llamaindex.ai/python/cloud/llamaparse/agent-skill?utm_source=xjl&utm_medium=social Sign up to LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
LlamaParse now has an official Agent Skill you can use across 40+ agents. With built-in instructions for parsing complex documents, including different formats, tables, charts, and images, your agents gain access to deeper document understanding, not just raw text extraction.
View quoted postRT LlamaIndex 🦙 LlamaParse now has an official Agent Skill you can use across 40+ agents. With built-in instructions for parsing complex documents, including different formats, tables, charts, and images, your agents gain access to deeper document understanding, not just raw text extraction. 👇 Watch the demo 📖 Read the docs: https://developers.llamaindex.ai/python/cloud/llamaparse/agent-skill?utm_source=socials&utm_medium=li_social 🚀 Get started with LlamaCloud: http://cloud.llamaindex.ai/signup?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2035069934372626812
RT Shubham Saboo Just added this free document parser to my OpenClaw Agent team. One line. That's it. LiteParse plugs right into any AI agent. 86 pages in 3.3 seconds on my Macmini. No GPU and no API key needed. 100% free, local and Opensource. Original tweet: https://x.com/Saboo_Shubham_/status/2035051817080643726
Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see
View quoted postRT LlamaIndex 🦙 Our new open-source LiteParse comes with ready-to-use agent skills that work seamlessly with coding agents. `npx skills add run-llama/llamaparse-agent-skills --skill liteparse` ..and your agents can immediately start processing documents locally as part of their reasoning process. Here's Claude Code with liteparse enabled 💪 Documentation for LiteParse agent skills: https://developers.llamaindex.ai/liteparse/guides/agent-skill/?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2035024635738431986
LiteParse is our free, blazing-fast document parser that you can plug into 46+ different agents - with one command 🔥 From Claude Code to OpenClaw to Cursor to Warp. Use liteparse to solve a task directly or read docs as context to write code. All you have to do is `npx skills add run-llama/llamaparse-agent-skills --skill liteparse` (we borrowed Vercel’s format) Check out our video below on using liteparse with Claude Code! Repo: https://github.com/run-llama/liteparse Blog: https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents?utm_source=xjl&utm_medium=social
We just open-sourced LiteParse 🎉 A lightweight, local document parser in the shape of an easy-to-use CLI. No API calls, no external service, no cloud dependency. Just fast text extraction from common file formats, right from your terminal. It's built for developers who want
RT Logan Markewich Re @thomaslutz_ @jerryjliu0 Didn't have time to include it in the blog post, but if curious I ran it against the benchmark in the blog -- in general tools that output markdown seem to have more edge cases Original tweet: https://x.com/LoganMarkewich/status/2034770659767710138
RT Sebas Shipping 🛥️ Original tweet: https://x.com/__sebasgar__/status/2034749575521521829
One of the biggest requirements for document OCR is visual grounding, and frontier models (gemini, opus, gpt-5.4) suck at it by default. In other words they don't have a great sense of the positions of things on a page. We've made massive strides in making sure our models are
RT Clelia Bertelli (🦙/acc) Today at @llama_index we are launching 𝗟𝗶𝘁𝗲𝗣𝗮𝗿𝘀𝗲, a lightweight and open source CLI and TS library for document parsing🚀 We all know that document parsing pipelines have become complicated fast: different tools for different formats, many libraries, failures in structural detection. LiteParse enters the scene as a local-first parsing solution to tame most document formats, from PDFs to images to Office docs, and has three core strengths: - preserves spatial layout (columns, tables, alignment) - built-in OCR, or bring your own server - can capture screenshots for multimodal LLMs Huge shout-out to @LoganMarkewich for driving this! ⬇️ Install it now: 𝘯𝘱𝘮 𝘪𝘯𝘴𝘵𝘢𝘭𝘭 -𝘨 @𝘭𝘭𝘢𝘮𝘢𝘪𝘯𝘥𝘦𝘹/𝘭𝘪𝘵𝘦𝘱𝘢𝘳𝘴𝘦 📚 Read the blog: https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents?utm_medium=cb_socials&utm_source=twitter&utm_campaign=2026-mar-liteparse-launch 📽️ Watch the full walkthrough on YT: https://youtu.be/_gcqMGUWN-E ⭐ Star the repo: http://github.com/run-llama/liteparse Original tweet: https://x.com/itsclelia/status/2034677366581297590
RT Tuana We just open-sourced LiteParse 🎉 A lightweight, local document parser in the shape of an easy-to-use CLI. No API calls, no external service, no cloud dependency. Just fast text extraction from common file formats, right from your terminal. It's built for developers who want parsing that stays on their own infrastructure and gets out of their way. Clean PDFs, DOCX, HTML: run it, get your text, move on. The output is designed to be fed straight into agents so they can read parsed text and reason over screenshots without any extra wrangling. When you hit more complex territory like scanned docs, dense tables, or multi-column layouts, that's where LlamaParse picks up. Same philosophy, more horsepower for the hard stuff. 📖 Announcement post: https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents?utm_medium=tc_socials&utm_source=twitter&utm_campaign=2026-mar-liteparse-launch 🔗 GitHub: https://github.com/run-llama/liteparse 🎬 Walkthrough: https://youtu.be/_gcqMGUWN-E Original tweet: https://x.com/tuanacelik/status/2034676802619416817
RT Wey Gu 古思为 LlamaIndex team 出新的好东西了 LiteParse 无模型的轻量级 idp 文档解析库,还是一个 cli,名字很酷 lit 支持各种文档格式,office pdf 图片,50 多种 而且,非常不 llamaindex 的一次,这个 cli 是 js 的不是 python/ go 的哈哈 👍 也支持 remote ocr http 的更重一点点的模式 适合放在 skills 里👍 Original tweet: https://x.com/wey_gu/status/2034670577345278276
Introducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see
View quoted postIntroducing LiteParse - the best model-free document parsing tool for AI agents 💫 ✅ It’s completely open-source and free. ✅ No GPU required, will process ~500 pages in 2 seconds on commodity hardware ✅ More accurate than PyPDF, PyMuPDF, Markdown. Also way more readable - see below for how we parse tables!! ✅ Supports 50+ file formats, from PDFs to Office docs to images ✅ Is designed to plug and play with Claude Code, OpenClaw, and any other AI agent with a one-line skills install. Supports native screenshotting capabilities. We spent years building up LlamaParse by orchestrating state-of-the-art VLMs over the most complex documents. Along the way we realized that you could get quite far on most docs through fast and cheap text parsing. Take a look at the video below. For really complex tables within PDFs, we output them in a spatial grid that’s both AI and human-interpretable. Any other free/light parser light PyPDF will destroy the representation of this table and output a sequential list. This is not a replacement for a VLM-based OCR tool (it requires 0 GPUs and doesn’t use models), but it is shocking how good it is to parse most documents. Huge shoutout to @LoganMarkewich and @itsclelia for all the work here. Come check it out: https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents?utm_source=xjl&utm_medium=social Repo: https://github.com/run-llama/liteparse
We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood. Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙
View quoted postRT LlamaIndex 🦙 We've spent years building LlamaParse into the most accurate document parser for production AI. Along the way, we learned a lot about what fast, lightweight parsing actually looks like under the hood. Today, we're open-sourcing a light-weight core of that tech as LiteParse 🦙 It's a CLI + TS-native library for layout-aware text parsing from PDFs, Office docs, and images. Local, zero Python dependencies, and built specifically for agents and LLM pipelines. Think of it as our way of giving the community a solid starting point for document parsing: npm i -g @llamaindex/liteparse lit parse anything.pdf - preserves spatial layout (columns, tables, alignment) - built-in local OCR, or bring your own server - screenshots for multimodal LLMs - handles PDFs, office docs, images Blog: https://www.llamaindex.ai/blog/liteparse-local-document-parsing-for-ai-agents?utm_medium=li_socials&utm_source=twitter&utm_campaign=2026-mar-liteparse-launch Repo: https://github.com/run-llama/liteparse Original tweet: https://x.com/llama_index/status/2034661997644808638
One of the biggest requirements for document OCR is visual grounding, and frontier models (gemini, opus, gpt-5.4) suck at it by default. In other words they don't have a great sense of the positions of things on a page. We've made massive strides in making sure our models are able to segment and detect every granular element in the most complex docs. This allows you to build AI agents that can surface extremely precise citations in the source documents: ✅ newspapers ✅ infographics ✅ handwritten notes ✅ product catalogs ✅ research presentations and much more Come check it out in LlamaParse! https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social
LlamaParse Agentic Plus mode now delivers precise visual grounding with bounding boxes for the most challenging document elements. Our latest update brings major improvements to how we handle complex visual content: 📐 Complex LaTex formulas - accurately parse mathematical
RT LlamaIndex 🦙 Context engineering is the new prompt engineering — and if you're building AI agents, you need to understand the difference and why parsing your data correctly sits at the heart of it Andrej Karpathy put it well: context engineering is "the delicate art and science of filling the context window with just the right information for the next step." It's not just about the instructions you give an LLM. It's about what you put IN front of it. That context can come from a lot of places: — System prompts — Chat history & long-term memory — Knowledge base retrieval — Tool definitions & responses — Structured outputs One of the most underrated levers? Structured information. This is exactly what LlamaParse + LlamaExtract are built for. Parse your complex documents properly → extract structured, relevant fields → pass clean, dense context to your agent. Better parsing = better context = better agents. It really is that simple. Take a look back on a piece by @tuanacelik and @LoganMarkewich about the full breakdown: what context engineering is, what makes up context, and the key techniques to consider — from memory blocks to workflow engineering. Read it here 👇 https://www.llamaindex.ai/blog/context-engineering-what-it-is-and-techniques-to-consider?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2034347384973762694
We've massively improved our document layout capabilities in LlamaParse 📄📐 This means that our document OCR engine lets you get insanely detailed bounding boxes over really complex multimodal documents, like the research poster shown below. A core requirement for any agentic document workflow is enabling users to trace back to the source. Now your AI agents can reason over complex line charts and tables deeply embedded within specific pages, free of hallucinations, but also surface the source segment to the user. Come check out LlamaParse: https://cloud.llamaindex.ai/?utm_source=xjl&utm_medium=social If you are building document OCR in production, come talk to us: https://www.llamaindex.ai/contact?utm_source=xjl&utm_medium=social
LlamaParse Agentic Plus mode now delivers precise visual grounding with bounding boxes for the most challenging document elements. Our latest update brings major improvements to how we handle complex visual content: 📐 Complex LaTex formulas - accurately parse mathematical
RT LlamaIndex 🦙 LlamaParse Agentic Plus mode now delivers precise visual grounding with bounding boxes for the most challenging document elements. Our latest update brings major improvements to how we handle complex visual content: 📐 Complex LaTex formulas - accurately parse mathematical expressions with precise positioning ✍️ Handwriting recognition - extract handwritten text with location coordinates 📊 Complex layouts - navigate multi-column documents and intricate formatting 📈 Infographics and charts - identify and extract data visualizations with spatial context This means you can now build applications that not only extract text from documents but also understand exactly where that content appears on the page - perfect for creating more intelligent document analysis workflows. Try LlamaParse Agentic Plus mode and see how visual grounding transforms your document parsing capabilities: https://cloud.llamaindex.ai?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2034300076441633276
We hosted an executive dinner at NVIDIA GTC with @Modular. 600+ people signed up, and we had to turn away more than 500. We had a packed house 🔥 This is one of the only dinners where we discuss the impact of AI agents across the entire stack - from infrastructure/systems to context and agentic engineering. Some of the topics we discussed: - Is the future composed of general models or specialized models? - Are agents a systems problem or model problem? - How are non-technical users adopting agent tooling like Claude Cowork? - Where do humans spend the most time on repetitive document work? Massive thank you to the Modular team for partnering with @llama_index on this, and to every single person who showed up. To the 500+ on the waitlist - next time we're getting a bigger venue.
RT LlamaIndex 🦙 One of the hardest problems with document parsing is trust. How do you know the output actually corresponds to what's in the source? LlamaParse has visual grounding with bounding box citations for outputs, and it addresses exactly this. Two ways to use it: 1️⃣ In the UI: hover over any element in the markdown output and it highlights the exact region it came from in the original document. Great for spot-checking complex tables, multi-column layouts, or figures where parsing can be tricky. 2️⃣ In the JSON output: every parsed element carries bounding box coordinates, i.e. the precise location of that element within the source file. That means you can build applications that don't just surface an answer, but can point back to exactly where in a document it came from. For due diligence, where auditability matters, this is a step up from "trust the output." You can verify it, cite it, and build on it. Sign up to LlamaParse to get started: https://cloud.llamaindex.ai?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2033937482577023103
RT LlamaIndex 🦙 Agentic AI transforms document extraction from simple text transcription into intelligent reasoning, dramatically reducing manual review queues and maintenance overhead. Traditional OCR hits a wall when documents deviate from templates - vendor format changes, skewed scans, or handwritten annotations break the pipeline. Agentic document extraction solves this by understanding context, not just converting pixels to text. 🧠 Plan-act-verify loops that identify document structure before extracting data, then validate results against context 📍 Visual grounding with bounding boxes links extracted text to precise page locations, solving spatial assignment errors 📋 Dynamic table processing infers header-row relationships instead of relying on brittle pixel coordinate templates LlamaParse processes any document type without training phases or template maintenance. When your vendor changes invoice formats or you encounter new document types, the system adapts automatically instead of breaking. Read the full breakdown of agentic AI and implementation best practices: https://www.llamaindex.ai/blog/agentic-document-extraction?utm_source=socials&utm_medium=li_social Original tweet: https://x.com/llama_index/status/2033575287909417283
Zhipu AI released the GLM-OCR technical report yesterday. A model that tops on OmniDocBench V1.5 with a 94.62 score - with only 0.9B params! I give them credit where credit is due: we are genuinely excited about any research that pushes the frontier of document parsing at sub-1B scale. Between GLM-OCR, dots.ocr, paddleOCR, Deepseek, small doc parsing models are getting really good really quickly 📈
🚨 Want to parse complex PDFs with SOTA accuracy, 100% locally? 📄🔍 At just 0.9B parameters, you can drop GLM-OCR straight into LM Studio and run it on almost any machine! 🥔 🧠 0.9B total parameters 💾 Runs on < 1.5GB VRAM (or ~1GB quantized!) 💸 Zero API costs 🔒 Total data