Jerry Liu
简介
document OCR + workflows @llama_index. cofounder/CEO Careers: https://t.co/EUnMNmb4DZ Enterprise: https://t.co/Ht5jwxRU13
平台
内容历史
Automate ETL over Financial Data 📊 Most real-world financials are not “database-shaped”, and requires a ton of human effort to manipulate/copy an Excel sheet into structured formats for analysis. We recently launched LlamaSheets - a specialized AI agent that automatically structures your Excel spreadsheet into a 2D format for analysis. There are so many use cases for Excel, and accounting is a huge subcategory here. Check it out: https://www.llamaindex.ai/blog/announcing-llamasheets-turn-messy-spreadsheets-into-ai-ready-data-beta
We wrote a tutorial on extracting massive structured tables from documents 📃 Using naive LLM structured output for document extraction fails if the number of output tokens is large - the LLM will end up dropping or hallucinating results. A lot of documents are basically rendering massive tables in PDF form, like this Blue Shield document showing network coverage across 380+ CA hospitals. We created a new mode in LlamaExtract that lets you extract every single row from this document with 100% accuracy. This lets you: * ETL it into a structured database * Do structured queries over it. * And more! Check out our blog post: https://www.llamaindex.ai/blog/extracting-repeating-entities-from-documents?utm_source=socials&utm_medium=li_social
Stop losing 80% of your data when extracting from long documents with repeating entities like catalogs, tables, and lists. Our new Table Row extraction target in LlamaExtract solves the core problem: instead of trying to extract everything at once (where LLMs get overwhelmed),
RT LlamaIndex 🦙 POV: You're building an agent and it keeps giving weird answers because your PDF parsing is broken 🫠 This is a great walkthrough by @mesudarshan showing exactly how to use LlamaParse to fix this—from basic setup through advanced configs. The video walks through: · Why most PDF parsers fail on complex layouts (tables, charts, multi-column text) · Using the LlamaCloud playground to experiment · Real demo: parsing "Attention Is All You Need" paper with different settings · Cost-effective vs agentic vs agentic plus modes—when to use each · Preset configs for invoices, scientific papers, forms (they tune the parsing prompts for you) · Advanced options: OCR, language selection, choosing your LLM (Sonnet, GPT-4, etc.) · Saving custom configs so you don't have to re-tune for similar docs https://youtu.be/mUHPPBbumIs?si=sWBDIoR9376wYP7R Original tweet: https://x.com/llama_index/status/1994452235754029426
We launched a new API today to let you parse any Excel sheet in a structured table. Take a look at this example on core production costs 🌽: 1️⃣ The table is located at the center of the sheet with headers, footnotes, and a hierarchical column layout 2️⃣ We get back a structured table with summarization, along with parsed row/column representations This lets you directly run text-to-pandas/SQL over this data if you’re building an AI agent, or do ETL yourself over it. Check out our blog and come take a look! Blog: https://www.llamaindex.ai/blog/announcing-llamasheets-turn-messy-spreadsheets-into-ai-ready-data-beta Try it out: https://cloud.llamaindex.ai/
Announcing LlamaSheets in beta 🔥 Transform your messy spreadsheets into AI-ready data with our newest LlamaCloud API 📊 LlamaSheets (in beta) is a specialized API that automatically structures complex spreadsheets while preserving their semantic meaning and hierarchical
Introducing LlamaSheets 🦙 - a specialized AI agent that can convert complex spreadsheets into normalized, structured data. Excel files are arbitrarily complicated - they contain semi-structured numerical data, complex formatting, visual hierarchies. A lot of Excel files “look like tables” to the human eye but are impossible to parse as actual tables if you tried to read the raw cell values. Parsing Excel files requires a completely different stack from parsing any other document format. We’ve been hard at work over the past few months doing applied research on understanding Excel files. The result is we’ve created a powerful algorithm to help identify, segment, and output structured tables - and this includes preserving multi-level row and column hierarchies. The output is a structured dataframe that you can directly run queries over, OR feed it as a tool to an upstream AI agent. Any AI agent will have a much easier time understanding these structured values vs. trying to write code (e.g. openpyxl) to manipulate the raw Excel values - it raises accuracy and reduces cost. This was a teamwide effort and huge shoutout to everyone who contributed to this. Check out our video, blog, docs, and signup below! Video: https://www.youtube.com/watch?v=eOp6_vbA5Kc Blog: https://www.llamaindex.ai/blog/announcing-llamasheets-turn-messy-spreadsheets-into-ai-ready-data-beta?utm_source=socials&utm_medium=li_social Docs: https://developers.llamaindex.ai/python/cloud/llamasheets/getting_started/ Sign up: https://cloud.llamaindex.ai/ Happy Thanksgiving week 🦃LlamaIndex 🦙: Announcing LlamaSheets in beta 🔥 Transform your messy spreadsheets into AI-ready data with our newest LlamaCloud API 📊 LlamaSheets (in beta) is a specialized API that automatically structures complex spreadsheets while preserving their semantic meaning and hierarchical Link: https://x.com/llama_index/status/1993362324070318286
We've created a specialized agent that lets you extract out every single row from super complex embedded tables ✂️, with super high-accuracy. Simply define a simple schema with natural language containing the elements you want to extract, and upload your doc(s). If you try this by just prompting an LLM, you are going to run into hallucinations and/or dropped output - the output token space is super large and the raw model has a high probability of failing. Docs: https://developers.llamaindex.ai/python/cloud/llamaextract/features/options/?utm_source=socials&utm_medium=li_social Sign up: https://cloud.llamaindex.ai/LlamaIndex 🦙: Extract data from table rows with precision using LlamaExtract's Table Row mode 📊 LlamaExtract now offers granular extraction capabilities that go beyond document-level processing, giving you powerful control over how your schema is applied: 🎯 Table row extraction applies Link: https://x.com/llama_index/status/1991930005425926229
We’re looking for strong AI engineers to help us build specialized agents for document understanding. It’s an extremely technical role, and a blend of hard ML tech with agent engineering techniques: ✅ You should know how to train VLMs for solving OCR + extraction tasks ✅ You should know first-principles context/workflow engineering for building agents ✅ You should know how to do applied research but aggressively prioritize in a fast-moving startup ✅ Ideally you’ve also shipped e2e product Simon and I both have extensive ML backgrounds, and we strongly value good technical talent here. In-person in SF. Apply here: https://www.llamaindex.ai/careers/multimodal-ai-engineer-document-understanding
It's pretty clear that AI agents can inhale documents and perform knowledge work. It's also pretty clear humans need to be able to observe this process to prevent things from going off the rails. One of the biggest benefits towards orchestrating document workflows through code (vs. a no-code builder like traditional RPA tools) is native integrations with OpenTelemetry and all the wonderful AI observability tools out there. This is exactly what LlamaAgents provides. This is super important because documents are inherently big containers of unstructured data. Workflows transform this mess of tokens into structured outputs over time. Trying to manually inspect everything is a nightmare. These integrations allow you to dump the full LLM traces into any supported observability tool out there, giving both non-technical and technical users extremely granular insights into what's going on, and allowing steerability in the case of unexpected outputs. Check out our awesome blog by @itsclelia here! https://www.llamaindex.ai/blog/observability-in-agentic-document-workflows LlamaAgents: https://developers.llamaindex.ai/python/llamaagents/overview/LlamaIndex 🦙: Agentic Document Workflows are crucial for AI-driven knowledge work and automation, but they are often treated as black boxes, which leads to silent failures and unexpected behaviors. With our Agent Workflows you don't have to worry about not knowing what is happening behind the Link: https://x.com/llama_index/status/1991183958164553959
We’ve built one of the most advanced ways to help you automate knowledge work over your documents A lot of document work depends on encoding custom processes. For instance, enforcing custom validation checks, doing web search, integrating with external systems. LlamaAgents is a full product suite that lets you build and deploy an agentic document extraction workflow, orchestrated purely through code. 🚫 It is not a drag-and-drop builder ✅ It directly integrates with the LlamaCloud suite: document parsing, extraction, classification, indexing. ✅ It lets you orchestrate workflows through code, meaning it’s infinitely customizable ✅ It gives you the app deployment layer out of the box - and you can even customize the app layer! Come check it out: https://www.llamaindex.ai/blog/llamaagents-build-serve-and-deploy-document-agents?utm_source=socials&utm_medium=li_social Docs: https://developers.llamaindex.ai/python/llamaagents/overview/LlamaIndex 🦙: LlamaAgents is now in open preview - the fastest way to build, serve, and deploy multi-step document agents that combine LlamaCloud's document extraction and parsing power with Agent Workflows orchestration. 🚀 Get started instantly with pre-built templates for SEC filings, Link: https://x.com/llama_index/status/1990828159835791697
I did some weekend reading into the recently released OlmOCR2 model by @jakepoznanski et al. 📄🔎 A cool insight here is that you can scalably do RL on your document parsing model (specifically RLVR) in an automated fashion - without needing humans for feedback or creating the reward function for each document. The authors use Sonnet to generate an HTML scaffold to then generate tailored unit tests for each document. The unit tests themselves are deterministic, but the generation is through an LLM. The parsed outputs are scored against the unit tests, which creates feedback signals for the model. This is nice because: - Manually generating parsed ground-truth for documents is a pain in the ass - Manually generating tests for documents is also a pain in the ass - So scaling up data and signals for document understanding has been painful Blog + paper: https://allenai.org/blog/olmocr-2 From our side, we’re constantly benchmarking all the latest OCR models in our LlamaParse pipeline. If you’re interested in parsing complex docs come check it out here! https://cloud.llamaindex.ai/
This past week the entire @llama_index team got together in Mexico City 🇲🇽 We reaffirmed our focus: we’re all in on building AI agents to solve document processing and workflows 📑🤖 We are not building simple LLM framework abstractions. We are building incredibly deep OCR technology across multiple price points to unlock content from any unstructured document. Most importantly, we ate a ton of tacos 🌮🌮🌮🌮 We have a ton of releases slated from now until EOY, stay tuned! Also we’re hiring: https://www.llamaindex.ai/careers Until next time CDMX :)
We’ve created a specialized agent tuned for row-level table extraction 🧩🤖 A lot of document workflows involve converting a complex table in a .pdf/.docx file into an Excel spreadsheet. Oftentimes this work is done manually. We created a mode in LlamaExtract that only deals with table extraction. Simply define the schema you want for each row, and the agent will extract out all rows from every table corresponding to this schema! It can not only populate the row context, but also the global table context. Check it out in LlamaCloud: https://cloud.llamaindex.ai/
RT LlamaIndex 🦙 Chart OCR just got a major upgrade with our new experimental "agentic chart parsing" feature in LlamaParse 📈🧪 Most LLMs struggle with converting charts to precise numerical data, so we've created an experimental a system that follows contours in line charts and extracts values. Automate chart analysis without spending hours manually correcting extracted values. Try it now in LlamaParse: https://cloud.llamaindex.ai/?utm_source=socials&utm_medium=li_social
One of the biggest use cases for agentic document automation is insurance underwriting ✍️ Underwriting depends on processing *massive* volumes of unstructured documents, from medical reports, scanned forms, and way more. It's also historically been a massively manual process. We're super excited to feature this case study with Pathwork AI - Pathwork is hyperfocused on building underwriting agents for life insurance. They're able to use LlamaCloud as a core module in order to process the massive volume of docs, from medical documentation to carrier guidelines. Check it out: https://www.llamaindex.ai/customers/pathwork-automates-information-extraction-from-medical-records-and-underwriting-guidelines-with?utm_source=socials&utm_medium=li_social LlamaCloud: https://cloud.llamaindex.ai/LlamaIndex 🦙: See how @pathwork scaled their life insurance document processing from 5,000 to 40,000 pages per week using LlamaParse. 📄 Process complex medical records, lab results, and decades-old scanned PDFs with 8x improved throughput 🤖 Automatically extract and index carrier Link: https://x.com/llama_index/status/1988290671279829204
RT LlamaIndex 🦙 See how @pathwork scaled their life insurance document processing from 5,000 to 40,000 pages per week using LlamaParse. 📄 Process complex medical records, lab results, and decades-old scanned PDFs with 8x improved throughput 🤖 Automatically extract and index carrier underwriting guidelines to keep risk rules current ⚡ Replace fragile, manual pipelines with robust automation that handles everything from digital forms to 1970s faded scans 🎯 Free up engineering time from maintenance to focus on building new product features @pathwork's Case Underwriter, Knowledge Assistant, and Pre-App Manager products all rely on transforming unstructured insurance documentation into structured data for faster decision-making. By integrating LlamaParse, they eliminated bottlenecks that were directly limiting customer growth and built future-proof infrastructure that automatically improves over time. Read the full case study: https://www.llamaindex.ai/customers/pathwork-automates-information-extraction-from-medical-records-and-underwriting-guidelines-with?utm_source=socials&utm_medium=li_social
We've gotten super, super deep in the wonderful world of document OCR through the history of @llama_index - and we'd love to share it with you! 🌟 1. There's a lot of benefits to "traditional" methods of reading the PDF binary for fast, cheap parsing. 2. You can use LLMs in the loop for general reading order reconstruction. 3. VLMs are obviously useful and we've benchmarked every frontier model there is out there to give high quality results over the most complex pages within our pipeline. State-of-the-art document parsing is super important for building agentic automation over any set of docs, and we've invested in it for the past 2 years. Register here: https://landing.llamaindex.ai/beynd-ocr-how-ai-agents-parse-complex-docs?utm_source=socials&utm_medium=li_socialLlamaIndex 🦙: We probably shouldn't tell you how to build your own document parsing agents, but we will 😮. AI agents are transforming how we handle messy, real-world documents that break traditional OCR systems. Join our live webinar on December 4th at 9 AM PST where the LlamaParse team Link: https://x.com/llama_index/status/1986810928713855235
We're partnering with @browserbase, @braintrust, @modal for an awesome afterparty at re:invent - come join us!LlamaIndex 🦙: There are Vegas parties and there is Late Shift 🎉 Join us for an exclusive re:Invent afterparty that brings together the best minds in AI and tech for a night you won't forget. 🍸 Cocktails and disco balls at Diner Ross Steakhouse in The LINQ 🤖 Connect with the teams behind Link: https://x.com/llama_index/status/1988003781448266141
RT LlamaIndex 🦙 There are Vegas parties and there is Late Shift 🎉 Join us for an exclusive re:Invent afterparty that brings together the best minds in AI and tech for a night you won't forget. 🍸 Cocktails and disco balls at Diner Ross Steakhouse in The LINQ 🤖 Connect with the teams behind @browserbase, @braintrust, @modal_labs, and LlamaIndex 🌙 Late-night tech conversations when the conference sessions end 🎟️ Limited spots with approval-required registration We're teaming up with our friends at @browserbase, @usebraintrust, and @modal_labs to host the most fun you'll have all conference. After your evening sessions, meet us for cocktails, networking, and the kind of tech chatter that makes re:Invent legendary. RSVP now - spots are limited: https://luma.com/lateshift
I’m very interested in seeing how many bits and pieces of finance work we can fully automate with agents. I built a multi-step agentic workflow to automate SEC document understanding👇 Given an SEC filing (10K, 10Q, 8K), use our agent classify module to determine what type it is, and route it to the right schema for document extraction (powered by LlamaExtract) Powered by LlamaCloud and LlamaAgents - it’s a full code-based orchestration layer over LLM capabilities. Simple Repo + file: https://github.com/jerryjliu/classify_extract_sec/blob/main/src/extraction_review_tmp5_classify_sec/process_file.py LlamaAgents: https://developers.llamaindex.ai/python/llamaagents/overview/ LlamaCloud: https://cloud.llamaindex.ai/
Build an agentic finance workflow over your inbox 📤 We’ve created a template that shows you how to automatically classify and process invoices/expense attachments as emails come in, with super high accuracy. Uses state-of-the-art OCR available in LlamaParse, wrapped in a LlamaAgents workflow. Shoutout @itsclelia for this example! Repo: https://github.com/AstraBert/financial-team-agent LlamaCloud: https://cloud.llamaindex.ai/login?redirect=%2F%3Futm_source%3Dtwitter%26utm_medium%3Dli_socialLlamaIndex 🦙: Trigger your agent workflows directly from your inbox, using our LlamaAgents and @resend webhooks📧 In this demo, we built a system that: 👉 Receives emails with documents attached 👉 Classifies the attachments as either invoices or expenses using LlamaClassify 👉 Extracts the Link: https://x.com/llama_index/status/1986847428272857356
RT LlamaIndex 🦙 Trigger your agent workflows directly from your inbox, using our LlamaAgents and @resend webhooks📧 In this demo, we built a system that: 👉 Receives emails with documents attached 👉 Classifies the attachments as either invoices or expenses using LlamaClassify 👉 Extracts the relevant information through LlamaExtract 👉 Writes an email reply and sends it back to the user All of this is packaged as an agent workflow and deployed to the cloud through our LlamaAgents!🚀 🦙 Get started with all our LlamaCloud services now: https://cloud.llamaindex.ai?utm_source=twitter&utm_medium=li_social 📚 Learn more about our agent workflows: https://developers.llamaindex.ai/python/llamaagents/overview?utm_source=twitter&utm_medium=li_social ⭐ Star the repo on GitHub: http://github.com/AstraBert/financial-team-agent
For the first time in human history, you can: 1️⃣ Take a bucket of docs/PDFs 🪣📑 2️⃣ Make sense of it 3️⃣ Extract insights / search over it with super high accuracy with effectively 0 humans involved. This is a neat joint stack we copublished with @MongoDB, check it out! https://youtube.com/watch?v=5mEPkPtoNyYLlamaIndex 🦙: Last week, we teamed up with @MongoDB to break down one of the most persistent challenges in production AI systems: turning messy, real-world documents into reliable insights. Enterprise documents don't come in neat, uniform packages. Invoices, SEC filings, reports—they all have Link: https://x.com/llama_index/status/1986117341911130163
Build an AI agent to automate your finance team’s entire invoice/expense workflow! 🧾 @TuanaCelik has built a fantastic example that shows you how to construct an agentic workflow that can triage incoming emails + attachments, detect whether it’s an invoice or expense, and process it accordingly. It uses our core agentic classification / extraction capabilities under the hood in LlamaCloud, and is backed by @llama_index workflows. Check it out: https://github.com/run-llama/workflows-py/blob/main/examples/document_agents/finance_triage_agent.ipynbLlamaIndex 🦙: Here's a common scenario: Your finance team gets emails all day with invoices from partners and expense reports from employees. Each one needs different handling. Invoices need acknowledgment and payment scheduling. Expenses need budget validation before approval etc. In this Link: https://x.com/llama_index/status/1986476949687140503
RT LlamaIndex 🦙 We probably shouldn't tell you how to build your own document parsing agents, but we will 😮. AI agents are transforming how we handle messy, real-world documents that break traditional OCR systems. Join our live webinar on December 4th at 9 AM PST where the LlamaParse team reveals industry secrets for parsing complex documents: 📋 Blueprint for building next-generation document parsing workflows using agents instead of OCR alone 🔧 Practical strategies for handling handwriting, rotated scans, nested tables, and visually dense layouts 🤖 Latest LlamaCloud capabilities showing how vision language models automate extraction from previously unparseable PDFs, forms, and images ⚡ When to apply each component in your parsing pipeline and why it matters We'll show you how to move beyond simple text extraction to actually automate understanding of documents with multi-column layouts, embedded charts, skewed scans, and tables within tables. Register now: https://landing.llamaindex.ai/beynd-ocr-how-ai-agents-parse-complex-docs?utm_source=socials&utm_medium=li_social
Our new bounding box approach in LlamaParse gives you clean bounding boxes while preserving clean reading order of the text through agentic reconstruction. The issue with traditional parsing methods is that the quality of the output is directly dependent on the layout detector - if the predicted boxes are wrong / in the wrong sequence, then your output is garbled. Here we use LLMs to reconstruct the entire semantic flow of the text, but still allow bounding box processing in parallel for additional metadata! Now available in LlamaParse: https://cloud.llamaindex.ai/
RT LlamaIndex 🦙 Here's a common scenario: Your finance team gets emails all day with invoices from partners and expense reports from employees. Each one needs different handling. Invoices need acknowledgment and payment scheduling. Expenses need budget validation before approval etc. In this example we build an agent that automatically triages incoming emails with attachments, extracts the right information, and takes appropriate action. Our approach uses three of our tools working together: 1️⃣ LlamaClassify handles the first decision point. It looks at each attachment and determines: is this an invoice that needs to be paid out to a partner, or an expense that needs reimbursement? It also provides reasoning for the decision. 2️⃣ LlamaExtract does the heavy lifting on data extraction. We create two specialized agents with different schemas for invoices vs expenses. 3️⃣ Agent Workflows orchestrates the entire process. It connects classification to extraction to business logic: in this case, checking expenses against a budget threshold and generating appropriate email responses via LLM. Classify incoming documents → extract relevant data → apply business rules → take action. Need to add a new document type? Add a classification rule and an extraction schema. Need different business logic? Modify the workflow steps. The components stay the same. Check out the full example: https://github.com/run-llama/workflows-py/blob/main/examples/document_agents/finance_triage_agent.ipynb
grep AND semantic search is all you need The fact that coding agents can access CLI commands makes them way better at search than standard retrieval. with grep/read/cat operations you can dynamically load different chunks of data and traverse complex directories. Obviously the even better answer is to just combine the two. Combine grep with semantic search. If you want to DIY this, check out `semtools`! We've built a simple lightweight, index-free engine that lets you run semantic search over any directory as a CLI command. Easily give it to your favorite coding agent e.g. Claude Code / Cursor to run. https://github.com/run-llama/semtoolsCursor: Semantic search improves our agent's accuracy across all frontier models, especially in large codebases where grep alone falls short. Learn more about our results and how we trained an embedding model for retrieving code. Link: https://x.com/cursor_ai/status/1986124270548709620
semtools is the easiest way to let your Claude Code / Cursor become an analyst over 1k+ PDF docs. It just adds two CLI commands: `parse`, `search`. Install it to ~/.zshrc and add it to your http://CLAUDE.md. Any coding agent can still choose to use grep, but now they get access to semantic search. Check it out: https://github.com/run-llama/semtools Blog: https://www.llamaindex.ai/blog/semtools-are-coding-agents-all-you-needLogan Markewich: Cursor put out a blog today stating that semantic search beats grep Semantic search doesn't have to be complicated, and thats exactly why I built SemTools -- to provide agents with a "fuzzy semantic grep search" Semtools https://github.com/run-llama/semtools Blog https://cursor.com/blog/semsearch Link: https://x.com/LoganMarkewich/status/1986231594072613333
RT Logan Markewich Cursor put out a blog today stating that semantic search beats grep Semantic search doesn't have to be complicated, and thats exactly why I built SemTools -- to provide agents with a "fuzzy semantic grep search" Semtools https://github.com/run-llama/semtools Blog https://cursor.com/blog/semsearch
RT LlamaIndex 🦙 Last week, we teamed up with @MongoDB to break down one of the most persistent challenges in production AI systems: turning messy, real-world documents into reliable insights. Enterprise documents don't come in neat, uniform packages. Invoices, SEC filings, reports—they all have irregular layouts, embedded tables, images, and context that traditional text extraction just can't handle. In this session, we walked through a complete document processing workflow that works at scale: LlamaParse acts as an agentic parsing tool that understands document structure—not just text extraction. It handles complex layouts, preserves table formatting, and extracts images with context. It outputs clean markdown that LLMs can work with. The architecture is : S3 → LlamaParse → MongoDB Atlas → LLM. The recording is up now: https://www.youtube.com/watch?v=5mEPkPtoNyY
RT LlamaIndex 🦙 MavenBio transformed complex scientific visuals in biopharma documents into searchable, analyzable intelligence using LlamaParse. Before LlamaParse, MavenBio's AI platform could process text-heavy documents but missed critical insights locked in charts, figures, and conference posters that drive real biopharma decisions. 🔬 Visual content parsing: Conference posters, regulatory filings, and scientific publications with complex diagrams now become fully searchable 📊 10x-20x faster workflows: Users can run comparative trial assessments and opportunity prioritization with unprecedented speed and depth 🎯 Enhanced accuracy: Visual context integration improved the precision of structured analyses across their platform ⚡ Engineering focus: Team reallocated resources from building parsing infrastructure to core product innovation "LlamaParse bridges the gap between static visual data and structured language," says @bernardffaucher, Founding Senior Backend Engineer at MavenBio. The webhook-based asynchronous processing scaled their throughput while maintaining low latency across their always-on ingestion pipeline. Read the full case study: https://www.llamaindex.ai/customers/maven-bio-turns-the-unstructured-world-of-complex-scientific-visuals-into-intelligence-with?utm_source=socials&utm_medium=li_social
Haiku 4.5 is better than GPT-5 at document OCR over tables 📋 Better reasoning doesn’t correlate to visual understanding 💡. I fed the NYC MTA timetable as screenshots into both GPT-5 and Haiku 4.5. - (Left) GPT-5 ignores the spaces between table values - (Right) Haiku almost perfectly reconstructs the table including spaces in between. The extra columns don’t materially impact the correctness of the results. Haiku is shaping up to be a great lightweight contender for document parsing. You can play with it and other models within LlamaCloud! LlamaCloud: https://cloud.llamaindex.ai/
RT LlamaIndex 🦙 Augment your LlamaIndex agent workflows with memory and persistent states: Check out @itsclelia's talk at @qdrant_engine Vector Space Day to learn how to build context-rich AI systems leveraging vector search and workflow engineering. Take a look at the YT video: https://youtu.be/CDyFukgpayY Learn more about LlamaIndex agent workflows: https://developers.llamaindex.ai/python/llamaagents/overview?utm_source=twitter&utm_medium=li_social