Jason Liu

𝕏x•1 day ago

You’re welcome

𝕏x•1 day ago

who are some accounts i should follow on context engineering, agents, and ai coding

𝕏x•1 day ago

How do we get this fundedanson ⁂: should have an F1-esque competition where research labs sponsor teams of layman to do tasks at expert level only using their models eg. "make the best japanese cheesecake" and then judged by culinary experts Link: https://x.com/ansonyuu/status/1988749001206960128

𝕏x•1 day ago

Who’s doing economic research at cursor. Societal impacts at cursor would be such a dream jobMichael Truell: After adopting Cursor, businesses merge ~40% more PRs each week. New economics research from the University of Chicago. Link: https://x.com/mntruell/status/1988755128401424399

𝕏x•1 day ago

It’s read every book on communication.Tamay Besiroglu: "I’ve got you, Ron — that’s totally normal, especially with everything you’ve got going on lately." Who actually wants their model to write like this? Surprised OpenAI highlighted this in the GPT-5.1 announcement. Very annoying IMO. Link: https://x.com/tamaybes/status/1988715705722892371

𝕏x•1 day ago

Literally feels inhumane that one team could not use AI.Anthropic: New Anthropic research: Project Fetch. We asked two teams of Anthropic researchers to program a robot dog. Neither team had any robotics expertise—but we let only one team use Claude. How did they do? Link: https://x.com/AnthropicAI/status/1988706380480385470

𝕏x•1 day ago

whats the best and worse thing an ai agent has done for you

𝕏x•1 day ago

cultivate cultivate cultivate cultivate

𝕏x•1 day ago

Finally gotta spend some time with @ManusAI

𝕏x•3 days ago

im horrified that gpt-4 is still available via the api, i just accidentally called it and spent 1000$ LOL

𝕏x•3 days ago

asian guy living in williamsburg building ai companions: asians are so normie

𝕏x•3 days ago

Trying on my 7mm wetsuit

𝕏x•3 days ago

bro what is happening to core weave

𝕏x•4 days ago

Cleobug101: post you and your celeb lookalike :-) Link: https://x.com/cleobug101/status/1986122237255029026

𝕏x•4 days ago

I wish there was a provider that could just say I want to spend $3,000 to get better skin, and they'll just tell me what to do.

𝕏x•4 days ago

Bought a red snapper and a deba Fish chef Jason era starts today After two years of yakitori chicken butcher Jason.

𝕏x•5 days ago

It makes sense cause the ladder is leaner fish to fattier fish. So ending with beef and then a pizza IS COMPLETELY GASTROBOMICALLY ACCURATElauren: my fav thing about omakase is getting a slice of pizza after Link: https://x.com/wowbestie/status/1986990812521992520

𝕏x•5 days ago

Who wants to sponsor our ai dev rel in New Yorkjason liu: Dev rel in ai dinner would hit. Link: https://x.com/jxnlco/status/1987249461576208452

𝕏x•6 days ago

This is how coding interviews feel✮⋆˙𝔢𝔪𝔦˙ ⋆ ✮: he will be spared today Link: https://x.com/emiikyu/status/1987099833287778451

𝕏x•9 days ago

RT Logan Kilpatrick We just shipped some nice Gemini API updates for developers using Structured Outputs. The API now supports: - $ ref for recursive schemas - anyOf union types - min + max numerical constraints - null types - property ordering adherence And much more!

📝blog•2 months ago

Text Chunking Strategies for RAG Applications

Technical session with Anton from ChromaDB on text chunking fundamentals, evaluation methods, and practical tips for improving retrieval performance

1 min readJason Liu

📝blog•2 months ago

How Extend Achieves 95%+ Document Automation (Lessons from Eli Badgio)

Insights from Eli Badgio, CTO of Extend, on mapping document workflows, building task-specific evaluations, and implementing partial automation with human-in-the-loop approaches for 95%+ extraction accuracy.

1 min readJason Liu

📝blog•2 months ago

Why Cognition does not use multi-agent systems

A deep dive into why multi-agent systems might not be the optimal approach for coding contexts, exploring context engineering, challenges of context passing between agents, and how single agents with proper context management can outperform multi-agent setups.

1 min readJason Liu

📝blog•2 months ago

Stop Trusting MTEB Rankings (Kelly Hong, Chroma)

A deep dive into generative benchmarking - creating custom evaluation sets from your own data to better assess embedding model performance.

1 min readJason Liu

📝blog•2 months ago

The 12% RAG Performance Boost You're Missing (Ayush, LanceDB)

Practical approaches to enhancing retrieval quality through fine-tuning, re-ranking, and understanding trade-offs in RAG systems

1 min readJason Liu

📝blog•2 months ago

Why Glean Builds Custom Embedding Models for Every Customer

How Glean achieves 20% search performance improvements through customer-specific embedding models, unified data architecture, and smart feedback loops that most enterprise AI companies are missing.

1 min readJason Liu

📝blog•2 months ago

Lexical Search in RAG Applications

Guest lecture with John Berryman on traditional search techniques, their application in RAG systems, and how lexical search complements semantic search

1 min readJason Liu

📝blog•2 months ago

Why Your AI Is Failing in Production (Ben & Sidhant)

AI monitoring, production testing, and data analysis frameworks for identifying issues in AI systems and implementing structured monitoring.

1 min readJason Liu

📝blog•2 months ago

Coding Agents Speaker Series: Lessons from Industry Leaders

Deep insights from the teams behind Devin, Amp, Cline, and Augment on building effective coding agents. Learn why simple approaches are winning over complex architectures in autonomous coding systems.

1 min readJason Liu

📝blog•2 months ago

Data Organization and Query Routing for RAG Systems

Guest lecture with Anton Troynikov from ChromaDB on organizing data for retrieval systems, query routing strategies, and optimizing vector search performance

1 min readJason Liu

📝blog•2 months ago

The RAG Mistakes That Are Killing Your AI (Skylar Payne)

Common RAG anti-patterns across different industries and practical advice for improving AI systems through better data handling, retrieval, and evaluation practices.

1 min readJason Liu

📝blog•2 months ago

Why I Stopped Using RAG for Coding Agents (And You Should Too)

Why leading coding agent companies are abandoning embedding-based RAG in favor of direct, agentic approaches to code exploration that mirror how senior engineers actually work.

1 min readJason Liu

📝blog•2 months ago

Why Most Document Parsing Sucks (Adit, Reducto)

A conversation with Adit, CEO of Reducto, covering challenges of document ingestion, parsing tables and forms, hybrid CV + VLM pipelines, and optimizing representations for reliable AI systems.

1 min readJason Liu

📝blog•2 months ago

Why Google Search Sucks for AI (Will Bryk, Exa)

How AI is changing search requirements and the technical challenges of building a semantic search engine designed for AI applications rather than human users.

1 min readJason Liu

📝blog•2 months ago

Do Your Engineers Know How to Leverage AI?

A letter to the readers of the AI Coding Accelerator.

1 min readJason Liu

📝blog•2 months ago

Domain Experts: The Lever for Vertical AI

How to successfully apply LLMs in specialized industries by building domain‑expert review loops, augmenting prompts with expert knowledge, and earning customer trust.

1 min readJason Liu

📝blog•2 months ago

Rethinking RAG Architecture for the Age of Agents

CTO of Sourcegraph explores how the evolution of AI models has fundamentally changed agent architecture, requiring a complete rethinking of context management, tool design, and model selection

1 min readJason Liu

📝blog•2 months ago

RAG Master Series: Complete Guide to Retrieval-Augmented Generation

Comprehensive guide to building, improving, and scaling RAG systems. From fundamentals to advanced enterprise implementations with real-world examples and proven strategies.

1 min readJason Liu

📝blog•2 months ago

Why Grep Beat Embeddings in Our SWE-Bench Agent (Lessons from Augment)

Insights from Colin Flaherty on building autonomous coding agents and how agentic approaches reshape retrieval-augmented generation systems.

1 min readJason Liu

📝blog•2 months ago

How OpenBB Ditched APIs and Put RAG in the Browser (Michael Struwig)

A novel approach to RAG systems that leverages the browser as a data layer, connecting agents to sensitive data without traditional APIs.

1 min readJason Liu

𝕏x•5 months ago

RT jason liu the hidden complexity of document parsing that's killing your rag pipeline i just spent an hour with adit from reducto, and it's clear why the best ai teams in finance, legal, and healthcare are working with them on document parsing. here's what most teams miss: vision language models fail in surprising ways even the best vlms hallucinate table values, drop rows, and misread form fields. the problem isn't reasoning, it's getting clean inputs. those "minor" parsing errors compound catastrophically downstream. don't believe me? try parsing a checkbox in healthcare data where "checked" vs "unchecked" determines patient vaccination status. vlms get this wrong at alarming rates. hybrid approaches outperform pure vlm pipelines the most effective approach isn't just throwing everything at gpt-4v: • traditional cv for clean structured content (better bounding boxes, confidence scores) • vlms for handwriting, charts, and complex visual elements • multi-pass correction with specialized models for error detection representation matters more than you think for simple structures, markdown works. for complex tables with merge cells? html preserves the structure that models need to reason correctly. the most surprising insight: embedding models and llms have fundamentally different limitations. your table might be perfectly parsed for the llm to reason over, but embedding models can't match user queries to that dense html structure. solution? create separate representations optimized for each: • html/structured format for the llm to reason with • natural language summaries of the same content for embedding/retrieval most teams are still treating document parsing as a solved problem when it's actually where most rag pipelines break. the difference between 80% and 99% accuracy here determines whether your product is usable in production. and yes, they're tackling excel files next. that alone might be worth the price of admission. http://improvingrag.com