Jason Liu
简介
independent ai consultant, a16z scout, creator of instructor prev. @stitchfix @meta
平台
内容历史
RT Aman Sanger Hard at work on composer-2!Cursor: We've raised $2.3B in Series D funding from Accel, Andreessen Horowitz, Coatue, Thrive, Nvidia, and Google. We're also happy to share that Cursor has grown to over $1B in annualized revenue and now produces more code than any other agent in the world. This funding will allow Link: https://x.com/cursor_ai/status/1988971258449682608
Surprised OpenAI has not come Up with a standardized otel formats for genetic inference and evals and allow other platforms to follow it.
Is there no animal on object arena?Simon Willison: Since this question shows up so often that it qualifies as an FAQ, here's my definite answer to "What happens if AI labs train for pelicans riding bicycles?" https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/ Link: https://x.com/simonw/status/1989001665526264169
And GoogleCursor: We've raised $2.3B in Series D funding from Accel, Andreessen Horowitz, Coatue, Thrive, Nvidia, and Google. We're also happy to share that Cursor has grown to over $1B in annualized revenue and now produces more code than any other agent in the world. This funding will allow Link: https://x.com/cursor_ai/status/1988971258449682608
who are some accounts i should follow on context engineering, agents, and ai coding
How do we get this fundedanson ⁂: should have an F1-esque competition where research labs sponsor teams of layman to do tasks at expert level only using their models eg. "make the best japanese cheesecake" and then judged by culinary experts Link: https://x.com/ansonyuu/status/1988749001206960128
Who’s doing economic research at cursor. Societal impacts at cursor would be such a dream jobMichael Truell: After adopting Cursor, businesses merge ~40% more PRs each week. New economics research from the University of Chicago. Link: https://x.com/mntruell/status/1988755128401424399
It’s read every book on communication.Tamay Besiroglu: "I’ve got you, Ron — that’s totally normal, especially with everything you’ve got going on lately." Who actually wants their model to write like this? Surprised OpenAI highlighted this in the GPT-5.1 announcement. Very annoying IMO. Link: https://x.com/tamaybes/status/1988715705722892371
Literally feels inhumane that one team could not use AI.Anthropic: New Anthropic research: Project Fetch. We asked two teams of Anthropic researchers to program a robot dog. Neither team had any robotics expertise—but we let only one team use Claude. How did they do? Link: https://x.com/AnthropicAI/status/1988706380480385470
im horrified that gpt-4 is still available via the api, i just accidentally called it and spent 1000$ LOL
asian guy living in williamsburg building ai companions: asians are so normie
Cleobug101: post you and your celeb lookalike :-) Link: https://x.com/cleobug101/status/1986122237255029026
I wish there was a provider that could just say I want to spend $3,000 to get better skin, and they'll just tell me what to do.
Bought a red snapper and a deba Fish chef Jason era starts today After two years of yakitori chicken butcher Jason.
It makes sense cause the ladder is leaner fish to fattier fish. So ending with beef and then a pizza IS COMPLETELY GASTROBOMICALLY ACCURATElauren: my fav thing about omakase is getting a slice of pizza after Link: https://x.com/wowbestie/status/1986990812521992520
Who wants to sponsor our ai dev rel in New Yorkjason liu: Dev rel in ai dinner would hit. Link: https://x.com/jxnlco/status/1987249461576208452
This is how coding interviews feel✮⋆˙𝔢𝔪𝔦˙ ⋆ ✮: he will be spared today Link: https://x.com/emiikyu/status/1987099833287778451
RT Logan Kilpatrick We just shipped some nice Gemini API updates for developers using Structured Outputs. The API now supports: - $ ref for recursive schemas - anyOf union types - min + max numerical constraints - null types - property ordering adherence And much more!
Why Glean Builds Custom Embedding Models for Every Customer

How Glean achieves 20% search performance improvements through customer-specific embedding models, unified data architecture, and smart feedback loops that most enterprise AI companies are missing.
Lexical Search in RAG Applications

Guest lecture with John Berryman on traditional search techniques, their application in RAG systems, and how lexical search complements semantic search
Why Your AI Is Failing in Production (Ben & Sidhant)

AI monitoring, production testing, and data analysis frameworks for identifying issues in AI systems and implementing structured monitoring.
Rethinking RAG Architecture for the Age of Agents

CTO of Sourcegraph explores how the evolution of AI models has fundamentally changed agent architecture, requiring a complete rethinking of context management, tool design, and model selection
Data Organization and Query Routing for RAG Systems

Guest lecture with Anton Troynikov from ChromaDB on organizing data for retrieval systems, query routing strategies, and optimizing vector search performance
The RAG Mistakes That Are Killing Your AI (Skylar Payne)

Common RAG anti-patterns across different industries and practical advice for improving AI systems through better data handling, retrieval, and evaluation practices.
Why I Stopped Using RAG for Coding Agents (And You Should Too)

Why leading coding agent companies are abandoning embedding-based RAG in favor of direct, agentic approaches to code exploration that mirror how senior engineers actually work.
Why Most Document Parsing Sucks (Adit, Reducto)

A conversation with Adit, CEO of Reducto, covering challenges of document ingestion, parsing tables and forms, hybrid CV + VLM pipelines, and optimizing representations for reliable AI systems.
How Extend Achieves 95%+ Document Automation (Lessons from Eli Badgio)

Insights from Eli Badgio, CTO of Extend, on mapping document workflows, building task-specific evaluations, and implementing partial automation with human-in-the-loop approaches for 95%+ extraction accuracy.
Text Chunking Strategies for RAG Applications

Technical session with Anton from ChromaDB on text chunking fundamentals, evaluation methods, and practical tips for improving retrieval performance
How OpenBB Ditched APIs and Put RAG in the Browser (Michael Struwig)

A novel approach to RAG systems that leverages the browser as a data layer, connecting agents to sensitive data without traditional APIs.
Why Grep Beat Embeddings in Our SWE-Bench Agent (Lessons from Augment)

Insights from Colin Flaherty on building autonomous coding agents and how agentic approaches reshape retrieval-augmented generation systems.
RAG Master Series: Complete Guide to Retrieval-Augmented Generation

Comprehensive guide to building, improving, and scaling RAG systems. From fundamentals to advanced enterprise implementations with real-world examples and proven strategies.
Coding Agents Speaker Series: Lessons from Industry Leaders

Deep insights from the teams behind Devin, Amp, Cline, and Augment on building effective coding agents. Learn why simple approaches are winning over complex architectures in autonomous coding systems.
Do Your Engineers Know How to Leverage AI?

A letter to the readers of the AI Coding Accelerator.
Domain Experts: The Lever for Vertical AI

How to successfully apply LLMs in specialized industries by building domain‑expert review loops, augmenting prompts with expert knowledge, and earning customer trust.
Why Google Search Sucks for AI (Will Bryk, Exa)

How AI is changing search requirements and the technical challenges of building a semantic search engine designed for AI applications rather than human users.
Why Cognition does not use multi-agent systems

A deep dive into why multi-agent systems might not be the optimal approach for coding contexts, exploring context engineering, challenges of context passing between agents, and how single agents with proper context management can outperform multi-agent setups.
Stop Trusting MTEB Rankings (Kelly Hong, Chroma)

A deep dive into generative benchmarking - creating custom evaluation sets from your own data to better assess embedding model performance.
The 12% RAG Performance Boost You're Missing (Ayush, LanceDB)

Practical approaches to enhancing retrieval quality through fine-tuning, re-ranking, and understanding trade-offs in RAG systems
RT jason liu the hidden complexity of document parsing that's killing your rag pipeline i just spent an hour with adit from reducto, and it's clear why the best ai teams in finance, legal, and healthcare are working with them on document parsing. here's what most teams miss: vision language models fail in surprising ways even the best vlms hallucinate table values, drop rows, and misread form fields. the problem isn't reasoning, it's getting clean inputs. those "minor" parsing errors compound catastrophically downstream. don't believe me? try parsing a checkbox in healthcare data where "checked" vs "unchecked" determines patient vaccination status. vlms get this wrong at alarming rates. hybrid approaches outperform pure vlm pipelines the most effective approach isn't just throwing everything at gpt-4v: • traditional cv for clean structured content (better bounding boxes, confidence scores) • vlms for handwriting, charts, and complex visual elements • multi-pass correction with specialized models for error detection representation matters more than you think for simple structures, markdown works. for complex tables with merge cells? html preserves the structure that models need to reason correctly. the most surprising insight: embedding models and llms have fundamentally different limitations. your table might be perfectly parsed for the llm to reason over, but embedding models can't match user queries to that dense html structure. solution? create separate representations optimized for each: • html/structured format for the llm to reason with • natural language summaries of the same content for embedding/retrieval most teams are still treating document parsing as a solved problem when it's actually where most rag pipelines break. the difference between 80% and 99% accuracy here determines whether your product is usable in production. and yes, they're tackling excel files next. that alone might be worth the price of admission. http://improvingrag.com