@LangChain Always hiring: https://t.co/D5Ut3loFO7
LangChain🤝GEPA shout out to @bryonkuchML for contributing a PR to the GEPA repo to make it work for LangChain! You can now optimize your LangChain chains Docs: https://gepa-ai.github.io/gepa/tutorials/langchain_adapter_pair_sum_product_walkthrough/
agent builder!
Start creating agents using everyday language with LangSmith Fleet. Learn how to build no-code agents for real work. Take our free LangChain Academy course today: https://academy.langchain.com/courses/quickstart-agent-builder
View quoted postRT LangChain ⏸️ Don’t pay for resources that aren’t doing anything. LangSmith Sandboxes pause automatically when idle.
RT LangChain Start creating agents using everyday language with LangSmith Fleet. Learn how to build no-code agents for real work. Take our free LangChain Academy course today: https://academy.langchain.com/courses/quickstart-agent-builder
open models are having a moment!
The latest finding in the LangSmith Signal: Open Models are having a moment. 1 in 3 AI teams ran an open-weights model in April 2026, up from 1 in 5 nine months ago. The overall number of teams using open weights grew 3x. We’re seeing newer users choose open models at a
llm spend is starting to get really high... a key part of our LLM gateway is spend visibility and spend control sign up for the private beta to try it out today! https://www.langchain.com/langsmith-llm-gateway-waitlist
LangSmith LLM Gateway lets you enforce spend limits and redacts PII before requests reach the model. Not after the fact.
RT LangChain LangSmith LLM Gateway lets you enforce spend limits and redacts PII before requests reach the model. Not after the fact.
Different models need different prompts, sometimes tools “Harness profiles” are how we do that in deepagents
Deep Agents v0.6 makes harness profiles a first-class abstraction. Now, you can get production-grade performance from models like @Kimi_Moonshot, @Alibaba_Qwen, and @DeepSeek_ai at 20x+ lower cost than closed frontier APIs. More on tuning: https://www.langchain.com/blog/tuning-deep-agents-different-models
View quoted postRT LangChain Deep Agents v0.6 makes harness profiles a first-class abstraction. Now, you can get production-grade performance from models like @Kimi_Moonshot, @Alibaba_Qwen, and @DeepSeek_ai at 20x+ lower cost than closed frontier APIs. More on tuning: https://www.langchain.com/blog/tuning-deep-agents-different-models
RT Matt MacInnis Tonight, I'm having dinner with a bunch of high-performing @Rippling team members (something we do every year). So I asked Rippling AI to give me a dossier on everyone I'm seated with, and the quality of this output makes me giddy...
RT @ghumare64: Almost everyone is building agent harness systems the wrong way. The default move: pick LangChain or LangGraph or the OpenA…
RT Caspar Broekhuizen Governance should lead to more building, not less Agent development is so much more fun when people don't need to ration tokens because one runaway agent might blow up the bill Understand spend, cap the scary cases, and teams can experiment much more freely
Your agents can burn through $10k overnight before you notice. LangSmith LLM Gateway stops that. The platform where you already observe, evaluate, and deploy your agents now has a governance layer.
RT LangChain New in Deep Agents v0.6: ContextHubBackend A versioned home for the files that power agent behavior, backed by LangSmith Context Hub, enabling context improvements from one run to the next.
As agent harnesses become more standardized, we’re going to see a lot more “managed agent services”
Managed Deep Agents lets you create a managed Deep Agent without standing up a custom agent server. Our runtime supports: ✅ Durable threads ✅ Streaming runs ✅ Checkpointing ✅ Human-in-the-loop workflows
RT LangChain Evals shape agent behavior. Every eval is a vector that shifts the behavior of your agentic system. More evals ≠ better agents. Instead, build targeted evals that reflect desired behaviors in production. Tools like LangSmith Engine help you targetedly create evals from your tracing data to build better agents.
RT LangChain Managed Deep Agents lets you create a managed Deep Agent without standing up a custom agent server. Our runtime supports: ✅ Durable threads ✅ Streaming runs ✅ Checkpointing ✅ Human-in-the-loop workflows
join us for a behind the curtains look at LangSmith Engine (our agent that helps make your agent better)
One of the best ways to learn what LangSmith Engine is capable of is to talk to the team that built it. Join @bentannyhill for a live session on June 11th and see how your team can automate the agent development lifecycle. https://events.langchain.com/webinar/how-to-shorten-the-path-with-langsmith-engine/
View quoted posta hot (cold at this point?) take that lead us to build this: every agent in the future will need a sandbox to connect to writing/executing code is not just for coding agents! is useful for all sorts of tasks
A TLDR on LangSmith Sandboxes: ✅ Hardware-virtualized microVM ✅ Kernel-isolated from your services + other sandboxes. ✅ Same SDK and API key as the rest of LangSmith ✅ Any framework or custom code ✅ Now GA https://www.langchain.com/blog/langsmith-sandboxes-generally-available
View quoted postRT LangChain One of the best ways to learn what LangSmith Engine is capable of is to talk to the team that built it. Join @bentannyhill for a live session on June 11th and see how your team can automate the agent development lifecycle. https://events.langchain.com/webinar/how-to-shorten-the-path-with-langsmith-engine/
RT LangChain A TLDR on LangSmith Sandboxes: ✅ Hardware-virtualized microVM ✅ Kernel-isolated from your services + other sandboxes. ✅ Same SDK and API key as the rest of LangSmith ✅ Any framework or custom code ✅ Now GA https://www.langchain.com/blog/langsmith-sandboxes-generally-available
RT LangChain Awesome night at #BostonTechWeek with @blitzyai
Cost controls are one of the main benefits of langsmith LLM gateway We’re seeing in a bunch of use cases adoption is getting to a point where cost really starts to matter
Your agents can burn through $10k overnight before you notice. LangSmith LLM Gateway stops that. The platform where you already observe, evaluate, and deploy your agents now has a governance layer.
RT LangChain Your agents can burn through $10k overnight before you notice. LangSmith LLM Gateway stops that. The platform where you already observe, evaluate, and deploy your agents now has a governance layer.
Managed deep agents is the easiest way to build and deploy long horizon agents Private preview, dm me if you want access
Managed Deep Agents is built for agents that need to work over long time horizons, use tools, preserve context, and produce artifacts. A few examples of what teams are building: ✅ Support + triage agents ✅ Research agents ✅ Coding agents ✅ Data analysis agents ✅ Internal
View quoted postRT LangChain Managed Deep Agents is built for agents that need to work over long time horizons, use tools, preserve context, and produce artifacts. A few examples of what teams are building: ✅ Support + triage agents ✅ Research agents ✅ Coding agents ✅ Data analysis agents ✅ Internal ops agents
RT Viv Model-Harness-Task fit! it’s clear that RL post-training produces a model-harness fit via tool shapes and prompting as models are trained with the harness in the loop. Mentioned this in a previous LangChain blog, Cursor also has good content on this But there’s probably less talk on the importance + experimentation of Harness-Task fit. Practically this includes choices like domain specific prompting (ex: verification coding tasks) or omission of confusing context that doesn’t apply to the current task Claude Code’s harness has TONS of instructions because they’re forced to serve a very general persona of user who could ask for…anything basically. But there’s a large benefit of using a laser focused set of context and tools relevant to the narrow task at hand without all the other junk This is the Harness-Task fit Every component of a harness exists to elicit some behavior from the model. If these components are tuned to the task, then the model benefits. If they’re a mix of noise and good content, the model may be fine but it may get confused This is why the best vertical AI teams in the world build very bespoke harnesses and evals for their agents Task-Harness fit helps you rock at the exact thing your customers care about and is why builders can outperform natively post-trained harnesses
Question for harness heads: how is it possible that another harness helps a model more than the one it was RL’d against on a top-priority capability? Not a dunk, I just find this really really surprising !!
View quoted postRT LangChain Agent Lake = Agents + high-scale data processing Watch our full conversation with Geng Sng, Co-Founder & CTO of @cogent_security ⏯️ YouTube: https://www.youtube.com/watch?v=D6XWu54oG4g&feature=youtu.be 🎧 Apple: https://podcasts.apple.com/nz/podcast/how-cogent-builds-ai-agents-that-have-to-be-right-every/id1891551672?i=1000769089112 🎧 Spotify: https://open.spotify.com/episode/4605K9ojyFmq1Dn4QjWeZ0?si=4d972381c1354446&nd=1&dlsi=9e2891b4ca3c467a
RT jpop 🦞 completing the loop is paramount most unrecognized step in the development lifecycle is iteration on the process itself
everyone is talking about self-optimizing loops in software & agents. but what does that actually mean? in my mind, it's a system that observes it's own outputs, evaluates them, and uses that signal to improve itself in the future. the reason why it has become so popular now, is
Activity on repository
hwchase17 forked hwchase17/harbor from harbor-framework/harbor
View on GitHubRT Sam Crowder a few months back, it become clear to us that a large part of technical work would be driven by agents in the future. coding agents were becoming ubiquitous and highly capable. since we build a platform for technical users, we needed to update our beliefs and strategy accordingly! LangSmith Engine automates the improvement of agents by looking through recent traces and finding problems according to a taxonomy of common agent issues that we have defined. we launched the product at our annual conference last week and the reception so far has been very exciting. and we're just getting started 📈
RT Viv We’re hiring for Labs! 🧪 If you’re interested in working with us to push forward Continual Learning, pls DM me with a blurb + link to the best Applied Research you’ve done (or even better shipped!) you’d be a good fit if you have some previous research background and are excited to build real experiments on: - understanding massive amounts of Agent Trace data - building + updating Environments over time - Harness Eng + Post-training over long time horizons
code interpreter is a light weight code execution environment lets you do: - RLMs - programmatic tool calling - more! without having to spin up a full sandbox we'll be writing a lot more about the use cases here, but check it out!
RT Palash Shah you can condense long horizon evals with agents into smaller subsets that still let you test intended behavior. i'm currently evaluating an agent that runs for 30+ minutes, and analyzes thousands of traces at a time. here's my process: if you're evaluating whether X impacts Y agent output, oftentimes a lot of the information that exists in X isn't relevant to the decision of Y. i extract the reasoning out of trace, and then figure out what is the cause of a specific behavior. then, i know what situation i need to re-create when setting up my eval. and as a result, i can create a much smaller/simpler version of the long horizon eval that i can quickly use to figure out what i need to change in my prompts to get the behavior i want.
RT LangChain .@huntlovell from the @LangChain_OSS team with a great explanation on interpereters.
RT Sydney Runkle deepagents v0.6 ships w/ support for code interpreters! these are the perfect happy medium between pure tool execution and heavyweight sandboxes they give your agent an environment where it can... → keep intermediate state out of model context → call tools programmatically (and fan out in parallel) → filter + batch over large datasets before anything hits the model
RT Viv brain dump of how/why we use Evals to measure agents before & after shipping to prod 1. Good Evals simulate what our real users will do and encounter. They’re not really random benchmark tasks, they reflect our priors on likely user behaviors to make sure the agent passes those cases in the product before we just ship it 2. With that said, the best Evals often aren’t made from scratch, they’re discovered from real world Traces. I’ll literally never ship a perfect agent first try, we need user feedback and failures from Traces to make Evals and then make sure these errors don’t happen again 3. At a basic level, Evals give us some measurable, apples to apples way of comparing performance. Ex: is my agent good today and also in 1 month when I go to try a new model? 4. Evals ~= Environments, we need some place to run Eval Tasks, which is defined by the environment setup. This should mirror prod as much as possible. The more that Eval drifts from prod, the higher my Sim2Real gap and the less I can trust the numbers 5. Evals are our regression tests. Sometimes a prompt change might fix something today while breaking something I changed last week. Evals help us catch that 6. Evals are our training data. They map out what we hope the agent should be & do. We literally fit the agent to evals in hopes of making more Evals pass. Good evals —> good agent. In some rough way: Agent = fit(model, evals) 7. It’s ok if some Evals fail today if it means this Eval is simply too hard for today’s models, but it’s something to strive towards for the next gen of agents. But you should still do agent engineering today to make them pass if you can, it’s a goal to engineer towards 8. The best Eval is an Eval that actually exists. I think still today Evals are daunting because of the blank canvas problem. “Where do I even start??” But I find small Unit Test style evals are a great place to start to feel like I’m building momentum. I can get a real number + Unit T...
RT LangChain ICYMI: LangSmith Sandboxes are GA ✅ Agents get a real filesystem, shell, and package manager. Isolated from your infra. ✅ Works with Deep Agents, Open SWE, or your own code. ✅ Auth with the same API key you already have. ✅ No new runtime to build or manage.
RT Sydney Runkle working on some hex dashboards -- this is a bummer! if you use langchain/deepagents as your agent harness, you can swap models with fallback middleware when your first choice provider is down https://docs.langchain.com/oss/python/langchain/middleware/built-in#model-fallback
RT Sydney Runkle as agents run for longer and with more context, there's a lot of tricky things you have to consider for deployment! super excited to dive into deploying Deep Agents for this Boston tech week meetup!
.@SydneyRunkle from the @LangChain_OSS team is hosting an in-person meet-up in Boston on 5/27 with Blitzy. In this meetup, Sydney and Dillon Jones from Blitzy will walk through what it takes to deploy long-running agents and the runtime capabilities that make it possible Save
RT Viv fun read on real tradeoffs & design decisions we debated when designing Engine for the data scale that customers produce one common thread is that we’re pretty strong supporters of just giving the agent autonomy AFTER we give it the right tooling, most of which revolved around how to interact with LangSmith agents are exceptional at selectively pulling in appropriate context as needed or diving deeper when needed with good prompting and largely our job is to help them self-facilitate the process of routing useful information into the context window
RT Viv Personally incredibly exciting at LangChain Labs to lead a focused, applied research effort into Continual Learning. A big prior is in treating the long horizon agent optimization problem as one that requires understanding signals from Agent Traces, Environments, and Production Systems at data scales that will far exceed what humans have produced before - Traces are a projection how Agents operate in the world we drop them into. They’ll produce more data than humans have in our history and there’s a large efficient data mining effort that needs to happen to make this data usable for agent improvement - The optimization recipe for real Continual Learning on agents will look different for different companies, agents, and situations. It will include things like Eval/Environment Generation, Harness Engineering, Post-Training, large-scale contextual indexing, search & retrieval - We’re very focused on practical applications of Continual Learning in ways that improve real world agents doing real world tasks in domains like Law, Banking, Coding, etc. This makes it even more exciting to be partnering with our partners who are also doing this in the real world today @PrimeIntellect @NVIDIAAI @harvey @FireworksAI_HQ @baseten if these research directions resonate and you’re building towards this future - myself and @jakebroekhuizen and I’m sure @hwchase17 would love to hear what you’re doing 🚀
bullish on LangChain Labs. imo initiatives like this are important because continual learning for agents is fundamentally an infrastructure problem...agents need systems that can collect trajectories, extract learning signal from behavior, optimize prompts/policies and close the
RT Brace I've been thinking a lot about the two different groups of evals you need in general agents/agents which handle broad tasks: 1. Benchmark evals - this is a suite of up to 100 eval cases which test the happy paths of your agent, and its most common use cases. This isn't that comprehensive, but covers enough to where you can use it to quickly judge how well your agent handles tasks 2. Test coverage evals - this is a much more detailed suite (maybe up to 500, or more individual cases) that covers every single task you want your agent to be able to handle. It doesn't just include single tests for tasks, but multiple tests per use case, all with slightly different user prompting/tragectories There needs to be two suites for a few reasons: - general agents have so many use cases, to accurately test them, and have confidence it preforms well on everything you want to support, you need many evals for each workflow - the comprehensive eval suite will become too expensive to run on any sort of recurring basis (let alone ci) think $1000's per run, esp if you're supporting multiple models. so you need a smaller suite (the benchmark eval) to quickly gauge whether or not your agent works on code changes - in general agents, agents can preform the same tasks, but via very different paths. the final result is all the user cares about, but the intermediate steps can look very different. if your eval suite doesn't cover multiple paths to reach the same result, you can't be confident your agent will actually work well in all real world scenarios your users put your agent into there's a lot more nuance here, so maybe i'll write a longer blog post on it, and how we're thinking about maintaining/building eval suites this large...
RT Brace Engine is one of the most complex production agent systems I've seen built. Definitely give this a read to see how we built it
RT Sydney Runkle deepagents v0.6 is about performance the first level at which we can control that is the model layer: how can you squeeze perf out of a model? tweaking prompts, tool names, and tool descriptions in accordance with the provider’s prompting guide can lead to substantial perf gains. we observed 10-20 jumps on subsets of the tau2 bench with just these tweaks alone. deepagents ships with a builtin set of profiles so that you can match the harness to the model with no extra effort @masondrxy and @Vtrivedy10 wrote a great blog on this! https://www.langchain.com/blog/tuning-deep-agents-different-models
RT Sydney Runkle two years ago we started building agents to automate work. turns out these are really useful, so there’s a LOT of runs and long traces that are hard to reason about now, use an ambient agent (engine) to digest and analyze that data and improve your agent over time
great deep dive into how we built LangSmith Engine lots of fun learnings and tips and tricks
View quoted postRT Palash Shah this has probably been the most interesting project i've ever worked on. we created the best agent at finding problems with your agent. and we integrated it with the best observability platform to spin the cycle of agent development faster, and faster over time. there's a ton of new learnings that myself, and the team have had in the past couple of months. particularly around eval's and benchmarks (evaluating an agent that runs for 30+ min is very hard) more to come soon!
great deep dive into how we built LangSmith Engine lots of fun learnings and tips and tricks
RT MongoDB LangGraph.js Long-Term Memory Store is now generally available. This integration brings long-term memory across sessions, across users, and across time, with Atlas as the unified backend for checkpointing and semantic recall, powered by Voyage AI embeddings. 👇 https://mongodb.social/6016BBfGOp
RT LangChain Most teams building agents are tracing their agent and reviewing the outputs, but the flow from error identification to merged fix is manual and slow. Engine is built to close that gap. https://www.langchain.com/blog/introducing-langsmith-engine
RT Anish Agarwal Excited for this conversation with @hwchase17 , Co-Founder & CEO of @LangChain , on June 2 at the @traversal_ai office in NYC. We’ll be talking about what it actually takes to operate AI agents in production: evals, observability, model selection, architecture decisions, and lessons from deploying AI into mission-critical infrastructure systems. Harrison has been a great partner and one of the clearest thinkers in the space on how production AI systems get built and improved over time. Plus drinks and a live crêpe chef. RSVP: https://luma.com/9k5ttsg2
RT Caspar Broekhuizen basically a mechanic for your agents
calling it now. LangSmith Engine going to be our fastest growing product yet. https://www.langchain.com/blog/introducing-langsmith-engine
RT Caspar Broekhuizen This release is absolutely stacked. I want to call out a sleeper feature: Interpreters Imagine your cloud agent gets a 10k-row CSV of support tickets. Without a code runtime, the model is stuck reasoning over chunks of raw rows in context With an interpreter, it can write code to do this analysis in one turn: parse the CSV, group tickets by 𝚎𝚛𝚛𝚘𝚛_𝚌𝚘𝚍𝚎, count each group, sort by frequency, sample 3 ticket bodies from each group, and finally return a small table Interpreters handle a lot of sandbox-shaped jobs with a lighter-weight runtime that lives right inside the agent loop, no bash required
RT Julia Schottenstein calling it now. LangSmith Engine going to be our fastest growing product yet. https://www.langchain.com/blog/introducing-langsmith-engine
RT LangChain In case you're not following the @LangChain_OSS team, here's who to follow!
@masondrxy and @Vtrivedy10 are doing an awesome push on harness profiles @huntlovell doing cool stuff w code interpreter / REPL @bromann leading the charge on streaming our OSS team is stacked!
View quoted postRT Hunter Lovell a lot of stuff to be excited about in this release! We're centering on two things to make deepagents great: - how do we make the most performant harness - how do we make it a joy to work with and build on See this writeup on the latest for how we're doing both. Try it out and let us know how we're doing! (dms open!)
RT Palash Shah when evaluating long running agents, all of your evals don't need to be end to end. i'm working on a proper blog about this, but in our evals for our agents that run for 30-60 minutes, we have two sets of evals. the first is end to end, provide inputs and llm as a judge over outputs. the second are incremental. in the end to end flow, there's probably 4-5 incremental steps and/or decisions that dictate how the agent performs. we write both sets of evals! one is for confidence in our overall system, and the other is confidence in the reproducibility of our agent behavior.
RT Saurabh LLM apps fail in ways normal logs can’t explain. Same input, different outputs. Subtle drift. Silent hallucinations. That’s the problem space LangSmith by @LangChain was built for: observability + evaluation for LLM apps and agents.
RT Sydney Runkle Re @masondrxy and @Vtrivedy10 are doing an awesome push on harness profiles @huntlovell doing cool stuff w code interpreter / REPL @bromann leading the charge on streaming our OSS team is stacked!
lots of good things in 0.6 release of deepagents! great write up by sydney
View quoted postlots of good things in 0.6 release of deepagents! great write up by sydney
RT LangChain OSS ICYMI: we shipped Deep Agents v0.6 last week, our biggest release yet!
RT Sydney Runkle announcing deepagents v0.6, our biggest release yet! it’s all about performance: at the model layer w harness profiles, agent layer w code interpreter, and at scale w streaming and delta channels context hub backend ties it all together, helping your agents improve over time!
RT LangChain ICYMI: SmithDB is our purpose-built data layer for agent observability + eval workloads. Supporting increasingly complex query patterns at low latency, over large traces, with self-hosting + multi-cloud requirements needs a fundamentally new architecture. That’s why we built SmithDB.
RT Nebius Nebius and @LangChain have partnered to integrate Nebius Token Factory with LangChain's Deep Agents. The integration, combined with LangChain's existing Tavily integration, gives teams building on LangChain a direct path to run agent workloads on production-grade AI infrastructure with open-source models, dedicated endpoints, real-time search, and full control over cost and data. Read the blog to learn more: https://nebius.com/blog/posts/nebius-and-langchain-partner-to-power-production-grade-ai-agents-on-open-models
RT Dev Anon This is one of the first real continual learning systems for agents in production. Not just monitoring. Actually getting better over time.
LangSmith Engine is how we’re spinning the always-on, self-improvement loop for every agent - Tracing is on for every single agent - Purpose built infra with SmithDB to handle data at agent scale (more data than humans have ever produced will be produced by agents) - Ambient
RT Viv LangSmith Engine is how we’re spinning the always-on, self-improvement loop for every agent - Tracing is on for every single agent - Purpose built infra with SmithDB to handle data at agent scale (more data than humans have ever produced will be produced by agents) - Ambient agentic intelligence applied to every Trace to find errors, product insights, or anything you want to look for (you can customize Engine to your needs) - PRs and Evals generated from this massive data with human gating/acceptance the data our agents produce is a gold mine of information to make agents and systems better over time the goal is that this flow shows users the first sparks of truly always on Continual Learning for their agents across their entire company
RT Viv build v1 of agent ship it (dogfooding counts) ⭐️ collect tracing data ⭐️ ⭐️⭐️ point agentic compute at data ⭐️⭐️ understand failures at scale generate evals edit agent to pass evals 🔁
RT Mason Daugherty Harness profiles! https://docs.langchain.com/oss/python/deepagents/profiles
@irl_danB I like what Langchain recently released in their deepagents harness, which is an adapter to modify the syntax of primitive file system commands depending on the model Claude likes “Bash”, Gemini likes “execute” etc The harness should adapt to the model and then as long as the
View quoted postRT Craig Certo Re I like what Langchain recently released in their deepagents harness, which is an adapter to modify the syntax of primitive file system commands depending on the model Claude likes “Bash”, Gemini likes “execute” etc The harness should adapt to the model and then as long as the model sticks to one syntax it will be effective in any harness
RT Palash Shah 1/ you are probably overcomplicating the environment that your agent has to work with when running evals
RT Assaf Elovic just got back from my second langchain interrupt conf and honestly it's probably the best AI engineering conference right now. what makes it different is that it's deeply practical. less "AI will change the world" and more engineers sharing what actually breaks in production, what scales, and what they've learned building real agents at scale. also feels like one of the few conferences where the community itself is the product. people are genuinely curious, technical, and hungry to learn from each other. was awesome seeing so many teams building on top of @tavilyai throughout the event as well. and the announcements were insane too. huge kudos to @LangChain - already looking forward to next year!
RT Palash Shah it’s kind of awesome that continual learning has now come to the agent & harness level. i remember when online learning was the big craze in traditional ml a couple of years ago, that transitioned into continual learning for model training, and now for harnesses.
RT Palash Shah turns out that building evals is super super challenging even now. i thought a lot of it was table stakes but turns out it has only become harder since agents are now more complex than ever! going to start tweeting more about how i design evals, especially to create autonomous improvement loops!
RT Ankush Gola This quote from our friends at @cogent_security says a lot: “At Cogent, our background agents can produce a huge volume of traces all at once. We need live observability into those systems, and SmithDB has been able to deliver that experience: seeing traces in seconds instead of minutes, which is what we experienced when testing other providers.” When building SmithDB, we focused heavily not only on query performance but also ingestion performance. We spent a lot of time meticulously optimizing our ingestion path (batching, compressing, indexing, uploading, etc) to achieve sub-second median write latency. This is a time it takes for data to be durably stored and become available to queries from the time the data is received. Waiting tens of second or even minutes for your traces to show up is completely unacceptable.
We built SmithDB: the database purpose built for agent observability workloads that now powers many parts of LangSmith. Agent observability presents a challenging data problem. Agent traces can contain tens of thousands of intermediate spans and large, unbounded payloads. These
View quoted postRT Git Maxd Wow! What a week! Lots of new things to learn - new ideas to form The complete Agent Development Life Cycle 🎯 ❤️ SF but can’t wait to get home and back to building! Thanks for the warm welcome and fun After Parties!- Till next time! @hwchase17 @amadaecheverria @torres_andres87 @PetralliLucas @lgesuelli_p 🫡
ICYMI: 1️⃣ LangSmith Engine 2️⃣ SmithDB 3️⃣ Managed Deep Agents 4️⃣ LangSmith Sandboxes: Now Generally Available 5️⃣ Context Hub 6️⃣ LangSmith LLM Gateway 7️⃣ Sandboxes, Prebuilt agents, + free model usage in LangSmith Fleet 8️⃣ Deep Agents 0.6 9️⃣ LangChain Labs https://www.langchain.com/blog/interrupt-2026-overview
View quoted postRT Palash Shah it almost never makes sense to use real api's for your evals. with how good coding agents have become, i will pretty much always opt to create a fake mock server for my agent to hit. the workflow is usually - fetch tasks for each eval - mock the endpoints that i need my agents to hit - create fastapi server for this - pre-fetch the real data i need - have the stub server return that pre-fetched data from start -> fully fledged mock server in under 30 minutes
RT Sydney Runkle Re @LangChain’s Interrupt 2026 conference was a blast!! such a pleasure to with @VictorMoreira16 about deep agents! ICYMI: we just dropped v0.6, which is focused on performance at the model, harness, and context layers!
“Dependably for LLM agent failures”
@hwchase17 @hwchase17 Started on this and finding it awesome; also LangSmith engine sparked an idea. The "Dependabot like for LLM agent failures". LangSmith Engine gives you the smoke detector. The natural next layer is a sprinkler system; an auto-remediation with a human approval gate.
RT Saurabh Re @hwchase17 Started on this and finding it awesome; also LangSmith engine sparked an idea. The "Dependabot like for LLM agent failures". LangSmith Engine gives you the smoke detector. The natural next layer is a sprinkler system; an auto-remediation with a human approval gate. A four-stage pipeline comes to mind: Classify → Patch → Eval → Shadow Trying it and will share trace results. This is a real gap in the LLMOps ecosystem; glad to see it being closed. 🔥 Will keep updated on the progress @LangChain_OSS
RT Julia Schottenstein If your head of applied ai is not 21, ngmi. @BraceSproul and Caroline dropping knowledge at @LangChain Interrupt
RT LangChain Model. Harness. Context. The 3 main components of agents. As you build more agents, context increasingly lives AGENTS.md, skills, policies, examples, + generated research files. Context needs its own home. That’s why we built LangSmith Context Hub.
RT LangChain Introducing LangSmith LLM Gateway: The runtime governance layer for your agents. 💸 Enforce cost limits 🔒 Detect PII ✅ Act on violations …All without leaving LangSmith. Now in Private Beta https://www.langchain.com/blog/introducing-llm-gateway
RT Brace DID SOMEONE SAY FREE TOKENS IN FLEET???? Yes it's true. Fleet now has a built in model powered by @FireworksAI_HQ that's free for all Developer & Plus plan users. Try it out today (or before @hwchase17 finds out I'm giving away free unlimited tokens...)
LangSmith Fleet now has a free model powered by @FireworksAI_HQ for Developer and Plus plans. It’s now easier than ever to get started. Try it today.
RT LangChain LangSmith Fleet now has a free model powered by @FireworksAI_HQ for Developer and Plus plans. It’s now easier than ever to get started. Try it today.
RT Niko There's a ton on unexplored space in building enterprise-grade agent harnesses that continuously improve over time. Congrats to @hwchase17 and @LangChain on the launch of LangChain Labs and excited to keep pushing the frontier with you all!
RT Brace I was going to say LangChain Labs is the most exciting thing we’ve announced recently, but with so many launches it’s hard to justify… but still incredibly thrilled to see this get started!! Checkout the blog for more info
RT Prime Intellect We are excited to be partnering with @LangChain for deploying self-improving agents. Continual learning in your production environment unlocks compounding capability gains for model-product optimization. Your data. Your advantage.