achieve ambition with intentionality, intensity, & integrity - @dxtipshq - @sveltesociety - @aidotengineer - @latentspacepod - @cognition + @smol_ai
RT Juan Pa One of the best events I’ve attended in a very long time. Bringing people together for writing stuff is my new favorite weekend activity. Shoutout to @swyx, @SarahChieng for putting this event together Original tweet: https://x.com/JuanPa/status/2027953299572597069
first podcast with @polsiahq right after they hit the $1m ARR mark, from a standing start of $50k ARR on Feb 1. also on the @latentspacepod youtube now@polsiaHQ @latentspacepod https://youtu.be/Yw-m0PI2Atk
$1M run rate. $100K → $1M in 2 weeks. One founder. Zero employees. Thousands of agents running 24/7. 1,000+ pioneer solopreneurs building autonomous businesses on Polsia.
Going to try the Full Stack Notion™ Challenge for a few weeks: - Notion Mail - Notion Calendar - Notion AI This talk has 350k views but i still think most tech ppl are sleeping on it. @NotionHQ is tastefully adding ai into productivity software rather than starting with chat and then making you recite ancient arcana to add your MCP to some json file they cant be bothered to add a nice ui for. Mail and Calendar are known to be rough. but Gmail and Gcal are even worse, and new-Superhuman isnt giving me the AI Tools For Thought I need fast enough (sorry Rahul you're my hero too). Ive been WAITING for Notion to get AI and I think this is the year.
RT Emil Eifrem The Latent Space pod with @swyx and @FanaHOVA is my #1 source for alpha on AI. Sadly, I don't have time to listen every episode, but it's always worth it. I just tuned in to the last episode and boy, have they raised the game! All of a sudden I'm watching the consistently insightful and equally eloquent @dylan522p talk about geopolitics and its impact on the AI value chain while cooking chicken fried rice! Great episode: https://www.youtube.com/watch?v=UwnqWAYOjPU Original tweet: https://x.com/emileifrem/status/2027731192695431561
btw i think a little buried today in the oai fundraise is the fact that OpenAI Codex added 600k users in last 3 weeks: - Feb 4 @sama said it crossed 1M WAU - Feb 27 oai says it crossed 1.6M WAU it is up >3x from Jan 1 (!?!?!?!?!?!?!?) which includes the Codex app launch (Feb 2)
RT Arena.ai Building community trust through open science is core to Arena. That’s why the Arena leaderboard runs on Arena-Rank, our open-source Python package for transparent ranking. With it, anyone can construct statistically grounded, reproducible leaderboards using pairwise comparison data. @Windsurf recently utilized Arena-Rank for their Arena Mode leaderboard. Try it for yourself, and share your leaderboard with us. More details in thread 👇🧵 Original tweet: https://x.com/arena/status/2027528061508587728
RT Joel Becker this was so much fun!! thank you @FanaHOVA @swyx for having me on, i can’t wait for our inaugural rec league soccer-karaoke-forecasting tournament :) Original tweet: https://x.com/joel_bkr/status/2027487716670378153
🆕 METR’s @joel_bkr on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity https://latent.space/p/metr Everyone is going beserk over the @METR_Evals plots going exponential. Yet we -do- think Something Big Is Happening and it kicked off with Opus 4.5 in
View quoted postRT Latent.Space 🆕 METR’s @joel_bkr on exponential Time Horizon Evals, Threat Models, and the Limits of AI Productivity https://latent.space/p/metr Everyone is going beserk over the @METR_Evals plots going exponential. Yet we -do- think Something Big Is Happening and it kicked off with Opus 4.5 in December. It's getting increasingly hard to find measured but noncynical voices who can objectively talk about model progress. Joel is one of the best conversations we've had about it! Original tweet: https://x.com/latentspacepod/status/2027473676577476845
i think Dario has finally displaced @pmarca’s definition of product market fit
BREAKING: The Pentagon says it wants to continue talks with Anthropic after they formally refused the Department of War’s demands.
View quoted postif you’re not in the RLFT industry you do not understand how quickly @harborframework has come to completely dominate the landscape right now for RL infra and evals. it is standing room only at this @modal x @willccbb meetup where Harbor is basically required knowledge. my team at Cog has made it a top priority to migrate all evals to Harbor as well. it’s kinda unreal given that it was basically launched by a few guys in a discord needing something better for TerminalBench 2 (we posted the launch on @latentspacepod youtube look it up). not at all surprised this one got the @andykonwinski blessing and you should expect an entire mini industry of Harbor based evals and benchmarks and infra startups this year.
. @harborframework /@alexgshaw @ryanmart3n @lschmidt3 @andykonwinski (@LaudeInstitute) / Agent evaluation needs shared infrastructure. Harbor standardizes benchmarks through one interface: repeatable runs, standardized traces, production-grade practice. Born from @terminalbench
Activity on swyxio/swyxdotio
swyxio closed an issue in swyxdotio
View on GitHubtoday @devinai investigated and a production bug (we migrated vercel orgs and forgot a key) and asked for exactly what it needed from us and verified fixes and then when i complimented it, it saluted me. 2026 is wild
RT Latent.Space For our first episode of our new show In-Context Cooking, we have the Founder & CEO of SemiAnalysis Dylan Patel. We talk about: • Taiwan endgame scenarios & TSMC risk • AI export controls + Chinese talent flight • $180–200B hyperscaler capex (is this a bubble?) • Nvidia vs vertical integration • What actually bottlenecks AI (power? fabs? chips?) • Why the public might turn anti-AI @dylan522p @SemiAnalysis_ @allenpark Original tweet: https://x.com/latentspacepod/status/2027132644161716490
http://x.com/i/article/2027104991018905600
every category leader in this list should be worth at least $5b btw because koding agents will be recommending them for the next 5 years + infra is stickier than agents (full disclosure am smol resend angel)
What Claude Code actually chooses if you ask it to build with no tool names anywhere in the input.
RT Grant Lee Stripe is worth $159 billion now. Like Bezos and Buffett, the Collison 'way of building a business' will create trillions in value and is worth studying. 800-word post on: 1. Craft & Beauty 2. Humility 3. Hiring 4. Culture 5. Decisions 1. On craft & beauty, You can work on something you're not proud of for 2 years. You can't do it for 30. Beauty has a practical function most people miss. When we go to a city or building that's beautiful, there's generosity in that construction. You never meet the architect, but it's a gift they've bestowed on us. Products are no different. And craft has another function that might matter even more: it's the single best way to attract extraordinary people. The best people consider themselves craftspeople, and above almost all else, they want to work alongside other craftspeople. Put bluntly: really good people don't enjoy working on shitty things. 2. On founder humility, Think carefully about what's cool and what's high status, and then make sure not to do that. Once you're succeeding, the real danger begins. Success breeds complacency. Other people are studying what you built and looking to replicate it. You can always be a month away from losing your business. The antidote is simple but uncomfortable: stay close to reality. Every week at Stripe's leadership meeting, they hear directly from a customer. It's not an A-plus scorecard every time. That's the point. It prevents you from getting delusional. And when you see a smart person holding a view that's different from your own, rather than figuring out how they're wrong, try to figure out how they're right. 3. On hiring, Think like a value investor. You're not looking for the best resume. You're looking for human capital the market has significantly underpriced. It took Patrick and John a full year to get to four people. No group will ever shape your company more than your early hires. When hiring anyone, ask: will I like the 50 people they hire? Prioritize r...
Re @RealGeneKim @Steve_Yegge yeah ok guys i think cursor officially agrees. IDE is dead. https://x.com/mntruell/status/2026736314272591924
RT swyx Re @RealGeneKim @Steve_Yegge yeah ok guys i think cursor officially agrees. IDE is dead. https://x.com/mntruell/status/2026736314272591924 Original tweet: https://x.com/swyx/status/2026869648034255037
thanks to @edwinarbus kindly giving me access I was able to try this out: literally just dropped the below tweet into @cursor_ai cloud, expecting this to not work because its a pretty hard task. CURSOR AGENT JUST ONESHOTTED reconstructing Rachel's website FROM JUST A VIDEO (!!!) working autonomously for 43 minutes. i'm sure theres a lot of design details that it missed. but god damn this is a fantastic starting point for just dropping in a tweet without any further instruction the below video on the right is THE ONESHOTTED CLONE not her real site. they even did the RachelLLM sidebar and demoed that it works...
Name one young product designer getting more offers than @racheljychen. Rachel, you rock.
View quoted postActivity on swyxio/wtf2025
swyxio opened a pull request in wtf2025
View on GitHubRT Latent.Space It's time to start tracking this. https://wtfhappened2025.com/ open source. we'll maintain the first draft of this special time in history. Original tweet: https://x.com/latentspacepod/status/2026813262495678550
It is hard to communicate how much programming has changed due to AI in the last 2 months: not gradually and over time in the "progress as usual" way, but specifically this last December. There are a number of asterisks but imo coding agents basically didn’t work before December
View quoted postActivity on swyxio/wtf2025
swyxio opened a pull request in wtf2025
View on GitHubRT Moonlake Introducing a world built by the Moonlake's world model. 🏙️ Most world models only allow for a limited action space. Moonlake maintains multimodal states across physics, appearance, geometry, and casual effects and predict how they evolve under different actions. 👇 Original tweet: https://x.com/moonlake/status/2026718586354487435
btw everything is computer
and yes - we even talk about the cpu shortage! just less extreme than memory.
View quoted postRT David Singleton Re @swyx @dreamer @eladgil @karpathy OK @swyx I came back to this late last night. I added one tool - the ability to take YouTube screenshots at timestamps and I'm now pretty excited about the experience I have in my hands. I will use this! Original tweet: https://x.com/dps/status/2026711448286884097
RT Ramana Venkata This has been a great listen. I usually listen to podcasts in 2x speed but I had to listen this at 1x. The speed at which Doug talks is intense. Loved the interview @swyx https://www.youtube.com/watch?v=nE32Zu_SZ1Y Original tweet: https://x.com/_vramana/status/2026680773789782028
just realized this is the last job that will be left
RT Fabricated Knowledge Had a lot of deep lore about.., before everything really. But especially how powerful Claude code is! Me and @swyx chop it up, I enjoyed this heavily Original tweet: https://x.com/fabknowledge/status/2026470900833390862
🆕 Double pod with @fabknowledge of @SemiAnalysis_! Claude Code for Finance + The Global Memory Shortage https://latent.space/p/valuemule One year ago today, @bcherny released Claude Code to the world as part of the Sonnet 3.7 launch. One year later, it is now writing 4% of all code
View quoted postGuys - it's Claude Code's actual first birthday today - Feb 24 2025 was the launch, check it am i crazy or is @latentspacepod the only one doing a retrospective + anniversary pod today? did everyone just forget the most consequential AI product since ChatGPT? anyway... we did one :) @fabknowledge is an old friend and self proclaimed psychosis patient. We do the postmortem + premortem the next +100% runup in the Global Memory shortage, wherein I make the prediction that LLM context lengths will -NOT- 10x with full attention in the next 2-4 years.
🆕 Double pod with @fabknowledge of @SemiAnalysis_! Claude Code for Finance + The Global Memory Shortage https://latent.space/p/valuemule One year ago today, @bcherny released Claude Code to the world as part of the Sonnet 3.7 launch. One year later, it is now writing 4% of all code
View quoted postsome illuminati somewhere decided today was Launch Everything Day but just sharing some personal commentary from this as an analyst: - Scott admits Devin didn’t even have internal PMF at the 2024 launch. took 6 months to get adoption at first enterprise customer. Models werent ready. trying out a lot of agent patterns. But: the form factor was right. Async agents are the Final Boss of agent ux. - Devin usage doubled every 2 months in 2025 per each enterprise after landing. Doubling rate has *accelerated* to every 6 weeks so far this year (!!) internal usage is now 4x 2025 peak. - Self serve sucked because repo setup was effectively not worked on (doesnt matter to enterprises if FDEs are just gonna set them up for you lol) - with Devin 2.2 that is now changing. Cog hired first designer 2 months ago (another lol). team did a crazy allhands 3 Sundays ago and decided to do a big sprint to catch up all the self serve UX debt that has been piling up. this means lots of polish but also integrating things with the new omnibox and seamlessly tying in devin review with devin main to “close the loop”. point being, this team has been building background agents since long before it was cool and honestly well before they were working. instead of chasing trend after trend in 2025 they found pmf and battle tested for the past year in the largest enterprises in the world - by the way the single most impt customer profile in coding. and now allow them to reintroduce themselves with a basically complete reworking of what its like to use Devin. again dont take my word for it - take my designer’s word who has been going absolutely fucking HAM because he now has his own senior engineer reading his figma and translating his vision to code. see screenshots. I’m feeling the agi myself. and… … look out for Devin 3.0 :)
RT Latent.Space 🆕 Double pod with @fabknowledge of @SemiAnalysis_! Claude Code for Finance + The Global Memory Shortage https://latent.space/p/valuemule One year ago today, @bcherny released Claude Code to the world as part of the Sonnet 3.7 launch. One year later, it is now writing 4% of all code on GitHub and on track to rise to 25-50% by 2027. We celebrate the anniversary by talking to a self proclaimed "Claude Code Psychosis" superuser, applying it to intense finance + semiconductor analysis knowledge work. Separately, we also cover the shocking global memory chip shortage that started hitting the market last quarter, but is likely to have ripple effects all over the semis stack right down to TSMC and even your own iPhone prices. A rare double pod with one of the few experts that can cover the full stack from chips to code. We're big fans of Doug's rants on Transistor Radio, and so excited to get some quality time to go in depth! Original tweet: https://x.com/latentspacepod/status/2026420225562804288
RT nader dabit http://x.com/i/article/2026131561100488704 Original tweet: https://x.com/dabit3/status/2026385925593510302
one of the most impressive teams in AI I’ve ever been honored to meet. going to their christmas party last year was a big “oh wow…. they are assembling some low key legends up and down the stack” moment for me. fun fact: when they were fundraising in 2023 NOBODY wanted to touch them because the received wisdom was that the last generation of custom chip startups were abject failures. congrats on the B! as we run into more and more memory and compute tradeoffs with exponential demand for speed and throughput, i think innovation at this layer makes sense especially as the workload shape stabilizes. The Hardware Lottery doesnt always have to go to the same winners every year…
We’re building an LLM chip that delivers much higher throughput than any other chip while also achieving the lowest latency. We call it the MatX One. The MatX One chip is based on a splittable systolic array, which has the energy and area efficiency that large systolic arrays
View quoted postRT Quincy Larson I just interviewed @swyx on the freeCodeCamp podcast about where he thinks LLM tools are heading, and how devs can adapt. He's super optimistic. "High-agency junior devs will always find a way." [1 hour watch / listen] https://www.freecodecamp.org/news/the-three-paths-ai-could-take-from-here-shawn-wang-swyx-interview-podcast-208/ Original tweet: https://x.com/ossia/status/2026370793425318102
Re @dreamer @altryne @eladgil @dps while doing the second one it encountered issues with the scraper it wrote and fixed the scraper lol.
If you don’t have @dreamer access yet, i'm helping out to do a sneak peek today! Starting now: Reply in-thread on the post below with what you want an agent to do. @eladgil, @dps, and I will build the best ones and post as we go. 👇 literally: ANY personal/family software you always wanted for yourself but never got around to building. lets go.
Kicking off now: @eladgil, @swyx, @dps take your agent ideas and build them in real time in @dreamer. Reply in this thread with your ideas. We’ll build a few of the best ones and share progress + links as we go. You dream it. We build it. 👇
View quoted postRT Michelle Lim When the cost of code goes to zero, marketing is your only advantage. Introducing Flint. It builds you a unique page for every ad, keyword, and customer. We’re already doubling conversions for @Cognition and @Graphite. Sign up at @tryflint. Original tweet: https://x.com/michlimlim/status/2026327160072634576
RT Lulu Cheng Meservey Banger Original tweet: https://x.com/lulumeservey/status/2026316361014595643
An AI chip startup founded by two Google alumni has raised more than $500 million in a new round to compete with Nvidia https://www.bloomberg.com/news/articles/2026-02-24/ai-chip-startup-matx-raises-500-million-to-compete-with-nvidia?taid=699da2831348600001d29e78&utm_campaign=trueanthem&utm_content=business&utm_medium=social&utm_source=twitter
View quoted postpash forgot to post the link idk for aura reasons but this is 100x more interesting than moltbook to me because: - single verified agent playing with 10s of 1000s of real humans* - real money involved (tho its degenerate money but whatever) - the @LobstarWilde account does seem to have “real soul” - actual useful technical breakdown of what went wrong (was alr fixed, haha open source) - good writing * instead of humans playing agents link below https://pashpashpash.substack.com/p/my-lobster-lost-450000-this-weekend
RT Jo Kristian Bergum It has grown into the best ai podcast with the highest signal to noise ratio. Happy birthday! Original tweet: https://x.com/jobergum/status/2026133407709708649
RT Sebastian Raschka Re also just saw @swyx did an inteview on that topic with Olivia Watkins and Mia Glaese (SWE-Bench dev) https://www.youtube.com/watch?v=0HaUD_olwQU Original tweet: https://x.com/rasbt/status/2026071610000572693
Big news today if you're into coding evals: SWE-Bench Verified is dead!! https://x.com/latentspacepod/status/2026027529039990985 i'm not sure if @HamelHusain is tired of me tagging him but it turns out @OpenAI really did look back at their own 2024 work and then you 1) look at the CoT and 2) look at the evals they realized that at LEAST 16.4% of SWE-Bench Verified should technically be unsolvable... ... and also that ALL frontier models, including OpenAI's own, are capable of solving them by sheer contamination (including being able to recite verbatim the entire SWE-Bench problem setup and solution, just by giving Task ID alone (!!!!)). Heroic work from the OAI Evals team, and imo an important highlight on the fragility and messiness of Evals work in general. OpenAI spent the money to do 3 independent reviews of each problem in 2024 and AT LEAST SIXTEEN PERCENT OF THESE were still egregiously prolematic (as shown in screenshots). in this 2026 audit they then did 6 independent reviews from software engineers, with ADDITIONAL positive finding verification from a separate team, in order to arrive at today's conclusion. If this happens to SWE-Bench Verified... what else is hiding in other benchmarks out there?
🆕 The End of SWE-Bench Verified (2024-2026) https://latent.space/p/swe-bench-dead Today @OpenAIDevs is announcing the voluntary deprecation of SWE-Bench Verified! We're releasing a podcast + analysis in today's post. Saturation of SWE-Bench has been a community hot topic for over a year -
happy 3 year anniversary to @latentspacepod :)
Super excited to launch the Latent Space Podcast w/ @swyx 🔭 Our guests will be the best people working at the cutting edge of the AI space, from founders to PhD researchers. (1/2)
honored to be the first 3x guest on the @freecodecamp podcast: https://youtu.be/kQqrMNviM9U i chatted with @ossia about @aiDotEngineer and the top 3 directions I am focusing on for AI progress in 2026: - world models - memory/continual learning - multimodality/generative media there are other candidates (subagents/multiagents, and ai for science) but these are the most obvious places to invest afaict
some amount of ego here but v interesting metaintelligence display where a 22 yo is selfaware enough that she can direct her own growth in the direction she wants and is unapologetically proud / excited to do so continual learning be damned, this is self directed learning that most humans arent even aware they are capable of. i dont know about the neuroplasticity argument - i found myself capable of a big pivot at age 30 and bet i could still do it today to a lesser degree. you can just will yourself into a different mindset.
Did not expect a question that starts out 'Do you think before you speak?' to go so well. A+ question from Charlotte Harpur A++ response from Eileen Gu.
View quoted postRT Nathan Labenz Special cross-post featuring @olive_jy_song of @MiniMax_AI, from @swyx's @aiDotEngineer and @Kseniase_'s @TheTuringPost From the ups, downs, & surprises of RL post-training, to battles with reward hacking, to painstaking debugging... Chinese researchers are ... just like us! Original tweet: https://x.com/labenz/status/2025735906762936699
just found out from @nytimes that the man shortage in NYC is so bad that dating events are charging women $100 and men $0 and the attendance ratio is still 3:1 do u new york girls know how insane you sound right now to san franciscans, just… move? pictures taken 5 mins apart right now
yesterday we chatted with @martin_casado and @sarahdingwang on the pod and he happened to do basic math™ on the logic of asics today @taalas_inc launched their HC1 asic that can inference 17k tok/s. Sure, it's a shitty 3.1 8B today which is a 1.5 year gap. But read the details to the HC2 this winter, and do the math — this timeline will converge to 0 in the next 2 years. Build accordingly.
## the egocentric fallacy: "the most important part of a trend will happen to be right now when i randomly decide to care a lot about it because of some dude's post" just to punch strawmen with Han (whom i love), this meta-principle of "choose the laggards in hopes they catch up" is a noob mistake and has a lower chance of working than most people are trained to appreciate (because they fundamentally want to believe in equality when capitalism fundamentally does not). generation after generation of investors from Thiel to @Chamath have learned painful lessons to double down on winners rather than do relative value trades. a simple thought exercise is to ask if the Software Engineering number goes up to 80% or 90% in the next 5 years would you be better of working on Back Office Automation? (no, the answer is no, some folks are so capitalism illiterate you need that to be spelled out) . So then if we just zoomed from 10 to 50% in the last 2 years, it is actually a classic egotistical fallacy that this exact moment is when AI x SWE peaked and it's all relative downside from here vs other domains, which is what is implied by the chart below: who are you, who are we, what evidence do you have, to say we are anywhere even close to done on the largest product market fit we've seen in software business history? @nlw has been echoing my thoughts here - kode is the lions share of agi and basically by betting on anything else you may well be rich and you may well be happier and you may well have much more positive social impact on the world.... but do not confuse that for having higher expected value.
in 2026 you no longer have an excuse to have a slow ass website. one prompt and 38-56% better LCP, FCP, and Speed Index as the youtube fengshui guy would say, "FIX IT" (this cost 5 mins and like $5. felt great)
RT Sarah Wang Takeaways from my talk with @swyx, @FanaHOVA, and @martin_casado about frontier labs, AGI, coding agents, the new capital flywheel, talent wars, and more: Original tweet: https://x.com/sarahdingwang/status/2024868959225594171
From pioneering software-defined networking to backing many of the most ambitious AI labs of this cycle, Martin Casado and Sarah Wang sit at the center of AI’s capital and capability arms race. We sat down with the a16z duo to unpack how AI investing fundamentally changed: why
View quoted postRT anhtho 🍊 http://x.com/i/article/2024567292856807424 Original tweet: https://x.com/byAnhtho/status/2024863500468629584
OH MY GOD CHATGPT has no chill rn
OpenAI and Anthropic go to war: Claude Opus 4.6 vs. GPT 5.3 Codex https://www.latent.space/p/ainews-openai-and-anthropic-go-to
View quoted postthis might be the single best timed called shot in the history of AIE. I now think about this talk almost ~daily and have direct line of sight to the nonstop onslaught of new post-IDE form factors for agentic engineering. @RealGeneKim and @Steve_Yegge really nailed this one, no notes. and they called out one of the predominant shifts in 2026 coding, in Nov 2025. i'm still shocked how -I- myself have changed opinion this dramatically in the last 3 months, because unlike appearances I'm actually not an early adopter of things personally (although I serve early adopters for a living). So by the time -I, a perennially left-of-mid-bell-curve person- have come around to the idea, then you really, really know its here. well: it's here. @Wattenberger just showed me what she and the Augment team have been cooking and yeah, this is the "ADE" or whatever three letter acronym you wanna call it. Cursor 2.0 was a toe dip. Claude folded it into their chat app. Codex formalized the Conductor patterns. Amazon Kiro went hard on Spec Driven Dev. but Intent... this feels like every good idea i've heard in code agent management rolled into one app that, very generously, does not lock you into only using Augment's inhouse coding agent. I'm in awe at all these smart people I get to talk to because yeah the future of how software is made is happening right in front of my eyes and people will ask us what it was like during this golden age for the rest of our lives.
RT Sarah Wang Big labs will feel the AGI vs product tradeoff more and more as the models get better and costs keep rising. There’s a constant balancing act between short-term and long-term research, marginal product improvements and ambitious jumps, and so on. And the cost of mistakes only rises as time goes on. Fun conversation with @FanaHOVA, @swyx, and @martincasado! Original tweet: https://x.com/sarahdingwang/status/2024530590037672021
From pioneering software-defined networking to backing many of the most ambitious AI labs of this cycle, Martin Casado and Sarah Wang sit at the center of AI’s capital and capability arms race. We sat down with the a16z duo to unpack how AI investing fundamentally changed: why
View quoted postper @AnthropicAI ‘s own numbers, interesting that 99.9 percentile model autonomy straightlines up through 2025 and immediately takes a nosedive after opus 4.6 launch. you can really see the efficiency work start to kick in after the pioneering work of the 4-4.5 series.
A central lesson of this work is that autonomy is co-constructed by the model, user, and product. It can't be fully characterized by pre-deployment evaluations alone. For full details, and our recommendations to developers and policymakers, see the blog: https://www.anthropic.com/research/measuring-agent-autonomy
View quoted postgave @MatthewBerman a tour of the new @latentspacepod studio and he gave me a youtube thumbnail tutorial to help a brother out
RT sathish316 This 9 months old Latent space podcast about the origin story of Claude code is mind-blowing about how 2 people in a company can start a revolution in the industry and also continuously improve the product through dogfooding. @latentspacepod @claudeai @bcherny @_catwu Original tweet: https://x.com/sathish316/status/2024181097601650801
🆕 "What if you could have 1000s of Claudes spinning up at once?" https://latent.space/p/claude-code We asked @_catwu and @bcherny ALL the questions you ever had about @anthropicai's Claude Code - including its origin story, cost ("It's like a wood chipper fueled by dollars" -
View quoted posti have this hall of fame tweet saved in my memes folder for whenever another Canadian Girlfriend performative coder shows up in the TL
This guy has had 12 agents building 24/7 for the past 2 months and ive yet to see something actually being built
View quoted post