X (Twitter)

RT Peter Gostev I spent a LOT of time through the hardest 3D prompts at Fable, it is a 45 min video, but I have 60+ very cool demos for you. Also prompts in the next post. https://www.youtube.com/watch?v=rTc2_-1KuRE

View on X

Andrej Karpathy

⚡github•12 days ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•13 days ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

𝕏x•14 days ago

Retweeted from @Etched

RT Etched We're coming out of stealth. We've built our first racks after a successful A0 tapeout, $1B+ in customer contracts, and $800m raised. Early customer tests show us achieving SOTA throughput, latency, and power efficiency on inference workloads. Our first racks ship this summer.

View on X

Andrej Karpathy

𝕏x•21 days ago

This is a new paradigm for interacting with Claude that is significantly more "inline" with all the other human activity org-wide. Once you do all of the under the hood engineering work to make this "just work" (e.g. across tools, integrations, compute environments, memory, security, etc.), Claude basically joins the team in a seamless way - you can talk to it as you would talk to a person and it can help with a very large variety of workloads. Imo this is the 3rd major redesign of LLM UIUX. The first paradigm was that the LLM is a website you go to, the second was that it is an app you download to your computer. This third one is that it is a self-contained, persistent, asynchronous entity with org-wide tools and context, working alongside teams of humans. It really takes a while to wrap your head around it, but it works and it is awesome.

@Claude

Introducing Claude Tag, a new way for teams to work with Claude. In Slack, Claude joins as a team member with access to the channels and tools you choose. Tag Claude in and delegate tasks to it while you focus on other work.

View quoted post

View on X

Andrej Karpathy

𝕏x•about 1 month ago

In awe of SpaceX and its story - past, present and the future. You can think about it in 10+ different ways and continue re-blowing your mind in circles. Huge congrats to the team! 🚀

View on X

Andrej Karpathy

𝕏x•about 1 month ago

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

@Claude

Fable 5 is state-of-the-art on nearly all tested benchmarks, with exceptional performance in software engineering, knowledge work, scientific research, and vision. The longer and more complex the task, the larger Fable 5’s lead over our other models.

View quoted post

View on X

Andrej Karpathy

𝕏x•about 1 month ago

Retweeted from @Thariq

RT Thariq http://x.com/i/article/2061850535708483585

View on X

Andrej Karpathy

𝕏x•about 1 month ago

Retweeted from @Derek

RT Derek Thompson This has quietly been a miracle month in medicine. In the last 5 weeks we’ve got news on: - retatrutide, the triple agonist GLP-1 from Lilly, basically melting fat and body-wide inflammation at record levels - RevMed’s new pancreatic cancer drug showing unprecedented abilities to extend life - small trial of a one-and-done PCSK9 gene editing therapy for slashing LDL cholesterol - Mayo’s AI-assisted radiology showing vastly improved cancer detection - this new therapy for metastatic solid tumors This stuff is at varying levels of evidence. Retatrutide is ~100% on its way, other stuff needs more clinical trial data. But put it together and we’re maybe on the verge of majorly reducing the mortality of heart disease and cancer, the two leading causes of death in America.

@Crémieux

This is actually insane. 97% of people taking the standard of care for metastatic solid tumor got worse by seven years. But with lorlatinib, that number was only 45% in the same time! This is an ENORMOUS jump in the quality of cancer care.

View quoted post

View on X

Andrej Karpathy

𝕏x•about 2 months ago

Personal update: I've joined Anthropic. I think the next few years at the frontier of LLMs will be especially formative. I am very excited to join the team here and get back to R&D. I remain deeply passionate about education and plan to resume my work on it in time.

View on X

Andrej Karpathy

𝕏x•2 months ago

This works really well btw, at the end of your query ask your LLM to "structure your response as HTML", then view the generated file in your browser. I've also had some success asking the LLM to present its output as slideshows, etc. More generally, imo audio is the human-preferred input to AIs but vision (images/animations/video) is the preferred output from them. Around a ~third of our brains are a massively parallel processor dedicated to vision, it is the 10-lane superhighway of information into brain. As AI improves, I think we'll see a progression that takes advantage: 1) raw text (hard/effortful to read) 2) markdown (bold, italic, headings, tables, a bit easier on the eyes) <-- current default 3) HTML (still procedural with underlying code, but a lot more flexibility on the graphics, layout, even interactivity) <-- early but forming new good default ...4,5,6,... n) interactive neural videos/simulations Imo the extrapolation (though the technology doesn't exist just yet) ends in some kind of interactive videos generated directly by a diffusion neural net. Many open questions as to how exact/procedural "Software 1.0" artifacts (e.g. interactive simulations) may be woven together with neural artifacts (diffusion grids), but generally something in the direction of the recently viral https://x.com/zan2434/status/2046982383430496444 There are also improvements necessary and pending at the input. Audio nor text nor video alone are not enough, e.g. I feel a need to point/gesture to things on the screen, similar to all the things you would do with a person physically next to you and your computer screen. TLDR The input/output mind meld between humans and AIs is ongoing and there is a lot of work to do and significant progress to be made, way before jumping all the way into neuralink-esque BCIs and all that. For what's worth exploring at the current stage, hot tip try ask for HTML.

@Thariq

http://x.com/i/article/2052796100608974848

View quoted post

View on X

Andrej Karpathy

⚡github•2 months ago

Activity on karpathy/nanochat

karpathy commented on an issue in nanochat

View on GitHub

Andrej Karpathy

⚡github•2 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

𝕏x•3 months ago

This is the the quote I've been citing a lot recently.

@kache

you can outsource your thinking but you cannot outsource your understanding

View quoted post

View on X

Andrej Karpathy

𝕏x•3 months ago

Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights: The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons: 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing. 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc. 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc. I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3). The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an...

@Stephanie Zhan

@karpathy and I are back! At @sequoia AI Ascent 2026. And a lot has changed. Last year, he coined “vibe coding”. This year, he’s never felt more behind as a programmer. The big shift: vibe coding raised the floor. Agentic engineering raises the ceiling. We talk about what it

View quoted post

View on X

Andrej Karpathy

📝blog•3 months ago

Sequoia Ascent 2026 summary

Summary of my talk at Sequoia Ascent

1 min readkarpathy

Read full article

Andrej Karpathy

𝕏x•3 months ago

Retweeted from @Nick

RT Nick Levine New work with @AlecRad and @DavidDuvenaud: Have you ever dreamed of talking to someone from the past? Introducing talkie, a 13B model trained only on pre-1931 text. Vintage models should help us to understand how LMs generalize (e.g., can we teach talkie to code?). Thread:

View on X

Andrej Karpathy

𝕏x•3 months ago

Retweeted from @Zain

RT Zain Shah Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see. @eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)

View on X

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed karpathy.github.io

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed karpathy.github.io

View on GitHub

Andrej Karpathy

𝕏x•3 months ago

Judging by my tl there is a growing gap in understanding of AI capability. The first issue I think is around recency and tier of use. I think a lot of people tried the free tier of ChatGPT somewhere last year and allowed it to inform their views on AI a little too much. This is a group of reactions laughing at various quirks of the models, hallucinations, etc. Yes I also saw the viral videos of OpenAI's Advanced Voice mode fumbling simple queries like "should I drive or walk to the carwash". The thing is that these free and old/deprecated models don't reflect the capability in the latest round of state of the art agentic models of this year, especially OpenAI Codex and Claude Code. But that brings me to the second issue. Even if people paid $200/month to use the state of the art models, a lot of the capabilities are relatively "peaky" in highly technical areas. Typical queries around search, writing, advice, etc. are *not* the domain that has made the most noticeable and dramatic strides in capability. Partly, this is due to the technical details of reinforcement learning and its use of verifiable rewards. But partly, it's also because these use cases are not sufficiently prioritized by the companies in their hillclimbing because they don't lead to as much $$$ value. The goldmines are elsewhere, and the focus comes along. So that brings me to the second group of people, who *both* 1) pay for and use the state of the art frontier agentic models (OpenAI Codex / Claude Code) and 2) do so professionally in technical domains like programming, math and research. This group of people is subject to the highest amount of "AI Psychosis" because the recent improvements in these domains as of this year have been nothing short of staggering. When you hand a computer terminal to one of these models, you can now watch them melt programming problems that you'd normally expect to take days/weeks of work. It's this second group of people that assigns a much greater gravity to th...

@staysaasy

The degree to which you are awed by AI is perfectly correlated with how much you use AI to code.

View quoted post

View on X

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on karpathy/KarpathyTalk

karpathy closed an issue in KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on karpathy/KarpathyTalk

karpathy commented on an issue in KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on karpathy/KarpathyTalk

karpathy contributed to karpathy/KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

⚡github•3 months ago

Activity on repository

karpathy pushed KarpathyTalk

View on GitHub

Andrej Karpathy

𝕏x•3 months ago

Farzapedia, personal wikipedia of Farza, good example following my Wiki LLM tweet. I really like this approach to personalization in a number of ways, compared to "status quo" of an AI that allegedly gets better the more you use it or something: 1. Explicit. The memory artifact is explicit and navigable (the wiki), you can see exactly what the AI does and does not know and you can inspect and manage this artifact, even if you don't do the direct text writing (the LLM does). The knowledge of you is not implicit and unknown, it's explicit and viewable. 2. Yours. Your data is yours, on your local computer, it's not in some particular AI provider's system without the ability to extract it. You're in control of your information. 3. File over app. The memory here is a simple collection of files in universal formats (images, markdown). This means the data is interoperable: you can use a very large collection of tools/CLIs or whatever you want over this information because it's just files. The agents can apply the entire Unix toolkit over them. They can natively read and understand them. Any kind of data can be imported into files as input, and any kind of interface can be used to view them as the output. E.g. you can use Obsidian to view them or vibe code something of your own. Search "File over app" for an article on this philosophy. 4. BYOAI. You can use whatever AI you want to "plug into" this information - Claude, Codex, OpenCode, whatever. You can even think about taking an open source AI and finetuning it on your wiki - in principle, this AI could "know" you in its weights, not just attend over your data. So this approach to personalization puts *you* in full control. The data is yours. In Universal formats. Explicit and inspectable. Use whatever AI you want over it, keep the AI companies on their toes! :) Certainly this is not the simplest way to get an AI to know you - it does require you to manage file directories and so on, but agents also make it quite...

@Farza 🇵🇰🇺🇸

This is Farzapedia. I had an LLM take 2,500 entries from my diary, Apple Notes, and some iMessage convos to create a personal Wikipedia for me. It made 400 detailed articles for my friends, my startups, research areas, and even my favorite animes and their impact on me complete

View quoted post

View on X

Andrej Karpathy

𝕏x•3 months ago

Something I've been thinking about - I am bullish on people (empowered by AI) increasing the visibility, legibility and accountability of their governments. Historically, it is the governments that act to make society legible (e.g. "Seeing like a state" is the common reference), but with AI, society can dramatically improve its ability to do this in reverse. Government accountability has not been constrained by access (the various branches of government publish an enormous amount of data), it has been constrained by intelligence - the ability to process a lot of raw data, combine it with domain expertise and derive insights. As an example, the 4000-page omnibus bill is "transparent" in principle and in a legal sense, but certainly not in a practical sense for most people. There's a lot more like it: laws, spending bills, federal budgets, freedom of information act responses, lobbying disclosures... Only a few highly trained professionals (investigative journalists) could historically process this information. This bottleneck might dissolve - not only are the professionals further empowered, but a lot more people can participate. Some examples to be precise: Detailed accounting of spending and budgets, diff tracking of legislation, individual voting trends w.r.t. stated positions or speeches, lobbying and influence (e.g. graph of lobbyist -> firm -> client -> legislator -> committee -> vote -> regulation), procurement and contracting, regulatory capture warning lights, judicial and legal patterns, campaign finance... Local governments might be even more interesting because the governed population is smaller so there is less national coverage: city council meetings, decisions around zoning, policing, schools, utilities... Certainly, the same tools can easily cut the other way and it's worth being very mindful of that, but I lean optimistic overall that added participation, transparency and accountability will improve democratic, free societies. (the quoted tweet i...

@Harry Rushworth

The British Government is a complicated beast. Dozens of departments, hundreds of public bodies, more corporations than one can count... Such is its complexity that there isn't an org chart for it. Well, there wasn't... Introducing ⚙️Machinery of Government⚙️

View quoted post

View on X

Andrej Karpathy

𝕏x•3 months ago

Wow, this tweet went very viral! I wanted share a possibly slightly improved version of the tweet in an "idea file". The idea of the idea file is that in this era of LLM agents, there is less of a point/need of sharing the specific code/app, you just share the idea, then the other person's agent customizes & builds it for your specific needs. So here's the idea in a gist format: https://gist.github.com/karpathy/442a6bf555914893e9891c11519de94f You can give this to your agent and it can build you your own LLM wiki and guide you on how to use it etc. It's intentionally kept a little bit abstract/vague because there are so many directions to take this in. And ofc, people can adjust the idea or contribute their own in the Discussion which is cool.

@Andrej Karpathy

View quoted post

View on X

Andrej Karpathy

𝕏x•3 months ago

LLM Knowledge Bases Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images). The latest LLMs are quite good at it. So: Data ingest: I index source documents (articles, papers, repos, datasets, images, etc.) into a raw/ directory, then I use an LLM to incrementally "compile" a wiki, which is just a collection of .md files in a directory structure. The wiki includes summaries of all the data in raw/, backlinks, and then it categorizes data into concepts, writes articles for them, and links them all. To convert web articles into .md files I like to use the Obsidian Web Clipper extension, and then I also use a hotkey to download all the related images to local so that my LLM can easily reference them. IDE: I use Obsidian as the IDE "frontend" where I can view the raw data, the the compiled wiki, and the derived visualizations. Important to note that the LLM writes and maintains all of the data of the wiki, I rarely touch it directly. I've played with a few Obsidian plugins to render and view data in other ways (e.g. Marp for slides). Q&A: Where things get interesting is that once your wiki is big enough (e.g. mine on some recent research is ~100 articles and ~400K words), you can ask your LLM agent all kinds of complex questions against the wiki, and it will go off, research the answers, etc. I thought I had to reach for fancy RAG, but the LLM has been pretty good about auto-maintaining index files and brief summaries of all the documents and it reads all the important related data fairly easily at this ~small scale. Output: Instead of getting answers in text/terminal, I like to have it render markdown files for me, or slide shows (Marp format), or matplotlib images, all of which I then view again in Obsidian. You can imagine ma...

View on X

Andrej Karpathy

𝕏x•4 months ago

New supply chain attack this time for npm axios, the most popular HTTP client library with 300M weekly downloads. Scanning my system I found a use imported from googleworkspace/cli from a few days ago when I was experimenting with gmail/gcal cli. The installed version (luckily) resolved to an unaffected 1.13.5, but the project dependency is not pinned, meaning that if I did this earlier today the code would have resolved to latest and I'd be pwned. It's possible to personally defend against these to some extent with local settings e.g. release-age constraints, or containers or etc, but I think ultimately the defaults of package management projects (pip, npm etc) have to change so that a single infection (usually luckily fairly temporary in nature due to security scanning) does not spread through users at random and at scale via unpinned dependencies. More comprehensive article: https://www.stepsecurity.io/blog/axios-compromised-on-npm-malicious-versions-drop-remote-access-trojan

@Feross

🚨 CRITICAL: Active supply chain attack on axios -- one of npm's most depended-on packages. The latest axios@1.14.1 now pulls in plain-crypto-js@4.2.1, a package that did not exist before today. This is a live compromise. This is textbook supply chain installer malware. axios

View quoted post

View on X

Andrej Karpathy

𝕏x•4 months ago

- Drafted a blog post - Used an LLM to meticulously improve the argument over 4 hours. - Wow, feeling great, it’s so convincing! - Fun idea let’s ask it to argue the opposite. - LLM demolishes the entire argument and convinces me that the opposite is in fact true. - lol The LLMs may elicit an opinion when asked but are extremely competent in arguing almost any direction. This is actually super useful as a tool for forming your own opinions, just make sure to ask different directions and be careful with the sycophancy.

View on X

Andrej Karpathy

𝕏x•4 months ago

When I built menugen ~1 year ago, I observed that the hardest part by far was not the code itself, it was the plethora of services you have to assemble like IKEA furniture to make it real, the DevOps: services, payments, auth, database, security, domain names, etc... I am really looking forward to a day where I could simply tell my agent: "build menugen" (referencing the post) and it would just work. The whole thing up to the deployed web page. The agent would have to browse a number of services, read the docs, get all the api keys, make everything work, debug it in dev, and deploy to prod. This is the actually hard part, not the code itself. Or rather, the better way to think about it is that the entire DevOps lifecycle has to become code, in addition to the necessary sensors/actuators of the CLIs/APIs with agent-native ergonomics. And there should be no need to visit web pages, click buttons, or anything like that for the human. It's easy to state, it's now just barely technically possible and expected to work maybe, but it definitely requires from-scratch re-design, work and thought. Very exciting direction!

@Patrick Collison

When @karpathy built MenuGen (https://karpathy.bearblog.dev/vibe-coding-menugen/), he said: "Vibe coding menugen was exhilarating and fun escapade as a local demo, but a bit of a painful slog as a deployed, real app. Building a modern app is a bit like assembling IKEA future. There are all these services,

View quoted post

View on X

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed autoresearch

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on karpathy/autoresearch

karpathy closed an issue in autoresearch

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on karpathy/nanochat

karpathy closed an issue in nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on karpathy/nanochat

karpathy commented on an issue in nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on karpathy/nanochat

karpathy closed an issue in nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

𝕏x•4 months ago

One common issue with personalization in all LLMs is how distracting memory seems to be for the models. A single question from 2 months ago about some topic can keep coming up as some kind of a deep interest of mine with undue mentions in perpetuity. Some kind of trying too hard.

View on X

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed nanochat

View on GitHub

Andrej Karpathy

𝕏x•4 months ago

Software horror: litellm PyPI supply chain attack. Simple `pip install litellm` was enough to exfiltrate SSH keys, AWS/GCP/Azure creds, Kubernetes configs, git credentials, env vars (all your API keys), shell history, crypto wallets, SSL private keys, CI/CD secrets, database passwords. LiteLLM itself has 97 million downloads per month which is already terrible, but much worse, the contagion spreads to any project that depends on litellm. For example, if you did `pip install dspy` (which depended on litellm>=1.64.0), you'd also be pwnd. Same for any other large project that depended on litellm. Afaict the poisoned version was up for only less than ~1 hour. The attack had a bug which led to its discovery - Callum McMahon was using an MCP plugin inside Cursor that pulled in litellm as a transitive dependency. When litellm 1.82.8 installed, their machine ran out of RAM and crashed. So if the attacker didn't vibe code this attack it could have been undetected for many days or weeks. Supply chain attacks like this are basically the scariest thing imaginable in modern software. Every time you install any depedency you could be pulling in a poisoned package anywhere deep inside its entire depedency tree. This is especially risky with large projects that might have lots and lots of dependencies. The credentials that do get stolen in each attack can then be used to take over more accounts and compromise more packages. Classical software engineering would have you believe that dependencies are good (we're building pyramids from bricks), but imo this has to be re-evaluated, and it's why I've been so growingly averse to them, preferring to use LLMs to "yoink" functionality when it's simple enough and possible.

@Daniel Hnyk

LiteLLM HAS BEEN COMPROMISED, DO NOT UPDATE. We just discovered that LiteLLM pypi release 1.82.8. It has been compromised, it contains litellm_init.pth with base64 encoded instructions to send all the credentials it can find to remote server + self-replicate. link below

View quoted post

View on X

Andrej Karpathy

⚡github•4 months ago

Activity on repository

karpathy pushed autoresearch

View on GitHub

Andrej Karpathy

𝕏x•4 months ago

Thank you Sarah, my pleasure to come on the pod! And happy to do some more Q&A in the replies.

@sarah guo

Caught up with @karpathy for a new @NoPriorsPod: on the phase shift in engineering, AI psychosis, claws, AutoResearch, the opportunity for a SETI-at-Home like movement in AI, the model landscape, and second order effects 02:55 - What Capability Limits Remain? 06:15 - What

View quoted post

View on X

Andrej Karpathy

𝕏x•4 months ago

Had to go see Project Hail Mary right away (it's based on the book of Andy Weir, of also The Martian fame). Both very pleased and relieved to say that 1) the movie sticks very close to the book in both content and tone and 2) is really well executed. The book is one of my favorites when it comes to alien portrayals because a lot of thought was clearly given to the scientific details of an alternate biochemistry, evolutionary history, sensorium, psychology, language, tech tree, etc. It's different enough that it is highly creative and plausible, but also similar enough that you get a compelling story and one of the best bromances in fiction. Not to mention the other (single-cellular) aliens. I can count fictional portrayals of aliens of this depth on one hand. A lot of these aspects are briefly featured - if you read the book you'll spot them but if you haven't, the movie can't spend the time to do them justice. I'll say that the movie inches a little too much into the superhero movie tropes with the pacing, the quips, the Bathos and such for my taste, and we get a little bit less the grand of Interstellar and a little bit less of the science of The Martian, but I think it's ok considering the tone of the original content. And it does really well where it counts - on Rocky and the bromance. Thank you to the film crew for the gem!

View on X

Andrej Karpathy

𝕏x•4 months ago

Thread • 2 tweets

The signature is alluding to NVIDIA GTC 2015, where Jensen excitedly told an audience of, at the time, mostly gamers and scientific computing professionals that Deep Learning is The Next Big Thing, citing among other examples my PhD thesis (one of the first image captioning systems that coupled image recognition ConvNet to an autoregressive RNN language model, trained end to end). This was back when most people were still unaware and somewhat skeptical but of course - Jensen was 1000% correct, highly prescient and locked in very early.(link to blast from the past) https://youtu.be/xQhb3C2hQoE?si=x3qQMjG-dktoNNv_&t=1577

View on X

Andrej Karpathy

About

Platforms

Content History

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy opened a pull request in nanochat

karpathy created a branch

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy opened a pull request in nanochat

karpathy pushed nanochat

karpathy created a branch

karpathy pushed nanochat

karpathy closed an issue in nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy closed a pull request in nanochat

karpathy pushed nanochat

karpathy commented on an issue in nanochat

karpathy closed an issue in nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy commented on an issue in nanochat

karpathy pushed nanochat

Sequoia Ascent 2026 summary

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed karpathy.github.io

karpathy pushed karpathy.github.io

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy closed an issue in KarpathyTalk

karpathy commented on an issue in KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy contributed to karpathy/KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed KarpathyTalk

karpathy pushed nanochat

karpathy pushed autoresearch

karpathy closed an issue in autoresearch

karpathy pushed nanochat

karpathy pushed nanochat

karpathy closed an issue in nanochat

karpathy commented on an issue in nanochat

karpathy pushed nanochat

karpathy closed an issue in nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed nanochat

karpathy pushed autoresearch