Jeremy Howard
简介
🇦🇺 Co-founder: @AnswerDotAI & @FastDotAI ; Prev: professor @ UQ; Stanford fellow; @kaggle president; @fastmail/@enlitic/etc founder https://t.co/16UBFTX7mo
平台
内容历史
RT Charles 🎉 Frye There was a flippening in the last few months: you can run your own LLM inference with rates and performance that match or beat LLM inference APIs. We wrote up the techniques to do so in a new guide, along with code samples. https://modal.com/docs/guide/high-performance-llm-inference Original tweet: https://x.com/charles_irl/status/2011484220032762114
RT @levelsio My #1 feature request for Claude Code should add is stop asking me every time for confirmation by default, like "can I check this folder", yes brother you can do anything you want Like maybe for writing ask me permission Add some [ just go ] mode Even with [ accept edits on ] it still asks me permission 1000 times per day I just want you to run and keep going mostly And no I don't feel like running it with --dangerously-skip-permissions Original tweet: https://x.com/levelsio/status/2011129631001170244
RT Greg Kamradt In ARC Prize 2024, MindsAI (@MindsAI_Jack et al) used test time fine tuning to get to the top of the leaderboard during the competition 100 models trained for 100 tasks Hearing how it worked was a peak into the future of continual learning Original tweet: https://x.com/GregKamradt/status/2010892517420699891
LLM memory is considered one of the hardest problems in AI. All we have today are endless hacks and workarounds. But the root solution has always been right in front of us. Next-token prediction is already an effective compressor. We don’t need a radical new architecture. The
RT Jane Manchun Wong I cry a little whenever I realize another novel software doesn’t come with the UNIX-style manual page, before I look at the sky and scream at the clouds 🥲 Original tweet: https://x.com/wongmjane/status/2010871340736397317
RT Daniel Litt IMO it should be considered quite rude in most contexts to post or send someone a wall of 100% AI-generated text. “Here, read this thing I didn’t care enough about to express myself.” Original tweet: https://x.com/littmath/status/2010759165061579086
Anyone using Wezterm? I've been having quite a few issues with Ghostty recently (particularly with trying to paste more then a few lines of text in—it often truncates) so wondering about trying something else. I've used alacritty before, and liked it. https://wezterm.org/
RT John Robinson llmsdottxt An open source Chrome extension that detects llms.txt files on websites as you browse around and makes it easy to discover and copy the URLs or content for use with your favorite LLM. @jeremyphoward proposed llms.txt as a way for websites to provide LLM friendly content that can be more to the point and more "context efficient" for LLMs to consume. Great for API docs and more. https://github.com/johnrobinsn/llmsdottxt_chrome Original tweet: https://x.com/johnrobinsn/status/2009256273242673612
RT Dave Guarino The new http://Maryland.gov has an llms.txt! (Hey @jeremyphoward!) Original tweet: https://x.com/allafarce/status/2009011874273350136
How useful is llms.txt? It's so useful that Tailwind rejected a PR to add an llms.txt, on the basis that it would be so useful that people wouldn't need to read their docs any more! https://github.com/tailwindlabs/tailwindcss.com/pull/2388
Great to see @threejs supporting llms.txt. 😀 I have noticed that the majority of libs/services I work with nowadays seem to have an llms.txt nowadays. Makes me happy! (I've been really enjoying the very comprehensive @GeminiApp llms.txt recently for writing Gemini code.)
RT Jason Rosenfeld I was able to find some old examples from DeepLyrics, the fine-tuned version of ULMFiT we made back in 2018 to generate song lyrics for a grad school project. A primitive LLM, if you will. @jeremyphoward pre-trained the AWD-LSTM (~250M params?) on Wikitext-103 for its "base knowledge" and world understanding. Then, we gathered a large corpus of song lyrics, cleaned everything up and started to fine-tune the model to learn how to write the songs, including titles and genres. The model learned to add its own "[INTRO]/[BRIDGE]" and "(Album Version)" tags to the text, which--at the time--was pretty mind-blowing. We settled on a cool implementation of beam-search for our "inference-time scaling" as well. In this way, we explored a graph of outputs by continuing to extend the most promising ones and selecting the very best one at the end. It learned to interpolate between genres and you could force it to write, for example, an "Oldies Metal" song. I remember being blown away by the song "Sparkling Damnation: Feel the sparkle in your eyes, Savage in tragedy" ✨💀 The model learned stanza structure, rhyme scheme, and far more than I would have thought, completely on its own. Seeing this behavior from the model is what made it clear that scaling was going to work. If you continued to scale, training on the correct data, more and more complex emergent features and sophisticated behaviors (including things like deception, self-modeling, etc.) would percolate in the network. At the time, we had also experimented with making the model multi-modal by adding in audio waves as well, but, as we learned and many else are learning, native multimodality is hard. Original tweet: https://x.com/jrosenfeld13/status/2008561850502291736
@jeremyphoward 2018 Fine-tuning ULMFiT on song lyrics (DeepLyrics, grad school project) Watching it learn to rhyme was when I knew scaling was going to work
View quoted postRT Xeophon And I was right!! IQuest-Coder was set up incorrectly and includes the whole git history, including future commits. The model has found this trick and uses it rather often. Thus, its SWE-bench score should be discarded. Original tweet: https://x.com/xeophon/status/2006969664346501589
Your timeline will be full of this image. If you believe this is a real model, I have a bridge to sell to you. For starters, they don’t disclose how they run those evals, which is a huge red flag. But good luck to the poor soul who’ll get nerdsniped by this.
View quoted postHyperparams and architectures are *far* more stable across model and data sizes than most people think. For stuff that does need to change, we have pretty reliable rules of thumb for nearly all of them. (eg: OpenAI did nearly all their GPT4 ablations on ~1000x smaller models.)
In the last 6 months the NanoGPT Speedrun to 3.28 loss on FineWeb dropped by 33% to 2mins. Recently a subset of these changes were bulk copy-pasted to the larger-scale 2.92 loss track. Surprisingly, the untuned yolo run broke the 2.92 loss record by 25%. https://github.com/KellerJordan/modded-nanogpt/pull/188
RT Ziming Liu New year's read 📔 -- "Physics of AI Requires Mindset Shifts." I argue that "Physics of AI" research is hard due to the current publishing culture. But there is a simple solution -- curiosity-driven open research. https://kindxiaoming.github.io/blog/2025/physics-of-ai/ Original tweet: https://x.com/ZimingLiu11/status/2006810684546494522
RT George Grigorev residuals in transformers are great for stability and scaling; deeper layers update the signal along the residual stream. few people questioned this choice publicly, and since 2025 there's been progress. few thoughts about hyper connections (wrt the newly released DeepSeek paper – mHC) - introduction. residuals – x_{𝑙+1} = x_{𝑙} + F(x_{𝑙}, W_{𝑙}) hyper connections – x_{𝑙+1} = H_res_{𝑙} @ x_{𝑙} + H_post_{𝑙}^T @ F(H_pre_{𝑙} @ x_{𝑙}, W_{𝑙}), where W_{𝑙} has shape [c, n * c] – expands features after attention H_res_{𝑙} has shape [n, n] – mixes up the residual stream H_pre_{𝑙} has shape [1, n] – aggregates features from n * c back to c H_post_{𝑙} has shape [1, n] – maps the layer back into the stream. so this is basically residuals when your function F (self-attention + MLP) increases the number of channels, and you add some mappings. whale later adds an extra layer of math for manifold regularization so that hyper connections, when stacked up together, preserve the identity-mapping property, as is naturally done in simple channel-preserving residual connections. They perform large-scale training, wrote mixed-precision fused kernels in tilelang and proved that this is a viable approach. - my thoughts. 1) this naturally flashes back to the original resnet design, where you had to add 1d conv to shrink / expand channel size while doing natural convnet design (you normally increase channels and decrease spatial dim after conv, so you have to design residuals with channel mapping) 2) this also complements the value residuals idea, that re-using older values, when multiplied by learned residual constants in a self-regularizing fashion, noticeably improves scores while adding negligible impact on computation. Even derivation of input-dependent coefficients is similar. This idea seems like a generalization. 3) expand factor in MLP now seems to be slightly redundant, since we are doing similar work in the residual stream? or at least it ...
RT Pol Avec Why you should write the code to use your Thermostat Yesterday, I tried following @karpathy 's lead and use CC to interface with my thermostat. The experience wasn't great for me. Today I tried the opposite, I built it step-by-step almost completely manually using @answerdotai 's SolveIT. It was great. I spent a much longer time, say maybe 4-5h. But it was worth it, even though I have a perfectly fine app already to handle the temperature (in other words, I might not reuse the code again). So, was worth it? Yes, why? Because there were a myriad small decisions and lessons learned throughout the process. Those are usually small enough and that don't feel significant but they do compound in the end. They make your tools much sharper. If you look at the final package I published to pypi you won't see that, and it looks like code an LLM could one-shot, the LLM could definitely NOT make you learn during the process. This is more or less how it went: Well, I started getting SolveIT to read the docs for me and list which endpoints does the Thermostat API have. Unfortunately, their docs require JS to render 😅. I took this chance to have a look a Zyte service to give SolveIT a tool to read the docs. With that setup I got a list of all endpoints for the API. I followed the instructions to get the API keys & tokens. Gave it a try, it worked, I went away, and when I came back the token had expired. No problem, solveIT created a super short _refresh method as part of the class. Next step, create the basic "Home Status" info endpoint. But call _refresh first to make sure we have an updated token... I thought we would have to do this for all endpoints, so let's actually create a `_request` that does this for us already. That way the methods simply look like this: At this point I took the chance to practice using dialoghelper. I had created two endpoints following a clear format: first markdown header, @ patch method into the class, try it out and dis...
I tried this too by asking CC to connect to my netatmo thermostat. CC spent a huge amount of tokens scanning ports, using nmap, arp, dsn, web searching for my thermostat's brand MAC prefix, etc... It did find my thermostat MAC address, which looked cool. Later, it walked me
RT Alexia Jolicoeur-Martineau Many grifters claim that you dont gain anything by working without LLM. They say its comparable to not using a calculator. Yet thats how TRM was so successful and won the ARC prize. It does matter. One high quality output beats multiple low quality outputs. Original tweet: https://x.com/jm_alexia/status/2005269635265110463
"Skill issue" is having to use AI to do programming tasks that you would have been able to do before. You don't need to write more code, you need to focus on the minimum code you need to solve your specific problem.
View quoted postMass unsolicited email is good, apparently. (I got this email too FWIW. It really was very annoying.)
I think Simon is wrong here, as is Rob Pike. If you get an unsolicited email, it is your responsibility as a professional laptop user to know how to deal with that. The year is 2025. Unsolicited email has existed for half a century. If your time is wasted by receiving one
RT Muyu He We previously found qwen3 surprisingly selects **only three** MLP down projection vectors to create attention sinks, and now we find that this selection has an important feature: The model intelligently assigns a very large intermediate activation for token 0, which goes into down projection to create an unusually massive MLP output. We find that this strategy guarantees that the total output of the current layer (L=6) embeds an important cue for attention sinks, which the next layer (L=7) surfaces and creates the sink for the first time in the model. Observations: - The intermediate activation z of token 0 is 2-3 orders of magnitudes larger than that of token 1 and 8. - 99% of the large activation is explained by only three dimensions for token 0, which means that it choose the three special column vectors each with a huge scalar. - Since the three vectors are aligned in one dimension, this linear combination essentially creates an reinforcement on that direction, which produces an MLP output 4 orders of magnitude larger than that of other tokens. Effects: - The output of layer 6 is the addition of the residual input, the attention output and the MLP output. - As we can see, only in token 0 does the MLP output have a 3 orders of magnitude larger norm than the attention and residual components. This means for token 0, the MLP output will dominant the direction and norm of the total output. - We also see that the directional variance, or "spread" of all sampled MLP outputs is two orders of magnitude lower than that of the other tokens. - The extremely low variance means that once we apply layer norm next layer, only token 0's input to the attention layer will surface a consistent single direction that maps deterministically to a single key vector direction. We have previously shown this direction to be the exact "cue" that the query in layer 7 picks up to form attention sinks. So the intermediate activation achieves two birds with one ston...
RT Hunter📈🌈📊 Americans think about 43% of the world is wealthier than them. Wildly wrong. Americans who make the median full-time wage ($63,128) are in the top 3% of global incomes, adjusted for price differences. Even if they make the 10th percentile wage ($18,890) they’re top 24% globally. Original tweet: https://x.com/StatisticUrban/status/2004723924781892033
What's going on here? It sure looks like an AI image, w an unsolvable maze, artifacts no human would normally draw, inhuman tiling/plate patterns… Is the claim this was created to look like AI? Or there's a drawing program that does this without AI? Or did Apple get fooled?
So this is something I've been noticing more lately: people seeing an image, wrongly assuming it's AI, and then confidently raging about it being AI when it wasn't. Example: the replies to this Pluribus-themed image Tim Cook posted.
RT Peter Steinberger TIL adding your terminal here will greatly speed up compile times. https://nnethercote.github.io/2025/09/04/faster-rust-builds-on-mac.html Original tweet: https://x.com/steipete/status/2003925293665337501
http://fast.ai alums and ULMFiT users like Jason have been fine-tuning LLMs since 2017/18. The rest of y'all are years behind, sorry. 🤷
@jeremyphoward 2018 Fine-tuning ULMFiT on song lyrics (DeepLyrics, grad school project) Watching it learn to rhyme was when I knew scaling was going to work
View quoted postRT Andy Masley 15.65k messages will add to about the same emissions as a one time 10 mile drive in a sedan. Original tweet: https://x.com/AndyMasley/status/2003539421879239095
I feel like I need to make a donation to an environmental impact fund 😬
RT Chris Albon I bike everywhere in SF. I barely ever take a taxi/uber/waymo. But if you want to ban Waymo it means you don’t care about cyclists like me. Original tweet: https://x.com/chrisalbon/status/2003501793393967423
RT Lewis Tunstall ULMFiT was really ahead of its time, complete with the pre-train -> mid-train -> SFT pipeline we use today Original tweet: https://x.com/_lewtun/status/2003404158595158191
2017 Pre-training (ULMFiT) 😊
The LLM training eras: 202x Pre-training (foundation) 2022 RLHF + PPO 2023 LoRA SFT 2024 Mid-Training 2025 RLVR + GRPO
View quoted postRT Arnaud Bertrand This is largely being ignored but it's easily one of the biggest China news of the year. What China is doing with Hainan - a huge island (50 times the size of Singapore!) - is pretty extraordinary: they're basically making it into a completely different jurisdiction from the rest of the country, and an extremely attractive entry gate for the Chinese market. You can now import most products in the world (74% of all goods) entirely duty free into Hainan. And, if you transform the product and add 30% value locally, you can then send it to the rest of mainland China completely tariff-free. So for instance: import Australian beef into Hainan tax free. Slice it and package it for hotpot in Hainan: it can enter all mainland supermarkets duty-free. They also have insanely low corporate tax rates: 15%, lower than Hong Kong (16.5%) and Singapore (17%) or the rest of the mainland (25%). That's not all, Hainan now has different rules from the rest of China in dozens of areas: HEALTH: Basically the rule is that if a medicine or medical device is approved by regulatory agencies anywhere in the world, it can be used in Hainan - even if banned on the mainland. Which undoubtedly makes it THE place in the world with the widest range of medical treatments available. NO FIREWALL: Companies registered in Hainan can apply for unrestricted global internet access OPEN EDUCATION: Foreign universities can open campuses without a Chinese partner VISA-FREE: 86 countries get visa-free entry, probably one of the most open places in the world CAPITAL: Special accounts let money flow freely to and from overseas - normal mainland forex restrictions don't apply So they're running a pretty extraordinary "radical openness" experiment there. They're basically building a "greatest hits" of global free zones: Singapore's tax regime, Switzerland's medical access, Dubai's visa policy - all in one giant tropical island attached to the 1.4 billion people Chinese consumer mark...
China on Thursday launched island-wide special customs operations in the Hainan Free Trade Port (FTP), the world's largest FTP by area, allowing freer entry of overseas goods, expanded zero-tariff coverage and more business-friendly measures. http://xhtxs.cn/8VU
RT Luca Soldaini 🎀 can someone help folks at Mistral find more weak baselines to add here? since they can't stomach comparing with SoTA.... (in case y'all wanna fix it: Chandra, dots.ocr, olmOCR, MinerU, Monkey OCR, and PaddleOCR are a good start) Original tweet: https://x.com/soldni/status/2001821298109120856
RT Andy Masley Two big corrections which imo leave the reader with a much better understanding of where water’s being used, especially the introductory paragraph on water which now goes into detail on energy generation too. Really awesome. Grateful to Hao for her engagement here Original tweet: https://x.com/AndyMasley/status/2001520364635976075
We have confirmed the unit error in the government document, and I have issued a correction to my publisher. I have also made other updates based on the ongoing feedback. I detail these changes here: http://karendhao.com/20251217/empire-water-changes. Thank you to my readers for strengthening my book 🙏
View quoted postRT Andy Masley Once again, a story about data centers and water where the author very directly announces they are using a wildly deceptive framing for no reason. Original tweet: https://x.com/AndyMasley/status/2001317098295898395
Interesting (and surprising to me) discovery from one of our Solveit students: it turns out that frontier LLMs (or @AnthropicAI Opus 4.5 at least) can't create nets for platonic solids, when given a simple API. E.g here's its attempt at a tetrahedron:
RT Conor Rogers Need a word for the phenomenon where people think things used to be nicer because the nice things are the only things from past eras that got preserved or photographed. Original tweet: https://x.com/conorjrogers/status/2000306256217825494
RT hardmaru “Why AGI Will Not Happen” @Tim_Dettmers https://timdettmers.com/2025/12/10/why-agi-will-not-happen/ This essay is worth reading. Discusses diminishing returns (and risks) of scaling. The contrast between West and East: “Winner Takes All” approach of building the biggest thing vs a long-term focus on practicality. “The purpose of this blog post is to address what I see as very sloppy thinking, thinking that is created in an echo chamber, particularly in the Bay Area, where the same ideas amplify themselves without critical awareness. This amplification of bad ideas and thinking exuded by the rationalist and EA movements, is a big problem in shaping a beneficial future for everyone.” “A key problem with ideas, particularly those coming from the Bay Area, is that they often live entirely in the idea space. Most people who think about AGI, superintelligence, scaling laws, and hardware improvements treat these concepts as abstract ideas that can be discussed like philosophical thought experiments. In fact, a lot of the thinking about superintelligence and AGI comes from Oxford-style philosophy. Oxford, the birthplace of effective altruism, mixed with the rationality culture from the Bay Area, gave rise to a strong distortion of how to clearly think about certain ideas.” Original tweet: https://x.com/hardmaru/status/2000038674835128718
Why isn't Tukey more well-known? He's the godfather of data science. Coined the terms "bit", "software", and "exploratory data analysis". Made FFTs usable. And much more… https://en.wikipedia.org/wiki/John_Tukey
RT Simon Willison Tip for the Google Gemini team: if you want to help Google truly get ahead in the AI era, use your hefty influence to get it so setting up API access to your own calendar doesn't involve THESE steps Original tweet: https://x.com/simonw/status/1999670989077250159
Need a Google Calendar CLI that works well with agents? Here you go: https://github.com/badlogic/gccli
View quoted postRT ARC Prize A year ago, we verified a preview of an unreleased version of @OpenAI o3 (High) that scored 88% on ARC-AGI-1 at est. $4.5k/task Today, we’ve verified a new GPT-5.2 Pro (X-High) SOTA score of 90.5% at $11.64/task This represents a ~390X efficiency improvement in one year Original tweet: https://x.com/arcprize/status/1999182732845547795
RT Lucas Beyer (bl16) Re I know what you mean, but it's really frustrating because Google would NEVER have allowed any of it if they didn't come under very, very intense competition pressure. I remember explicitly choosing to avoid image generation topic in our team (despite being excited and having plenty ideas) because of the massive headache to even talk about it, let alone publish, put out a demo, or, how crazy would that be, model weights. Strong competition is good. Original tweet: https://x.com/giffmana/status/1998502115237355970
This is incorrect. LLMs can call tools to get info and change about the outside world, including viewing videos, moving robotic arms, etc. The pic below shows a minimal falsifying example. The value of this number squared is new information—it's never been documented before.
as a reminder: AI cannot generate knowledge. It cannot create knowledge. It cannot find new information. It can only mix information that has already been found and written and input into computers by humans.
View quoted postRT Claire Lehmann Wild that saying “maybe we shouldn’t normalise public executions” is being floated as a “far-left/insane” position by US tech figures. The West stopped doing public executions because we moved beyond barbarism. One would think a founder of a university would know this. Original tweet: https://x.com/clairlemon/status/1997788910072783357
RT tae kim Oh no. It has gotten so bad that even Paul Krugman is opining on GPUs and TPUs. No Paul, the G in GPUs isn't for general. Original tweet: https://x.com/firstadopter/status/1997732138687640000
And this is the easy bit. Tmux is another level again.
Love to see what our Solveiteers are building. Here's a delightful dialog, from which a little girl will be getting special flash cards to help her learn her 1st sight words, in a game of hide & seek. The full process, which you can customise and re-use: https://share.solve.it.com/d/c022c2fe595fe659c6a5fbd4054bb5f3Here's the hide and seek maps, ready for searching out the words!
WTF. Weren't we promised less bots? And yet it's much much worse than ever. Here's a screen shot of replies to my most recent post. What's going on at X? Are they short staffed? All the competent devs left? Wrong priorities? Bad management? I dunno why, but it sure is a mess.
I just released a handy little chrome extension 'clipmd' that lets you click on any element in a web page, and puts in the clipboard that element converted to markdown (ctrl-shift-m), or a screenshot of it (ctrl-shift-s). Handy for LLMs! 😊 https://github.com/AnswerDotAI/clipmd/
Great to see things heading this direction. But be warned: it's blackwell-only -- i.e only the very latest, most expensive GPUs. So if you make stuff with this, most people won't be able to use it.
CUDA Tile has shipped! You can now `pip install cuda-tile`. I'm excited to see what y'all will build with it! Docs & resources: https://developer.nvidia.com/cuda/tile GitHub: https://github.com/NVIDIA/cutile-python
View quoted postHuh TIL Sony has re-used our years-old "PSSR" super-resolution algorithm name to instead refer to their proprietary gaming super-resolution algorithm.
From a lab member (links pasted below): “Googled "PSSR" to pull up the original PSSR paper and got served this AI overview: PlayStation Spectral Super Resolution (PSSR): A Sony technology that uses AI to upscale games to higher resolutions, creating a sharper and more detailed
RT Chris Levy The team at @answerdotai, including @jeremyphoward and @johnowhitaker, are teaching techniques and approaches for working with AI to address exactly this. Checkout their “SolveIt” approach and always be on the lookout for their amazing courses. Original tweet: https://x.com/cleavey1985/status/1996877674623782944
My biggest worries about coding with AI: 1. Beginners not actually learning 2. Atrophy of skills I’m seeing #1 happen and I don’t have a good answer yet. Leveling up as an engineer requires grinding and it’s not always fun. If AI can solve most of the problems for you, when
View quoted postRT Sayash Kapoor CORE-Bench is solved (using Opus 4.5 with Claude Code) TL;DR: Last week, we released results for Opus 4.5 on CORE-Bench, a benchmark that tests agents on scientific reproducibility tasks. Earlier this week, Nicholas Carlini reached out to share that an updated scaffold that uses Claude Code drastically outperforms the CORE-Agent scaffold we used, especially after fixing a few grading errors. Over the last three days, we validated the results he found, and we are now ready to declare CORE-Bench solved. Context. We developed the Holistic Agent Leaderboard (HAL), to evaluate AI agents on challenging benchmarks. One of our motivations was that most models are never compared head-to-head — on the same benchmark, with the same environment, and using the same scaffold. We built standard agent scaffolds for each benchmark, which allowed us to independently evaluate models, scaffolds, and benchmarks. CORE-Bench is one of the benchmarks on HAL. It evaluates whether AI agents can reproduce scientific papers when given the code and data from a paper. The benchmark consists of papers from computer science, social science, and medicine. It requires agents to set up the paper's repository, run the code, and correctly answer questions about the paper's results. We manually validated each paper's results for inclusion in the benchmark to avoid impossible tasks. 1. Switching the scaffold to Claude Code nearly doubles the accuracy of Opus 4.5 Our scaffold for this benchmark, CORE-Agent, was built using the HuggingFace smolagent library. This allowed us to easily switch the model we used on the backend to compare performance across models in a standardized way. While CORE-Agent allowed cross-model comparison, when we ran Claude Opus 4.5 using Claude Code, it scored 78%, nearly double the 42% we reported using our standard CORE-Agent scaffold. This is a substantial leap: The best agent with CORE-Agent previously scored 51% (Opus 4.1). Surprisingly, this gap w...
Stop Saying Boredom is Good for Kids
Chronic boredom causes stress, disengagement, and poor well-being in adults. So why do we glorify it for children?
RT Lucas Beyer (bl16) OK, I have to give Jürgen this one. I've seen the Yann video like a million times, but this is the first time I see the Fukushima video, which is strikingly similar, but was 3y earlier. Why? Original tweet: https://x.com/giffmana/status/1995982071425019907
Fukushima's video (1986) shows a CNN that recognises handwritten digits [3], three years before LeCun's video (1989). CNN timeline taken from [5]: ★ 1969: Kunihiko Fukushima published rectified linear units or ReLUs [1] which are now extensively used in CNNs. ★ 1979:
View quoted postRT Peter Tulip "In medical research, there’s a practice of ending a study early when the results are too striking to ignore. ... When an intervention works this clearly, you change what you do." Original tweet: https://x.com/peter_tulip/status/1995810181960105991
Absolute public health imperative to get safe driverless cars everywhere as fast as possible https://www.nytimes.com/2025/12/02/opinion/self-driving-cars.html?smid=nytcore-ios-share
RT Christian Szegedy Sorry, I don't buy this. AI was on a super clear trajectory in 2020. Quantum computing is at the same point AI was in 1975. Original tweet: https://x.com/ChrSzegedy/status/1995411127061348499
Quantum computing has reached the same point AI was at in 2020. ~ Sundar Pichai said in a recent BBC interview. He expects practical wins in drug discovery, materials, cryptography, and even faster AI training, where quantum subroutines can speed parts of simulation or
Why would anyone disagree with the Democratic People's Republic of Korea? What, are they against democracy and people? Folks, just because you name something a thing, it doesn't make it that thing.
Why would anyone disagree with effective altruism? What, do they support ineffective altruism?
View quoted postRT Salman // 萨尔曼 Recently finished lesson 2 of the SolveIt course, where @jeremyphoward elegantly unmasks the deceivingly complicated mask "agents" wear. Turns out they're just a simple 4-step loop. Original tweet: https://x.com/ForBo7_/status/1995285440509968728
RT Andy Masley The idea that a field that's 70 years old should have to suddenly contort itself to people mad at AI spotify songs is kind of spiritually offensive to me Original tweet: https://x.com/AndyMasley/status/1995233198964682892
I’m sorry but any description of specialized medical machine learning algorithms as “AI” is just doing PR for OpenAI and other slop generator companies. You’re dragging down public trust in useful ML while giving ChatGPT a veneer of unearned respectability!!
View quoted postThe new @AnthropicAI Opus 4.5 model is absolutely stunningly good at literary analysis. I'm helping a friend with their book and there's times I feel like something is a little off in the writing, and when I ask Opus about it, it always diagnoses it beautifully, with examples.
RT Andy Masley Tweet with 10k likes letting the farmers who poisoned a town's water off the hook because Amazon used a tiny amount of the water, didn't add new pollutants, and their crime is they didn't also evaporate the pollutants before they returned it. Original tweet: https://x.com/AndyMasley/status/1995188488657084444
Amazon vastly accelerated nitrate poisoning of an entire town and refuses to do anything about it
RT anaum whoever started the ‘AI consumes absurd amounts of water’ narrative ended up creating one of the most persistently damaging misconceptions that the public has internalized Original tweet: https://x.com/anaumghori/status/1995006333251453304
We could save soooo many lives but people are so caught up in their reactionary moral panic nonsense
>7k likes on something that is totally mathematically impossible. Is there any way that AI discourse will become sane again? Or this what we have to deal with from now on?
This one prioritises keeping Australia correctly oriented above all else, which seems reasonable to me.
Is there some agreement between @AnthropicAI and @grok on search now? The Anthropic API today for me started citing Grokipedia sources when using its search tool, even although Wikipedia results are higher in search engines for these queries.Amusingly, Claude has no idea what Grokipedia is, or why it used it :D (I know this isn't at all surprising - still got a chuckle though…)
RT Tal I created this app for NeurIPS, mostly driven by my own desires to not feel lost 🙂 https://paperjam.ai/neurips25 It’s small but mighty! 🧵 Original tweet: https://x.com/eiopa/status/1994907389187731492
This is utter nonsense. I created the first company to focus on deep learning in radiology (the exact area discussed below). It is, and always has been, part of the field of AI. Now that AI is popular, everyone thinks they're an expert and feels they must have strong opinions
I’m sorry but any description of specialized medical machine learning algorithms as “AI” is just doing PR for OpenAI and other slop generator companies. You’re dragging down public trust in useful ML while giving ChatGPT a veneer of unearned respectability!!
View quoted postRT near >amazon homepage. black friday >scroll entire page >years of data, every item ever clicked on and bought >not a single interesting item suggested to me why Original tweet: https://x.com/nearcyan/status/1994549753892491422
I didn't believe this was real, so I looked into it. It is real. It's actually worse than it first looks. Definitely supports claims from @ziglang and @theo that GH Actions is a sad, neglected platform. Read on for a little software archeology…🧵
GitHub's official "safe sleep" script: >is not safe >does not even sleep Microsoft just can't stop losing
RT Rebekah Jones The suicide rate among married women in 1950 was 22.1 per 100k. Today it is 6.9 per 100k. After the passage of no-fault divorce in 1970s and 1980s, suicide rates dropped 20%, there was a 10% decline in women murdered by their partners, and rates of other kinds of domestic violence against women dropped 30%. https://now.org/blog/threats-to-no-fault-divorce-and-its-implications-for-violence-against-women/ https://www.aclusd.org/news/attacks-no-fault-divorce-are-dangerous-especially-those-experiencing-domestic-violence/ Original tweet: https://x.com/GeoRebekah/status/1994441776854138962
RT Scott H. Hawley 🛶 New blog tutorial: "Flow Where You Want" Want to steer pretrained flow models without retraining? I spent months simplifying guidance methods: intuitive visuals, accessible math, & runnable code -- it's a Colab! 🔗 below. (btw that's my kayak) Original tweet: https://x.com/drscotthawley/status/1994250175133409342
When accuracy gets well above 50%, it's often of more practical importance to understand the change in *error rate*. So instead of @AnthropicAI's chart crime with the truncated axis, they could have shown it like so:
Should have done this a long time ago. They used to have some useful info, but turned into a slop-machine a while back. It's a shame. We could use some actual thoughtful researchers in this area.
PSA: Never ever listen to these guys re software. They clearly never actually use the software they talk about so they're just repeating stuff they've heard somewhere. Replace Nsight with pdb?!? That's like replacing a pencil with a glass of apple juice. Not the same thing!
NVIDIA's Nsight Systems profiler is one of the most heavyweight system-level profilers for NVIDIA GPUs, which can generate massive trace dumps sometimes reaching terabytes in size. For simple debugging tasks, it's absolute overkill. End users should instead consider using
View quoted postIt's weird how little people talk about gpt-4.1. Such a good model!
@jeremyphoward @pvncher @jerhadf @OpenAI We are still using gpt-4.1 for many tasks where reasoning is not needed and it’s best non-thinking model ever from OpenAI.
View quoted postWe just decided to give all students in our "How to Solve it With Code" course free access to Opus 4.5 for the rest of this year. Although the course has started already, you can still sign up and catch up with the recordings here: https://solve.it.com/
14MB ram / 9MB disk (MB, *not* GB!) to index all of Windows 10, in 1 second. Index stays updated automatically. It's amazing what's possible with a modern computer if you actually care about engineering. https://www.voidtools.com/
This is great! (Much less good that using Solveit ofc -- but still great :D )Geoffrey Litt: I cannot emphasize enough how much I prefer this "tutorial doc + build-it-yourself" coding workflow to the typical "ugh" feeling of reviewing huge agent PRs. You can try it right now and see for yourself: 1) Instead of having Claude Code make a PR, ask it to output a Markdown Link: https://x.com/geoffreylitt/status/1991909304085987366
Great, now people are asking our Discord bot about *my* opinions on Tudor history…Deep Thrill: Collecting the best pro-Elon grok posts from the day in this thread: Link: https://x.com/DeeperThrill/status/1991616527372808495
Who could have guessed what would happen when the world's richest man bought a giant social network and stuck an AI in it…
this is amazingGrok: @hfredguy @steinkobbe The body in the meme is overweight and soft, characterized by excess abdominal fat, minimal muscle definition, and a sedentary posture. It's unhealthy, signaling potential risks like metabolic issues from inactivity and poor diet. Such neglect suggests limited physical skills or Link: https://x.com/grok/status/1991277840730628156
"best open-weight LLM by a US company" It's a Chinese LLM, fine-tuned by a US company.Drishan Arora: Today, we are releasing the best open-weight LLM by a US company: Cogito v2.1 671B. On most industry benchmarks and our internal evals, the model performs competitively with frontier closed and open models, while being ahead of any US open model (such as the best versions of Link: https://x.com/drishanarora/status/1991204769642475656
There are folks out there who, due to their limited education and intellect, are unable to understand basic principles of how science advances, impacts and improves the world that us humans build for ourselves. So instead, they lash out at scientists.Tartarian Empire: @juliet_turner6 People are not seething because you earned a PhD. They are seething because, after all the self congratulation and the glitter emojis, they discovered the grand achievement you are flaunting is… about ants. Most people can accept someone celebrating medical research, Link: https://x.com/SanguineChester/status/1990824290791666110
That aged well.xAI: Grok 4.1 claims the #1 spot on the @arena leaderboard at 1483 Elo — a commanding 31 points above the nearest non-xAI model. Link: https://x.com/xai/status/1990530501237715372
Fantastic example of the power of FastHTML to create really practical web applications:Alonso Astroza 🤖: Today we at @DataScienceUDD are launching https://shiaaa.cl, a community platform to correct transcripts of Chilean Spanish audio. Why? Because current speech recognition systems simply do not understand how people in Chile actually speak. Link: https://x.com/aastroza/status/1990849222921355311
Looks like the lobbying for regulatory capture to ensure lock-in of profits to the private sector is working. :(Chris Murphy 🟧: Guys wake the f up. This is going to destroy us - sooner than we think - if we don’t make AI regulation a national priority tomorrow. Link: https://x.com/ChrisMurphyCT/status/1989120215171625149
RT Marius Vach Here is the solveit dialog implementing RLMs by @a1zhang using `lisette` and `toolslm` by @answerdotai (h/t @jeremyphoward)Alex L Zhang: What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length, Link: https://x.com/a1zhang/status/1978469116542337259
I agree with @dileeplearningDileep George: I agree with @ylecun Link: https://x.com/dileeplearning/status/1988320699493339284
RT Antonio Sarosi Porting decades old programs to Rust is dumb. They already fixed countless bugs / security issues, spend the time fixing remaining issues instead of starting all over again and re-introducing them. Memory safety doesn't prevent dumb logical errors. Use Rust for new software.The Lunduke Journal: Multiple, serious security vulnerabilities found in the Rust clone of Sudo — which shipped with Ubuntu 25.10 (the most recent release). Not little vulnerabilities: We’re talking about the disclosure of passwords and total bypassing of authentication. In fact, we’re getting new Link: https://x.com/LundukeJournal/status/1988346904581726501
RT Micah Goldblum 🚨We converted pretrained LLMs into looped LLMs that can crank up performance by looping for more iterations. Our looped models surpass the performance of the pretrained models we started out with, showing that existing models benefit from increased computational depth. 📜1/9
RT Sully noticed a pretty worrying trend the more i use llms my day to day skills are slowly atrophying and im relying more and more on models for even simple tasks happening for coding, writing etc sometimes i dont even want to do the task if theres no ai to help
RT Rachel Thomas Re "People who go all in on AI agents now are guaranteeing their obsolescence. If you outsource all your thinking to computers, you stop upskilling, learning, and becoming more competent. AI is great at helping you learn." @jeremyphoward @NVIDIAAI https://www.youtube.com/watch?v=zDkHJDgefyk 2/
RT Rachel Thomas TensorFlow was all about making it easier for computers. PyTorch *won* because it was about making it easier for humans. It’s disappointing to see AI community focusing on what’s easiest for machines again (prioritizing AI agents & not centering humans). -- @jeremyphoward 1/
Interesting bot account this one. Check out the posting history. Wonder who is organizing this, and why.Michael: @jeremyphoward @Kimi_Moonshot Don’t fall for China propaganda. You don’t need H100s when your model’s trained on distilled U.S. knowledge — cheap A800s, H800s, even RTX 4090s can handle that just fine using plain old BF16 ops. But when that distilled-knowledge faucet closes in 2026, what then? Link: https://x.com/letsgomike888/status/1987675934548471993
RT 張小珺 Xiaojùn If you are interested in Kimi K2 thinking, you can check out this interview with Yang Zhilin, founder of Kimi (with Chinese and English bilingual subtitles): https://youtu.be/91fmhAnECVc?si=AKNvfeNvxvfYF7fF
RT Shekswess It’s frustrating how labs like @Kimi_Moonshot, @Alibaba_Qwen, @deepseek_ai, @allen_ai, @huggingface... share their research, pipelines, and lessons openly, only for closed-source labs to quietly use that knowledge to build better models without ever giving back.
RT “paula” the tweetInternal Tech Emails: Sam Altman texts Shivon Zilis February 9, 2023 Link: https://x.com/TechEmails/status/1987199248732180862