Creator @datasetteproj, co-creator Django. PSF board. Hangs out with @natbat. He/Him. Mastodon: https://t.co/t0MrmnJW0K Bsky: https://t.co/OnWIyhX4CH
I am increasingly worried about AI in the video game space in general. [...] I'm not sure that the CEOs and the people making the decisions at these sorts of companies understand the difference between actual content and slop. [...] It's exactly the same cryolab, it's exactly the same robot factory place on all of these different planets. It's like there's so much to explore and nothing to find. [...] And what was in this contraband chest was a bunch of harvested organs. And I'm like, oh, wow. If this was an actual game that people cared about the making of, this would be something interesting - an interesting bit of environmental storytelling. [...] But it's not, because it's just a cold, heartless, procedurally generated slop. [...] Like, the point of having a giant open world to explore isn't the size of the world or the amount of stuff in it. It's that all of that stuff, however much there is, was made by someone for a reason. — Felix Nolan, TikTok about AI and proced...
It's ChatGPT's third birthday today. It's fun looking back at Sam Altman's low key announcement thread from November 30th 2022: today we launched ChatGPT. try talking with it here: chat.openai.com language interfaces are going to be a big deal, i think. talk to the computer (voice or text) and get what you want, for increasingly complex definitions of "want"! this is an early demo of what's possible (still a lot of limitations--it's very much a research release). [...] We later learned from Forbes in February 2023 that OpenAI nearly didn't release it at all: Despite its viral success, ChatGPT did not impress employees inside OpenAI. “None of us were that enamored by it,” Brockman told Forbes. “None of us were like, ‘This is really useful.’” This past fall, Altman and company decided to shelve the chatbot to concentrate on domain-focused alternatives instead. But in November, after those alternatives failed to catch on internally—and as tools like Stable Diffusion caused the...
Activity on simonw/tools
simonw opened a pull request in tools
View on GitHubActivity on simonw/tools
simonw contributed to simonw/tools
View on GitHubActivity on simonw/tools
simonw opened a pull request in tools
View on GitHubRT Raiza Martin Happy 3rd birthday, ChatGPT. I was at Google then, leading two products: AI Test Kitchen and NotebookLM (there was no Bard/Gemini yet). We had launched AITK much earlier than ChatGPT, but it was a heavily constrained experience in comparison. It was a cool demo of LaMDA but that was all: a demo. On the other hand, NotebookLM was in its infancy. We’d just done an internal launch (dogfood) and still very much felt like a 20% project. It was difficult to land the concept broadly with people because, well, you needed the concept of LLMs to land first - and it was just so very early. The only people that really “got it” in those early days were students (and that gave me enough conviction to keep going). When ChatGPT arrived, I felt a sense of getting beat - their simple, open ended demo was just way more viral than AITK, and we’d spent so much energy getting it out the door. I knew I’d missed the mark. In that moment I felt dejected but honestly looking back, it was the best thing that could’ve happened. ChatGPT led to such a fast and broad awareness of AI that it made it easier to convey the value of applications built on top of LLMs. It also gave me this competitive drive - the game can’t possibly be over, right? There’s hundreds, if not thousands more of apps to be built, and it lit this fire in me to just keep exploring, tinkering, building. 3 years later here we are - and while the world largely still only has a handful of (consumer) AI apps, I do think that’s about to meaningfully change. Happy birthday, ChatGPT, thanks for changing the world. Original tweet: https://x.com/raizamrtn/status/1995183434541715956
today we launched ChatGPT. try talking with it here: http://chat.openai.com
View quoted postActivity on simonw/tools
simonw contributed to simonw/tools
View on GitHubActivity on simonw/tools
simonw opened a pull request in tools
View on GitHubThe most annoying problem is that the [GitHub] frontend barely works without JavaScript, so we cannot open issues, pull requests, source code or CI logs in Dillo itself, despite them being mostly plain HTML, which I don't think is acceptable. In the past, it used to gracefully degrade without enforcing JavaScript, but now it doesn't. — Rodrigo Arias Mallo, Migrating Dillo from GitHub Tags: browsers, progressive-enhancement, github
Activity on simonw/llm-prices
simonw contributed to simonw/llm-prices
View on GitHubActivity on simonw/llm-prices
simonw commented on an issue in llm-prices
View on GitHubActivity on simonw/llm-prices
simonw commented on an issue in llm-prices
View on GitHubActivity on simonw/llm-prices
simonw labeled an issue in llm-prices
View on GitHubActivity on simonw/llm-prices
simonw opened a pull request in llm-prices
View on GitHubActivity on simonw/research
simonw contributed to simonw/research
View on GitHubActivity on simonw/sqlite-utils
simonw opened a pull request in sqlite-utils
View on GitHubActivity on simonw/tools
simonw contributed to simonw/tools
View on GitHubContext plumbing Matt Webb coins the term context plumbing to describe the kind of engineering needed to feed agents the right context at the right time: Context appears at disparate sources, by user activity or changes in the user’s environment: what they’re working on changes, emails appear, documents are edited, it’s no longer sunny outside, the available tools have been updated. This context is not always where the AI runs (and the AI runs as closer as possible to the point of user intent). So the job of making an agent run really well is to move the context to where it needs to be. [...] So I’ve been thinking of AI system technical architecture as plumbing the sources and sinks of context. Tags: definitions, matt-webb, ai, generative-ai, llms, ai-agents, context-engineering
Large language models (LLMs) can be useful tools, but they are not good at creating entirely new Wikipedia articles. Large language models should not be used to generate new Wikipedia articles from scratch. — Wikipedia content guideline, promoted to a guideline on 24th November 2025 Tags: ai-ethics, slop, generative-ai, wikipedia, ai, llms
Out of curiosity I decided to try and run the numbers on how much Netflix you can watch for the energy cost of a ChatGPT prompt As far as I can tell it's between 5.1 and 10.2 seconds, depending on which end of the 2019 IEA Netflix energy usage estimate you use
In June 2025 Sam Altman claimed about ChatGPT that "the average query uses about 0.34 watt-hours". In March 2020 George Kamiya of the International Energy Agency estimated that "streaming a Netflix video in 2019 typically consumed 0.12-0.24kWh of electricity per hour" - that's 240 watt-hours per hour at the higher end. Assuming that higher end, a ChatGPT prompt by Sam Altman's estimate uses: 0.34 Wh / (240 Wh / 3600 seconds) = 5.1 seconds of Netflix Or double that, 10.2 seconds, if you take the lower end of the Netflix estimate instead. I'm always interested in anything that can help contextualize a number like "0.34 watt-hours" - I think this comparison to Netflix is a neat way of doing that. This is evidently not the whole story with regards to AI energy usage - training costs, data center buildout costs and the ongoing fierce competition between the providers all add up to a very significant carbon footprint for the AI industry as a whole. (I got some help from ChatGPT to di...
I've been having a bunch of fun recently vibe coding a custom thread viewer against Bluesky, which has a CORS-enabled authentication free JSON API that's really fun to hit from plain HTML and JavaScript written by various LLM tools https://simonwillison.net/2025/Nov/28/bluesky-thread-viewer/
Activity on simonw/simonwillisonblog
simonw commented on an issue in simonwillisonblog
View on GitHubActivity on simonw/simonwillisonblog
simonw labeled an issue in simonwillisonblog
View on GitHub
Bluesky Thread Viewer thread by @simonwillison.net I've been having a lot of fun hacking on my Bluesky Thread Viewer JavaScript tool with Claude Code recently. Here it renders a thread (complete with demo video) talking about the latest improvements to the tool itself. I've been mostly vibe-coding this thing since April, now spanning 15 commits with contributions from ChatGPT, Claude, Claude Code for Web and Claude Code on my laptop. Each of those commits links to the transcript that created the changes in the commit. Bluesky is a lot of fun to build tools like this against because the API supports CORS (so you can talk to it from an HTML+JavaScript page hosted anywhere) and doesn't require authentication. Tags: projects, tools, ai, generative-ai, llms, cors, bluesky, vibe-coding, coding-agents, claude-code
Activity on simonw/tools
simonw contributed to simonw/tools
View on GitHubActivity on simonw/tools
simonw opened a pull request in tools
View on GitHubActivity on simonw/research
simonw contributed to simonw/research
View on GitHubActivity on simonw/datasette
simonw opened a pull request in datasette
View on GitHubActivity on simonw/datasette
simonw commented on an issue in datasette
View on GitHubActivity on simonw/datasette
simonw labeled an issue in datasette
View on GitHubTo evaluate the model’s capability in processing long-context inputs, we construct a video “Needle-in- a-Haystack” evaluation on Qwen3-VL-235B-A22B-Instruct. In this task, a semantically salient “needle” frame—containing critical visual evidence—is inserted at varying temporal positions within a long video. The model is then tasked with accurately locating the target frame from the long video and answering the corresponding question. [...] As shown in Figure 3, the model achieves a perfect 100% accuracy on videos up to 30 minutes in duration—corresponding to a context length of 256K tokens. Remarkably, even when extrapolating to sequences of up to 1M tokens (approximately 2 hours of video) via YaRN-based positional extension, the model retains a high accuracy of 99.5%. — Qwen3-VL Technical Report, 5.12.3: Needle-in-a-Haystack Tags: vision-llms, evals, generative-ai, ai-in-china, ai, qwen, llms
Re @Jhaddix @tristanbob There's an extra section in that post about the terminology confusion https://simonwillison.net/2025/Jun/16/the-lethal-trifecta/#this-is-an-example-of-the-prompt-injection-class-of-attacks
deepseek-ai/DeepSeek-Math-V2 New on Hugging Face, a specialist mathematical reasoning LLM from DeepSeek. This is their entry in the space previously dominated by proprietary models from OpenAI and Google DeepMind, both of which achieved gold medal scores on the International Mathematical Olympiad earlier this year. We now have an open weights (Apache 2 licensed) 685B, 689GB model that can achieve the same. From the accompanying paper: DeepSeekMath-V2 demonstrates strong performance on competition mathematics. With scaled test-time compute, it achieved gold-medal scores in high-school competitions including IMO 2025 and CMO 2024, and a near-perfect score on the undergraduate Putnam 2024 competition. Tags: mathematics, ai, llms, llm-reasoning, deepseek, llm-release, ai-in-china
Activity on simonw/tools
simonw contributed to simonw/tools
View on GitHubActivity on simonw/tools
simonw contributed to simonw/tools
View on GitHub