Discover real AI creators shaping the future. Track their latest blogs, X posts, YouTube videos, WeChat Official Account posts, and GitHub commits — all in one place.
Newly concerning for the AGI optimists...
It seems like the asymptote of AI will come down to its potential inability to discover a truly novel / creative solution to a problem. To the degree it continues to be mediocre, the gap must be filled by humans. That’s still quite a lot of alpha left over in software.
View quoted postActivity on repository
yoheinakajima pushed regimes-probe
View on GitHubUhm Guys… Mythos (Fable) is AGI. On the left is the ACTUAL Lovable Mobile App. On the right is my Lovable version I built with Mythos in 2 prompts. My version SMOKED it.
Activity on repository
yoheinakajima pushed regimes-probe
View on GitHubWho’s ready for the 2000$ / month plans?
Even although the company isn't named in this tweet... ...you know exactly who he's talking about, don't you?
@soldni Yeah, I mean I think they believe they are somehow morally superior and virtuous and anything they do must be correct because they are the “good guys” and in that way they are consistent but sort of vacuously so
View quoted postActivity on repository
yoheinakajima pushed regimes-probe
View on GitHubIf you’re not smart enough to understand frontier LLM research or security vulnerabilities (like me) Fable is absolutely insane, and an absolute pleasure to use.
Claude Fable 5 thinks document parsing is beneath it It is absolutely crushing on all reasoning-intensive/long horizon benchmarks: SWE-Bench Pro, FrontierCode, GDPval, Runescape, etc. But for document understanding tasks, it is roughly equivalent with Gemini 3 Flash in performance, at roughly 10-15x the token cost. We benchmarked the model on ParseBench and compared it against all other frontier models. It is definitely up there compared to other frontier models, but falls far short of specialized OCR providers. What we found interesting is that Fable 5 is self-aware about this. When we ask the model what tasks it enjoys the last, it actively said that it dislikes tasks "where the request is fully specified and the answer is fully known" - implying part of it being bad is due to laziness and lack of willingness to actually solve the task at hand. For a full list of results across different frontier models, check out ParseBench! https://www.parsebench.ai/
Day 0 Anthropic Fable 5 in ParseBench: We tested the model's advancements when it comes to document understanding. The model clearly peaks when it comes to adherence to the original text: 📃 Content faithfulness: 90.02% vs 86.19% (Gemini 3 Flash) and 86.81% (GPT-5.5) 🔢 Semantic