LogoFollow AI builders
  • 首页
  • 功能特性
  • 构建者
  • 提交构建者
LogoFollow AI builders

AI Builder 动态聚合平台|不追网红,只追创造者

TwitterX (Twitter)Email
公司
  • 联系我们
法律
  • Cookie政策
  • 隐私政策
  • 服务条款
© 2026 Follow AI builders All Rights Reserved.
SR

Sebastian Raschka

0 位关注者
864 条内容
39最近 7 天 条

简介

ML/AI research engineer. Ex stats professor. Author of "Build a Large Language Model From Scratch" (https://t.co/O8LAAMRzzW) & reasoning (https://t.co/5TueQKx2Fk)

平台

𝕏Sebastian Raschka

内容历史

SR
Sebastian Raschka
⚡github•about 17 hours ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 20 hours ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•about 20 hours ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 20 hours ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 20 hours ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•about 21 hours ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 21 hours ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•about 21 hours ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 21 hours ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•about 21 hours ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on rasbt/LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
𝕏x•1 day ago

Claude distillation has been a big topic this week while I am (coincidentally) writing Chapter 8 on model distillation. In that context, I shared some utilities to generate distillation data from all sorts of open-weight models via OpenRouter and Ollama: https://github.com/rasbt/reasoning-from-scratch/blob/main/ch08/02_generate_distillation_data/README.md

Your browser does not support the video tag.
View on X
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•1 day ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•2 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•2 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•2 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•2 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•2 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
𝕏x•4 days ago

http://x.com/i/article/2026503653893222402

View on X
SR
Sebastian Raschka
𝕏x•5 days ago

Memorization & distillation. Two sides of the same scaling coin.

@Ivan Fioravanti ᯅ

We extract nearly all (95.8%) of Harry Potter and the Sorcerer's Stone from Claude Sonnet 🤷🏻‍♂️

Quoted tweet media 1
View quoted post
View on X
SR
Sebastian Raschka
𝕏x•6 days ago

Am currently putting together an article, and yeah, the SWE-Bench Verified numbers are definitely a bit sus across all models -- the benchmark suggest they are more similar than they really are. So, I went down a rabbit hole looking into SWE-Bench Verified issues... And it looks like OpenAI already did really nice work there in their "Why SWE-bench Verified no longer measures frontier coding capabilities" analysis: https://openai.com/index/why-we-no-longer-evaluate-swe-bench-verified/ The gist is: 1. After auditing 27.6% of frequently failed tasks, at least 59.4% had flawed tests that reject correct solutions 2. Since SWE-Bench draws from widely used open-source repos, frontier models sometimes reproduced the exact “gold patch” or problem details, which suggest data leakage. (Probably a "duh" given that the dataset has been out since 2023.) Long story short, SWE-Bench Pro seems to a bit of an improvement (for now).

Am currently putting together an article, and yeah, the SWE-Bench Verified numbers are definitely a bit sus across all models -- the benchmark suggest they are more similar than they really are.

S...
View on X
SR
Sebastian Raschka
⚡github•6 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•6 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•6 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•6 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•7 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•8 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•8 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•8 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•8 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
𝕏x•8 days ago
Retweeted from @paulopacitti

RT paulopacitti 🌐 some things are quite fundamental, but so wholesome to remember it. Dot product just check how aligned vectors are, and in the context of LLMs, how tokens are similar to each other from @rasbt amazing book Original tweet: https://x.com/paulopacitti/status/2025214154236137645

RT paulopacitti 🌐
some things are quite fundamental, but so wholesome to remember it. Dot product just check how aligned vectors are, and in the context of LLMs, how tokens are similar to each oth...
View on X
SR
Sebastian Raschka
𝕏x•9 days ago

February is one of those months... - Moonshot AI's Kimi K2.5 (Feb 2) - z. AI GLM 5 (Feb 12) - MiniMax M2.5 (Feb 12) - ByteDance Seed-2.0 (Feb 13) - Nanbeige 4.1 3B (Feb 13) - Qwen 3.5 (Feb 15) - Cohere's Tiny Aya (Feb 17) (+Hopefully DeepSeek V4 soon) Anything I forgot?

View on X
SR
Sebastian Raschka
𝕏x•10 days ago

Tiny Aya reimplementation From Scratch! Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon). Tiny Aya (2 days ago) was a bit under the radar. Looks like a nice, small 3.35B model with strongest multilingual support of that size class. Great for on-device translation tasks. Just did a from-scratch implementation here: https://github.com/rasbt/LLMs-from-scratch/blob/main/ch05/15_tiny-aya/standalone-tiny-aya-plus-kv-cache.ipynb Architecture-wise, Tiny Aya is a classic decoder-style transformer with a few noteworthy modifications (besides the obvious ones like SwiGLU and Grouped Query Attention): 1. Parallel transformer blocks. A parallel transformer block computes attention and MLP from the same normalized input, then adds both to the residual in one step. I assume this is to reduce serial dependencies inside a layer to improve computational throughput. 2. Sliding window attention. Specifically, it uses a 3:1 local:global ratio similar to Arcee Trinity and Olmo 3. The window size is also 4096. Also, similar to Arcee, the sliding window layers use RoPE whereas the full attention layers use NoPE. 3. LayerNorm. Most architectures moved to RMSNorm as it's computationally a bit cheaper and performs well. Tiny Aya is keeping it more classic with a modified version of LayerNorm (the implementation here is like standard LayerNorm but without shift, i.e., bias, parameter).

Tiny Aya reimplementation From Scratch! 

Have been reading through the technical reports of the recent wave of open-weight LLM releases (more on that soon).
Tiny Aya (2 days ago) was a bit under t...
View on X
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on rasbt/LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on rasbt/LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on rasbt/LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•10 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt pushed LLMs-from-scratch

rasbt pushed LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on rasbt/LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

rasbt opened a pull request in LLMs-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•11 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•12 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on rasbt/reasoning-from-scratch

rasbt commented on an issue in reasoning-from-scratch

rasbt commented on an issue in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•13 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•14 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
𝕏x•14 days ago

Finished Ch07 on Improving GRPO for Reinforcement Learning! Building on the GRPO from scratch intro, this adds (and analyzes) more bells and whistles! (Clipped policy ratios, KL term, format rewards, and couple of improvements.) https://github.com/rasbt/reasoning-from-scratch/blob/main/ch07/01_main-chapter-code/ch07_main.ipynb

Finished Ch07 on Improving GRPO for Reinforcement Learning!

Building on the GRPO from scratch intro, this adds (and analyzes) more bells and whistles! (Clipped policy ratios, KL term, format rewar...
View on X
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt deleted

rasbt deleted

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt pushed reasoning-from-scratch

rasbt pushed reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on rasbt/reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

rasbt opened a pull request in reasoning-from-scratch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on repository

rasbt created a branch

rasbt created a branch

View on GitHub
SR
Sebastian Raschka
⚡github•15 days ago

Activity on rasbt/reasoning-from-scratch

rasbt labeled an issue in reasoning-from-scratch

rasbt labeled an issue in reasoning-from-scratch

View on GitHub