Hamel Husain
简介
Evals evals evals https://t.co/Zrmp6LRd9c About Me: https://t.co/P6WyeKkyTa
平台
内容历史
Love it. The Simon honeypotSimon Willison: Since this question shows up so often that it qualifies as an FAQ, here's my definite answer to "What happens if AI labs train for pelicans riding bicycles?" https://simonwillison.net/2025/Nov/13/training-for-pelicans-riding-bicycles/ Link: https://x.com/simonw/status/1989001665526264169
RT Simon Willison AI automated replies on here are annoying, but the ones that ask follow-up questions are next level rude because, if taken at face value, they act as time vampires
RT Vik Paruchuri The Datalab API can now extract redlines and comments into clean markdown! This is great for analyzing legal documents with LLMs.
RT Hamel Husain Lots things make sense when you realize how similar these activities are (when done well)
This eval talk features some of my favorite people all in one go. It's discusses evals from many perspectives: - How to look at data - Human/Computer interface design - Metrics - Tools - etc @eugeneyan , @sh_reya , @BEBischof , @hwchase17 , etc 🔥 https://www.youtube.com/watch?si=P9EmuJXw0kzLsdIu&v=SnbGD677_u0
> New AI coding / “OS for AI” / etc announced claiming “We are different” > Opens page > Says “Join waitlist” > Close it, move on, forget about it I can’t remember any great software behind a waitlist in recent times
RT Omar Khattab This is an extremely exciting initiative on Saturday. I'm especially excited about the fact that they're creating evals for their task! Bummer it's only in SF. Folks who are there should give this a shot!! I was asked to share the event, but I wanted to find a moment to write some thoughts: I get that the caricature of the tools below is just a caricature, but one must add that building a good system and building a hackathon system call for opposite tradeoffs. A good system is maintainable and portable into the future, even as the underlying technology shifts. It revolves around separation of concerns. A good tool thus prevents you from premature hand-engineering, even though the bitter lesson tells you that hand-fitting *will* in fact help you in the short term. In other words, if my goal is to build a throwaway artifact in a few hours, I will shamelessly consider low-level tricks and duct tape. The only way they can fail me is if I'm not a very good prompt engineer for some reason or if there are just too many settings and models to handle at once by hand. All that said, I'm super excited about the resources that this will produce. We need better public evals and my understanding is that this could produce one.Hamel Husain: This is going to be a 🌶️ event. Battle Royale of all the tribes - @DSPyOSS - Just optimize bro - @LangChainAI - A framework is all you need - @fastdotai - everything is a notebook (solveit) - Python vs. Typescript - etc. This is in two weeks! Links in reply Link: https://x.com/HamelHusain/status/1983950540431355999
RT Greg Kamradt Hamel and Bryan are the type of group that if you do good work (regardless of your track record or lack of) you will get rewarded with more cool work and opportunities Great chance for a data analyst to jump on a quick project and over deliverHamel Husain: . @BEBischof and I are looking for a strong Data Analyst who will have a super fun, special role in this Hackathon to be "THE human baseline" Group DM us if interested. Link: https://x.com/HamelHusain/status/1986994437311091016
. @BEBischof and I are looking for a strong Data Analyst who will have a super fun, special role in this Hackathon to be "THE human baseline" Group DM us if interested.Hamel Husain: 📢New development, we have prizes for this event (which is almost full) - ${10k,5k,1k) in OpenAI credits - NVIDIA mystery GPUs - Swag TLDR; this is an IRL Kaggle competition re: build agents to answer questions over structured & unstructured data https://luma.com/7vg7u3mf?utm_source=hh Link: https://x.com/HamelHusain/status/1986896087307985292
RT Shreya Shankar This is important information» teej: O'Reilly Wine Pairings Book: Evals for AI Engineers by @sh_reya and @HamelHusain Wine: Balgownie 2023 Gold Label Shiraz Link: https://x.com/teej_m/status/1986929594843500574
RT » teej O'Reilly Wine Pairings Book: Evals for AI Engineers by @sh_reya and @HamelHusain Wine: Balgownie 2023 Gold Label ShirazHamel Husain: 👀 Animals have been assigned. Scheduled to print fall 2026! We have iterated on this with over 3k students (and continue to do so). We give our students access to the full draft as part of our evals course (link in bio). Link: https://x.com/HamelHusain/status/1986918458286862480
📢New development, we have prizes for this event (which is almost full) - ${10k,5k,1k) in OpenAI credits - NVIDIA mystery GPUs - Swag TLDR; this is an IRL Kaggle competition re: build agents to answer questions over structured & unstructured data https://luma.com/7vg7u3mf?utm_source=hh
RT Shreya Shankar Finally got some time to code in this busy semester & am building DocScraper for the DocETL stack. People struggle to discover documents to analyze. Surprisingly AI is pretty bad at writing (i) code to scrape *high-quality* docs, and (ii) custom UI to visualize the docs
Come heckle me in ~10 minutes hereabhishek: join here: https://www.youtube.com/watch?v=1VBeE5BfhPM Link: https://x.com/abhi1thakur/status/1986734266689224997
I guess we gonna have Sydney Sweeney memes for the next 48 hours
RT abhishek today, i will talking with hamel husain about evaluations. this is going to be so good. dont miss it!
Activity on repository
hamelsmu made this repository public
hamelsmu made this repository public
View on GitHubRT Chris Albon Your goal isn’t to code. Your goal is to build. If EDM music makes you build faster, listen to EDM. If AI makes you build faster, use AI. Being a great coder + AI will make you build 1000x faster than not being able to code + AI.
Activity on repository
hamelsmu made this repository public
hamelsmu made this repository public
View on GitHub