Max Nadeau (@maxnadeau_) 's Twitter Profile
Max Nadeau

@maxnadeau_

Advancing AI honesty, control, safety at @open_phil. Prev Harvard AISST (haist.ai), Harvard '23.

ID: 935718892546220034

linkhttp://maxnadeau.com calendar_today29-11-2017 03:55:43

152 Tweet

529 Followers

389 Following

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Fascinating... seems like all the fine-tuning/prompting about itself (you are an LLM, you were trained with CAI, etc) alters Claude's accuracy/forthrightness when talking about GPT-4. I wonder what else it gets weird about.

Cameron Jones (@camrobjones) 's Twitter Profile Photo

We made some updates to our "Does GPT-4 pass the Turing test paper", based on new data. The biggest one is that one prompt, "Dragon", achieves a pass rate of 49.7% after 855 games.  (brief 🧵)

We made some updates to our "Does GPT-4 pass the Turing test paper", based on new data. The biggest one is that one prompt, "Dragon", achieves a pass rate of 49.7% after 855 games.  (brief 🧵)
Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

This was a _secret_ non-disparagement, that employees weren't told would be sprung on them when they joined OpenAI. That strikes me as pretty immoral.

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Did we learn nothing from covid? Was all the death and suffering not enough motivation? Preparing for the next pandemic now is a moral imperative.

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

No ARC human baseline exists! arcprize.org/arc: "most humans can solve on average 85% of ARC-AGI tasks." But this study used the train set arcprize.org/guide: "The public training set is significantly easier than the...public evaluation and private evaluation set"

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Anyone want to measure the ARC-AGI human baseline? The ARC prize folks want to hear from you! I think this would be really cool; we'd have a better sense of where AIs fall in the distibution of human performance (tho I don't doubt that current models do quite poorly)

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Fascinating! Curriculum-learning demolishes regular fine-tuning for teaching a small model to multiply. This complicates the formula for model capabilities: not just model size, data quality, and data quantity, but also order of data/strategy for teaching.

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Come for the SOTA, stay for the discussion of why GPT-4o still trips up. It can't even answer questions about the grids! But given this, I think further scaling/mundane improvements to multimodal abilities will continue to improve perf.

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

François Chollet Dwarkesh Patel IIUC, the difference between the scaling optimists and you has been about how much effort it would take to scaffold LLMs for ARC (see screenshot below). ~8 person-days of effort on scaffolding vs. $100mm investment in the LLM makes it look like the LLM is the hard part.

<a href="/fchollet/">François Chollet</a> <a href="/dwarkesh_sp/">Dwarkesh Patel</a> IIUC, the difference between the scaling optimists and you has been about how much effort it would take to scaffold LLMs for ARC (see screenshot below). 

~8 person-days of effort on scaffolding vs. $100mm investment in the LLM makes it look like the LLM is the hard part.
Jack Cole (@mindsai_jack) 's Twitter Profile Photo

Today, I ran our solution alone against the public test set. It scored 54%. The 60% score was an ensemble of our approach and the previous symbolic approaches. At the time, our model was scoring at best 26% on the ARC-AGI private test set (vs. 37% today).

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Very important topic (how to measure safety) and a very impactful place to do it (government body with privileged access, working with great researchers like Geoffrey). Apply!

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

What a case study of systemic risk with CrowdStrike outage... that a few bits in the wrong place can brick ~1 billion computers and all the 2nd, 3rd order effects of it. What other single points of instantaneous failure exist in the technosphere and how do we design against it.

David Bau (@davidbau) 's Twitter Profile Photo

Time to study #llama3 405b, but gosh it's big! Please retweet: if you have a great experiment but not enough GPU, here is an opportunity to apply for shared #NDIF research resources. Deadline July 30: ndif.us/405b.html You'll help NDIF test, we'll help you run 405b

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Fascinating distinction between inputs into LLMs that are "data" and inputs that are "instructions". To ponder, for those who consider advexes an analogy for future misalignment triggers: which of the two categories is the better analogy?

Max Nadeau (@maxnadeau_) 's Twitter Profile Photo

Very worth reading! Most interesting part IMO is how Sam writes about interfacing with govts / external experts / wider world to reduce risks from other companies and to provide oversight on Anthropic.