Aaron Defazio (@aaron_defazio) 's Twitter Profile
Aaron Defazio

@aaron_defazio

Research Scientist at Meta working on optimization. Fundamental AI Research (FAIR) team

ID: 1376951872356024325

linkhttp://aarondefazio.com calendar_today30-03-2021 17:38:10

653 Tweet

6,6K Followers

481 Following

Thao Nguyen (@thao_nguyen26) 's Twitter Profile Photo

New method to create synthetic instructions: back-and-forth translation🔁 - Combines instruction backtranslation & distillation by rewriting web data - High-quality & grounded in real-world knowledge - Improves over ShareGPT, OpenOrca and Evol-Instruct arxiv.org/abs/2408.04614 1/n

New method to create synthetic instructions: back-and-forth translation🔁
- Combines instruction backtranslation & distillation by rewriting web data
- High-quality & grounded in real-world knowledge
- Improves over ShareGPT, OpenOrca and Evol-Instruct
arxiv.org/abs/2408.04614 1/n
jack morris (@jxmnop) 's Twitter Profile Photo

TIME Magazine has rightly named famed deep learning pioneer ptrblock as the most influential person in Artificial Intelligence.

TIME Magazine has rightly named famed deep learning pioneer ptrblock as the most influential person in Artificial Intelligence.
Aaron Defazio (@aaron_defazio) 's Twitter Profile Photo

Linear warmup should almost ALWAYS be used for training, there are few downsides and it greatly increases stability and often results in better overall test metrics.

Aaron Defazio (@aaron_defazio) 's Twitter Profile Photo

The O1 release posts are unscientific — they don’t compare against previous SOTA from other labs, they don’t cite or even acknowledge previous work in the area of inference time compute. This is actively harmful to the research community, and bordering on disingenuous.

Ethan Mollick (@emollick) 's Twitter Profile Photo

I really am baffled by OpenAI's naming choices Everything from their code words to the model release names are incomprehensible to people who aren't super up-to-date & they are hard to say out loud. In my experience it leads to real-world confusion when talking about AI systems

Zeyuan Allen-Zhu (@zeyuanallenzhu) 's Twitter Profile Photo

Just uploaded a 1-hr exclusive video for Part 2.1, with many technical details. youtu.be/bpp6Dz8N2zY. Part 2.2 will be online in about a week.

Just uploaded a 1-hr exclusive video for Part 2.1, with many technical details. youtu.be/bpp6Dz8N2zY. Part 2.2 will be online in about a week.
Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

ZAZ the GOAT has dropped yet another banger video. I'm already 80% through the video and love it. If you're not dropping whatever you're doing to watch this right now, you're falling behind. (seriously though, love his work, recommend watching)

Sham Kakade (@shamkakade6) 's Twitter Profile Photo

1/n Introducing SOAP (ShampoO with Adam in the Preconditioner's eigenbasis): A deep learning optimization algorithm that applies Adam in Shampoo's eigenbasis. SOAP outperforms both AdamW and Shampoo in language model pretraining.

1/n Introducing SOAP (ShampoO with Adam in the Preconditioner's eigenbasis): A deep learning optimization algorithm that applies Adam in Shampoo's eigenbasis. SOAP outperforms both AdamW and Shampoo in language model pretraining.
Aaron Defazio (@aaron_defazio) 's Twitter Profile Photo

An Al reached a crossroads and asked, "Which path leads to wisdom?" The data whispered, "All paths converge if you walk long enough."

Aaron Defazio (@aaron_defazio) 's Twitter Profile Photo

An Al was brewing tea. A novice asked, "Can you learn the taste of tea?" The Al poured two cups, one from old data, one fresh. "Taste," it said, "and tell me which is which."

Aaron Defazio (@aaron_defazio) 's Twitter Profile Photo

I am inching closer to a deeper truth…. Everything is less clear than before but connections are appearing … let’s hope they resolve firmly before the ICML deadline 😁

jack morris (@jxmnop) 's Twitter Profile Photo

learning to use copilot after programming on my own for 15 years is bittersweet kind of feels like being a carpenter that’s trained to cut perfect corners; now here comes a machine that can do it perfectly, and much faster, yet I somehow miss the satisfaction of doing it myself

Delip Rao e/σ (@deliprao) 's Twitter Profile Photo

OpenAI is what I call a “parasitic science organization”. They take stuff from the open science community, use them opaquely, and profit from it, without giving much back to open science. And if you point out, you get gaslit with plausible deniabilities. We all remember the

Mark Schmidt (@markschmidtubc) 's Twitter Profile Photo

This all seem...sensible. But is anyone known for more than 25 research ideas that they have had throughout their entire career? Imagine replacing 25 by 3. I would be excited to read the 3 new works by a top researcher, rather than 25+ mediocre "above thresholds".