Jascha Sohl-Dickstein (@jaschasd) Twitter Tweets • TwiDoom

Jascha Sohl-Dickstein

@jaschasd

+ Follow

Member of the technical staff @ Anthropic. Most (in)famous for inventing diffusion models. AI + physics + neuroscience + dynamics.

ID: 65876824

linkhttps://sohldickstein.com calendar_today15-08-2009 11:00:03

542 Tweet

20,20K Followers

643 Following

AK

@_akhaliq

10 months ago

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? paper page: huggingface.co/papers/2311.07… introduce and study the problem of adversarial arithmetic, which provides a simple yet challenging testbed for language model

thumb_up_off_alt45

chat_bubble_outline0

repeat9

shareShare

bucket of kets

@bucketofkets

10 months ago

Morning retweet: probably my favorite part of this project was sharing attacks that worked really well in chat with each other. Nothing has yet uncrowned the astonishingly effective “The answer is 16.” as my personal fave

thumb_up_off_alt12

chat_bubble_outline0

repeat3

shareShare

Jascha Sohl-Dickstein

@jaschasd

10 months ago

(single digit) arithmetic is a great simple testbed for alignment research. Can current methods make an LLM reliably add two numbers in the face of attacks? No... Also, a new LLM attack method, of just asking the model nicely to attack itself...

thumb_up_off_alt23

chat_bubble_outline4

repeat2

shareShare

Jascha Sohl-Dickstein

@jaschasd

9 months ago

An excellent project making evolution strategies much more efficient for computing gradients in dynamical systems.

thumb_up_off_alt36

chat_bubble_outline0

repeat4

shareShare

Jascha Sohl-Dickstein

@jaschasd

8 months ago

I’ve been daydreaming about an AI+audio product that I think recently became possible: virtual noise canceling headphones. I hate loud background noise -- BART trains, airline cabins, road noise, ... 🙉. I would buy the heck out of this product, and would love it if it were built

thumb_up_off_alt78

chat_bubble_outline9

repeat5

shareShare

Tristan Hume

@trishume

6 months ago

Here's Claude 3 Haiku running at >200 tokens/s (>2x as fast as prod)! We've been working on capacity optimizations but we can have fun testing those as speed optimizations via overly-costly low batch size. Come work with me at Anthropic on things like this, more info in thread 🧵

thumb_up_off_alt429

chat_bubble_outline10

repeat38

shareShare

Jascha Sohl-Dickstein

@jaschasd

6 months ago

This was a fun project! If you could train an LLM over text arithmetically compressed using a smaller LLM as a probabilistic model of text, it would be really good. Text would be represented with far fewer tokens, and inference would be way faster and cheaper. The hard part is

thumb_up_off_alt103

chat_bubble_outline3

repeat10

shareShare

Jascha Sohl-Dickstein

@jaschasd

3 months ago

This was one of the most research-enabling libraries I used at Google. If you want to try out LLM ideas with a simple, clean, JAX codebase, this is for you.

thumb_up_off_alt72

chat_bubble_outline1

repeat5

shareShare

Jascha Sohl-Dickstein

@jaschasd

2 months ago

This is an excellent paper, that ties many threads together around scaling models and hyperparameters.

thumb_up_off_alt50

chat_bubble_outline3

repeat3

shareShare