Chunting Zhou (@violet_zct) Twitter Tweets • TwiDoom

Chunting Zhou

@violet_zct

+ Follow

Research Scientist at FAIR. PhD @CMU. she/her.

ID: 3284146452

linkhttps://violet-zct.github.io/ calendar_today19-07-2015 09:41:45

145 Tweet

3,3K Followers

284 Following

AK

@_akhaliq

5 months ago

Meta announces Megalodon Efficient LLM Pretraining and Inference with Unlimited Context Length The quadratic complexity and weak length extrapolation of Transformers limits their ability to scale to long sequences, and while sub-quadratic solutions like linear attention and

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat226

shareShare

Sergey Edunov

@edunov

5 months ago

Llama 3 has arrived! Taaa-daaam! ai.meta.com/blog/meta-llam…

thumb_up_off_alt55

chat_bubble_outline4

repeat7

shareShare

Chunting Zhou

@violet_zct

4 months ago

🚀 Excited to introduce Chameleon, our work in mixed-modality early-fusion foundation models from last year! 🦎 Capable of understanding and generating text and images in any sequence. Check out our paper to learn more about its SOTA performance and versatile capabilities!

thumb_up_off_alt112

chat_bubble_outline3

repeat19

shareShare

Daniel Levy

@daniellevy__

3 months ago

Beyond excited to be starting this company with Ilya and DG! I can't imagine working on anything else at this point in human history. If you feel the same and want to work in a small, cracked, high-trust team that will produce miracles, please reach out.

thumb_up_off_alt1,1K

chat_bubble_outline101

repeat111

shareShare

Xiang Lisa Li

@xianglisali2

2 months ago

arxiv.org/abs/2407.08351 LM performance on existing benchmarks is highly correlated. How do we build novel benchmarks that reveal previously unknown trends? We propose AutoBencher: it casts benchmark creation as an optimization problem with a novelty term in the objective.

thumb_up_off_alt303

chat_bubble_outline9

repeat65

shareShare

Chunting Zhou

@violet_zct

a month ago

Great work from Horace He and the team! FlexAttention is really easy to use with highly expressive designed user interface , also with strong profiles compared to Flash!

thumb_up_off_alt24

chat_bubble_outline0

repeat1

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

a month ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model abs: arxiv.org/abs/2408.11039 New paper from Meta that introduces Transfusion, a recipe for training a model that can seamlessly generate discrete and continuous modalities. The authors pretrain a

thumb_up_off_alt369

chat_bubble_outline5

repeat61

shareShare

AK

@_akhaliq

a month ago

Transfusion Predict the Next Token and Diffuse Images with One Multi-Modal Model discuss: huggingface.co/papers/2408.11… We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. Transfusion combines the language modeling loss function

thumb_up_off_alt265

chat_bubble_outline4

repeat52

shareShare

Aran Komatsuzaki

@arankomatsuzaki

a month ago

Meta presents Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model - Can generate images and text on a par with similar scale diffusion models and language models - Compresses each image to just 16 patches arxiv.org/abs/2408.11039

thumb_up_off_alt438

chat_bubble_outline4

repeat108

shareShare

Jim Fan

@drjimfan

a month ago

The transformer-land and diffusion-land have been separate for too long. There were many attempts to unify before, but they lose simplicity and elegance. Time for a transfusion🩸to revitalize the merge!

thumb_up_off_alt446

chat_bubble_outline8

repeat108

shareShare

Horace He

@chhillee

a month ago

Jokes aside, it's fun to see innovation beyond the standard causal/autoregressive next-token generation in text. Transfusion is another cool work in this vein (that already used FlexAttention :P) x.com/violet_zct/sta…

thumb_up_off_alt54

chat_bubble_outline0

repeat3

shareShare