Lili Yu (@liliyu_lili) Twitter Tweets • TwiDoom

Lili Yu

@liliyu_lili

5 months ago

Beside large scale language modeling , it also achieve SOTA on many other tasks, with very little tweak.

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

It’s here! Meet Llama 3, our latest generation of models that is setting a new standard for state-of-the art performance and efficiency for openly available LLMs. Key highlights • 8B and 70B parameter openly available pre-trained and fine-tuned models. • Trained on more

thumb_up_off_alt1,1K

chat_bubble_outline63

repeat202

shareShare

Lili Yu

@liliyu_lili

5 months ago

The long-awaited LLAMA3!!

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Teknium (e/λ)

@teknium1

5 months ago

Welp folks, we have gpt-4 at home

thumb_up_off_alt4,4K

chat_bubble_outline151

repeat364

shareShare

Curtis G. Northcutt

@cgnorthcutt

5 months ago

Goodbye Hallucinations! Today, Cleanlab launches the Trustworthy Language Model (TLM 1.0), addressing the biggest problem in Generative AI: reliability. technologyreview.com/2024/04/25/109…

thumb_up_off_alt54

chat_bubble_outline3

repeat18

shareShare

Armen Aghajanyan

@armenagha

4 months ago

I’m excited to announce our latest paper, introducing a family of early-fusion token-in token-out (gpt4o….), models capable of interleaved text and image understanding and generation. arxiv.org/abs/2405.09818

thumb_up_off_alt1,1K

chat_bubble_outline43

repeat229

shareShare

Srini Iyer

@sriniiyer88

4 months ago

Excited to release our work from last year showcasing a stable training recipe for fully token-based multi-modal early-fusion auto-regressive models! arxiv.org/abs/2405.09818 Huge shout out to Armen Aghajanyan Ramakanth Luke Zettlemoyer Gargi Ghosh and other co-authors. (1/n)

thumb_up_off_alt102

chat_bubble_outline4

repeat27

shareShare

Srini Iyer

@sriniiyer88

4 months ago

We found that the uncontrolled growth of output norms is an early indicator of future training divergence. More shout-outs: Victoria X Lin Chunting Zhou Lili Yu (4/n)

We found that the uncontrolled growth of output norms is an early indicator of future training divergence. More shout-outs: <a href="/VictoriaLinML/">Victoria X Lin</a> <a href="/violet_zct/">Chunting Zhou</a> <a href="/liliyu_lili/">Lili Yu</a> (4/n)

thumb_up_off_alt9

chat_bubble_outline2

repeat1

shareShare

Lili Yu

@liliyu_lili

4 months ago

🚀 Excited to introduce Chameleon, our latest breakthrough in mixed-modal early-fusion foundation models! 🦎✨ Capable of understanding and generating text and images in any sequence. Check out our paper to learn more about its SOTA performance and versatile capabilities! 🌟

thumb_up_off_alt76

chat_bubble_outline2

repeat5

shareShare

Lili Yu

@liliyu_lili

4 months ago

Interleaving text image generation with consistency is a unique feature bought by our early-fusing end to end training model.

thumb_up_off_alt39

chat_bubble_outline1

repeat4

shareShare

Lili Yu

@liliyu_lili

4 months ago

We are better than all other multimodal baseline on interleaving generation.

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Lili Yu

@liliyu_lili

4 months ago

The team is working very hard to make this happen. Armen Aghajanyan Srini Iyer Gargi Ghosh Luke Zettlemoyer

thumb_up_off_alt24

chat_bubble_outline1

repeat3

shareShare

Lili Yu

@liliyu_lili

4 months ago

Such a fun coincident on picking the same name. Before scaling up, we called it CM3leon (pronounced as chameleon, with a twist to older cm3 paper) last year, in the paper "Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning" (arxiv.org/abs/2309.02591).

thumb_up_off_alt17

chat_bubble_outline0

repeat2

shareShare

Gargi Ghosh

@gargighosh

3 months ago

Extremely proud of the team who made it happen. Armen Aghajanyan Srini Iyer Luke Zettlemoyer Asli Celikyilmaz Victoria X Lin Scott Yih Lili Yu and many others!

thumb_up_off_alt5

chat_bubble_outline0

repeat3

shareShare

Lili Yu

@liliyu_lili

3 months ago

Super excited to open-source Chameleon 7B and 34B model weights today. Finally 🎇

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Tanishq Mathew Abraham, Ph.D.

@iscienceluvr

a month ago

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model abs: arxiv.org/abs/2408.11039 New paper from Meta that introduces Transfusion, a recipe for training a model that can seamlessly generate discrete and continuous modalities. The authors pretrain a

thumb_up_off_alt369

chat_bubble_outline5

repeat61

shareShare

AK

@_akhaliq

a month ago

Transfusion Predict the Next Token and Diffuse Images with One Multi-Modal Model discuss: huggingface.co/papers/2408.11… We introduce Transfusion, a recipe for training a multi-modal model over discrete and continuous data. Transfusion combines the language modeling loss function

thumb_up_off_alt265

chat_bubble_outline4

repeat52

shareShare

Aran Komatsuzaki

@arankomatsuzaki

a month ago

Meta presents Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model - Can generate images and text on a par with similar scale diffusion models and language models - Compresses each image to just 16 patches arxiv.org/abs/2408.11039

thumb_up_off_alt438

chat_bubble_outline4

repeat108

shareShare

Lili Yu

@liliyu_lili

a month ago

🚀 Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to Chunting Zhou Omer Levy Michi Yasunaga Arun Babu Kushal Tirumala and other collaborators.

thumb_up_off_alt98

chat_bubble_outline4

repeat17

shareShare

AI at Meta

@aiatmeta

23 days ago

New research paper from Meta FAIR – Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model. Chunting Zhou, Lili Yu and team introduce this recipe for training a multi-modal model over discrete and continuous data. Transfusion combines next token

New research paper from Meta FAIR – Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model.

<a href="/violet_zct/">Chunting Zhou</a>, <a href="/liliyu_lili/">Lili Yu</a> and team introduce this recipe for training a multi-modal model over discrete and continuous data. Transfusion combines next token

thumb_up_off_alt793

chat_bubble_outline14

repeat145

shareShare

Lili Yu

Lili Yu

Ahmad Al-Dahle

Lili Yu

Teknium (e/λ)

Curtis G. Northcutt

Armen Aghajanyan

Srini Iyer

Srini Iyer

Lili Yu

Lili Yu

Lili Yu

Lili Yu

Lili Yu

Gargi Ghosh

Lili Yu

Tanishq Mathew Abraham, Ph.D.

AK

Aran Komatsuzaki

Lili Yu

AI at Meta