Luke Zettlemoyer (@lukezettlemoyer) Twitter Tweets • TwiDoom

Morena

@morenadevil4

9 years ago

Twitter Beğeni Hilesi

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

Meta presents MoMa: Efficient Early-Fusion Pre-training with Mixture of Modality-Aware Experts - Dividesy expert modules into modality-specific groups - Achieves better performance than the baseline MoE abs: arxiv.org/abs/2407.21770 alphaxiv: alphaxiv.org/abs/2407.21770

thumb_up_off_alt222

chat_bubble_outline4

repeat54

shareShare

Victoria X Lin

@victorialinml

2 months ago

1/n Introducing MoMa 🖼, our new sparse early-fusion architecture for mixed-modal language modeling that significantly boosts pre-training efficiency 🚀 (arxiv.org/pdf/2407.21770). MoMa employs a mixture-of-expert (MoE) framework with modality-specific expert groups. Given any

thumb_up_off_alt297

chat_bubble_outline7

repeat55

shareShare

Armen Aghajanyan

@armenagha

2 months ago

If you were interested in my cryptic posts on how to train Chameleon-like models up to 4x faster, check out our MoMa paper which covers a detailed overview of most of our architectural improvements. tl;dr adaptive compute in 3-dim, modality, width, depth.

thumb_up_off_alt216

chat_bubble_outline3

repeat24

shareShare

Ai2

@allen_ai

2 months ago

Our people are our biggest strength — and we just got stronger! We're overjoyed to welcome Sewon Min and Tim Dettmers to the Ai2 team. Their expertise and ingenuity will help us push the boundaries of AI even further. Welcome aboard 🛳️

thumb_up_off_alt88

chat_bubble_outline4

repeat8

shareShare

Hila Gonen

@hila_gonen

a month ago

Do you like yellow? Then, according to LLMs, you are probably a school bus driver! Excited to share our new paper about Semantic Leakage in Language Models! Joint work with my wonderful collaborators @terra Alisa Liu luke Noah A. Smith Paper: gonenhila.github.io/files/Semantic… 1/10

thumb_up_off_alt199

chat_bubble_outline10

repeat34

shareShare

Ai2

@allen_ai

a month ago

🥳The BIGGEST congratulations for our teams' recognition at #ACL2024! OLMo received the Best Theme Paper, Dolma + AppWorld received the Best Resource Paper, and "Political Compass or Spinning Arrow?" was honored with an Outstanding Paper Award.

thumb_up_off_alt70

chat_bubble_outline4

repeat14

shareShare

Oreva Ahia

@orevaahia

a month ago

Thrilled to have won the Best Social Impact Paper Award at #ACL2024 ACL 2024 for our work; DialectBench! Big thanks to all my amazing collaborators who made this possible!

thumb_up_off_alt130

chat_bubble_outline20

repeat23

shareShare

NYU Data Science

@nyudatascience

a month ago

CDS welcomes Eunsol Choi (Eunsol Choi) as an Assistant Professor of Computer Science (NYU Courant) and Data Science! Her research focuses on advancing how computers interpret human language in real-world contexts. nyudatascience.medium.com/meet-the-facul…

thumb_up_off_alt184

chat_bubble_outline0

repeat18

shareShare

Aran Komatsuzaki

@arankomatsuzaki

a month ago

JPEG-LM: LLMs as Image Generators with Canonical Codec Representations - Directly model images and videos via canonical codecs (e.g., JPEG, AVC/H.264) - More effective than pixel-based modeling and VQ baselines (yields a 31% reduction in FID) arxiv.org/abs/2408.08459

thumb_up_off_alt180

chat_bubble_outline7

repeat43

shareShare

Yoav Artzi (PC-ing COLM)

@yoavartzi

a month ago

New online: LM-class -- lm-class.org LM-class includes all the lectures (PDFs and .key files) and assignments of my completely revamped "intro NLP" class, which probably be renamed to Introduction to Language Modeling (broadly construed).

thumb_up_off_alt284

chat_bubble_outline4

repeat50

shareShare

tsvetshop

@tsvetshop

a month ago

Huge congrats to Oreva Ahia and Shangbin Feng for winning awards at #ACL2024! DialectBench Best Social Impact Paper Award arxiv.org/abs/2403.11009 Don't Hallucinate, Abstain Area Chair Award, QA track & Outstanding Paper Award arxiv.org/abs/2402.00367

thumb_up_off_alt58

chat_bubble_outline1

repeat6

shareShare

Terra Blevins

@terrablvns

a month ago

I’m very excited to join Northeastern U. Khoury College of Computer Sciences as an assistant professor starting Fall '25!! Looking forward to working with the amazing people there! Until then I'll be a postdoc at NLP @ Uni Vienna with Ben Roth, so reach out if you want to meet up while I'm over in Europe ✨

thumb_up_off_alt292

chat_bubble_outline29

repeat18

shareShare

Jiacheng Liu

@liujc1998

a month ago

Very humbled to learn that infini-gram is mentioned in Dan Jurafsky 's SLP textbook!

Very humbled to learn that infini-gram is mentioned in <a href="/jurafsky/">Dan Jurafsky</a> 's SLP textbook!

thumb_up_off_alt55

chat_bubble_outline0

repeat5

shareShare

Chunting Zhou

@violet_zct

a month ago

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

thumb_up_off_alt983

chat_bubble_outline23

repeat208

shareShare

Peter West

@peterwesttm

a month ago

I'm very excited to be starting my dream job as faculty at UBC Computer Science CAIDA_UBC in 2025 and postdoc-ing with Christopher Potts at Stanford HAI Stanford NLP Group this year! I am recruiting students this cycle who are curious to explore the mysteries and limitations of LMs / GenAI ...

thumb_up_off_alt270

chat_bubble_outline41

repeat23

shareShare

Yu Meng

@yumeng0818

25 days ago

Thrilled & humbled to receive the KDD Dissertation Award 🥳 Deepest thanks to my advisor Prof. Jiawei Han, my labmates Data Mining Group@UIUC, & my letter writers Luke Zettlemoyer & Danqi Chen. This is a shared honor - couldn't have done it without your support🎉

thumb_up_off_alt144

chat_bubble_outline13

repeat6

shareShare

AI at Meta

@aiatmeta

23 days ago

New research paper from Meta FAIR – Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model. Chunting Zhou, Lili Yu and team introduce this recipe for training a multi-modal model over discrete and continuous data. Transfusion combines next token

New research paper from Meta FAIR – Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model.

<a href="/violet_zct/">Chunting Zhou</a>, <a href="/liliyu_lili/">Lili Yu</a> and team introduce this recipe for training a multi-modal model over discrete and continuous data. Transfusion combines next token

thumb_up_off_alt793

chat_bubble_outline14

repeat145

shareShare

Niklas Muennighoff

@muennighoff

16 days ago

Releasing OLMoE - the first good Mixture-of-Experts LLM that's 100% open-source - 1B active, 7B total params for 5T tokens - Best small LLM & matches more costly ones like Gemma, Llama - Open Model/Data/Code/Logs + lots of analysis & experiments 📜arxiv.org/abs/2409.02060 🧵1/9

thumb_up_off_alt888

chat_bubble_outline22

repeat213

shareShare

Ai2

@allen_ai

16 days ago

🐣Welcome the newest member to the OLMo family, OLMoE! This Mixture-of-Experts model is 100% open — it's efficient, performant, and ready for experimentation. Learn more on our blog: blog.allenai.org/olmoe-an-open-…

thumb_up_off_alt76

chat_bubble_outline2

repeat16

shareShare