iislucas (Lucas Dixon) (@iislucas) Twitter Tweets • TwiDoom

Dan Friedman

a year ago

Learning Transformer Programs We designed a modified Transformer that can be trained to solve a task and then automatically converted into a discrete, human-readable program. With Alex Wettig and Danqi Chen. Paper: arxiv.org/abs/2306.01128 Code: github.com/princeton-nlp/… [1/12]

thumb_up_off_alt530

chat_bubble_outline10

repeat135

shareShare

Tolga Bolukbasi

@tolgab0

a year ago

Understanding the data-model interaction has become a very important topic for alignment and understanding LLM behavior. Happy to announce that we are organizing a workshop on attribution and accepting submissions.

thumb_up_off_alt24

chat_bubble_outline1

repeat7

shareShare

Adam Pearce

@adamrpearce

a year ago

Do Machine Learning Models Memorize or Generalize? pair.withgoogle.com/explorables/gr… An interactive introduction to grokking and mechanistic interpretability w/ Asma Ghandeharioun, nada hussein, Nithum, Martin Wattenberg and iislucas (Lucas Dixon)

thumb_up_off_alt1,1K

chat_bubble_outline20

repeat252

shareShare

Google AI

@googleai

a year ago

Recent innovation has given rise to #ML models w/ impressive capabilities, but there’s much to learn about how we attribute model behavior to training data, algorithms, architecture, & more! Have papers or ideas on this? Submit to ATTRIB @ #NeurIPS2023 → attrib-workshop.cc

thumb_up_off_alt207

chat_bubble_outline5

repeat54

shareShare

iislucas (Lucas Dixon)

@iislucas

a year ago

PAIR is looking for a Research Scientist interested in making hard ML problems (like understanding langauge) much smaller... In Paris, and working closely with fun interactive explorable visualizations too. See: goo.gle/3PcvPEs

thumb_up_off_alt48

chat_bubble_outline0

repeat16

shareShare

Peter Hase

@peterbhase

a year ago

Happy to share that this paper was accepted with a Spotlight at #NeurIPS2023! We updated the arXiv with results showing the disconnect between knowledge localization and editing success across different neuron ablations, editing methods, editing metrics, models, and datasets.⬇️

thumb_up_off_alt82

chat_bubble_outline1

repeat20

shareShare

Jeff Dean (@🏡)

@jeffdean

10 months ago

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,

thumb_up_off_alt13,13K

chat_bubble_outline275

repeat2,2K

shareShare

Dan Friedman

@danfriedman0

9 months ago

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6

thumb_up_off_alt295

chat_bubble_outline3

repeat49

shareShare

Asma Ghandeharioun

@ghandeharioun

8 months ago

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9

thumb_up_off_alt783

chat_bubble_outline15

repeat149

shareShare

Geoffrey Cideron

@cdrgeo

7 months ago

Happy to introduce our paper MusicRL, the first music generation system finetuned with human preferences. Paper link: arxiv.org/abs/2402.04229

thumb_up_off_alt81

chat_bubble_outline2

repeat30

shareShare

Adam Roberts

@ada_rob

7 months ago

I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx… It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,

thumb_up_off_alt445

chat_bubble_outline50

repeat107

shareShare

Google AI

@googleai

5 months ago

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → goo.gle/4aS5epd

thumb_up_off_alt1,1K

chat_bubble_outline33

repeat368

shareShare

Armand Joulin

@armandjoulin

3 months ago

Gemma 2 27B is now the best open model while being 2.5x smaller than alternatives! This validates the work done by the team and Gemini. This is just the beginning 💙♊️

thumb_up_off_alt254

chat_bubble_outline8

repeat61

shareShare

kyutai

@kyutai_labs

3 months ago

Join us live tomorrow at 2:30pm CET for some exciting updates on our research! youtube.com/live/hm2IJSKcY…

thumb_up_off_alt256

chat_bubble_outline14

repeat41

shareShare

nada hussein

@nadamused_

2 months ago

Can Large Language Models Explain Their Internal Mechanisms? pair.withgoogle.com/explorables/pa… An interactive intro to Patchscopes, an inspection framework for explaining the hidden representations of LLMs, with LLMs w/ Asma Ghandeharioun Ryan Mullins Emily Reif Jimbo Wilson Nithum iislucas (Lucas Dixon)

thumb_up_off_alt127

chat_bubble_outline9

repeat21

shareShare

Google AI

@googleai

2 months ago

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → goo.gle/patchscopes

thumb_up_off_alt604

chat_bubble_outline20

repeat151

shareShare

Google DeepMind

@googledeepmind

2 months ago

We’re welcoming a new 2 billion parameter model to the Gemma 2 family. 🛠️ It offers best-in-class performance for its size and can run efficiently on a wide range of hardware. Developers can get started with 2B today → dpmd.ai/4d0MKEH

thumb_up_off_alt1,1K

chat_bubble_outline36

repeat343

shareShare

Asma Ghandeharioun

@ghandeharioun

a month ago

🧵Responses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)

thumb_up_off_alt62

chat_bubble_outline1

repeat10

shareShare