iislucas (Lucas Dixon) (@iislucas) 's Twitter Profile
iislucas (Lucas Dixon)

@iislucas

machines learn, graphs reason, identity is a non-identity, incompetence over conspiracy, evil by association is evil, expression is never free, stay curious

ID: 101337016

linkhttp://pair.withgoogle.com calendar_today02-01-2010 23:08:40

241 Tweet

395 Followers

202 Following

Dan Friedman (@danfriedman0) 's Twitter Profile Photo

Learning Transformer Programs We designed a modified Transformer that can be trained to solve a task and then automatically converted into a discrete, human-readable program. With Alex Wettig and Danqi Chen. Paper: arxiv.org/abs/2306.01128 Code: github.com/princeton-nlp/… [1/12]

Learning Transformer Programs

We designed a modified Transformer that can be trained to solve a task and then automatically converted into a discrete, human-readable program. With <a href="/_awettig/">Alex Wettig</a> and <a href="/danqi_chen/">Danqi Chen</a>.

Paper: arxiv.org/abs/2306.01128
Code: github.com/princeton-nlp/…

[1/12]
Tolga Bolukbasi (@tolgab0) 's Twitter Profile Photo

Understanding the data-model interaction has become a very important topic for alignment and understanding LLM behavior. Happy to announce that we are organizing a workshop on attribution and accepting submissions.

Adam Pearce (@adamrpearce) 's Twitter Profile Photo

Do Machine Learning Models Memorize or Generalize? pair.withgoogle.com/explorables/gr… An interactive introduction to grokking and mechanistic interpretability w/ Asma Ghandeharioun, nada hussein, Nithum, Martin Wattenberg and iislucas (Lucas Dixon)

Google AI (@googleai) 's Twitter Profile Photo

Recent innovation has given rise to #ML models w/ impressive capabilities, but there’s much to learn about how we attribute model behavior to training data, algorithms, architecture, & more! Have papers or ideas on this? Submit to ATTRIB @ #NeurIPS2023 → attrib-workshop.cc

Recent innovation has given rise to #ML models w/ impressive capabilities, but there’s much to learn about how we attribute model behavior to training data, algorithms, architecture, &amp; more! Have papers or ideas on this? Submit to ATTRIB @ #NeurIPS2023 → attrib-workshop.cc
iislucas (Lucas Dixon) (@iislucas) 's Twitter Profile Photo

PAIR is looking for a Research Scientist interested in making hard ML problems (like understanding langauge) much smaller... In Paris, and working closely with fun interactive explorable visualizations too. See: goo.gle/3PcvPEs

Peter Hase (@peterbhase) 's Twitter Profile Photo

Happy to share that this paper was accepted with a Spotlight at #NeurIPS2023! We updated the arXiv with results showing the disconnect between knowledge localization and editing success across different neuron ablations, editing methods, editing metrics, models, and datasets.⬇️

Jeff Dean (@🏡) (@jeffdean) 's Twitter Profile Photo

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,

I’m very excited to share our work on Gemini today!  Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains.  Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Dan Friedman (@danfriedman0) 's Twitter Profile Photo

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6

We often interpret neural nets by studying simplified representations (e.g. low-dim visualization). But how faithful are these simplifications to the original model? In our new preprint, we found some surprising "interpretability illusions"... 1/6
Asma Ghandeharioun (@ghandeharioun) 's Twitter Profile Photo

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9

🧵Can we “ask” an LLM to “translate” its own hidden representations into natural language? We propose 🩺Patchscopes, a new framework for decoding specific information from a representation by “patching” it into a separate inference pass, independently of its original context. 1/9
Geoffrey Cideron (@cdrgeo) 's Twitter Profile Photo

Happy to introduce our paper MusicRL, the first music generation system finetuned with human preferences. Paper link: arxiv.org/abs/2402.04229

Adam Roberts (@ada_rob) 's Twitter Profile Photo

I love music most when it’s live, in the moment, and expressing something personal. This is why I’m psyched about the new “DJ mode” we developed for MusicFX: aitestkitchen.withgoogle.com/tools/music-fx… It’s an infinite AI jam that you control 🎛️. Try mixing your unique 🌀 of instruments, genres,

Google AI (@googleai) 's Twitter Profile Photo

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → goo.gle/4aS5epd

Being able to interpret an #ML model’s hidden representations is key to understanding its behavior. Today we introduce Patchscopes, an approach that trains #LLMs to provide natural language explanations of their own hidden representations. Learn more → goo.gle/4aS5epd
Armand Joulin (@armandjoulin) 's Twitter Profile Photo

Gemma 2 27B is now the best open model while being 2.5x smaller than alternatives! This validates the work done by the team and Gemini. This is just the beginning 💙♊️

nada hussein (@nadamused_) 's Twitter Profile Photo

Can Large Language Models Explain Their Internal Mechanisms? pair.withgoogle.com/explorables/pa… An interactive intro to Patchscopes, an inspection framework for explaining the hidden representations of LLMs, with LLMs w/ Asma Ghandeharioun Ryan Mullins Emily Reif Jimbo Wilson Nithum iislucas (Lucas Dixon)

Google AI (@googleai) 's Twitter Profile Photo

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → goo.gle/patchscopes

Can large language models (LLMs) explain their internal mechanisms? Check out the latest AI Explorable on Patchscopes, an inspection framework that uses LLMs to explain the hidden representations of LLMs. Learn more → goo.gle/patchscopes
Google DeepMind (@googledeepmind) 's Twitter Profile Photo

We’re welcoming a new 2 billion parameter model to the Gemma 2 family. 🛠️ It offers best-in-class performance for its size and can run efficiently on a wide range of hardware. Developers can get started with 2B today → dpmd.ai/4d0MKEH

Asma Ghandeharioun (@ghandeharioun) 's Twitter Profile Photo

🧵Responses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)

🧵Responses to adversarial queries can still remain latent in a safety-tuned model. Why are they revealed sometimes, but not others? And what are the mechanics of this latent misalignment? Does it matter *who* the user is? (1/n)