Xiaochuang Han (@xiaochuanghan) 's Twitter Profile
Xiaochuang Han

@xiaochuanghan

PhD student at the University of Washington

ID: 4916685123

linkhttp://xhan77.github.io calendar_today16-02-2016 00:47:56

93 Tweet

567 Followers

730 Following

Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

🚨 Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially. What does LLM bring to the arms race between bot detectors and operators? A thread 🧵#ACL2024

🚨 Detecting social media bots has always been an arms race: we design better detectors with advanced ML tools, while more evasive bots emerge adversarially.

What does LLM bring to the arms race between bot detectors and operators?

A thread 🧵#ACL2024
Sachin Kumar (@shocheen) 's Twitter Profile Photo

Check out our paper on model noncompliance. We outline a taxonomy of requests that LLMs should not comply with beyond only unsafe queries. Based on this taxonomy we create CoCoNot, a resource for training and evaluating models’ noncompliance. More details in this thread👇🏾

Leo Liu (@zeyuliu10) 's Twitter Profile Photo

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions? Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMs’ API knowledge is hard!

Knowledge updates for code LLMs: Code LLMs generate calls to libraries like numpy and pandas. What if these libraries change? Can we update LLMs with modified function definitions?

Test this with our new benchmark, CodeUpdateArena. Findings: updating LLMs’ API knowledge is hard!
Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data? We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests. 🔗:

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data?

We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests.

🔗:
Sachin Kumar (@shocheen) 's Twitter Profile Photo

You think your model just fell out of a coconot tree 🥥? It should not always comply in the context of all it has seen in the request. Check out our paper on contextual noncompliance.

Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment? Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment. A thread 🧵

What can we do when certain values, cultures, and communities are underrepresented in LLM alignment?

Introducing Modular Pluralism, where a general LLM interacts with a pool of specialized community LMs in various modes to advance pluralistic alignment.

A thread 🧵
Alisa Liu (@alisawuffles) 's Twitter Profile Photo

What do BPE tokenizers reveal about their training data?🧐 We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists! Co-1⃣st Jonathan Hayase arxiv.org/abs/2407.16607 🧵⬇️

What do BPE tokenizers reveal about their training data?🧐

We develop an attack🗡️ that uncovers the training data mixtures📊 of commercial LLM tokenizers (incl. GPT-4o), using their ordered merge lists!

Co-1⃣st <a href="/JonathanHayase/">Jonathan Hayase</a>
arxiv.org/abs/2407.16607 🧵⬇️
Shangbin Feng (@shangbinfeng) 's Twitter Profile Photo

Instruction tuning with synthetic graph data leads to graph LLMs, but: Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? 🤔 (for examples, patterns like how you describe a graph in natural language) A thread 🧵

Instruction tuning with synthetic graph data leads to graph LLMs, but:

Are LLMs learning generalizable graph reasoning skills or merely memorizing patterns in the synthetic training data? 🤔 (for examples, patterns like how you describe a graph in natural language)

A thread 🧵
AK (@_akhaliq) 's Twitter Profile Photo

JPEG-LM LLMs as Image Generators with Canonical Codec Representations discuss: huggingface.co/papers/2408.08… Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal

JPEG-LM

LLMs as Image Generators with Canonical Codec Representations

discuss: huggingface.co/papers/2408.08…

Recent work in image and video generation has been adopting the autoregressive LLM architecture due to its generality and potentially easy integration into multi-modal
tsvetshop (@tsvetshop) 's Twitter Profile Photo

Huge congrats to Oreva Ahia and Shangbin Feng for winning awards at #ACL2024! DialectBench Best Social Impact Paper Award arxiv.org/abs/2403.11009 Don't Hallucinate, Abstain Area Chair Award, QA track & Outstanding Paper Award arxiv.org/abs/2402.00367

Chunting Zhou (@violet_zct) 's Twitter Profile Photo

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039

Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This
Lili Yu (@liliyu_lili) 's Twitter Profile Photo

🚀 Excited to share our latest work: Transfusion! A new multi-modal generative training combining language modeling and image diffusion in a single transformer! Huge shout to Chunting Zhou Omer Levy Michi Yasunaga Arun Babu Kushal Tirumala and other collaborators.

Pang Wei Koh (@pangweikoh) 's Twitter Profile Photo

Check out JPEG-LM, a fun idea led by Xiaochuang Han -- we generate images simply by training an LM on raw JPEG bytes and show that it outperforms much more complicated VQ models, especially on rare inputs.

Marjan Ghazvininejad (@gh_marjan) 's Twitter Profile Photo

Can we train an LM on raw JPEG bytes and generate images with that? Yes we can. Check out JPEG-LM (arxiv.org/abs/2408.08459), a cool work lead by @XiaochuangHano to learn more.