Andy Zhou (@andyz245) Twitter Tweets • TwiDoom

Haohan Wang

7 months ago

New Preprint. 🔒 In an era where AI's influence is surging, government compliance isn't optional—it's imperative. Our research introduces GUARD, a novel system ensuring LLMs and VLMs are tested for adherence to these critical standards. #AICompliance #GovernmentGuidelines

thumb_up_off_alt3

chat_bubble_outline1

repeat1

shareShare

Dan Hendrycks

@danhendrycks

7 months ago

Can hazardous knowledge be unlearned from LLMs without harming other capabilities? We’re releasing the Weapons of Mass Destruction Proxy (WMDP), a dataset about weaponization, and we create a way to unlearn this knowledge. 📝arxiv.org/abs/2403.03218 🔗wmdp.ai

thumb_up_off_alt244

chat_bubble_outline13

repeat64

shareShare

Haohan Wang

@haohanwang

6 months ago

🚀 Thanks for hosting! Excited to share our latest work on jailbreaking LLMs: 1️⃣ Compliance testing with jailbreak 🧐 arxiv.org/abs/2402.03299 2️⃣ systematic approach to defense 💪 arxiv.org/abs/2401.17263 with Haibo, Andy Zhou, Lapis Labs, and Bo Li; Trustworthy ML Initiative (TrustML)

thumb_up_off_alt17

chat_bubble_outline0

repeat5

shareShare

LlamaIndex 🦙

@llama_index

5 months ago

Language Agent Tree Search 🤖🌲 As LLMs get faster, better, cheaper, developers will be able to compose agentic systems that are able to plan out an entire tree of possible futures, instead of just sequentially planning the next state (e.g. in ReAct). This is crucial for higher

thumb_up_off_alt356

chat_bubble_outline6

repeat92

shareShare

Andy Zhou

@andyz245

5 months ago

Pleased to announce Language Agent Tree Search was accepted to #ICML2024 !! We propose a general search algorithm for LM agents that effectively navigates the prompt space for agent tasks Check it out here arxiv.org/abs/2310.04406 Great LangChain implementation here

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

John Yang

@jyangballin

5 months ago

The SWE-agent preprint has finally landed! Check it out at swe-agent.com/paper.pdf

thumb_up_off_alt323

chat_bubble_outline6

repeat67

shareShare

Andy Zhou

@andyz245

3 months ago

I got an email bringing this paper to my attention that mentioned something was concerning, and the methodology is almost exactly the same as our ICML 2024 work (released Oct 2023), Language Agent Tree Search (arxiv.org/abs/2310.04406), but did not cite us...

thumb_up_off_alt100

chat_bubble_outline4

repeat11

shareShare

Revanth Gangi Reddy

@gangi_official

3 months ago

Introducing FIRST: Faster Improved Listwise Reranking with Single Token Decoding arxiv.org/pdf/2406.15657 Listwise LLM reranking typically outputs the ranking order as a generation sequence. Instead, we use output logits of the first generated identifier to obtain the ranking.

thumb_up_off_alt46

chat_bubble_outline2

repeat10

shareShare

Virtue AI

@virtueai_co

2 months ago

We present AIR 2024, a unified AI Risk Taxonomy for AI regulation and company policy-guided risk assessment and compliance, jointly with Stanford University's HELM. 📜Blog: virtueai.com/2024/07/27/dec…

thumb_up_off_alt9

chat_bubble_outline1

repeat4

shareShare

Andy Zhou

@andyz245

2 months ago

Excited to release this work on AI policy! We map out company/government policies into a taxonomy with 314 potentially harmful categories arxiv.org/abs/2406.17864

thumb_up_off_alt6

chat_bubble_outline0

repeat0

shareShare

Andrew Curran

@andrewcurran_

2 months ago

Leading the Superalignment team is like taking the Defense Against the Dark Arts position.

thumb_up_off_alt879

chat_bubble_outline26

repeat78

shareShare

alphaXiv

@askalphaxiv

a month ago

Excited to feature Tamper-Resistant Safeguards for Open-Weight LLMs from Lapis Labs! Introducing the first safeguards for LLMs that resist fine-tuning attacks, showing the power of tamper-resistance to make open-weight LLMs safer. Rishub Tamirisa is here to answer your questions!

Excited to feature Tamper-Resistant Safeguards for Open-Weight LLMs from <a href="/lapisrocks/">Lapis Labs</a>!

Introducing the first safeguards for LLMs that resist fine-tuning attacks, showing the power of tamper-resistance to make open-weight LLMs safer.

<a href="/rishub_t/">Rishub Tamirisa</a> is here to answer your questions!

thumb_up_off_alt9

chat_bubble_outline1

repeat4

shareShare

Chi Wang

@chi_wang_

a month ago

Join us for the invited talk Language Agent Tree Search in the AutoGen community meetup (Do not miss!) Time: August 26, Monday 9am PT. Event link: discord.com/events/1153072… Abstract: ⬇️

thumb_up_off_alt19

chat_bubble_outline2

repeat5

shareShare