Andy Zhou
@andyz245
undergrad student at UIUC | intern @virtueai_co | @lapisrocks
ID: 765646771498340352
http://www.andyzhou.ai 16-08-2016 20:29:50
673 Tweet
416 Followers
406 Following
🚀 Thanks for hosting! Excited to share our latest work on jailbreaking LLMs: 1️⃣ Compliance testing with jailbreak 🧐 arxiv.org/abs/2402.03299 2️⃣ systematic approach to defense 💪 arxiv.org/abs/2401.17263 with Haibo, Andy Zhou, Lapis Labs, and Bo Li; Trustworthy ML Initiative (TrustML)
We present AIR 2024, a unified AI Risk Taxonomy for AI regulation and company policy-guided risk assessment and compliance, jointly with Stanford University's HELM. 📜Blog: virtueai.com/2024/07/27/dec…
Excited to feature Tamper-Resistant Safeguards for Open-Weight LLMs from Lapis Labs! Introducing the first safeguards for LLMs that resist fine-tuning attacks, showing the power of tamper-resistance to make open-weight LLMs safer. Rishub Tamirisa is here to answer your questions!