Hadas Orgad
@orgadhadas
PhD student (Natural Language Processing) @ Technion, Israel, Interested in AI interpretability, robustness and safety
ID: 1121835405454786561
https://orgadhadas.github.io/ 26-04-2019 17:56:18
98 Tweet
215 Takipçi
107 Takip Edilen
Attending #WACV2024? Make sure you visit our poster on debiasing, moderating and erasing concepts in text2img models! @ 8PM today. Rohit Gandikota will be there.
If you're looking for our recent paper on model editing on the ACL anthology and cannot find it, it's because it has been taken down without cause or due process. The paper is still available on arXiv, feel free to read it there. arxiv.org/abs/2310.11958 aclanthology.org/2023.findings-…
Our paper Diffusion Lens got accepted to #ACL2024 main conference! 🌴⭐️ Visualize LLMs computation process with our live demo >> huggingface.co/spaces/tokeron… For a quick TL;DR checkout Michael Toker's thread or project website - tokeron.github.io/DiffusionLensW…
Interested in text-to-image models? Come say hi today at poster session 2 and hear about the diffusion lens - our new method of interpreting text encoders of t2i models ✨📸 #acl2024nlp #ACL2024 w/ Michael Toker Mor Ventura Hadas Orgad Yonatan Belinkov
Had a blast discussing how mechanistic interpretability can improve AI safety at #NEMI2024 🤖 Thanks to David Bau, Max Tegmark, Koyena Pal @ VLDB 2024, Kenneth Li, Eric J. Michaud & Jannik Brinkmann for bringing us together! And to my amazing co-panelists Hadas Orgad & David Krueger!
#NEMI2024 was really fun and informative. It was great having so many smart people passionate about mecinterp in one room. Thanks Koyena Pal @ VLDB 2024, David Bau, Max Tegmark, Kenneth Li, Eric J. Michaud& Jannik Brinkmann for organizing this!