Yossi Gandelsman (@ygandelsman) 's Twitter Profile
Yossi Gandelsman

@ygandelsman

PhD student at Berkeley AI

ID: 1275430079754092550

linkhttp://yossi.gandelsman.com calendar_today23-06-2020 14:06:56

114 Tweet

737 Followers

428 Following

Yossi Gandelsman (@ygandelsman) 's Twitter Profile Photo

Mechanistic interpretability is not only a good way to understand what is going on in a model, but it is also a tool for discovering "model bugs" and exploiting them! Our new paper shows that understanding CLIP neurons enables automatic generation of semantic adversarial images:

Mechanistic interpretability is not only a good way to understand what is going on in a model, but it is also a tool for discovering "model bugs" and exploiting them!

Our new paper shows that understanding CLIP neurons enables automatic generation of semantic adversarial images: