Rylan Schaeffer (@rylanschaeffer) 's Twitter Profile
Rylan Schaeffer

@rylanschaeffer

AI/ML arborist (or arsonist, depending on your viewpoint)

ID: 387918067

linkhttp://rylanschaeffer.github.io calendar_today09-10-2011 21:35:36

1,1K Tweet

4,4K Followers

1,1K Following

Luke Bailey (@lukebailey181) 's Twitter Profile Photo

Excited to have been a part of this work! I was surprised that transferable jailbreak attacks were so difficult to find. Our results show that single images can jailbreak multiple models. This means black-box transfer **may** be possible with future more complex attacks.

Ethan Perez (@ethanjperez) 's Twitter Profile Photo

Gradient-based adversarial image attacks/jailbreaks don't seem to transfer across vision-language models, unless the models are *really* similar. This is good (and IMO surprising) news for the robustness of VLMs! Check out our new paper on when these attacks do/don't transfer:

Aparna Dhinakaran (@aparnadhinak) 's Twitter Profile Photo

Ⓜ️📉Model Collapse seemed to be all over twitter last week, including a sky is falling post from Alexandr Wang. As I finally got to read the papers, I landed on Rylan Schaeffer paper, I realized that the Model Collapse paper had a huge hole in it. TLDR: It assumes all the