Rylan Schaeffer (@rylanschaeffer) Twitter Tweets • TwiDoom

Rylan Schaeffer

@rylanschaeffer

+ Follow

AI/ML arborist (or arsonist, depending on your viewpoint)

ID: 387918067

linkhttp://rylanschaeffer.github.io calendar_today09-10-2011 21:35:36

1,1K Tweet

4,4K Followers

1,1K Following

Rylan Schaeffer

@rylanschaeffer

2 months ago

Also, shout out to mikail khona for helping us improve the figure !! Cristóbal Eyzaguirre Ercilla & Lisa Soder , I promise I'll integrate your figure feedback for V2!!

thumb_up_off_alt5

chat_bubble_outline0

repeat1

shareShare

Excited to have been a part of this work! I was surprised that transferable jailbreak attacks were so difficult to find. Our results show that single images can jailbreak multiple models. This means black-box transfer **may** be possible with future more complex attacks.

thumb_up_off_alt9

chat_bubble_outline1

repeat2

shareShare

Ethan Perez

@ethanjperez

2 months ago

Gradient-based adversarial image attacks/jailbreaks don't seem to transfer across vision-language models, unless the models are *really* similar. This is good (and IMO surprising) news for the robustness of VLMs! Check out our new paper on when these attacks do/don't transfer:

thumb_up_off_alt55

chat_bubble_outline0

repeat4

shareShare

Aparna Dhinakaran

@aparnadhinak

2 months ago

Ⓜ️📉Model Collapse seemed to be all over twitter last week, including a sky is falling post from Alexandr Wang. As I finally got to read the papers, I landed on Rylan Schaeffer paper, I realized that the Model Collapse paper had a huge hole in it. TLDR: It assumes all the

thumb_up_off_alt58

chat_bubble_outline2

repeat15

shareShare