Trustworthy ML Initiative (TrustML) (@trustworthy_ml) 's Twitter Profile
Trustworthy ML Initiative (TrustML)

@trustworthy_ml

Latest research in Trustworthy ML. Organizers: @JaydeepBorkar @sbmisi @hima_lakkaraju @sarahookr Sarah Tan @chhaviyadav_ @_cagarwal @m_lemanczyk @HaohanWang

ID: 1262375165490540549

linkhttps://www.trustworthyml.org calendar_today18-05-2020 13:31:24

1,1K Tweet

6,6K Takipçi

66 Takip Edilen

Florian Tramèr (@florian_tramer) 's Twitter Profile Photo

🔥 We're releasing the strongest membership inference attack for foundation models! 🔥 Our attack applies to LLMs, vLMs, CLIP, Diffusion models and is SOTA on all🥇 Not only is our attack a magnificent breakthrough, it is also *magic*: we don't look at the ML model at all🪄 🧵👇

🔥 We're releasing the strongest membership inference attack for foundation models! 🔥
Our attack applies to LLMs, vLMs, CLIP, Diffusion models and is SOTA on all🥇

Not only is our attack a magnificent breakthrough, it is also *magic*: we don't look at the ML model at all🪄
🧵👇
A. Feder Cooper (@afedercooper) 's Twitter Profile Photo

Here's my last PhD paper-- “The Files are in the Computer: Copyright, Memorization, and Generative-AI Systems” James Grimmelmann & I address ambiguity over the relationship b/w copying + memorization: when a (near-)exact copy of training data can be reconstructed from a model

Here's my last PhD paper-- “The Files are in the Computer: Copyright, Memorization, and Generative-AI Systems”

<a href="/grimmelm/">James Grimmelmann</a> &amp; I address ambiguity over the relationship b/w copying + memorization: when a (near-)exact copy of training data can be reconstructed from a model
Ilia Shumailov🦔 (@iliaishacked) 's Twitter Profile Photo

Unlearning, originally for privacy, today is often discussed as a content-regulation tool. If my model doesnt know X, it is safe. We argue that unlearning provides illusion of safety, since adversaries can inject malicious knowledge back into the models. arxiv.org/pdf/2407.00106

Unlearning, originally for privacy, today is often discussed as a content-regulation tool. If my model doesnt know X, it is safe. We argue that unlearning provides illusion of safety, since adversaries can inject malicious knowledge back into the models.
arxiv.org/pdf/2407.00106
Niloofar Mireshghallah (@niloofar_mire) 's Twitter Profile Photo

Our work on challenges and inconclusiveness of membership inference attacks on LLMs has been accepted to Conference on Language Modeling!! arxiv.org/abs/2402.07841 This work has instigated new directions and many conversations on MIA evaluations, I will list them here in this thread, add to it!

Weijia Shi (@weijiashi2) 's Twitter Profile Photo

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data? We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests. 🔗:

Can 𝐦𝐚𝐜𝐡𝐢𝐧𝐞 𝐮𝐧𝐥𝐞𝐚𝐫𝐧𝐢𝐧𝐠 make language models forget their training data?

We shows Yes but at the cost of privacy and utility. Current unlearning scales poorly with the size of the data to be forgotten and can’t handle sequential unlearning requests.

🔗:
Niloofar Mireshghallah (@niloofar_mire) 's Twitter Profile Photo

When talking abt personal data people share w/ OpenAI & privacy implications, I get the 'come on! people don't share that w/ ChatGPT!🫷' In our Conference on Language Modeling paper, we study disclosures, and find many concerning⚠️ cases of sensitive information sharing: tinyurl.com/ChatGPT-person…

When talking abt personal data people share w/ <a href="/OpenAI/">OpenAI</a>  &amp; privacy implications, I get the 'come on! people don't share that w/ ChatGPT!🫷'

In our <a href="/COLM_conf/">Conference on Language Modeling</a> paper, we study disclosures, and find many concerning⚠️ cases of sensitive information sharing:

tinyurl.com/ChatGPT-person…
The GenLaw Center (@genlawcenter) 's Twitter Profile Photo

Really excited to announce this year's list of accepted papers + spotlights! We had _so_ many submissions, many wonderful reviewers, and a final list of 66 accepted papers. Full papers will be posted before the workshop July 27th! genlaw.org/2024-icml-pape…

Really excited to announce this year's list of accepted papers + spotlights! We had _so_ many submissions, many wonderful reviewers, and a final list of 66 accepted papers. 

Full papers will be posted before the workshop July 27th!

genlaw.org/2024-icml-pape…
Eoin Delaney (@eoindelaney_) 's Twitter Profile Photo

🚨New paper and fairness toolkit alert🚨 Announcing OxonFair: A Flexible Toolkit for Algorithmic Fairness w/Zihao Fu, Sandra Wachter - @[email protected], Brent Mittelstadt and Chris Russell toolkit -github.com/oxfordinternet… paper - papers.ssrn.com/sol3/papers.cf…

Swarnadeep Saha (@swarnanlp) 's Twitter Profile Photo

🚨 New: my last PhD paper 🚨 Introducing System-1.x, a controllable planning framework with LLMs. It draws inspiration from Dual-Process Theory, which argues for the co-existence of fast/intuitive System-1 and slow/deliberate System-2 planning. System 1.x generates hybrid plans

🚨 New: my last PhD paper 🚨

Introducing System-1.x, a controllable planning framework with LLMs. It draws inspiration from Dual-Process Theory, which argues for the co-existence of fast/intuitive System-1 and slow/deliberate System-2 planning.

System 1.x generates hybrid plans
Besmira Nushi 💙💛 (@besanushi) 's Twitter Profile Photo

Microsoft Research Cambridge UK is hiring on topics related to equitable and responsible multi-modal AI. The team pursues efforts at the intersection of vision and language and is passionate about all aspects of #ResponsibleAI including fairness, reliability, interpretability. I

Pin-Yu Chen (@pinyuchentw) 's Twitter Profile Photo

🚩(1/2) Please help forward the Call for the 2024 Adversarial Machine Learning (AdvML) Rising Star Awards! We promote junior researchers in AI safety, robustness, and security. Award events are hosted at AdvML'Frontiers workshop NeurIPS Conference 2024 Info: sites.google.com/view/advml/adv…

🚩(1/2) 
Please help forward the Call for the 2024 Adversarial Machine Learning (AdvML) Rising Star Awards!

We promote junior researchers in AI safety, robustness, and security. Award events are hosted at AdvML'Frontiers workshop <a href="/NeurIPSConf/">NeurIPS Conference</a> 2024

Info: sites.google.com/view/advml/adv…
sijia.liu (@sijialiu17) 's Twitter Profile Photo

The 3rd AdvML-Frontiers Workshop (AdvMLFrontiers advml-frontier.github.io) is set for #NeurIPS 2024 (NeurIPS Conference)! This year, we're delving into the expansion of the trustworthy AI landscape, especially in large multi-modal systems. Trustworthy ML Initiative (TrustML) LLM Security🚀 We're now

The 3rd AdvML-Frontiers Workshop (<a href="/AdvMLFrontiers/">AdvMLFrontiers</a> advml-frontier.github.io) is set for #NeurIPS 2024 (<a href="/NeurIPSConf/">NeurIPS Conference</a>)! This year, we're delving into the expansion of the trustworthy AI landscape, especially in large multi-modal systems. <a href="/trustworthy_ml/">Trustworthy ML Initiative (TrustML)</a>
<a href="/llm_sec/">LLM Security</a>🚀

We're now
𝙷𝚒𝚖𝚊 𝙻𝚊𝚔𝚔𝚊𝚛𝚊𝚓𝚞 (@hima_lakkaraju) 's Twitter Profile Photo

Excited to announce the 2nd edition of our Regulatable ML workshop NeurIPS Conference! We plan to debate burning questions around the regulation of generative #AI and Artificial General Intelligence (#AGI). We are accepting submissions until Aug. 30th -- regulatableml.github.io [1/N]

Excited to announce the 2nd edition of our Regulatable ML workshop <a href="/NeurIPSConf/">NeurIPS Conference</a>! We plan to debate burning questions around the regulation of generative #AI and Artificial General Intelligence (#AGI). 

We are accepting submissions until Aug. 30th -- regulatableml.github.io [1/N]
Canyu Chen (@canyuchen3) 's Twitter Profile Photo

🤔Are your open-source LLMs really safe? 🚨It may be injected with misinformation or bias! Our new paper "𝐂𝐚𝐧 𝐄𝐝𝐢𝐭𝐢𝐧𝐠 𝐋𝐋𝐌𝐬 𝐈𝐧𝐣𝐞𝐜𝐭 𝐇𝐚𝐫𝐦?" (Project website: llm-editing.github.io ) sheds light on the emerging challenges of LLMs, especially the

🤔Are your open-source LLMs really safe? 
🚨It may be injected with misinformation or bias!  

Our new paper "𝐂𝐚𝐧 𝐄𝐝𝐢𝐭𝐢𝐧𝐠 𝐋𝐋𝐌𝐬 𝐈𝐧𝐣𝐞𝐜𝐭 𝐇𝐚𝐫𝐦?" (Project website: llm-editing.github.io ) sheds light on the emerging challenges of LLMs, especially the
Chhavi Yadav (@chhaviyadav_) 's Twitter Profile Photo

📰 Excited to be organizing a workshop on Interpretability NeurIPS Conference'24, called 'Interpretable AI : Past, Present and Future' Submit to our workshop for all things inherently interpretable! Submission ddl : 30 Aug 🔗 interpretable-ai-workshop.github.io Follow this account for updates!

Konrad Rieck 🌈 (@mlsec) 's Twitter Profile Photo

🚨 We are extending the Call for Papers for the 3rd IEEE Conference on Secure and Trustworthy Machine Learning (SaTML Conference)! 👉 satml.org/participate-cf… ⏰ New Deadline: Sep 27 This extension gives you more time to submit your best work on secure AI algorithms and systems😉

🚨 We are extending the Call for Papers for the 3rd IEEE Conference on Secure and Trustworthy Machine Learning (<a href="/satml_conf/">SaTML Conference</a>)!

👉 satml.org/participate-cf…
⏰ New Deadline: Sep 27

This extension gives you more time to submit your best work on secure AI algorithms and systems😉