Arvind Narayanan(@random_walker) 's Twitter Profileg
Arvind Narayanan

@random_walker

Princeton CS prof. Director @PrincetonCITP. I write about the societal impact of AI, tech ethics, & social media platforms.
BOOK: AI Snake Oil. Views mine.

ID:10834752

linkhttps://www.cs.princeton.edu/~arvindn/ calendar_today04-12-2007 11:14:14

12,0K Tweets

119,1K Followers

413 Following

Follow People
Michael Lones(@michael_lones) 's Twitter Profile Photo

Great to have been involved in this initiative led by Sayash Kapoor and Arvind Narayanan to (hopefully!) improve the use of machine learning in science. Further thoughts in my Substack post: fetchdecodeexecute.substack.com/p/reforms-a-gu…

account_circle
Peter Henderson(@PeterHndrsn) 's Twitter Profile Photo

To my mind, unconstrained military use of AI is one of the most risky & is underemphasized in policymaking. Military use must be a central part of AI Safety discussions. Glad to see a couple of new pieces emphasizing this point.

ft.com/content/da03f8…

foreignaffairs.com/united-states/…

account_circle
Ethan Zuckerman(@EthanZ) 's Twitter Profile Photo

With my brilliant friends at Knight First Amendment Institute, I filed suit against Meta today, asking a federal court to find that CDA section 230 gives users rights to control what they see on social media via third party tools. See our complaint at knightcolumbia.org/cases/zuckerma…

account_circle
Jessica Hullman(@JessicaHullman) 's Twitter Profile Photo

Lots of practical advice to help researchers doing ML-based science avoid unintentional irreproducibility and overgeneralization in this new paper led by Sayash Kapoor

account_circle
rishi(@RishiBommasani) 's Twitter Profile Photo

REFORMS is an exceptional work by an ensemble cast spanning institutions and disciplines! Check it out!

The approach also directly inspired our work on open foundation models, where we worked towards consensus across folks from different institutions:
crfm.stanford.edu/open-fms/

account_circle
Sayash Kapoor(@sayashk) 's Twitter Profile Photo

Excited to share that our paper introducing the REFORMS checklist is now out Science Advances!

In it, we:
- review common errors in ML for science
- create a checklist of 32 items applicable across disciplines
- provide in-depth guidelines for each item

science.org/doi/10.1126/sc…

account_circle
Musa al-Gharbi(@Musa_alGharbi) 's Twitter Profile Photo

MTurk is basically junk responses. People often lie about their background characteristics. And they often choose the same answer for most questions, regardless of content, such that you can ask people opposing questions and get completely incoherent results (even after screening…

MTurk is basically junk responses. People often lie about their background characteristics. And they often choose the same answer for most questions, regardless of content, such that you can ask people opposing questions and get completely incoherent results (even after screening…
account_circle
Scott Condron(@_ScottCondron) 's Twitter Profile Photo

- Agents are costly and that should be jointly optimized with task accuracy
- Simple baselines like retrying, retrying with different temps, retrying with better models outperform complex Agents on the Pareto frontier of cost/accuracy
- reproducibility & benchmarks continue to be…

account_circle
Veniamin Veselovsky(@VminVsky) 's Twitter Profile Photo

put your mouth where your money is

amazing how adding cost on an axis can directionally shift interpretations on many ai agent approaches!

account_circle
Arvind Narayanan(@random_walker) 's Twitter Profile Photo

We fully recognize that there are downsides to reporting dollar costs in evals, given that costs change quickly. The difference in our perspective comes down to model evaluation vs downstream evaluation (our focus is the latter). aisnakeoil.com/p/ai-leaderboa…

We fully recognize that there are downsides to reporting dollar costs in evals, given that costs change quickly. The difference in our perspective comes down to model evaluation vs downstream evaluation (our focus is the latter). aisnakeoil.com/p/ai-leaderboa…
account_circle