Roger Grosse (@rogergrosse) Twitter Tweets • TwiDoom

davidad 🎇

4 months ago

It is generally underappreciated that Marvin Minsky in 1984 sketched a taxonomy of AI loss-of-control risks which included specification gaming, goal misgeneralization, and treacherous turns prompted by recursive self-improvement.

thumb_up_off_alt362

chat_bubble_outline19

repeat66

shareShare

Logan Graham

@logangraham

3 months ago

Evals are critical for measuring AI capabilities + safety. If you're building evals, I'd like you to apply for our support. Here are some we wish existed. anthropic.com/news/a-new-ini…

thumb_up_off_alt352

chat_bubble_outline6

repeat60

shareShare

Roger Grosse

@rogergrosse

3 months ago

I made this diagram with AI in mind, but I wonder if human institutional decay is basically a matter of following this curve in the opposite direction. Institutions fail once everyone is basically "just GPTing it."

thumb_up_off_alt36

chat_bubble_outline1

repeat1

shareShare

CIFAR

@cifar_news

2 months ago

With a powerful technology like AI, training in ethics and safety is vital for emerging AI researchers and developers. #DLRL2024 yesterday featured an engaging panel on AI safety with Canada CIFAR AI Chairs at the Vector Institute Sheila McIlraith and Roger Grosse.

thumb_up_off_alt16

chat_bubble_outline0

repeat4

shareShare

Roger Grosse

@rogergrosse

2 months ago

One of the joys of teaching is seeing your students' projects turn into interesting papers. Here's some very nice work by david glukhov and collaborators on the challenges of ensuring harmlessness of LLMs that can be queried obliquely and repeatedly.

thumb_up_off_alt50

chat_bubble_outline0

repeat8

shareShare

Nathan Ng

@learn_ng

2 months ago

The minimum description length principle is an attractive Bayesian alternative for quantifying uncertainty, but how can we get it to work efficiently and accurately at scale? Excited to share our ICML work on measuring stochastic complexity with Boltzmann influence functions!

thumb_up_off_alt54

chat_bubble_outline2

repeat6

shareShare

Richard Ngo

@richardmcngo

2 months ago

Some thoughts on open-source AI: 1. We should have a strong prior favoring open source. It’s been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn’t take it for granted.

thumb_up_off_alt234

chat_bubble_outline7

repeat18

shareShare

Roger Grosse

@rogergrosse

2 months ago

Exciting new work from the Superalignment Team. This really captures the essence of scalable oversight.

thumb_up_off_alt24

chat_bubble_outline0

repeat0

shareShare

Roger Grosse

@rogergrosse

2 months ago

Back in 2010, during my PhD, I explored some ideas for learning twist functions for SMC. (The twists were linear random feature models since this was pre-DL-era.) I didn't try to publish since I couldn't think of a compelling use case. Sometimes you just have to wait.

thumb_up_off_alt111

chat_bubble_outline3

repeat4

shareShare

Andrew Critch (h/acc)

@andrewcritchphd

2 months ago

Zuckerberg's message here is really important. I prefer to live in a world where small businesses and solo researchers have transparency into AI model weights. It parallelizes and democratizes AI safety, security, and ethics research. I've been eagerly awaiting Llama 3.1, and I'm

thumb_up_off_alt61

chat_bubble_outline10

repeat6

shareShare

Chris J. Maddison

@cjmaddison

2 months ago

What if the next medical breakthrough is hidden in plain text? Causal estimates drives progress but data is limited & RCTs slow. Introducing NATURAL: a pipeline for causal estimation from text data in hours, not years. Paper: tinyurl.com/ppr29 Site: tinyurl.com/web98

thumb_up_off_alt621

chat_bubble_outline25

repeat125

shareShare

Roger Grosse

@rogergrosse

2 months ago

In 2021, I wrote up a timeline for how things might progress if we were 30 years away from artificial superintelligence. (This was a thought experiment rather than a forecast -- it felt aggressive at the time.) This timeline had an AI win an IMO gold medal in 2035.

thumb_up_off_alt131

chat_bubble_outline2

repeat6

shareShare

Schwartz Reisman Institute

@torontosri

2 months ago

The field of AI safety emphasizes AI should “do no harm.” But lethal autonomous systems used in warfare are already causing harm. How should we think about purposely harmful AI? SRI Grad Fellow Michael Zhang writes about a panel exploring this topic: uoft.me/aJh

thumb_up_off_alt15

chat_bubble_outline0

repeat5

shareShare

Zachary Nado

@zacharynado

2 months ago

"Non-diagonal preconditioning has dethroned Nesterov Adam" 🧴👑 shampoo wins, finally the community can know what we have for years! this benchmark has been 3+ years in the making (we first talked about it Google in 2021), I'm beyond psyched that it's finally yielded results!

thumb_up_off_alt154

chat_bubble_outline3

repeat22

shareShare

Stuart Ritchie 🇺🇦

@stuartjritchie

a month ago

Look, here’s the thing about free speech: YES, it’s not “absolute”. Even the most hardcore free speech advocates agree that there are exceptions. Extreme case: telling e.g. Russia about UK military secrets is “just” a speech act, but it is (and should be) illegal in UK law.

thumb_up_off_alt2,2K

chat_bubble_outline146

repeat596

shareShare

Amanda Askell

@amandaaskell

a month ago

Joining a company you think is bad in order to be a force for good from the inside is the career equivalent of "I can change him".

thumb_up_off_alt400

chat_bubble_outline21

repeat30

shareShare

Roger Grosse

@rogergrosse

a month ago

A strange but plausible future is one where 2028-era AIs can autonomously spit out NeurIPS papers and this results in no discernible speedup in scientific progress.

thumb_up_off_alt268

chat_bubble_outline14

repeat20

shareShare

Roger Grosse

@rogergrosse

a month ago

To paraphrase Brian Kernighan, if GPT-5 is just clever enough to write the code, it will take GPT-6 to debug it.

thumb_up_off_alt14

chat_bubble_outline0

repeat1

shareShare

Roger Grosse

@rogergrosse

22 days ago

Amortized variational inference is neither amortized nor variational nor inference.

thumb_up_off_alt66

chat_bubble_outline7

repeat3

shareShare

Schwartz Reisman Institute

@torontosri

18 days ago

What is "safe" AI? Why is it difficult to achieve? Can LLMs be hacked? Are the existential risks of advanced AI exaggerated—or justified? Join us next week on Sept. 10 to hear from AI experts Karina Vold,Roger Grosse,Sedef Akinli Kocak, and Sheila McIlraith. 🔗 uoft.me/aLB

thumb_up_off_alt7

chat_bubble_outline0

repeat3

shareShare