Roger Grosse (@rogergrosse) 's Twitter Profile
Roger Grosse

@rogergrosse

ID: 3301643341

calendar_today30-07-2015 14:30:37

996 Tweet

10,10K Followers

782 Following

davidad 🎇 (@davidad) 's Twitter Profile Photo

It is generally underappreciated that Marvin Minsky in 1984 sketched a taxonomy of AI loss-of-control risks which included specification gaming, goal misgeneralization, and treacherous turns prompted by recursive self-improvement.

It is generally underappreciated that Marvin Minsky in 1984 sketched a taxonomy of AI loss-of-control risks which included specification gaming, goal misgeneralization, and treacherous turns prompted by recursive self-improvement.
Logan Graham (@logangraham) 's Twitter Profile Photo

Evals are critical for measuring AI capabilities + safety. If you're building evals, I'd like you to apply for our support. Here are some we wish existed. anthropic.com/news/a-new-ini…

Roger Grosse (@rogergrosse) 's Twitter Profile Photo

I made this diagram with AI in mind, but I wonder if human institutional decay is basically a matter of following this curve in the opposite direction. Institutions fail once everyone is basically "just GPTing it."

CIFAR (@cifar_news) 's Twitter Profile Photo

With a powerful technology like AI, training in ethics and safety is vital for emerging AI researchers and developers. #DLRL2024 yesterday featured an engaging panel on AI safety with Canada CIFAR AI Chairs at the Vector Institute Sheila McIlraith and Roger Grosse.

With a powerful technology like AI, training in ethics and safety is vital for emerging AI researchers and developers. #DLRL2024 yesterday featured an engaging panel on AI safety with Canada CIFAR AI Chairs at the <a href="/VectorInst/">Vector Institute</a> <a href="/SheilaMcIlraith/">Sheila McIlraith</a> and <a href="/RogerGrosse/">Roger Grosse</a>.
Roger Grosse (@rogergrosse) 's Twitter Profile Photo

One of the joys of teaching is seeing your students' projects turn into interesting papers. Here's some very nice work by david glukhov and collaborators on the challenges of ensuring harmlessness of LLMs that can be queried obliquely and repeatedly.

Nathan Ng (@learn_ng) 's Twitter Profile Photo

The minimum description length principle is an attractive Bayesian alternative for quantifying uncertainty, but how can we get it to work efficiently and accurately at scale? Excited to share our ICML work on measuring stochastic complexity with Boltzmann influence functions!

The minimum description length principle is an attractive Bayesian alternative for quantifying uncertainty, but how can we get it to work efficiently and accurately at scale?

Excited to share our ICML work on measuring  stochastic complexity with Boltzmann influence functions!
Richard Ngo (@richardmcngo) 's Twitter Profile Photo

Some thoughts on open-source AI: 1. We should have a strong prior favoring open source. It’s been a huge success driving tech progress over many decades. We forget how counterintuitive it was originally, and shouldn’t take it for granted.

Roger Grosse (@rogergrosse) 's Twitter Profile Photo

Back in 2010, during my PhD, I explored some ideas for learning twist functions for SMC. (The twists were linear random feature models since this was pre-DL-era.) I didn't try to publish since I couldn't think of a compelling use case. Sometimes you just have to wait.

Andrew Critch (h/acc) (@andrewcritchphd) 's Twitter Profile Photo

Zuckerberg's message here is really important. I prefer to live in a world where small businesses and solo researchers have transparency into AI model weights. It parallelizes and democratizes AI safety, security, and ethics research. I've been eagerly awaiting Llama 3.1, and I'm

Chris J. Maddison (@cjmaddison) 's Twitter Profile Photo

What if the next medical breakthrough is hidden in plain text? Causal estimates drives progress but data is limited & RCTs slow. Introducing NATURAL: a pipeline for causal estimation from text data in hours, not years. Paper: tinyurl.com/ppr29 Site: tinyurl.com/web98

What if the next medical breakthrough is hidden in plain text? Causal estimates drives progress but data is limited &amp; RCTs slow. Introducing NATURAL: a pipeline for causal estimation from text data in hours, not years.
Paper: tinyurl.com/ppr29
Site: tinyurl.com/web98
Roger Grosse (@rogergrosse) 's Twitter Profile Photo

In 2021, I wrote up a timeline for how things might progress if we were 30 years away from artificial superintelligence. (This was a thought experiment rather than a forecast -- it felt aggressive at the time.) This timeline had an AI win an IMO gold medal in 2035.

Schwartz Reisman Institute (@torontosri) 's Twitter Profile Photo

The field of AI safety emphasizes AI should “do no harm.” But lethal autonomous systems used in warfare are already causing harm. How should we think about purposely harmful AI? SRI Grad Fellow Michael Zhang writes about a panel exploring this topic: uoft.me/aJh

The field of AI safety emphasizes AI should “do no harm.” But lethal autonomous systems used in warfare are already causing harm. How should we think about purposely harmful AI?

SRI Grad Fellow <a href="/michaelrzhang/">Michael Zhang</a> writes about a panel exploring this topic: uoft.me/aJh
Zachary Nado (@zacharynado) 's Twitter Profile Photo

"Non-diagonal preconditioning has dethroned Nesterov Adam" 🧴👑 shampoo wins, finally the community can know what we have for years! this benchmark has been 3+ years in the making (we first talked about it Google in 2021), I'm beyond psyched that it's finally yielded results!

Stuart Ritchie 🇺🇦 (@stuartjritchie) 's Twitter Profile Photo

Look, here’s the thing about free speech: YES, it’s not “absolute”. Even the most hardcore free speech advocates agree that there are exceptions. Extreme case: telling e.g. Russia about UK military secrets is “just” a speech act, but it is (and should be) illegal in UK law.

Amanda Askell (@amandaaskell) 's Twitter Profile Photo

Joining a company you think is bad in order to be a force for good from the inside is the career equivalent of "I can change him".

Roger Grosse (@rogergrosse) 's Twitter Profile Photo

A strange but plausible future is one where 2028-era AIs can autonomously spit out NeurIPS papers and this results in no discernible speedup in scientific progress.

Schwartz Reisman Institute (@torontosri) 's Twitter Profile Photo

What is "safe" AI? Why is it difficult to achieve? Can LLMs be hacked? Are the existential risks of advanced AI exaggerated—or justified? Join us next week on Sept. 10 to hear from AI experts Karina Vold,Roger Grosse,Sedef Akinli Kocak, and Sheila McIlraith. 🔗 uoft.me/aLB

What is "safe" AI? Why is it difficult to achieve? Can LLMs be hacked? Are the existential risks of advanced AI exaggerated—or justified?  

Join us next week on Sept. 10 to hear from AI experts <a href="/karinavold/">Karina Vold</a>,<a href="/RogerGrosse/">Roger Grosse</a>,<a href="/sedak99/">Sedef Akinli Kocak</a>, and <a href="/SheilaMcIlraith/">Sheila McIlraith</a>.

đź”— uoft.me/aLB