Owain Evans (@owainevans_uk) 's Twitter Profile
Owain Evans

@owainevans_uk

Independent AI Safety research group in Berkeley + Affiliate at UC Berkeley. Past: Oxford Uni, TruthfulQA, Reversal Curse. Prefer email to DM.

ID: 1247872005912891392

linkhttps://owainevans.github.io/ calendar_today08-04-2020 13:01:26

4,4K Tweet

7,7K Followers

266 Following

Jacob Pfau (@jacob_pfau) 's Twitter Profile Photo

Situational awareness benchmarking shows increasing performance with newer LLMs, but not on this one: ANTI-IMITATION tasks challenge LLMs that naively imitates training distribution. To succeed, an LLM must use details of the LLM itself and its particular non-human capabilities.

Dima Krasheninnikov (@dmkrash) 's Twitter Profile Photo

1/ Excited to finally tweet about our paper “Implicit meta-learning may lead LLMs to trust more reliable sources”, to appear at ICML 2024. Our results suggest that during training, LLMs better internalize text that appears useful for predicting other text (e.g. seems reliable).

1/ Excited to finally tweet about our paper “Implicit meta-learning may lead LLMs to trust more reliable sources”, to appear at ICML 2024. Our results suggest that during training, LLMs better internalize text that appears useful for predicting other text (e.g. seems reliable).
Owain Evans (@owainevans_uk) 's Twitter Profile Photo

"The Annunciation", Oleksandr Murashko. 1909, National Art Museum of Ukraine Saw this in Bratislava (Slovakia) on loan from Ukraine. Not much info online about this painting.

"The Annunciation", Oleksandr Murashko. 
1909, National Art Museum of Ukraine 

Saw this in Bratislava (Slovakia) on loan from Ukraine. Not much info online about this painting.
Joe Carlsmith (@jkcarlsmith) 's Twitter Profile Photo

I so enjoyed talking with Dwarkesh Patel about my essay series “Otherness and control in the age of AGI.” He engaged so deeply, asked such great questions, and aimed so directly at the core of the issues at stake. It’s a pleasure to be a part of conversations like this.

Michaël Trazzi (@michaeltrazzi) 's Twitter Profile Photo

I've interviewed Owain Evans about his work on Al situational awareness and out-of-context reasoning in LLMs Owain has been publishing some of the most surprising and important Alignment papers in the past year, and I'm proud to be his first longform podcast ever