Samuel Albanie (@samuelalbanie) Twitter Tweets • TwiDoom

Samuel Albanie

7 months ago

Today I'll give my final lecture on data structures & algorithms Engineering Dept Cambridge University 😢 But, for those keen to study: - re-recorded videos - slides - and code are all available online: buff.ly/3UVX36V (the fun Red-Black Tree vis. is based on work by Luca 🇪🇺 🇮🇹)

thumb_up_off_alt72

chat_bubble_outline1

repeat10

shareShare

Samuel Albanie

@samuelalbanie

7 months ago

Thought-provoking talk from Yuval Noah Harari last night Briefly: - Humanity faces *several* big challenges ahead - If we can't avoid global military conflict, we will struggle to tackle them Cambridge University Centre for the Study of Existential Risk King's College, Cambridge

Thought-provoking talk from <a href="/harari_yuval/">Yuval Noah Harari</a> last night

Briefly:
- Humanity faces *several* big challenges ahead
- If we can't avoid global military conflict, we will struggle to tackle them

<a href="/Cambridge_Uni/">Cambridge University</a> <a href="/CSERCambridge/">Centre for the Study of Existential Risk</a> <a href="/Kings_College/">King's College, Cambridge</a>

thumb_up_off_alt8

chat_bubble_outline1

repeat2

shareShare

Samuel Albanie

@samuelalbanie

7 months ago

Beartype has long been one of my favourite open-source libraries Because: - it's a great library - thanks to maintainer Cecil Curry (leycec) every GitHub issue thread is a work of literature Some classics buff.ly/4bWCpJG buff.ly/3TgFSvA buff.ly/3wA4N4j

thumb_up_off_alt16

chat_bubble_outline1

repeat1

shareShare

Samuel Albanie

@samuelalbanie

7 months ago

A small personal update: - Excited to join Google DeepMind 🚀 - Grateful for the wonderful humans I've had the pleasure of working with on my journey so far at Engineering Dept and Visual Geometry Group (VGG) ❤️

thumb_up_off_alt237

chat_bubble_outline21

repeat5

shareShare

Jonathan Roberts

@jrobertsai

4 months ago

Introducing SciFIBench, a scientific figure interpretation benchmark for LMMs! github.com/jonathan-rober… - We evaluate 30 LMM, VLM and human baselines - GPT-4o is much better than GPT-4V - The mean human narrowly outperforms GPT-4o & Gemini-Pro 1.5 (1/5)

thumb_up_off_alt15

chat_bubble_outline6

repeat7

shareShare

Tim Franzmeyer

@frtimlive

3 months ago

📢 Introducing HelloFresh: A Dynamic LLM Benchmark of Real-World Human Editorial Actions on X Community Notes and Wikipedia Edits. Can you beat GPT4 and GeminiPro at classifying X Community Notes and Wikipedia edits? Try our demo – shown in the video below – and see what

thumb_up_off_alt34

chat_bubble_outline1

repeat14

shareShare

Zac Kenton

@zackenton1

2 months ago

Eventually, humans will need to supervise superhuman AI - but how? Can we study it now? We don't have superhuman AI, but we do have LLMs. We study protocols where a weaker LLM uses stronger ones to find better answers than it knows itself. Does this work? It’s complicated: 🧵👇

thumb_up_off_alt235

chat_bubble_outline4

repeat63

shareShare

Samuel Albanie

@samuelalbanie

2 months ago

Great work from the team!

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Samuel Albanie

@samuelalbanie

2 months ago

I recommend this essay on neural network interpretability by Lewis Smith lesswrong.com/posts/tojtPCCR…

thumb_up_off_alt177

chat_bubble_outline1

repeat20

shareShare

Samuel Albanie

@samuelalbanie

a month ago

Interesting write-up from Alex Irpan about his switch to AI safety alexirpan.com/2024/08/06/swi…

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Samuel Albanie

@samuelalbanie

a month ago

Enjoyed this paper on LMs, world models and agent models by Zhiting Hu and Tianmin Shu TLDR: for reasoning tasks, it’s a useful abstraction to treat LMs as simulators (“backends”) that simulate agent models and world models arxiv.org/abs/2312.05230

Enjoyed this paper on LMs, world models and agent models by <a href="/ZhitingHu/">Zhiting Hu</a> and <a href="/tianminshu/">Tianmin Shu</a>

TLDR: for reasoning tasks, it’s a useful abstraction to treat LMs as simulators (“backends”) that simulate agent models and world models

arxiv.org/abs/2312.05230

thumb_up_off_alt92

chat_bubble_outline2

repeat21

shareShare

Samuel Albanie

@samuelalbanie

a month ago

Can multimodal models understand complex graphs? Not yet...

thumb_up_off_alt8

chat_bubble_outline0

repeat1

shareShare

Samuel Albanie

@samuelalbanie

a month ago

Found this work by Bradley Brown and others interesting. Test-time compute looks set to become increasingly important.

Found this work by <a href="/brad19brown/">Bradley Brown</a> and others interesting.

Test-time compute looks set to become increasingly important.

thumb_up_off_alt50

chat_bubble_outline1

repeat4

shareShare

Samuel Albanie

@samuelalbanie

25 days ago

Life achievement: a research paper featured on Computerphile! Led by Vishaal Udandarao, Ameya P, Adhiraj Ghosh, Yash Sharma with philip ୨୧ Matthias Bethge (from a few months ago) youtube.com/watch?v=dDUC-L…

thumb_up_off_alt29

chat_bubble_outline1

repeat3

shareShare

Samuel Albanie

@samuelalbanie

20 days ago

Great work led by Karsten Roth and Vishaal Udandarao With Sebastian Dziadzio, Ameya P, mehdi cherti, Oriol Vinyals, Olivier Hénaff, Matthias Bethge, Zeynep Akata

thumb_up_off_alt32

chat_bubble_outline0

repeat4

shareShare

Samuel Albanie

@samuelalbanie

20 days ago

Enjoyed this work on AI debate by Charlie George, J. Dan and Andreas Stuhlmüller lesswrong.com/posts/DgKyDTKe…

thumb_up_off_alt4

chat_bubble_outline0

repeat1

shareShare

Samuel Albanie

@samuelalbanie

17 days ago

Useful perspective on AI Safety From Sam Bowman Recommend. sleepinyourhat.github.io/checklist/

Useful perspective on AI Safety

From <a href="/sleepinyourhat/">Sam Bowman</a>

Recommend.

sleepinyourhat.github.io/checklist/

thumb_up_off_alt5

chat_bubble_outline0

repeat0

shareShare

Usman Anwar

@usmananwar391

13 days ago

Our agenda paper on alignment and safety of LLMs just got published at TMLR: openreview.net/forum?id=oVTkO… 🥳 The revised version is also now on arxiv arxiv.org/abs/2404.09932.

thumb_up_off_alt78

chat_bubble_outline4

repeat22

shareShare