Rob Miles (✈️ Berkeley)
@robertskmiles
Explaining AI Alignment to anyone who'll stand still for long enough, on YouTube and Discord.
Music, movies, microcode, and high-speed pizza delivery
15-04-2010 04:10:41
11,2K Tweets
16,1K Followers
777 Following
Their results are bizarre and inhuman. Neel Nanda trained a tiny transformer to do addition, then spent weeks figuring out what it was doing - one of the only times in history someone has understood how a transformer works.
This is the algorithm it created. To *add two numbers*!