Sebastian Raschka
@rasbt
AI & ML researcher. Author of the "Build a Large Language Model From Scratch" book (mng.bz/n1O4). LLM research engineer @LightningAI.
ID: 865622395
https://sebastianraschka.com/books/ 07-10-2012 02:06:16
16,16K Tweet
285,285K Followers
907 Following
Just shared a new article on "Building a GPT-Style LLM Classifier From Scratch" (magazine.sebastianraschka.com/p/building-a-g…) along with some insights from some extra experiments that I found super interesting: 1) Do we need to train all layers? 2) Why finetuning the last token, not the first token?