Sebastian Raschka (@rasbt) Twitter Tweets • TwiDoom

Sebastian Raschka

@rasbt

+ Follow

AI & ML researcher. Author of the "Build a Large Language Model From Scratch" book (mng.bz/n1O4). LLM research engineer @LightningAI.

ID: 865622395

linkhttps://sebastianraschka.com/books/ calendar_today07-10-2012 02:06:16

16,16K Tweet

285,285K Followers

907 Following

Sebastian Raschka

@rasbt

2 months ago

I just created a notebook using Llama 3.1 70B to generate a synthetic data for preference finetuning / aligning LLMs: github.com/rasbt/LLMs-fro… (The updated Llama license now allows using Llama 3.1 to improve other models, yay!). You can probably guess what comes next... 😊

thumb_up_off_alt1,1K

chat_bubble_outline18

repeat269

shareShare

Sebastian Raschka

@rasbt

2 months ago

If you have worked with more complex PyTorch models, such as when implementing LLMs, you've probably encountered PyTorch buffers. I recorded a short 13-min hands-on coding video to explain what PyTorch buffers are, when we need them, and how we use them: youtube.com/watch?v=PetlIo…

thumb_up_off_alt738

chat_bubble_outline8

repeat113

shareShare

Nathan Lambert

@natolambert

2 months ago

Here's my full interview with Sebastian Raschka, one of the great AI educators today. We cover many details from DPO training failure modes, ChatGPT vs Claude, Llama 3.1, moderating Arxiv, avoiding hype, writing, getting started in AI, and other topics. Chapters: 00:00:00 Introduction &

thumb_up_off_alt94

chat_bubble_outline2

repeat15

shareShare

Sebastian Raschka

@rasbt

2 months ago

If you want to try out the freshly released Gemma 2 2B model in Python... Just added Gemma 2 2B to LitGPT: github.com/Lightning-AI/l…

thumb_up_off_alt379

chat_bubble_outline1

repeat56

shareShare

Yam Peleg

@yampeleg

2 months ago

Damnnnn this tutorial is of high quality!! Check out the code repository: - Standalone Jupyter notebooks. - Only what you need. - Highly explanatory. - 0 over engineering. - All from scratch. - Self contained. I wish everyone wrote all their code like this.

thumb_up_off_alt267

chat_bubble_outline0

repeat43

shareShare

Sebastian Raschka

@rasbt

a month ago

I don’t post video tutorials (that) often, but hey, I just saw that I got 30k subs on YouTube! If you’re looking to learn something new this weekend, I recently made video on how LLMs work, breaking down the development stages step by step: youtube.com/watch?v=kPGTx4…

thumb_up_off_alt1,1K

chat_bubble_outline16

repeat260

shareShare

Nathan Lambert

@natolambert

a month ago

Direct Alignment Algorithms (DAA) is a great way to describe DPO et al algorithms. Endorsed. Bringing back the petition for preference fine-tuning (PrefFT) instead of RLHF).

thumb_up_off_alt112

chat_bubble_outline8

repeat27

shareShare

Sebastian Raschka

@rasbt

a month ago

While there are 100s of LLMs papers published each month proposing new techniques. The best way to see what truly works in practice is by looking at the pre-training and post-training pipelines of the latest state-of-the-art models. Here's what I found: magazine.sebastianraschka.com/p/new-llm-pre-…

thumb_up_off_alt823

chat_bubble_outline15

repeat156

shareShare

Sebastian Raschka

@rasbt

a month ago

LitServe is awesome! I also use it to serve LLMs (e.g., based on LitGPT). Short example from one of my recent talks:

thumb_up_off_alt149

chat_bubble_outline2

repeat28

shareShare

Sebastian Raschka

@rasbt

21 days ago

If you’d like to spend a few hours this weekend to dive into Large Language Models (LLMs) and understand how they work, I've prepared a 3-hour coding workshop presentation on implementing, training, and using LLMs: youtube.com/watch?v=quh7z1…

thumb_up_off_alt2,2K

chat_bubble_outline30

repeat451

shareShare

Tim Dettmers

@tim_dettmers

17 days ago

MoEs are back!

thumb_up_off_alt130

chat_bubble_outline2

repeat10

shareShare