Mayank Bhaskar (@cataluna84) Twitter Tweets • TwiDoom

Mayank Bhaskar

@cataluna84

+ Follow

ID: 1481857459

linkhttp://www.linkedin.com/in/cataluna84 calendar_today04-06-2013 09:38:31

2,2K Tweet

710 Followers

3,3K Following

Phillip Isola

@phillip_isola

5 months ago

Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman mitpress.mit.edu/9780262048972/… It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4

thumb_up_off_alt2,2K

chat_bubble_outline39

repeat406

shareShare

Arthur Douillard

@ar_douillard

a month ago

It’s crazy that model merging works as well as it does. Strongly recommend following Mitchell Wortsman Alexandre Ramé Prateek Yadav who have done a lot in that field.

thumb_up_off_alt58

chat_bubble_outline6

repeat6

shareShare

Sebastian Raschka

@rasbt

16 days ago

Just added a multi-head attention implementation for Einstein summation enthusiasts to my collection: github.com/rasbt/LLMs-fro… Why? Because it's fun 😊!

thumb_up_off_alt166

chat_bubble_outline7

repeat22

shareShare

Dawn Song

@dawnsongtweets

16 days ago

Large Language Model Agents is the next frontier. Really excited to announce our Berkeley course on LLM Agents, also available for anyone to join as a MOOC, starting Sep 9 (Mon) 3pm PT! 📢 Sign up & join us: llmagents-learning.org

thumb_up_off_alt381

chat_bubble_outline10

repeat85

shareShare

Alfredo Canziani

@alfcnz

16 days ago

Dropping a new blog on «Visual prerequisites for learning deep learning». Nothing new. Just my recommendations, explicitly listed for former and future students’ benefit. atcold.github.io/2024/09/04/pre…

thumb_up_off_alt202

chat_bubble_outline6

repeat27

shareShare

Mayank Bhaskar

@cataluna84

14 days ago

Congratulations to JANNIK SINNER 🇮🇹 for playing a fantastic finals, winning in straight sets! 🏆

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare

Shreyansh Singh

@shreyansh_26

13 days ago

FlexAttention is a game-changer! 🔥 I was revisiting the Longformer paper, and I decided to implement its attention patterns using FlexAttention. Tested it out and it’s faster than F.sdpa/xFormers with masking and even FlashAttention-2 (full/causal) for both the forward and

thumb_up_off_alt366

chat_bubble_outline4

repeat38

shareShare

Jason Ramapuram

@jramapuram

13 days ago

Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention with a sigmoid and a constant (not learned) scalar bias based on the sequence length. Paper: arxiv.org/abs/2409.04431 Code: github.com/apple/ml-sigmo… This was

thumb_up_off_alt658

chat_bubble_outline13

repeat127

shareShare

Mayank Bhaskar

@cataluna84

13 days ago

Apple is so far ahead in the game by deploying #AppleIntelligence at scale to millions of devices and using the services of Google AI and OpenAI to better their product 📱 Moreover, have you looked at their SoC improvements? Their processors manufactured on the second

thumb_up_off_alt1

chat_bubble_outline0

repeat1

shareShare

Thomas Capelle

@capetorch

12 days ago

In case you are wondering what is this about or just getting started, I put a simpler google colab to get a grasp on the problems. It is also powered by Mistral AI Large. 📓 colab.research.google.com/github/wandb/a… You sometimes manage to solve the cheeseburger corollary with a single-shot (+

thumb_up_off_alt10

chat_bubble_outline2

repeat5

shareShare

Graham Neubig

@gneubig

10 days ago

We started the Fall 2024 version of CMU CS11-711 Advanced NLP🎓 Follow along to learn about the latest in NLP, LLMs, Agents, etc. * Materials: phontron.com/class/anlp-fal… * Videos: youtube.com/playlist?list=…

thumb_up_off_alt641

chat_bubble_outline5

repeat128

shareShare

Rohan Pandey (e/acc)

@khoomeik

10 days ago

results still coming in but o1-preview is doing 10% on an internal codegen benchmark that gpt-4o gets 38% at 😬 maybe the CoT & fewshot examples in our prompts are reducing performance, as suggested by API docs?

thumb_up_off_alt153

chat_bubble_outline18

repeat9

shareShare

ICLR 2025

@iclr_conf

10 days ago

ICLR 2025 is now open for submissions on OpenReview!

thumb_up_off_alt169

chat_bubble_outline1

repeat27

shareShare

Mayank Bhaskar

@cataluna84

10 days ago

This is surely going to be fun! 🤩 #CUDA #Hackathon

thumb_up_off_alt1

chat_bubble_outline0

repeat0

shareShare

Jeremy Howard

@jeremyphoward

9 days ago

This is amaaaazing! This must be the best way to get started with GPU programming now Literally, as you type each char, it creates and runs GPU in real time, runs it, and reports the results.🤯 I didn't expect to see a FastHTML/gpu.cpp cross-over… let alone something so cool.

thumb_up_off_alt1,1K

chat_bubble_outline5

repeat149

shareShare

Horace He

@chhillee

8 days ago

Now that some fused attention implementations (like CuDNN, which is nearly as fast as FA3) support arbitrary attention masks, one might wonder if that could replace FlexAttention. As it turns out.... no. Not only does it require quadratic memory, it's 4x slower. (1/4)

thumb_up_off_alt192

chat_bubble_outline1

repeat18

shareShare

Csaba Szepesvari

@csabaszepesvari

8 days ago

Signal boosting; please repost! We need more nominations! There are so many deserving people, **please be generous and send a nomination**! It should not take much time (a short nomination is preferred to none). We are hoping the prize will motivate more to take the math+AI path!

thumb_up_off_alt40

chat_bubble_outline1

repeat22

shareShare

LangChain

@langchainai

8 days ago

✨Build a Knowledge Graph-based Agent With Llama 3.1, NVIDIA NIM, and LangChain Great blog by Tomaz Bratanic! Use Llama 3.1 native function-calling capabilities to retrieve structured data from a knowledge graph to power your RAG applications medium.com/neo4j/build-a-…

thumb_up_off_alt176

chat_bubble_outline4

repeat45

shareShare

Yuchen Jin

@yuchenj_uw

7 days ago

GPU tradeoff series: GPT-2 (124M) training on a Single 4090 vs. H100 🧑‍🍳 GPU Perf and Price: - 4090: 330 fp16 TFLOPs, $1,749 - H100: 989 fp16 TFLOPs, $25,000 > H100 is 14.3X more expensive Traning speed with llm.c: - H100: 481K tokens/s - 4090: 153K tokens/s > H100 is 3.1X

thumb_up_off_alt108

chat_bubble_outline16

repeat8

shareShare

Mayank Bhaskar

@cataluna84

6 days ago

This might be the most accessible way to learn Metal + GPU programming if you have a M series Mac. All 14 GPU puzzles (increasing order of difficulty) in Metal with MLX custom kernels to do it all from Python. #MetalProgramming #MLX

thumb_up_off_alt0

chat_bubble_outline0

repeat0

shareShare