Mayank Bhaskar (@cataluna84) 's Twitter Profile
Mayank Bhaskar

@cataluna84

Independent ML Consultant 🧑🏽‍💻 | @twimlai & @CohereForAI contributer | #programmer ⌨ | #engineer 🛠 | #machinelearning 🧮 | #datavisualization 📊 | #sports ⚽

ID: 1481857459

linkhttp://www.linkedin.com/in/cataluna84 calendar_today04-06-2013 09:38:31

2,2K Tweet

710 Followers

3,3K Following

Phillip Isola (@phillip_isola) 's Twitter Profile Photo

Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman mitpress.mit.edu/9780262048972/… It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4

Our computer vision textbook is released!

Foundations of Computer Vision
with Antonio Torralba and Bill Freeman
mitpress.mit.edu/9780262048972/…

It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields.

1/4
Sebastian Raschka (@rasbt) 's Twitter Profile Photo

Just added a multi-head attention implementation for Einstein summation enthusiasts to my collection: github.com/rasbt/LLMs-fro… Why? Because it's fun 😊!

Just added a multi-head attention implementation for Einstein summation enthusiasts to my collection: github.com/rasbt/LLMs-fro…
Why? Because it's fun 😊!
Dawn Song (@dawnsongtweets) 's Twitter Profile Photo

Large Language Model Agents is the next frontier. Really excited to announce our Berkeley course on LLM Agents, also available for anyone to join as a MOOC, starting Sep 9 (Mon) 3pm PT! 📢 Sign up & join us: llmagents-learning.org

Large Language Model Agents is the next frontier. Really excited to announce our Berkeley course on LLM Agents, also available for anyone to join as a MOOC, starting Sep 9 (Mon) 3pm PT! 📢
Sign up & join us: llmagents-learning.org
Alfredo Canziani (@alfcnz) 's Twitter Profile Photo

Dropping a new blog on «Visual prerequisites for learning deep learning». Nothing new. Just my recommendations, explicitly listed for former and future students’ benefit. atcold.github.io/2024/09/04/pre…

Dropping a new blog on «Visual prerequisites for learning deep learning». Nothing new. Just my recommendations, explicitly listed for former and future students’ benefit.
atcold.github.io/2024/09/04/pre…
Shreyansh Singh (@shreyansh_26) 's Twitter Profile Photo

FlexAttention is a game-changer! 🔥 I was revisiting the Longformer paper, and I decided to implement its attention patterns using FlexAttention. Tested it out and it’s faster than F.sdpa/xFormers with masking and even FlashAttention-2 (full/causal) for both the forward and

FlexAttention is a game-changer! 🔥
I was revisiting the Longformer paper, and I decided to implement its attention patterns using FlexAttention.

Tested it out and it’s faster than F.sdpa/xFormers with masking and even FlashAttention-2 (full/causal) for both the forward and
Jason Ramapuram (@jramapuram) 's Twitter Profile Photo

Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention with a sigmoid and a constant (not learned) scalar bias based on the sequence length. Paper: arxiv.org/abs/2409.04431 Code: github.com/apple/ml-sigmo… This was

Enjoy attention? Want to make it ~18% faster? Try out Sigmoid Attention. We replace the traditional softmax in attention with a sigmoid and a constant (not learned) scalar bias based on the sequence length.

Paper: arxiv.org/abs/2409.04431
Code: github.com/apple/ml-sigmo…

This was
Mayank Bhaskar (@cataluna84) 's Twitter Profile Photo

Apple is so far ahead in the game by deploying #AppleIntelligence at scale to millions of devices and using the services of Google AI and OpenAI to better their product 📱 Moreover, have you looked at their SoC improvements? Their processors manufactured on the second

Thomas Capelle (@capetorch) 's Twitter Profile Photo

In case you are wondering what is this about or just getting started, I put a simpler google colab to get a grasp on the problems. It is also powered by Mistral AI Large. 📓 colab.research.google.com/github/wandb/a… You sometimes manage to solve the cheeseburger corollary with a single-shot (+

In case you are wondering what is this about or just getting started, I put a simpler google colab to get a grasp on the problems. It is also powered by <a href="/MistralAI/">Mistral AI</a> Large.

📓 colab.research.google.com/github/wandb/a…

You sometimes manage to solve the cheeseburger corollary with a single-shot (+
Graham Neubig (@gneubig) 's Twitter Profile Photo

We started the Fall 2024 version of CMU CS11-711 Advanced NLP🎓 Follow along to learn about the latest in NLP, LLMs, Agents, etc. * Materials: phontron.com/class/anlp-fal… * Videos: youtube.com/playlist?list=…

Rohan Pandey (e/acc) (@khoomeik) 's Twitter Profile Photo

results still coming in but o1-preview is doing 10% on an internal codegen benchmark that gpt-4o gets 38% at 😬 maybe the CoT & fewshot examples in our prompts are reducing performance, as suggested by API docs?

results still coming in but o1-preview is doing 10% on an internal codegen benchmark that gpt-4o gets 38% at 😬

maybe the CoT &amp; fewshot examples in our prompts are reducing performance, as suggested by API docs?
Jeremy Howard (@jeremyphoward) 's Twitter Profile Photo

This is amaaaazing! This must be the best way to get started with GPU programming now Literally, as you type each char, it creates and runs GPU in real time, runs it, and reports the results.🤯 I didn't expect to see a FastHTML/gpu.cpp cross-over… let alone something so cool.

Horace He (@chhillee) 's Twitter Profile Photo

Now that some fused attention implementations (like CuDNN, which is nearly as fast as FA3) support arbitrary attention masks, one might wonder if that could replace FlexAttention. As it turns out.... no. Not only does it require quadratic memory, it's 4x slower. (1/4)

Now that some fused attention implementations (like CuDNN, which is nearly as fast as FA3) support arbitrary attention masks, one might wonder if that could replace FlexAttention.

As it turns out.... no. Not only does it require quadratic memory, it's 4x slower.

(1/4)
Csaba Szepesvari (@csabaszepesvari) 's Twitter Profile Photo

Signal boosting; please repost! We need more nominations! There are so many deserving people, **please be generous and send a nomination**! It should not take much time (a short nomination is preferred to none). We are hoping the prize will motivate more to take the math+AI path!

LangChain (@langchainai) 's Twitter Profile Photo

✨Build a Knowledge Graph-based Agent With Llama 3.1, NVIDIA NIM, and LangChain Great blog by Tomaz Bratanic! Use Llama 3.1 native function-calling capabilities to retrieve structured data from a knowledge graph to power your RAG applications medium.com/neo4j/build-a-…

✨Build a Knowledge Graph-based Agent With Llama 3.1, NVIDIA NIM, and LangChain

Great blog by Tomaz Bratanic!

Use Llama 3.1 native function-calling capabilities to retrieve structured data from a knowledge graph to power your RAG applications

medium.com/neo4j/build-a-…
Yuchen Jin (@yuchenj_uw) 's Twitter Profile Photo

GPU tradeoff series: GPT-2 (124M) training on a Single 4090 vs. H100 🧑‍🍳 GPU Perf and Price: - 4090: 330 fp16 TFLOPs, $1,749 - H100: 989 fp16 TFLOPs, $25,000 > H100 is 14.3X more expensive Traning speed with llm.c: - H100: 481K tokens/s - 4090: 153K tokens/s > H100 is 3.1X

Mayank Bhaskar (@cataluna84) 's Twitter Profile Photo

This might be the most accessible way to learn Metal + GPU programming if you have a M series Mac. All 14 GPU puzzles (increasing order of difficulty) in Metal with MLX custom kernels to do it all from Python. #MetalProgramming #MLX