Jim Fan (@DrJimFan) Twitter Tweets • TwiCopy

Jim Fan

@DrJimFan

+ Follow

@NVIDIA Sr. Research Manager & Lead of Embodied AI (GEAR Lab). Creating foundation models for Humanoid Robots & Gaming. @Stanford Ph.D. @OpenAI's first intern.

ID:1007413134

linkhttps://jimfan.me calendar_today12-12-2012 22:11:27

3,5K Tweets

229,2K Followers

2,9K Following

Follow People

AI research paper tweets, ML @Gradio (acq. by @HuggingFace 🤗) dm for promo

+ Follow

AI at Meta

Together with the AI community, we are pushing the boundaries of what’s possible through open science to create a more connected world.

+ Follow

AI Pub

AI papers and AI research explained, for technical people. Get hired by the best AI companies: https://t.co/MySVjUGOQ3

+ Follow

Aran Komatsuzaki

ML research & startup; EleutherAI, LAION

+ Follow

Lior⚡

Covering the latest breakthroughs in AI • ML Engineer building AlphaSignal → A technical newsletter read by 150,000+ engineers and researchers.

+ Follow

Jim Fan

@DrJimFan

6 days ago

Llama-3 is closing the gap with GPT-4, but multimodal models gotta catch up. Vision capabilities of open models like LlaVA are far, far behind GPT-4V. Video models are even worse. They hallucinate all the time and fail to give detailed descriptions of complex scenes and actions.

account_circle

Jim Fan

@DrJimFan

1 week ago

AI winter? No. Even if GPT-5 plateaus. Robotics hasn’t even started to scale yet.

Embodied intelligence in the physical world will be a powerhouse for economic value. Friendly reminder to everyone that LLM is not all of AI. It is just one piece of a bigger puzzle.

account_circle

Jim Fan

@DrJimFan

1 week ago

Prediction: GPT-5 will be announced before Llama-3-400B releases. External movement defines OpenAI’s PR schedule 🤣

account_circle

Ajay Mandlekar

@AjayMandlekar

1 week ago

Data is the key driving force behind success in robot learning. Our upcoming RSS 2024 workshop 'Data Generation for Robotics” will feature exciting speakers, timely debates, and more! Submit by May 20th.

sites.google.com/view/data-gene…

account_circle

Jim Fan

@DrJimFan

1 week ago

It took my brain a while to parse what's going on in this video. We are so obsessed with 'human-level' robotics that we forget it is just an artificial ceiling. Why don't we make a new species superhuman from day one? Boston Dynamics has once again reinvented itself. Gradually,

account_circle

Jim Fan

@DrJimFan

1 week ago

Humanoid robots will exceed the supply of iPhones in the next decade.

Gradually, then suddenly.

account_circle

Elon Musk

@elonmusk

1 week ago

Jim Fan Two sources of data scale infinitely: synthetic data, which has an “is it true?” problem and real-world video, which does not.

account_circle

Jim Fan

@DrJimFan

1 week ago

Tesla FSD v13 will likely be grokking language tokens. What excites me the most about Grok-1.5V is the potential to solve edge cases in self-driving. Using language for 'chain of thought' will help the car break down a complex scenario, reason with rules and counterfactuals, and

account_circle

Jim Fan

@DrJimFan

2 weeks ago

The moat of software AI agents is not the thin wrapper layer (Devin, SWE-Agent), but the underlying LLM. Instead of benchmarking the wrapper, I think SWE-Bench is excellent for evaluating coding LLMs instead:

Hold the agent layer fixed and vary only the LLM backend. Provide all

account_circle

Jim Fan

@DrJimFan

2 weeks ago

Math talking to bare metal in the purest way. Andrej Karpathy makes AI education not only accessible, but also elegant. I'm reading through the code like a work of art.

account_circle

Jim Fan

@DrJimFan

3 weeks ago

The legendary class created by Fei-Fei Li & Andrej Karpathy that introduced deep learning to a generation of students. Proud to be a TA alumnus for CS231n! I used to write the Google Cloud tutorial on how to set up GPU instances and run experiments ;)

account_circle

Jim Fan

@DrJimFan

3 weeks ago

Better manual design of the command line tools for GPT-4 is all you need to get 12.3% on SWEBench. There is no magic, no model breakthrough, no justification for the extreme hype.

When GPT-5 comes, instruction following, tool use, and long context will surely be far better. None

account_circle

Jim Fan

@DrJimFan

4 weeks ago

This sakura video has no more complexity than 262 characters, implemented as shader code that *generates* pixels. A text2video model that achieves maximal possible compression will be able to recover this program approximately in its weights, synthesized through denoising and

account_circle

Jim Fan

@DrJimFan

1 month ago

Novelty is so overrated. It's an example of misaligned objective: if reviewers look for novelty, you shape your research and efforts towards that, while devaluing things that actually matter.

I used to review CVPR papers, but stopped wasting time on so many mind-numbing papers

account_circle