Rowan Zellers (@rown) 's Twitter Profile
Rowan Zellers

@rown

Researcher at @OpenAI studying multimodal - vision & language & sound.
website: rowanzellers.com
(he/him)

ID: 17108894

linkhttps://rowanzellers.com calendar_today02-11-2008 02:12:46

474 Tweet

9,9K Takipçi

925 Takip Edilen

Lucas Beyer (bl16) (@giffmana) 's Twitter Profile Photo

We release PaliGemma. I'll keep it short, still on vacation: - sota open base VLM designed to transfer quickly, easily, and strongly to a wide range of tasks - Also does detection and segmentation - We provide lots of examples - Meaty tech report later! ai.google.dev/gemma/docs/pal…

We release PaliGemma. I'll keep it short, still on vacation:

- sota open base VLM designed to transfer quickly, easily, and strongly to a wide range of tasks
- Also does detection and segmentation
- We provide lots of examples
- Meaty tech report later!

ai.google.dev/gemma/docs/pal…
Rowan Zellers (@rown) 's Twitter Profile Photo

Congrats to the GDM Astra team! Super impressive video, audio, and long-context capabilities - can't wait to play with it 😀 (in the meantime, love the demo, though can't help but feel some secondhand awkwardness on behalf of the coworkers 🤣)

Prafulla Dhariwal (@prafdhar) 's Twitter Profile Photo

GPT-4o (o for “omni”) is the first model to come out of the omni team, OpenAI’s first natively fully multimodal model. This launch was a huge org-wide effort, but I’d like to give a shout out to a few of my awesome team members who made this magical model even possible!

Greg Brockman (@gdb) 's Twitter Profile Photo

GPT-4o was a whole team effort. Huge credit especially to Prafulla Dhariwal for the conviction to build an omni model, and helping see it through collaboration with many teams at OpenAI over the past 18 months. Also, he's excellent at trivia:

bogo (@giertler) 's Twitter Profile Photo

the first time gpt-4o spoke back to me in real-time, it became clear that we built something completely new – and that what we are building is the future of human-computer interaction. come build this real-time future with us. openai.com/careers/real-t…

Rowan Zellers (@rown) 's Twitter Profile Photo

I'll be in Seattle for #CVPR2024 (Tues June 18/Wednesday June 19) If you'll be there and interested in realtime video+audio, then we should meet! DM me if so 😀

Alexander Kirillov (@kirillov_a_n) 's Twitter Profile Photo

I'm going to #CVPR2024 next week. If you are in Seattle and interested in broad multimodal intelligence or real-time multimodality, we should talk. DM me here or send an email.

raulpuri.eth (@therealrpuri) 's Twitter Profile Photo

Multimodal OpenAI is out here #CVPR2024 this week. We have one gptv/gpt4o talk tmrw (Tuesday) @9am. DMs are open. Come find us to chat about AGI, multimodal, vibes, or hiring. Deets about the talk and who's here in thread 👇.

Alessandro Suglia (@ale_suglia) 's Twitter Profile Photo

Excited to announce that the Embodied AI team at Alana developed AlanaVLM, a new foundation model for egocentric video understanding. Best open-weight model which rivals GPT-4V on spatial reasoning questions in OpenEQA. Thread below! arxiv.org/abs/2406.13807 #AI #NLProc

Tsarathustra (@tsarnick) 's Twitter Profile Photo

Roman Huet, Head of Developer Experience at OpenAI, shows a live demo of how GPT-4o can interact with the world through a webcam, understanding what it sees and reads

Rowan Zellers (@rown) 's Twitter Profile Photo

congrats kyutai on the Moshi demo moshi.chat - the latency is impressive! Curious to read the tech report! Capabilities wise, the current model feels a bit like the early days of LMs 😅. So it'll be useful to have this model to build on + make benchmarks for.

congrats <a href="/kyutai_labs/">kyutai</a> on the Moshi demo moshi.chat - the latency is impressive! Curious to read the tech report!

Capabilities wise, the current model feels a bit like the early days of LMs 😅. So it'll be useful to have this model to build on + make benchmarks for.
David Dohan (@dmdohan) 's Twitter Profile Photo

🍓is ripe and is ready to think, fast and slow: check out OpenAI o1, trained to reason before answering I joined OpenAI to push boundaries of science & reasoning with AI. Happy to share this result of team's amazing collaboration does just that Try it on your hardest problems

Mark Chen (@markchen90) 's Twitter Profile Photo

As a coach for the US IOI team, I’ve been motivated for a long time to create models which can perform at the level of the most elite competitors in the world. Check out our research blog post - with enough samples, we achieve gold medal performance on this year’s IOI and ~14/15