Rowan Zellers (@rown) Twitter Tweets • TwiDoom

Morena

@morenadevil4

9 years ago

Twitter Beğeni Hilesi

thumb_up_off_alt3

chat_bubble_outline0

repeat0

shareShare

We release PaliGemma. I'll keep it short, still on vacation: - sota open base VLM designed to transfer quickly, easily, and strongly to a wide range of tasks - Also does detection and segmentation - We provide lots of examples - Meaty tech report later! ai.google.dev/gemma/docs/pal…

thumb_up_off_alt866

chat_bubble_outline27

repeat129

shareShare

Rowan Zellers

@rown

4 months ago

Congrats to the GDM Astra team! Super impressive video, audio, and long-context capabilities - can't wait to play with it 😀 (in the meantime, love the demo, though can't help but feel some secondhand awkwardness on behalf of the coworkers 🤣)

thumb_up_off_alt26

chat_bubble_outline1

repeat0

shareShare

Rowan Zellers

@rown

4 months ago

Video and audio go really well together! Feels truer than ever now 😀

thumb_up_off_alt53

chat_bubble_outline3

repeat0

shareShare

Prafulla Dhariwal

@prafdhar

4 months ago

GPT-4o (o for “omni”) is the first model to come out of the omni team, OpenAI’s first natively fully multimodal model. This launch was a huge org-wide effort, but I’d like to give a shout out to a few of my awesome team members who made this magical model even possible!

thumb_up_off_alt4,4K

chat_bubble_outline137

repeat344

shareShare

Greg Brockman

@gdb

4 months ago

GPT-4o was a whole team effort. Huge credit especially to Prafulla Dhariwal for the conviction to build an omni model, and helping see it through collaboration with many teams at OpenAI over the past 18 months. Also, he's excellent at trivia:

thumb_up_off_alt1,1K

chat_bubble_outline77

repeat154

shareShare

bogo

@giertler

4 months ago

the first time gpt-4o spoke back to me in real-time, it became clear that we built something completely new – and that what we are building is the future of human-computer interaction. come build this real-time future with us. openai.com/careers/real-t…

thumb_up_off_alt549

chat_bubble_outline31

repeat59

shareShare

Greg Brockman

@gdb

4 months ago

We're hiring Real-Time Communications Engineers to help us scale and safely deploy multimodal voice models. Apply now:

thumb_up_off_alt1,1K

chat_bubble_outline116

repeat144

shareShare

Rowan Zellers

@rown

3 months ago

I'll be in Seattle for #CVPR2024 (Tues June 18/Wednesday June 19) If you'll be there and interested in realtime video+audio, then we should meet! DM me if so 😀

thumb_up_off_alt77

chat_bubble_outline0

repeat3

shareShare

Alexander Kirillov

@kirillov_a_n

3 months ago

I'm going to #CVPR2024 next week. If you are in Seattle and interested in broad multimodal intelligence or real-time multimodality, we should talk. DM me here or send an email.

thumb_up_off_alt88

chat_bubble_outline4

repeat4

shareShare

raulpuri.eth

@therealrpuri

3 months ago

Multimodal OpenAI is out here #CVPR2024 this week. We have one gptv/gpt4o talk tmrw (Tuesday) @9am. DMs are open. Come find us to chat about AGI, multimodal, vibes, or hiring. Deets about the talk and who's here in thread 👇.

thumb_up_off_alt705

chat_bubble_outline110

repeat97

shareShare

Alessandro Suglia

@ale_suglia

3 months ago

Excited to announce that the Embodied AI team at Alana developed AlanaVLM, a new foundation model for egocentric video understanding. Best open-weight model which rivals GPT-4V on spatial reasoning questions in OpenEQA. Thread below! arxiv.org/abs/2406.13807 #AI #NLProc

thumb_up_off_alt64

chat_bubble_outline3

repeat12

shareShare

Tsarathustra

@tsarnick

3 months ago

Roman Huet, Head of Developer Experience at OpenAI, shows a live demo of how GPT-4o can interact with the world through a webcam, understanding what it sees and reads

thumb_up_off_alt1,1K

chat_bubble_outline71

repeat302

shareShare

Rowan Zellers

@rown

3 months ago

congrats kyutai on the Moshi demo moshi.chat - the latency is impressive! Curious to read the tech report! Capabilities wise, the current model feels a bit like the early days of LMs 😅. So it'll be useful to have this model to build on + make benchmarks for.

congrats <a href="/kyutai_labs/">kyutai</a> on the Moshi demo moshi.chat - the latency is impressive! Curious to read the tech report!

Capabilities wise, the current model feels a bit like the early days of LMs 😅. So it'll be useful to have this model to build on + make benchmarks for.

thumb_up_off_alt30

chat_bubble_outline0

repeat0

shareShare

Rowan Zellers

@rown

a month ago

Who all is going to be at ECCV in-person?

thumb_up_off_alt20

chat_bubble_outline5

repeat0

shareShare

David Dohan

@dmdohan

8 days ago

🍓is ripe and is ready to think, fast and slow: check out OpenAI o1, trained to reason before answering I joined OpenAI to push boundaries of science & reasoning with AI. Happy to share this result of team's amazing collaboration does just that Try it on your hardest problems

thumb_up_off_alt375

chat_bubble_outline8

repeat18

shareShare

Rowan Zellers

@rown

8 days ago

O1 is much better at code (and especially at writing docstrings, a trick Vlad Fomenko showed me!). super excited to preview it!

thumb_up_off_alt67

chat_bubble_outline4

repeat1

shareShare

Mark Chen

@markchen90

8 days ago

As a coach for the US IOI team, I’ve been motivated for a long time to create models which can perform at the level of the most elite competitors in the world. Check out our research blog post - with enough samples, we achieve gold medal performance on this year’s IOI and ~14/15

thumb_up_off_alt581

chat_bubble_outline14

repeat40

shareShare

Rowan Zellers

Morena

Lucas Beyer (bl16)

Rowan Zellers

Rowan Zellers

Prafulla Dhariwal

Greg Brockman

bogo

Greg Brockman

Rowan Zellers

Alexander Kirillov

raulpuri.eth

Alessandro Suglia

Tsarathustra

Rowan Zellers

Rowan Zellers

David Dohan

Rowan Zellers

Mark Chen