Xavier Giró🎗 (@docxavi) Twitter Tweets • TwiDoom

Joelle Pineau

2 months ago

We dropped another awesome open model: SAM 2. This one comes with the data and an easy-to-use demo. It extends the original Segment Anything Model, to work on video. Enjoy!

thumb_up_off_alt109

chat_bubble_outline4

repeat17

shareShare

🔥 I am so damn excited to announce the launch of Black Forest Labs. We set ourselves on a mission to advance state-of-the-art, high-quality generative deep learning models for images and video, and make them available to the broadest audience possible. Today, we release FLUX.1

thumb_up_off_alt1,1K

chat_bubble_outline83

repeat161

shareShare

Geleta

@geletavc

2 months ago

It was a beautiful venue to showcase our research on multimodal deep steganography — Xavier Giró🎗 Jordi Pons Cristina Puntí Cristian Canton

thumb_up_off_alt8

chat_bubble_outline1

repeat1

shareShare

Sander Dieleman

@sedielem

2 months ago

I gave a 1-hour talk about generative modelling at the EEML 2024 summer school last month. It's mostly an intuitive look at how and why diffusion models actually work -- not unlike the content of my recent blog posts. All summer school talks will be freely available online!🙏

thumb_up_off_alt373

chat_bubble_outline5

repeat40

shareShare

Sander Dieleman

@sedielem

a month ago

The interpretation of diffusion as autoregression in the frequency domain seems to be stirring up a lot of thought! (I may or may not have a new blog post in the works 🧐)

thumb_up_off_alt495

chat_bubble_outline10

repeat31

shareShare

Jia-Bin Huang

@jbhuang0604

a month ago

How I Understand Transformers Transformer architectures power most (if not all) of the incredible generative AI applications. But how does it work? In this short (17m) video, you and I will go through the basic ideas behind transformers. While making this video, I had

thumb_up_off_alt301

chat_bubble_outline1

repeat59

shareShare

Google DeepMind

@googledeepmind

a month ago

Meet our AI-powered robot that’s ready to play table tennis. 🤖🏓 It’s the first agent to achieve amateur human level performance in this sport. Here’s how it works. 🧵

thumb_up_off_alt3,3K

chat_bubble_outline138

repeat841

shareShare

Jason Baldridge

@jasonbaldridge

a month ago

Excited to share our paper about Imagen 3, Google's latest and most capable text-to-image model! arxiv.org/abs/2408.07009 It's been a big team effort. You can try it out on ImageFX, our experimental AI tech surface, now: aitestkitchen.withgoogle.com/tools/image-fx

thumb_up_off_alt251

chat_bubble_outline13

repeat56

shareShare

ELLISBarcelona

@ellisbarcelona

a month ago

🗣️ This September, don't miss the first seminar from Prof. Nicu Sebe, director of #ELLIS' program on Multimodal Learning: "Cross-modal understanding and generation of multimodal content". Register here 👉 bit.ly/3A2h0Ap

thumb_up_off_alt13

chat_bubble_outline0

repeat4

shareShare

Robert Luxemburg

@robertluxemburg

a month ago

A random walk(*) through #flux latent space

thumb_up_off_alt491

chat_bubble_outline10

repeat59

shareShare

European Conference on Computer Vision #ECCV2024

@eccvconf

a month ago

The #ECCV2024 Preliminary Program is now available. The poster/oral session times can be found at the link below. Note, this schedule is subject to changes. docs.google.com/spreadsheets/d…

thumb_up_off_alt57

chat_bubble_outline2

repeat7

shareShare

Chunting Zhou

@violet_zct

a month ago

Introducing *Transfusion* - a unified approach for training models that can generate both text and images. arxiv.org/pdf/2408.11039 Transfusion combines language modeling (next token prediction) with diffusion to train a single transformer over mixed-modality sequences. This

thumb_up_off_alt983

chat_bubble_outline23

repeat208

shareShare

Richard Socher

@richardsocher

a month ago

We are entering the Age of AI. History does not repeat itself but it rhymes and this era combines aspects of the renaissance, enlightenment and the industrial revolution. We have collectively never had so much access to knowledge. AI is making it more digestible with amazing

thumb_up_off_alt216

chat_bubble_outline13

repeat46

shareShare

Sander Dieleman

@sedielem

a month ago

Think you understand classifier-free diffusion guidance? Think again! These two papers beg to differ😁 arxiv.org/abs/2406.02507 arxiv.org/abs/2408.09000 Both full of really great insights that question prevailing assumptions. cc Jaakko Lehtinen Arwen Bradley Preetum Nakkiran

thumb_up_off_alt275

chat_bubble_outline2

repeat48

shareShare

AK

@_akhaliq

a month ago

Meta presents Sapiens Foundation for Human Vision Models discuss: huggingface.co/papers/2408.12… We present Sapiens, a family of models for four fundamental human-centric vision tasks - 2D pose estimation, body-part segmentation, depth estimation, and surface normal prediction. Our

thumb_up_off_alt1,1K

chat_bubble_outline22

repeat327

shareShare

AK

@_akhaliq

a month ago

Scalable Autoregressive Image Generation with Mamba discuss: huggingface.co/papers/2408.12… model: huggingface.co/hp-l33/aim We introduce AiM, an autoregressive (AR) image generative model based on Mamba architecture. AiM employs Mamba, a novel state-space model characterized by its

thumb_up_off_alt224

chat_bubble_outline2

repeat46

shareShare

CVC_UAB

@cvc_uab

23 days ago

📢 The Annual Catalan Meeting on Computer Vision (ACMCV) is arriving, taking place on September 17th. This meeting seeks to connect the Computer Vision community of Catalonia, allowing the attendees to strength links. ✏️ Registration is already open: acmcv.cat

thumb_up_off_alt10

chat_bubble_outline0

repeat8

shareShare

AK

@_akhaliq

23 days ago

Google presents Diffusion Models Are Real-Time Game Engines discuss: huggingface.co/papers/2408.14… We present GameNGen, the first game engine powered entirely by a neural model that enables real-time interaction with a complex environment over long trajectories at high quality.

thumb_up_off_alt2,2K

chat_bubble_outline62

repeat475

shareShare

Xavier Giró🎗

@docxavi

21 days ago

Happy to see customer obsession by Prime Video España in practice: a new list of movies dubbed to Catalan. It was really a inconvenient having to check Desdelsofà.cat or Goita què fan ara every time I want to enjoy a movie or TV show in Catalan.

thumb_up_off_alt21

chat_bubble_outline1

repeat7

shareShare

Xavier Giró🎗

Joelle Pineau

Robin Rombach

Geleta

Sander Dieleman

Sander Dieleman

Jia-Bin Huang

Google DeepMind

Jason Baldridge

ELLISBarcelona

Robert Luxemburg

European Conference on Computer Vision #ECCV2024

Chunting Zhou

Richard Socher

Sander Dieleman

AK

AK

CVC_UAB

AK

Xavier Giró🎗