Axel Darmouni (@adarmouni) 's Twitter Profile
Axel Darmouni

@adarmouni

Engineer @CentraleSupelec P22 | Data Scientist

ID: 1194253035339472897

calendar_today12-11-2019 13:59:13

642 Tweet

446 Followers

648 Following

Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Is there common practice on few-shot examples for text-to-SQL in terms of complexity? Is it better to give an LM progressive examples (easy to hard), full hard, or full easy?

Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Distilling a strong diffusion model into a much faster one with higher performance is achievable 📖 Read of the day, day 128: « SwiftBrush v2: Make your one-step diffusion model better than its teacher », by Trung Dao et al from VinAI research arxiv.org/pdf/2408.14176 The

Distilling a strong diffusion model into a much faster one with higher performance is achievable

📖 Read of the day, day 128: « SwiftBrush v2: Make your one-step diffusion model better than its teacher », by <a href="/bomcon123456/">Trung Dao</a> et al from VinAI research

arxiv.org/pdf/2408.14176

The
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Yes, diffusion models can generate video games 📖 Read of the day, day 129: « Diffusion Models are real-time game engines », by Dani Valevski, Yaniv Leviathan, moab.arar, and Shlomi Fruchter from Google Research To watch the mindblowing doom reproduction: gamengen.github.io

Yes, diffusion models can generate video games

📖 Read of the day, day 129: « Diffusion Models are real-time game engines », by <a href="/daniva/">Dani Valevski</a>, <a href="/yanivle/">Yaniv Leviathan</a>, <a href="/ArarMoab/">moab.arar</a>, and <a href="/shlomifruchter/">Shlomi Fruchter</a> from Google Research

To watch the mindblowing doom reproduction:  gamengen.github.io
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Reducing parameters and improving performances of LLMs is possible 📖 Read of the day, day 131: « LLM Pruning and Distillation in Practice: The Minitron Approach », by Sreenivas, Saurav Muralidharan et al from NVIDIA arxiv.org/pdf/2408.11796 The authors apply a strategy to turn a model

Reducing parameters and improving performances of LLMs is possible

📖 Read of the day, day 131: « LLM Pruning and Distillation in Practice: The Minitron Approach », by Sreenivas, <a href="/srv_m/">Saurav Muralidharan</a> et al from <a href="/nvidia/">NVIDIA</a>

arxiv.org/pdf/2408.11796

The authors apply a strategy to turn a model
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Fast, good, open source speech-to-speech is possible 📖 Read of the day, day 132 : Mini Omni: Language Models can hear, talk while thinking in streaming, by Xie Zhifei and Wu from Tsinghua University arxiv.org/pdf/2408.16725 The authors present a framework to make a Large

Fast, good, open source speech-to-speech is possible

📖 Read of the day, day 132 : Mini Omni: Language Models can hear, talk while thinking in streaming, by <a href="/XieZhifei14110/">Xie Zhifei</a> and Wu from Tsinghua University

arxiv.org/pdf/2408.16725

The authors present a framework to make a Large
Matt Shumer (@mattshumer_) 's Twitter Profile Photo

I'm excited to announce Reflection 70B, the world’s top open-source model. Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes. 405B coming next week - we expect it to be the best model in the world. Built w/ Glaive AI. Read on ⬇️:

I'm excited to announce Reflection 70B, the world’s top open-source model.

Trained using Reflection-Tuning, a technique developed to enable LLMs to fix their own mistakes.

405B coming next week - we expect it to be the best model in the world.

Built w/ <a href="/GlaiveAI/">Glaive AI</a>.

Read on ⬇️:
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

A search algorithm specific for code generation that does boost performance 📖 Read of the day, day 133: Planning in Natural Language Improves LLM Search For Code Generation, by Evan Wang et al from Scale AI arxiv.org/pdf/2409.03733 Researchers at scale AI made a method to

A search algorithm specific for code generation that does boost performance

📖 Read of the day, day 133: Planning in Natural Language Improves LLM Search For Code Generation, by <a href="/evanzwangg/">Evan Wang</a> et al from <a href="/scale_AI/">Scale AI</a>

arxiv.org/pdf/2409.03733

Researchers at scale AI made a method to
Axel Darmouni (@adarmouni) 's Twitter Profile Photo

Can’t wait for the benchmark results from Pixtral Wanna know how it compares to current SotA in small-sized VLM I’m thinking that if Mistral released it after so long it should be pretty good, which is why I’m pretty hyped :)

shawn swyx wang (@swyx) 's Twitter Profile Photo

🎉Congrats to OpenAI for releasing o1: - Economics: tylercowen asked o1 basically to write a college essay - Genetics: @catbrownstein asked o1 to help her reason through "n of 1" cases - medical cases that nobody has ever seen - Physics: @mariokrenn6240 used o1 to draft and

Andrej Karpathy (@karpathy) 's Twitter Profile Photo

o1-mini keeps refusing to try to solve the Riemann Hypothesis on my behalf. Model laziness continues to be a major issue sad ;p

Haider. (@slow_developer) 's Twitter Profile Photo

🚨 BREAKING First, live Preliminary LiveBench results for 'Reasoning' show that OpenAI o1-mini massively outperforms Claude Sonnet 3.5 Claude 3.5 Opus soon?

🚨 BREAKING

First, live Preliminary LiveBench results for 'Reasoning' show that OpenAI o1-mini massively outperforms Claude Sonnet 3.5

Claude 3.5 Opus soon?
Matt Clifford (@matthewclifford) 's Twitter Profile Photo

This morning I had my first visceral “🤯” moment with AI for ~2 years 🧵on o1 and cryptic crosswords: My test for new models is a set of cryptic crossword clues that aren’t online (my granny wrote them). Every model so far has been completely useless at them… but o1 gets them

Axel Darmouni (@adarmouni) 's Twitter Profile Photo

If o1 mini is in fact a distilled version of the o1 or o1 preview, this is huge Feeling like there’s a lot to be explored with llm distillation, especially considering we’ve got really strong large models atm