Samuele Cornell (@samuelecornell) Twitter Tweets • TwiDoom

Efthymios Tzinis

a year ago

My Ph.D. dissertation on "Unsupervised sound separation" is online ideals.illinois.edu/items/127515 😎 If you are interested in self-supervised, multi-modal, efficient and federated learning aspects for sound separation, feel free to take a look 👀

thumb_up_off_alt123

chat_bubble_outline1

repeat16

shareShare

Samuele Cornell

@samuelecornell

a year ago

The CHiME Challenge is accepting proposals for tasks. Submitting one is easy (organizing the task less so😅) and any idea concerning speech processing is welcomed (e.g. even audiovisual). Proposals may be merged so it is a great opportunity for collaboration !

thumb_up_off_alt15

chat_bubble_outline0

repeat2

shareShare

Jonathan Le Roux

@jonathanleroux

a year ago

#SANE2023 is in just over 1 month, Thu 10/26 at NYU in Brooklyn! Talk details are now up for Kyunghyun Cho, Yuan Gong, Anna Huang, Wenwu Wang, and (New!) Gaël Richard. There's still time to register, but we're nearing capacity. Poster deadline: 9/30 Please RT🤗 saneworkshop.org/sane2023/

thumb_up_off_alt22

chat_bubble_outline1

repeat11

shareShare

Piotr Żelasko

@piotrzelasko

a year ago

Fantastic new development — a new journal focused on data-related questions in ML research, backed by JMLR brand!

thumb_up_off_alt5

chat_bubble_outline1

repeat1

shareShare

Hervé "pyannote" Bredin

@hbredin

a year ago

What if #pyannote could do separation too? 🔀 I am hiring a postdoc to help me with that! 👩‍🎓 Apply here: emploi.cnrs.fr/Offres/CDD/UMR… … or please RT 🙏 #diarization #pyannote

thumb_up_off_alt18

chat_bubble_outline0

repeat10

shareShare

Jonathan Le Roux

@jonathanleroux

a year ago

Applications for 2024 MERL Speech & Audio team internships are now open. Come work with us on sound event/anomaly detection, multimodal representation learning, multimodal scene interaction, and audio source separation, towards publication in top venues. merl.com/internship/ope…

thumb_up_off_alt34

chat_bubble_outline0

repeat5

shareShare

Samuele Cornell

@samuelecornell

a year ago

This is a great work from my colleague Zhong-Qiu Wang. IMO it is the first work that does unsupervised **fully neural** (beamforming not required) speech separation in a multi-channel setting without resorting to mixtures of mixtures. Performance is close to supervised PIT.

thumb_up_off_alt17

chat_bubble_outline1

repeat4

shareShare

Shinji Watanabe

@shinjiw_at_cmu

6 months ago

Hi all, We want to announce the new Interspeech satellite workshop called SynData4GenAI (Synthetic Data's Transformative Role in Foundational Speech Models). Please check the details! syndata4genai.org INTERSPEECH 2024

thumb_up_off_alt81

chat_bubble_outline1

repeat11

shareShare

Hervé "pyannote" Bredin

@hbredin

5 months ago

👀

thumb_up_off_alt24

chat_bubble_outline1

repeat2

shareShare

Samuele Cornell

@samuelecornell

5 months ago

Is it only my impression or is the quality (last three years) of Interspeech reviews (on average) been much worse than other conferences ? Maybe I am unlucky but every year I get crazy stuff which seldom happens elsewhere. Is the reviewers pool so much different from ICASSP ?

thumb_up_off_alt10

chat_bubble_outline2

repeat0

shareShare

Samuele Cornell

@samuelecornell

3 months ago

This is a very cool work on speech separation in real world settings (AMI meeting). It is robust because it is weakly supervised (diarization labels only) and thus can be trained on real world data, unlike most neural methods which have to rely on synthetic data.

thumb_up_off_alt15

chat_bubble_outline2

repeat2

shareShare

Shinji Watanabe

@shinjiw_at_cmu

3 months ago

Hi all, This is the third call for papers about the SynData4GenAI workshop. Good news! While the submission data was originally due on June 18th, we'll extend it to June 24th. Please submit your papers at syndata4genai.org We look forward to your submissions!

thumb_up_off_alt20

chat_bubble_outline0

repeat5

shareShare

Samuele Cornell

@samuelecornell

3 months ago

If you are interested in generalizable speech enhancement that can tackle "speech-in-the-wild" data, different sampling rates and is able to restore audio from different distortions check this out. We have a new challenge at NeurIPS 2024. Website: urgent-challenge.github.io/urgent2024/tim…

thumb_up_off_alt30

chat_bubble_outline0

repeat3

shareShare

William Chen

@chenwanch1

3 months ago

I'm excited to announce WAVLab | @CarnegieMellon's XEUS - an SSL speech encoder that covers over 4000+ languages! XEUS is trained on over 1 million hours of speech. It outperforms both MMS 1B and w2v-BERT v2 2.0 on many tasks. We're releasing the code, checkpoints, and our 4000+ lang. data! 🧵

I'm excited to announce <a href="/WavLab/">WAVLab | @CarnegieMellon</a>'s XEUS - an SSL speech encoder that covers over 4000+ languages!

XEUS is trained on over 1 million hours of speech. It outperforms both MMS 1B and w2v-BERT v2 2.0 on many tasks.

We're releasing the code, checkpoints, and our 4000+ lang. data! 🧵

thumb_up_off_alt214

chat_bubble_outline7

repeat58

shareShare

Neil Zeghidour

@neilzegh

6 days ago

We release a detailed paper, model weights (model and codec) and streaming inference for Moshi! Beyond the model itself, we believe our findings will be useful to audio language models. "Inner Monologue" for the win!

thumb_up_off_alt48

chat_bubble_outline4

repeat9

shareShare