Mark Mazumder (@markmaz) 's Twitter Profile
Mark Mazumder

@markmaz

Machine learning PhD student at Harvard University

ID: 12649062

linkhttps://markmaz.com calendar_today24-01-2008 17:44:32

14 Tweet

26 Followers

15 Following

coqui (@coqui_ai) 's Twitter Profile Photo

Coqui + Harvard University + Google paper accepted to @INTERSPEECH2021 🐸🚀 🔥 Few-Shot Keyword Spotting in Any Language 🔥 Check it out on arXiv 😎 arxiv.org/abs/2104.01454 Spear-headed by the brilliant 💫 Mark Mazumder 💫 from the Harvard University Edge Computing lab 1/

Coqui + <a href="/Harvard/">Harvard University</a> + <a href="/Google/">Google</a> paper accepted to @INTERSPEECH2021 🐸🚀

🔥 Few-Shot Keyword Spotting in Any Language 🔥

Check it out on arXiv 😎 arxiv.org/abs/2104.01454

Spear-headed by the brilliant 💫 Mark Mazumder 💫 from the <a href="/Harvard/">Harvard University</a> Edge Computing lab

1/
The Gradient (@gradientpub) 's Twitter Profile Photo

Over the last year, @commons_ml set out to expand open source speech recognition resources Find out about The People’s Speech, a massive English-language dataset, and the 50-language, 6000-hour Multilingual Spoken Words Corpus 🗣️ 👇👇👇 thegradientpub.substack.com/p/new-datasets…

Over the last year, @commons_ml set out to expand open source speech recognition resources

Find out about The People’s Speech, a massive English-language dataset, and the 50-language, 6000-hour Multilingual Spoken Words Corpus 🗣️

👇👇👇
thegradientpub.substack.com/p/new-datasets…
MLCommons (@mlcommons) 's Twitter Profile Photo

MLCommons releases Multilingual Spoken Words Corpus, permissively licensed keyword spotting dataset in 50 languages. The rich audio #speech dataset helps advance development of apps such as voice interfaces for a broad global audience. bit.ly/3yxqYF5 #AI #ML #NeurIPS2021

MLCommons releases Multilingual Spoken Words Corpus, permissively licensed keyword spotting dataset in 50 languages. The rich audio #speech dataset helps advance development of apps such as voice interfaces for a broad global audience. bit.ly/3yxqYF5 #AI #ML #NeurIPS2021
David Kanter (@thekanter) 's Twitter Profile Photo

@commons_ml A huge thanks to all our partners and community supporters, you can read more about this project at mlcommons.org/en/news/spoken… It takes a team :)

Edge Impulse (@edgeimpulse) 's Twitter Profile Photo

The Multilingual Spoken Words Corpus is a large and growing audio dataset of spoken words in 50 different languages. It contains more than 340K words and 23M one-second audio samples, adding up to over 6K hours of speech. mlcommons.org/en/news/spoken…

Harvard SEAS (@hseas) 's Twitter Profile Photo

A new project aims to build a dataset with 1,000 words in 1,000 different languages to bring voice technology to hundreds of millions of speakers around the world buff.ly/3pZjJlz

A new project aims to build a dataset with 1,000 words in 1,000 different languages to bring voice technology to hundreds of millions of speakers around the world
buff.ly/3pZjJlz
Mark Mazumder (@markmaz) 's Twitter Profile Photo

Our new dataset, The Multilingual Spoken Words Corpus (NeurIPS 2021 Datasets & Benchmarks Track) is now available! Paper: openreview.net/forum?id=c20ji… Data: mlcommons.org/words Colab Tutorial: colab.research.google.com/github/harvard… Video: youtube.com/watch?v=eGPCwn…

Andrew Ng (@andrewyng) 's Twitter Profile Photo

I’m happy to see MLCommons release both a multilingual speech dataset and a large 30,000 hour diverse english dataset. We need more open datasets like these to help researchers build speech systems that can serve many individuals around the world. 🗺️

Davis Blalock (@davisblalock) 's Twitter Profile Photo

"DataPerf: Benchmarks for Data-Centric AI Development" What if instead of holding the data constant and benchmarking different models, we held the model constant and benchmarked different data pipelines? [1/7]

"DataPerf: Benchmarks for Data-Centric AI Development"

What if instead of holding the data constant and benchmarking different models, we held the model constant and benchmarked different data pipelines? [1/7]
MLCommons (@mlcommons) 's Twitter Profile Photo

The future of #ML is data-centric! That’s why we built #DataPerf, the leaderboard for data. It is the 1st platform and community for data-centric competitions. Together we will break through data limitations and unlock better ML for the world mlcommons.org/en/news/datape…