Emma Strubell(@strubell) 's Twitter Profileg
Emma Strubell

@strubell

assistant professor @LTIatCMU & visiting scientist @allen_ai. natural language processing and efficient ML. she/her/dad (2 dogs). 🏳️‍🌈. hiking and food. BLM.

ID:71544226

linkhttp://strubell.github.io calendar_today04-09-2009 14:13:53

1,0K Tweets

4,0K Followers

930 Following

Follow People
Sang Choe(@sangkeun_choe) 's Twitter Profile Photo

High-quality data is a key to successful pretrain/finetuning in the GPT era, but manual data curation is expensive💸 We tackle data quality challenges involving large models and datasets with ScAlable Meta leArning (SAMA) 💫

Arxiv: arxiv.org/abs/2310.05674
🧵 (1/n)

High-quality data is a key to successful pretrain/finetuning in the GPT era, but manual data curation is expensive💸 We tackle data quality challenges involving large models and datasets with ScAlable Meta leArning (SAMA) #NeurIPS2023💫 Arxiv: arxiv.org/abs/2310.05674 🧵 (1/n)
account_circle
Luca Soldaini 🎀(@soldni) 's Twitter Profile Photo

Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have....

✨ proper documentation 💫

check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊

Just released v0.9.0 of the Dolma toolkit 🍇 Lots of goodies (dataset tokenization support, new taggers, data analysis, etc), but the one I'm most proud of is that we now have.... ✨ proper documentation 💫 check it out at github.com/allenai/dolma/…, or `pip install dolma` 😊
account_circle
Language Technologies Institute | @CarnegieMellon(@LTIatCMU) 's Twitter Profile Photo

The LTI is hosting an information session for applicants to the MLT and PhD programs on Nov 8, 2023, 12-1 PM ET. If you would like to attend, please RSVP and send us your questions through this form: forms.gle/fGjDiTAKypwD2f…

account_circle
Sanket Vaibhav Mehta (SVM)(@sanketvmehta) 's Twitter Profile Photo

Our paper (w/ Darshan Patil, Sarath Chandar & Emma Strubell) “An Empirical Investigation of the Role of Pre-training in Lifelong Learning” is now officially published in (will be presented at Journal-to-Conference Track)!
Paper 👉 jmlr.org/papers/v24/22-…
🧵👇 (1/n)

account_circle
AllenNLP(@ai2_allennlp) 's Twitter Profile Photo

The deadline for Spring 2024 Research Internships at AllenNLP is July 15th, in two weeks. If you think 2024 is a great time to do NLP research with top mentors, apply at boards.greenhouse.io/thealleninstit…!

account_circle
Jesse Dodge(@JesseDodge) 's Twitter Profile Photo

Today Google announced PaLM 2. In their 91 page paper they repeatedly say the training data is key ('we find that the data mixture is a critical component of the final model') while providing almost no information about how it was constructed, how it was sourced, or its contents.

Today Google announced PaLM 2. In their 91 page paper they repeatedly say the training data is key ('we find that the data mixture is a critical component of the final model') while providing almost no information about how it was constructed, how it was sourced, or its contents.
account_circle
Allen Institute for AI(@allen_ai) 's Twitter Profile Photo

Today we're thrilled to announce our new undertaking to collaboratively build the best open language model in the world: AI2 OLMo.

Uniquely open, 70B parameters, coming early 2024 – join us!

blog.allenai.org/announcing-ai2…

account_circle
Cohere For AI(@CohereForAI) 's Twitter Profile Photo

Our cross-institutional collaboration, Efficient Methods for Natural Language Processing, has been accepted for publication at TACL! 🎉

You can find the pre-print at: arxiv.org/abs/2209.00099

Our cross-institutional collaboration, Efficient Methods for Natural Language Processing, has been accepted for publication at TACL! 🎉 You can find the pre-print at: arxiv.org/abs/2209.00099
account_circle
Roy Schwartz(@royschwartzNLP) 's Twitter Profile Photo

AI models are becoming dangerously powerful. How can we effectively regulate them?

We propose a simple regulation to address the spread of misinformation⚠️: any AI-generated photorealistic image must have a visible watermark 🔖
tinyurl.com/zsf7zc3h

👇
(1/n)

AI models are becoming dangerously powerful. How can we effectively regulate them? We propose a simple regulation to address the spread of misinformation⚠️: any AI-generated photorealistic image must have a visible watermark 🔖 tinyurl.com/zsf7zc3h 👇 (1/n)
account_circle
@timnitGebru@dair-community.social on Mastodon(@timnitGebru) 's Twitter Profile Photo

“It’s not okay to install these by default,” says David Gray Widder...who became one of the department’s most vocal voices against Mites. “I don’t want to live in a world where one’s employer installing networked sensors...”
technologyreview.com/2023/04/03/107…

by Eileen Guo & Tate Ryan-Mosley

account_circle
Emma Strubell(@strubell) 's Twitter Profile Photo

We, members of the research community, have the power to shape what is or is not considered good science.

Check out our blog post for discussion and recommendations on what to do about the rise of closed models like GPT-4:

account_circle
Leon Derczynski ✍🏻🌲☕(@LeonDerczynski) 's Twitter Profile Photo

ChatGPT not best at many language tasks. It's outranked by other systems on many NLP benchmarks in current evaluation. For 77.5% of tasks examined, other systems are better than ChatGPT.

opensamizdat.com/posts/chatgpt_…

ChatGPT not best at many language tasks. It's outranked by other systems on many NLP benchmarks in current evaluation. For 77.5% of tasks examined, other systems are better than ChatGPT. opensamizdat.com/posts/chatgpt_…
account_circle