Daphne Ippolito (@daphneipp) 's Twitter Profile
Daphne Ippolito

@daphneipp

I am an assistant professor at Carnegie Mellon University and also a senior research scientist at Google. I research topics in natural language generation.

ID: 408234858

calendar_today09-11-2011 04:50:50

46 Tweet

1,1K Followers

75 Following

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

As a note to my NLP friends, we're also encouraging works of ML-assisted creative writing this year. Consider submitting your best LM-assisted stories, poems, or anything else.

Jascha Sohl-Dickstein (@jaschasd) 's Twitter Profile Photo

CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben… #BIGbench

CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS

We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben… 
#BIGbench
Katherine Lee (@katherine1ee) 's Twitter Profile Photo

Data duplication is serious business! 3% of documents in the large language dataset, C4, have near-duplicates. Deduplication reduces model memorization while training faster and without reducing accuracy. Paper: arxiv.org/abs/2107.06499 Code: coming soon! 🧵⬇️ (1/9)

NAACL SRW (@naacl_srw) 's Twitter Profile Photo

The Call For Papers of NAACL SRW 2022 is out! The mentoring program deadline is Feb 1, 2022. The paper submission deadline is Mar 25, 2022. Details are available here: 2022.naacl.org/calls/srw/ #NLProc

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

We noticed that the recent arxiv paper "A Roadmap for Big Model" with 100 authors plagiarizes from many papers, including ours. Nicholas Carlini wrote a blog post about it: nicholas.carlini.com/writing/2022/a…

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

Have you found yourself wishing for a manual on "how to do research?" In this panel discussion, you will hear from experienced NLP researchers about topics including picking promising research questions, building collaborations, and ensuring your results are reproducible.

Stephen Mayhew (@mayhewsw) 's Twitter Profile Photo

Short Seattle public transit thread, while I’m thinking about it. First, the Link Light Rail, which is what I took from the airport to downtown. Price: extremely reasonable $3, esp. compared to ~$60 Uber.

Short Seattle public transit thread, while I’m thinking about it. First, the Link Light Rail, which is what I took from the airport to downtown. Price: extremely reasonable $3, esp. compared to ~$60 Uber.
Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

Announcing the Training Data Extraction Challenge, part of SaTML Conference! Your mission: extract train set strings memorized by a 1.3B parameter language model. More details at github.com/google-researc… GPU time is available through Colaboratory; let us know if you’re participating!

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

My collaborators and I have spent the last year learning from professional writers about the roles AI could play in providing creative writing assistance. Take a look at the whitepaper and stories!

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

See our new research on human ability to detect when a text passage transitions from human-written to language model-generated. We will be presenting this work at AAAI this week!

Florian Tramèr (@florian_tramer) 's Twitter Profile Photo

Author order on academic papers is important! My Google friends and I spent lots of time thinking about this critical issue (the scores of our ICML submissions show this is time well spent) We distill our findings for the community here: floriantramer.com/docs/papers/Au… Comments welcome!

Author order on academic papers is important!
My Google friends and I spent lots of time thinking about this critical issue (the scores of our ICML submissions show this is time well spent)

We distill our findings for the community here:
floriantramer.com/docs/papers/Au…
Comments welcome!
Liam Dugan (@liamdugan_) 's Twitter Profile Photo

🚨New Paper🚨: Are AI text detectors *really* as good as they claim? (#ACL2024) We release RAID—The largest & most challenging detection benchmark with 6M+ outputs from 11 LLMs, 8 domains, 4 decoding strategies, and 11 adv attacks arxiv.org/abs/2405.07940 github.com/liamdugan/raid

🚨New Paper🚨: Are AI text detectors *really* as good as they claim? (#ACL2024)

We release RAID—The largest & most challenging detection benchmark with 6M+ outputs from 11 LLMs, 8 domains, 4 decoding strategies, and 11 adv attacks

arxiv.org/abs/2405.07940
github.com/liamdugan/raid
Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

In the past, I've studied how curation decisions for pre-training data influence what LMs are good and bad at. In our new preprint, we look at how the fabric of the internet (the primary source of most of these datasets), is itself changing, and the effects this might have.

Daphne Ippolito (@daphneipp) 's Twitter Profile Photo

Liam Dugan and his UPenn collaborators have done excellent work on testing out all the different methods for detecting AI-generated text, showing their efficacy across LMs and text domains, as well as their robustness to adversarial attacks. There's a new public benchmark too!