Daphne Ippolito (@daphneipp) Twitter Tweets • TwiDoom

Daphne Ippolito

@daphneipp

+ Follow

I am an assistant professor at Carnegie Mellon University and also a senior research scientist at Google. I research topics in natural language generation.

ID: 408234858

calendar_today09-11-2011 04:50:50

46 Tweet

1,1K Followers

75 Following

Daphne Ippolito

@daphneipp

4 years ago

As a note to my NLP friends, we're also encouraging works of ML-assisted creative writing this year. Consider submitting your best LM-assisted stories, poems, or anything else.

thumb_up_off_alt24

chat_bubble_outline0

repeat10

shareShare

CALL FOR TASKS CAPTURING LIMITATIONS OF LARGE LANGUAGE MODELS We are soliciting contributions of tasks to a *collaborative* benchmark designed to measure and extrapolate the capabilities and limitations of large language models. Submit tasks at github.com/google/BIG-Ben… #BIGbench

thumb_up_off_alt279

chat_bubble_outline14

repeat73

shareShare

Katherine Lee

@katherine1ee

3 years ago

Data duplication is serious business! 3% of documents in the large language dataset, C4, have near-duplicates. Deduplication reduces model memorization while training faster and without reducing accuracy. Paper: arxiv.org/abs/2107.06499 Code: coming soon! 🧵⬇️ (1/9)

thumb_up_off_alt259

chat_bubble_outline6

repeat55

shareShare

NAACL SRW

@naacl_srw

3 years ago

The Call For Papers of NAACL SRW 2022 is out! The mentoring program deadline is Feb 1, 2022. The paper submission deadline is Mar 25, 2022. Details are available here: 2022.naacl.org/calls/srw/ #NLProc

thumb_up_off_alt15

chat_bubble_outline0

repeat12

shareShare

Daphne Ippolito

@daphneipp

2 years ago

We noticed that the recent arxiv paper "A Roadmap for Big Model" with 100 authors plagiarizes from many papers, including ours. Nicholas Carlini wrote a blog post about it: nicholas.carlini.com/writing/2022/a…

thumb_up_off_alt1,1K

chat_bubble_outline23

repeat158

shareShare

Daphne Ippolito

@daphneipp

2 years ago

Have you found yourself wishing for a manual on "how to do research?" In this panel discussion, you will hear from experienced NLP researchers about topics including picking promising research questions, building collaborations, and ensuring your results are reproducible.

thumb_up_off_alt2

chat_bubble_outline0

repeat0

shareShare

Stephen Mayhew

@mayhewsw

2 years ago

Short Seattle public transit thread, while I’m thinking about it. First, the Link Light Rail, which is what I took from the airport to downtown. Price: extremely reasonable $3, esp. compared to ~$60 Uber.

thumb_up_off_alt8

chat_bubble_outline3

repeat1

shareShare

Daphne Ippolito

@daphneipp

2 years ago

Announcing the Training Data Extraction Challenge, part of SaTML Conference! Your mission: extract train set strings memorized by a 1.3B parameter language model. More details at github.com/google-researc… GPU time is available through Colaboratory; let us know if you’re participating!

thumb_up_off_alt933

chat_bubble_outline7

repeat183

shareShare

Daphne Ippolito

@daphneipp

2 years ago

My collaborators and I have spent the last year learning from professional writers about the roles AI could play in providing creative writing assistance. Take a look at the whitepaper and stories!

thumb_up_off_alt42

chat_bubble_outline3

repeat6

shareShare

Daphne Ippolito

@daphneipp

2 years ago

This excellent article talks about some of my older work on the detection of language model-generated text!

thumb_up_off_alt24

chat_bubble_outline1

repeat0

shareShare

Daphne Ippolito

@daphneipp

2 years ago

See our new research on human ability to detect when a text passage transitions from human-written to language model-generated. We will be presenting this work at AAAI this week!

thumb_up_off_alt18

chat_bubble_outline0

repeat6

shareShare

Florian Tramèr

@florian_tramer

a year ago

Author order on academic papers is important! My Google friends and I spent lots of time thinking about this critical issue (the scores of our ICML submissions show this is time well spent) We distill our findings for the community here: floriantramer.com/docs/papers/Au… Comments welcome!

thumb_up_off_alt401

chat_bubble_outline10

repeat61

shareShare

Liam Dugan

@liamdugan_

3 months ago

🚨New Paper🚨: Are AI text detectors *really* as good as they claim? (#ACL2024) We release RAID—The largest & most challenging detection benchmark with 6M+ outputs from 11 LLMs, 8 domains, 4 decoding strategies, and 11 adv attacks arxiv.org/abs/2405.07940 github.com/liamdugan/raid

thumb_up_off_alt43

chat_bubble_outline3

repeat12

shareShare

Daphne Ippolito

@daphneipp

2 months ago

In the past, I've studied how curation decisions for pre-training data influence what LMs are good and bad at. In our new preprint, we look at how the fabric of the internet (the primary source of most of these datasets), is itself changing, and the effects this might have.

thumb_up_off_alt39

chat_bubble_outline0

repeat8

shareShare

Daphne Ippolito

@daphneipp

a month ago

Liam Dugan and his UPenn collaborators have done excellent work on testing out all the different methods for detecting AI-generated text, showing their efficacy across LMs and text domains, as well as their robustness to adversarial attacks. There's a new public benchmark too!

thumb_up_off_alt6

chat_bubble_outline0

repeat2

shareShare

Daphne Ippolito

Daphne Ippolito

Jascha Sohl-Dickstein

Katherine Lee

NAACL SRW

Daphne Ippolito

Daphne Ippolito

Stephen Mayhew

Daphne Ippolito

Daphne Ippolito

Daphne Ippolito

Daphne Ippolito

Florian Tramèr

Liam Dugan

Daphne Ippolito

Daphne Ippolito