Percy Liang(@percyliang) 's Twitter Profileg
Percy Liang

@percyliang

Associate Professor in computer science @Stanford @StanfordHAI @StanfordCRFM @StanfordAILab @stanfordnlp | cofounder @togethercompute | Pianist

ID:86481377

linkhttps://cs.stanford.edu/~pliang/ calendar_today31-10-2009 07:26:37

821 Tweets

50,6K Followers

408 Following

Percy Liang(@percyliang) 's Twitter Profile Photo

HELM MMLU v1.3.0 is out. We've added GPT-4o, Gemini 1.5 Flash, and Palmyra-X v3 - all 3 made it into the top 10.
crfm.stanford.edu/helm/mmlu/v1.3…
Click on the numbers to drill down into the predictions.

HELM MMLU v1.3.0 is out. We've added GPT-4o, Gemini 1.5 Flash, and Palmyra-X v3 - all 3 made it into the top 10. crfm.stanford.edu/helm/mmlu/v1.3… Click on the numbers to drill down into the predictions.
account_circle
Yann Dubois(@yanndubs) 's Twitter Profile Photo

GPT4-o from OpenAI tops AlpacaEval

Actually, the top 3 models are preferred by GPT-4 Preview than itself. By now I've seen many times models that prefer better models than themselves, and that suggests to me that some form of self-improvement (in the narrow sense) is possible!

GPT4-o from @OpenAI tops AlpacaEval Actually, the top 3 models are preferred by GPT-4 Preview than itself. By now I've seen many times models that prefer better models than themselves, and that suggests to me that some form of self-improvement (in the narrow sense) is possible!
account_circle
Sang Michael Xie(@sangmichaelxie) 's Twitter Profile Photo

The ME-FoMo workshop is tomorrow, Sat May 11 starting at 8:50AM in Vienna!

Room: Strauss 2
Schedule: sites.google.com/view/me-fomo20…

Excited for our amazing speakers: Sasha Rush (ICLR) Hanna Hajishirzi Jacob Steinhardt Amir Globerson Yuandong Tian @ Paris !!

The ME-FoMo #ICLR2024 workshop is tomorrow, Sat May 11 starting at 8:50AM in Vienna! Room: Strauss 2 Schedule: sites.google.com/view/me-fomo20… Excited for our amazing speakers: @srush_nlp @HannaHajishirzi @JacobSteinhardt @amirgloberson @tydsh !!
account_circle
Imbue(@imbue_ai) 's Twitter Profile Photo

🎙️ Generally Intelligent Episode 35: Percy Liang

We sat down with Percy Liang, associate professor of computer science and statistics at Stanford University, to discuss:
- how to evaluate language models robustly
- balancing plurality and consensus with AI
- the role of academia vs.

account_circle
Weights & Biases(@weights_biases) 's Twitter Profile Photo

🎙️Join Percy Liang, Together AI co-founder and Stanford University Professor, as he discusses the challenges of evaluating LLMs.
𝐋𝐢𝐬𝐭𝐞𝐧 🎧 𝐨𝐫 𝐰𝐚𝐭𝐜𝐡 🎥 𝐧𝐨𝐰: lnk.to/GDkZS7O1

account_circle
Percy Liang(@percyliang) 's Twitter Profile Photo

HELM is now fully multimodal! In addition to language models, text-to-image models (HEIM), we now evaluate vision-language models (made possible by MMMU, VQAv2, VizWiz - thanks to the authors!). As usual, the full predictions and prompts are available on the HELM website:

account_circle
rishi(@RishiBommasani) 's Twitter Profile Photo

Transparency for foundation models is an outstanding challenge.

To make progress the White House and G7 have recommended that foundation model developers prepare *transparency reports*.

We recently put out a paper that articulates what this should mean and its policy impact🧵

Transparency for foundation models is an outstanding challenge. To make progress the White House and G7 have recommended that foundation model developers prepare *transparency reports*. We recently put out a paper that articulates what this should mean and its policy impact🧵
account_circle
Percy Liang(@percyliang) 's Twitter Profile Photo

HELM is now multimodal! In addition to evaluating language models, text-to-image models, we now have vision-language models.

account_circle