Joey Gonzalez (@profjoeyg) 's Twitter Profile
Joey Gonzalez

@profjoeyg

Professor @UCBerkeley, co-director of @LMSysorg, and co-founder @RunLLM

ID: 323533772

linkhttps://www.eecs.berkeley.edu/~jegonzal calendar_today25-06-2011 00:24:02

484 Tweet

2,2K Followers

293 Following

Joschka Braun (@joschkabraun) 's Twitter Profile Photo

I benchmarked Anthropic's new tool use beta API on the Berkeley function calling benchmark. Haiku beats GPT-4 Turbo in half of the scenarios. Results in 🧵 A huge thanks to Shishir Patil, Fanjia Yan, Tianjun Zhang, Joey Gonzalez & rest for providing this benchmark publicly.

I benchmarked <a href="/AnthropicAI/">Anthropic</a>'s new tool use beta API on the Berkeley function calling benchmark. Haiku beats GPT-4 Turbo in half of the scenarios. Results in 🧵

A huge thanks to <a href="/shishirpatil_/">Shishir Patil</a>, <a href="/fanjia_yan/">Fanjia Yan</a>, <a href="/tianjun_zhang/">Tianjun Zhang</a>, <a href="/profjoeyg/">Joey Gonzalez</a> &amp; rest for providing this benchmark publicly.
lmsys.org (@lmsysorg) 's Twitter Profile Photo

Congrats NVIDIA on the exciting 340B model release! The model was tested under the codename "june-chatbot" and is now coming out of stealth with impressive performance, surpassing Llama-3-70b across hard benchmarks like Arena-Hard-Auto. The new best open model? Come play with

Congrats <a href="/nvidia/">NVIDIA</a> on the exciting 340B model release!

The model was tested under the codename "june-chatbot" and is now coming out of stealth with impressive performance, surpassing Llama-3-70b across hard benchmarks like Arena-Hard-Auto.

The new best open model? Come play with
Simon Willison (@simonw) 's Twitter Profile Photo

Here's the animated LMSYS arena tool I built for the talk using Claude 3.5 Sonnet and Artifacts - inspired by Peter Gostev's visualizations Peter's: linkedin.com/posts/peter-go… My tool: tools.simonwillison.net/arena-animated

Vikram Sreekanti (@vsreekanti) 's Twitter Profile Photo

.RunLLM's now on the Modin Project documentation site — thanks to the Modin team for their help! Let us know what you think: modin.readthedocs.io/en/stable/

Matei Zaharia (@matei_zaharia) 's Twitter Profile Photo

Great post by Celebal Technologies on implementing RAFT on top of Databricks fine-tuning to out-perform RAG: celebaltech.com/blogs/enhancin… Incidentally the RAFT paper will be presented at Conference on Language Modeling so check it out if you're there.

Lianmin Zheng (@lm_zheng) 's Twitter Profile Photo

Torch.compile is awesome! No more headaches from hacking tricky low-level CUDA code for new attention variants. I can't wait to see them catch up with H100 and FP8 support as well.

vLLM (@vllm_project) 's Twitter Profile Photo

🙏 Thank you NVIDIA for sponsoring vLLM development. The DGX H200 machine is marvelous! We plan to use the machine for benchmarking and performance enhancement 🏎️.

🙏 Thank you <a href="/nvidia/">NVIDIA</a> for sponsoring vLLM development. The DGX H200 machine is marvelous! We plan to use the machine for benchmarking and performance enhancement 🏎️.
Joey Gonzalez (@profjoeyg) 's Twitter Profile Photo

For the past year, Vikram Sreekanti and I have been writing about whatever we found interesting in our generative AI themed blog - Generating Conversations. However, when we look at which posts did the best, it is clear that our readers are most interested in deeper discussions on

Vikram Sreekanti (@vsreekanti) 's Twitter Profile Photo

We've been thinking a lot about how to price RunLLM recently — pricing is always hard, but AI adds some interesting wrinkles to the dynamic. We haven't completely figured it out yet, but we thought we'd share our thoughts about where things are headed: open.substack.com/pub/frontierai…

Joey Gonzalez (@profjoeyg) 's Twitter Profile Photo

What is the right pricing model for AI? Should it be a monthly fee or a flat rate per token? Do you pay extra for more knowledge? Three years ago, I was focused on server-less computing for AI and how to allocate inference engines. At the time, consumption based pricing was

Liana (@lianapatel_) 's Twitter Profile Photo

Want to answer NL questions over your data? Introducing Table Augmented Generation (TAG)! Joint work w/ the amazing Matei Zaharia Carlos Guestrin Joey Gonzalez Asim Biswal Sid Jha Amog Kamsetty Shu Liu 📚 Paper: arxiv.org/abs/2408.14717 🛠️ Code: github.com/tag-research/t… 🧵

Want to answer NL questions over your data?

Introducing Table Augmented Generation (TAG)!

Joint work w/ the amazing <a href="/matei_zaharia/">Matei Zaharia</a> <a href="/guestrin/">Carlos Guestrin</a> <a href="/profjoeyg/">Joey Gonzalez</a> <a href="/_asimbiswal/">Asim Biswal</a> <a href="/sid_jha1/">Sid Jha</a> <a href="/AmogKamsetty/">Amog Kamsetty</a> <a href="/LynnLiu41887950/">Shu Liu</a>

📚 Paper: arxiv.org/abs/2408.14717
🛠️ Code: github.com/tag-research/t…

🧵
Joey Gonzalez (@profjoeyg) 's Twitter Profile Photo

Back in 2023, my students working on Gorilla project made the case for connecting LLMs to SaaS APIs. Today, everyone knows that models should be interacting with APIs. What people don't realize is that LLMs need to know more than how to call the API. They need to learn the

Joey Gonzalez (@profjoeyg) 's Twitter Profile Photo

I am excited to announce the launch of the Video Arena. Our goal is to study video generation models and ultimately how humans prompt them. You can help us by watching entertaining generative AI videos constructed using the same prompt.