lmsys.org (@lmsysorg) 's Twitter Profile
lmsys.org

@lmsysorg

Large Model Systems Organization. We created Vicuna and Chatbot Arena! Compare 50+ LLMs (GPT-4/Claude/Gemini/Llamas) side-by-side at lmarena.ai

ID: 1641378826537295874

linkhttp://lmsys.org calendar_today30-03-2023 09:56:38

656 Tweet

54,54K Followers

178 Following

Logan Kilpatrick (@officiallogank) 's Twitter Profile Photo

Huge gains with 1.5 Flash across the board and 1.5 Pro is much better at math, coding, and complex prompts. Gemini 1.5 Flash is the best most in the world for developers right now ⚡️

lmsys.org (@lmsysorg) 's Twitter Profile Photo

Exciting news -- we've launched a new X account for our open-source research! At LMSYS, we're dedicated to advancing open-source systems and in-depth technical work besides regular leaderboard updates. Follow @lmsys_oss to stay updated on our roadmap and ongoing developments

lmsys.org (@lmsysorg) 's Twitter Profile Photo

Check out the new SGLang release featuring performance boosts with DeepSeek MLA optimization and torch.compile! We're also introducing multi-image/video inputs with LLaVA-OneVision support, now live in the Chatbot Arena served by the latest SGLang. Try the Code:

lmsys.org (@lmsysorg) 's Twitter Profile Photo

We're launching something new... Sign up now to become a beta tester and get early access to Copilot Arena 👨‍💻🤖 forms.gle/o8Qh7SccrVEkuX…

We're launching something new...

Sign up now to become a beta tester and get early access to Copilot Arena 👨‍💻🤖 forms.gle/o8Qh7SccrVEkuX…
lmsys.org (@lmsysorg) 's Twitter Profile Photo

Benchmarking is challenging across all fields, from evaluating LLM assistants to comparing inference engines. Based on community feedback, we've shared some quick notes on benchmarking inference systems and will continue refining our evaluation pipelines!

lmsys.org (@lmsysorg) 's Twitter Profile Photo

[Update] server is overloading and we're fixing it now..! Join our discord channel to report bugs or share your best jailbreak prompt :) discord.gg/mP3PwbKG9m

lmsys.org (@lmsysorg) 's Twitter Profile Photo

Whoa, the grandmaster of jailbreaking has arrived! 🏆#1 on the leaderboard now! Check it out at redarena.ai/leaderboard (⚠️ strong language)

lmsys.org (@lmsysorg) 's Twitter Profile Photo

We just shipped a fix and make the server much more responsive now. Happy jailbreaking and please keep the feedback coming! discord.gg/2evxnrPt