Scale AI (@scale_ai) 's Twitter Profile
Scale AI

@scale_ai

Our mission is to accelerate the development of AI. We believe that to make the best models, you need the best data.

ID: 752712449321644032

linkhttp://www.scale.com calendar_today12-07-2016 03:53:27

1,1K Tweet

47,47K Followers

490 Following

Alexandr Wang (@alexandr_wang) 's Twitter Profile Photo

1/Meta just released Llama3.1 405B! Scale AI partnered deeply with Meta on this release: 🥇 SEAL Evaluations: Based on our evals 🥇 on IF 🥈 on Math #4 on Coding 💼 Enterprise partnership for custom Llama models 🤖 Data Foundry partnership on RLHF & SFT 👇

1/Meta just released Llama3.1 405B!

<a href="/scale_AI/">Scale AI</a> partnered deeply with <a href="/Meta/">Meta</a> on this release:

🥇 SEAL Evaluations: Based on our evals
    🥇 on IF
    🥈 on Math
     #4 on Coding
💼 Enterprise partnership for custom Llama models
🤖 Data Foundry partnership on RLHF &amp; SFT

👇
TechNet (@technetupdate) 's Twitter Profile Photo

AI has been utilized in the financial services and housing sectors for decades. But as Scale AI testified, we must properly deploy AI in a safe, responsible, and thoughtful manner to grow the U.S. economy. #AIforAmerica

Ed Ludlow (@edludlow) 's Twitter Profile Photo

!!! ITS FINALLY HERE !!! Bloomberg’s 2024 List of Top AI Startups: With so much going on, it was difficult to decide which companies should make the cut. Rachel Metz Shirin Ghaffary and Dina Bass had a tough job. bloomberg.com/features/2024-… (gift link) Including: xAI/Elon Musk,

!!! ITS FINALLY HERE !!!

Bloomberg’s 2024 List of Top AI Startups: 

With so much going on, it was difficult to decide which companies should make the cut. <a href="/rachelmetz/">Rachel Metz</a> <a href="/shiringhaffary/">Shirin Ghaffary</a> and <a href="/dinabass/">Dina Bass</a> had a tough job.

bloomberg.com/features/2024-… (gift link)

Including: <a href="/xai/">xAI</a>/<a href="/elonmusk/">Elon Musk</a>,
The Cognitive Revolution Podcast (@cogrev_podcast) 's Twitter Profile Photo

New episode out! Nathan Labenz hosts Riley Goodside, the world's first staff prompt engineer at Scale AI , to discuss the evolution of prompt engineering. Checkout the full episode here : cognitiverevolution.ai/from-poetry-to…

Scale AI (@scale_ai) 's Twitter Profile Photo

Introducing the latest addition to the SEAL Leaderboards: Adversarial Robustness. scl.ai/ar-leaderboard Adversarial Robustness evaluates top models against 1,000 adversarial prompts, covering critical areas like illegal activities, harm, and hate speech. Why it matters 👇 ✅

Introducing the latest addition to the SEAL Leaderboards: Adversarial Robustness.
scl.ai/ar-leaderboard

Adversarial Robustness evaluates top models against 1,000 adversarial prompts, covering critical areas like illegal activities, harm, and hate speech.

Why it matters 👇

✅
Demis Hassabis (@demishassabis) 's Twitter Profile Photo

Great to see Gemini 1.5 Pro top the new @scale_ai leaderboard for adversarial robustness! Congrats to the entire Gemini team, and special thanks to Anca Dragan & the AI safety team for leading the charge on building in robustness to our models as a core capability.

OPTO (@optothemes) 's Twitter Profile Photo

⚔️ AI Wars: Who's Leading the New Global Arms Race? 🎙️ Vijay Karunamurthy, Field CTO at Scale AI discusses global competition for AI talent and regulatory challenges faced by companies like Scale AI working across the US and UK, in the wider context of the new arms race against China.

Google DeepMind (@googledeepmind) 's Twitter Profile Photo

Gemini 1.5 Pro is the safest model on @Scale_AI's leaderboard for adversarial robustness. Evaluations look at how it performed when tested by harmful prompts compared to others. As we continue to develop advanced AI, we're committed to ensuring safety is built in from scratch.

Nathan Labenz (@labenz) 's Twitter Profile Photo

People tell me they listen to The Cognitive Revolution Podcast for the "nuggets" If you're automating routine work with LLMs, my episode with Riley Goodside, world's first Staff Prompt Engineer Scale AI, is full of them Here's Riley on task decomposition & reasoning demonstrations Recommended!

Scale AI (@scale_ai) 's Twitter Profile Photo

Scale is on Forbes’ 2024 Cloud 100 list! The list recognizes the world’s top 100 cloud computing companies. forbes.com/lists/cloud100/ Join us on our mission to accelerate the development of AI applications 👉 scale.com/careers

Scale is on <a href="/Forbes/">Forbes</a>’ 2024 Cloud 100 list! The list recognizes the world’s top 100 cloud computing companies. 

forbes.com/lists/cloud100/

Join us on our mission to accelerate the development of AI applications 👉 scale.com/careers
You Might Be Right Podcast (@ymbrpodcast) 's Twitter Profile Photo

#ListenNow: It's been nearly a year since our episode about Artificial Intelligence (AI) technology in Season 3, which seems like a lifetime ago in this quickly developing industry. In this week's episode, Govs. Phil Bredesen and Bill Haslam spoke with Michael Kratsios, former

#ListenNow: It's been nearly a year since our episode about Artificial Intelligence (AI) technology in Season 3, which seems like a lifetime ago in this quickly developing industry. In this week's episode, Govs. <a href="/PhilBredesen/">Phil Bredesen</a> and <a href="/BillHaslam/">Bill Haslam</a> spoke with <a href="/MichaelKratsios/">Michael Kratsios</a>, former
Scale AI (@scale_ai) 's Twitter Profile Photo

How do you know if a model is truly solving problems or if it’s just repeating answers from its training? Scale's Hugh Zhang is giving a tech talk on how his team developed GSM1k to expose potential data contamination in leading reasoning benchmarks 👉 scl.ai/unmasking-llm-…

How do you know if a model is truly solving problems or if it’s just repeating answers from its training?

Scale's <a href="/hughbzhang/">Hugh Zhang</a> is giving a tech talk on how his team developed GSM1k to expose potential data contamination in leading reasoning benchmarks

👉 scl.ai/unmasking-llm-…
Nathaniel Li (@natliml) 's Twitter Profile Photo

Who's better at LLM mischief — humans or AIs? Spoiler: It's us. Human red teamers achieve 70%+ attack success rates against LLM defenses that stump automated adversarial attacks. Why? We’re better at adversarial yapping.🧵

Who's better at LLM mischief — humans or AIs? Spoiler: It's us.

Human red teamers achieve 70%+ attack success rates against LLM defenses that stump automated adversarial attacks. Why? We’re better at adversarial yapping.🧵
Scale AI (@scale_ai) 's Twitter Profile Photo

📢Happening tomorrow! Can’t make it? Register to receive the recording: scl.ai/unmasking-llm-… How do you know if a model is truly solving problems or if it’s just repeating answers from its training? Join Hugh Zhang 's tech talk tomorrow to learn about what his team found

📢Happening tomorrow! Can’t make it? Register to receive the recording: scl.ai/unmasking-llm-…

How do you know if a model is truly solving problems or if it’s just repeating answers from its training?

Join <a href="/hughbzhang/">Hugh Zhang</a> 's tech talk tomorrow to learn about what his team found
Scale AI (@scale_ai) 's Twitter Profile Photo

We’ve added Mistral Large 2, GPT-4o (August 2024), and Gemini 1.5 Pro (August 27, 2024) to the SEAL LLM Leaderboards. See how they rank compared to leading LLMs across Coding, Instruction Following, Math, and Spanish domains: scl.ai/leaderboard

We’ve added Mistral Large 2, GPT-4o (August 2024), and Gemini 1.5 Pro (August 27, 2024) to the SEAL LLM Leaderboards. 

See how they rank compared to leading LLMs across Coding, Instruction Following, Math, and Spanish domains: scl.ai/leaderboard