Stanislas Polu (@spolu) Twitter Tweets • TwiCopy

2 months ago

Clear negative correlation between accuracy and reasoning gap. This goes directly against the hypothesis that larger models are more contaminated.

Best news for largest language models in a long time!

WTF is going on with Mistral Large 5 shots without CoT?

thumb_up_off_alt14

account_circle

Stanislas Polu

2 months ago

Next week I'll run a model on all the conversations of the week to estimate (usefulness, time saved or lost in minutes) so that I can compute # of humans saved / week by Dust users :)

thumb_up_off_alt8

account_circle

Stanislas Polu

2 months ago

Semantic search is powerful but bad at quantitative questions (by construction).

To circumvent that, we built Table Queries📓

Any structured data in your company (Google Sheets, Notion DBs, CSVs...) gets turned in to JIT in-memory sqlite DBs that models can query using SQL👨‍🏫

thumb_up_off_alt16

account_circle

Stanislas Polu

2 months ago

People ask me (often with a weird look):

'But dude, why did you leave OpenAI ??'

This is why 👇

thumb_up_off_alt47

repeat2

account_circle

Stanislas Polu

2 months ago

We made two hard bets with Dust:

- An horizontal platform with access to all the SaaS relied on by our users (Notion, Github, Slack, Drive, Intercom, ...)
- Not one Assistant, but many Assistants specialized on specific tasks.
- Capability to do semantic rertieval but also…

thumb_up_off_alt46

repeat2

account_circle

Stanislas Polu

2 months ago

Anybody tried to make models play chess against one another in standard algebraic notation?

We know models are quite good at it. But who wins?

Mistral-Large vs Claude 2 vs Gemini 1.5 vs GPT-4

thumb_up_off_alt27