Tengyu Ma(@tengyuma) 's Twitter Profileg
Tengyu Ma

@tengyuma

Assistant professor at Stanford; Co-founder of Voyage AI (https://t.co/wpIITHLgF0) ;

Working on ML, DL, RL, LLMs, and their theory.

ID:314395154

linkhttp://ai.stanford.edu/~tengyuma calendar_today10-06-2011 05:40:55

411 Tweets

26,1K Followers

515 Following

Tengyu Ma(@tengyuma) 's Twitter Profile Photo

A 100h/week job (e.g., CEO of a startup) apparently easily becomes a 24/7 job, where the job description of the remaining 68h is 'forget about what happened in those 100h, just enjoy life, family, and sleep'. And it's much much harder to succeed in those 68h than in those 100h.

account_circle
Zhiyuan Li(@zhiyuanli_) 's Twitter Profile Photo

Why does Chain of Thought (CoT) work?

Our 2024 paper proves that CoT enables more *iterative* compute to solve *inherently* serial problems. Otoh, a const-depth transformer that outputs answers right away can only solve problems that allow fast parallel algorithms.

Why does Chain of Thought (CoT) work? Our #ICLR 2024 paper proves that CoT enables more *iterative* compute to solve *inherently* serial problems. Otoh, a const-depth transformer that outputs answers right away can only solve problems that allow fast parallel algorithms.
account_circle
Voyage AI(@Voyage_AI_) 's Twitter Profile Photo

🆕 📢 voyage-large-2-instruct embedding model tops the MTEB leaderboard 🥇! huggingface.co/spaces/mteb/le…

— embedding dimension = 1024 (4x smaller than any other non-Voyage model in top-5)
— 16K context length (2x of OpenAI v3 large)

blog: blog.voyageai.com/2024/05/05/voy…

🆕 📢 voyage-large-2-instruct embedding model tops the MTEB leaderboard 🥇! huggingface.co/spaces/mteb/le… — embedding dimension = 1024 (4x smaller than any other non-Voyage model in top-5) — 16K context length (2x of OpenAI v3 large) blog: blog.voyageai.com/2024/05/05/voy… #RAG #LLM
account_circle
Armen Aghajanyan(@ArmenAgha) 's Twitter Profile Photo

Correction: I did the math wrong (not considering log/log scales). Sophia is ~1.6x times more efficient than Adam (thanks for pointing out Tengyu Ma).

account_circle
Jason Hu(@onjas_buidl) 's Twitter Profile Photo

Very cool project that not enough people talk about!

Have tried this in production, voyage ai embedding provides a few percentage of final performance improvement, despite being a small segment in the entire ML pipeline!

Bullish! 🫡

account_circle
Armen Aghajanyan(@ArmenAgha) 's Twitter Profile Photo

Final Update: One more magnitude of testing Sophia. We're talking model sizes in the B's, tokens in the T's. Sophia once again wins out. For me at least this is clear evidence that Sophia may be a replacement for Adam even in large scale runs.

Final Update: One more magnitude of testing Sophia. We're talking model sizes in the B's, tokens in the T's. Sophia once again wins out. For me at least this is clear evidence that Sophia may be a replacement for Adam even in large scale runs.
account_circle
meng shao(@shao__meng) 's Twitter Profile Photo

Anthropic does not offer its own embedding model. One embeddings provider that has a wide variety of options and capabilities encompassing all four of the above considerations is Voyage AI.
Voyage AI makes state of the art embedding models and offers customized models for

account_circle
LangChain(@LangChainAI) 's Twitter Profile Photo

⛵ Voyage AI Embedding Integration Package ↗️

Use the same custom embeddings that power Chat LangChain via the new langchain-voyageai package! Recommended by Anthropic as their preferred embedding provider, Voyage AI builds custom embedding models for your company or

⛵ @Voyage_AI_ Embedding Integration Package ↗️ Use the same custom embeddings that power Chat LangChain via the new langchain-voyageai package! Recommended by @AnthropicAI as their preferred embedding provider, Voyage AI builds custom embedding models for your company or
account_circle
Tengyu Ma(@tengyuma) 's Twitter Profile Photo

In increasing difficulty,

1. train artificial neural nets
2. train one’s own biological neural net
3. train others’ neural nets

Level 1.5: train others’ neural nets when others are also willing to train their own— that’s why profs can mentor even though they may fail at 2.

account_circle