D. Sivakumar (@dsivakumar) 's Twitter Profile
D. Sivakumar

@dsivakumar

Co-Founder: Tonita.co
(Commerce Search with NLP Magic)

Earlier:
Member, @southpkcommons;
NLP, ML, Algorithms at Google Research;
CS Theory at IBM Almaden

.

ID: 22533917

calendar_today02-03-2009 21:07:11

3,3K Tweet

3,3K Followers

871 Following

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

played a bit with the models on Groq Inc, three observations: 1. Groq's inference speeds are mind-blowing! 2. hallucination, even for the 70B Llama 3.1 model, is very real (e.g., "what is flash attention?") 3. the linguistic abilities of even the xB models are stunningly good

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

Love watching Steph and Lebron, of course, and enjoy watching Anthony Edwards; but has there been a better basketball player to watch for the sheer quality of play in this century than KD? 🏀 🇺🇸

Vivek Raghunathan (@vivek7ue) 's Twitter Profile Photo

As we see companies like Reddit block search engines from indexing them, time to revisit crawl neutrality for the web ... fastcompany.com/90759792/with-…

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

Very nice introspection / insight. Also critical is how we can often get lucky / creative in Step 3 (either a lot of noisy data and contrastive learning works well enough; or a powerful oracle in the form of a foundation model).

Han (@hanchunglee) 's Twitter Profile Photo

for vertex ai, it would be a 10 step process to deploy an model, setup an api gateway, ensure there’s proper traffic and routing, before you can actually use it easy. google is not on gcp.

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

New marathon tapering strategy from Sifan Hassan -- run a 5K and a 10K at Olympic medal pace in the last week before the marathon; then go on to win a Gold and set an Olympic record. What an athlete.

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

Samanth Subramanian's (Samanth Subramanian) article / essay / portrait on his Substack today is at once fresh, unexpected, beautiful, and ever so gently poignant. Do not miss! samanth.substack.com/p/multi-storie…

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

one of these days ChatGPT is going to learn not to create lines with just spaces when writing Python programs... until then I'll learn to live with this:

one of these days ChatGPT is going to learn not to create lines with just spaces when writing Python programs... until then I'll learn to live with this:
Dmytro Dzhulgakov (@dzhulgakov) 's Twitter Profile Photo

Congrats to Cerebras on the impressive results! How SRAM-only ASICs like it stack up against GPUs? Spoiler: GPUs still rock for throughput, custom models, large models and prompts (common "prod" things). SRAM ASICs shine for pure generation. Long 🧵 x.com/CerebrasSystem…

Cosmin Negruseri (@cosminnegruseri) 's Twitter Profile Photo

We're living in the golden age of coding/software engineering. In chess there was a similar period where humans + computers was a better combination than humans or computers separately. It lasted around 5y? So enjoy it while it lasts.

D. Sivakumar (@dsivakumar) 's Twitter Profile Photo

Wordle 1,170 3/6* Skill 99, Luck 39 - which is weird, since I must be very lucky to get a picture like this! ⬛⬛⬛⬛⬛ ⬛⬛⬛⬛⬛ 🟩🟩🟩🟩🟩