Gokul Swamy (@g_k_swamy) Twitter Tweets • TwiDoom

Gokul Swamy

@g_k_swamy

+ Follow

phd candidate @CMU_Robotics. ms @berkeley_ai. summers @GoogleAI, @msftresearch, @aurora_inno, @nvidia, @spacex. no model is an island.

ID: 1077849302326697985

linkhttps://gokul.dev/ calendar_today26-12-2018 08:51:13

465 Tweet

2,2K Followers

1,1K Following

cs-sop.org

@cs_sop_org

2 years ago

As PhD application deadlines approach, we are super excited to announce cs-sop.org, created by Zhaofeng Wu Alexis Ross @ZejiangS💻 cs-sop.org is a platform with statements of purpose generously shared by previous applicants to CS PhD programs 🧵(1/n)

As PhD application deadlines approach, we are super excited to announce cs-sop.org, created by <a href="/zhaofeng_wu/">Zhaofeng Wu</a> <a href="/alexisjross/">Alexis Ross</a> @ZejiangS💻

cs-sop.org is a platform with statements of purpose generously shared by previous applicants to CS PhD programs

🧵(1/n)

thumb_up_off_alt668

chat_bubble_outline8

repeat215

shareShare

Aurora

@aurora_inno

2 months ago

Our co-founder and Chief Scientist Drew Bagnell shares the next part of his AI blog series focused on AI alignment and our approach to ensuring the safety of the Aurora Driver. Read the blog here: bit.ly/3WBFGIT

thumb_up_off_alt28

chat_bubble_outline0

repeat7

shareShare

Gokul Swamy

@g_k_swamy

2 months ago

Danke Schön, Vienna! #ICML2024

thumb_up_off_alt67

chat_bubble_outline1

repeat1

shareShare

Gokul Swamy

@g_k_swamy

2 months ago

bumpin’ that

thumb_up_off_alt5

chat_bubble_outline2

repeat0

shareShare

RL Beyond Rewards Workshop

@rlbrew_2024

2 months ago

It is officially less than a week before the workshop begins⌛️ The workshop schedule is posted here: rlbrew-workshop.github.io/schedule.html A complete list of accepted papers can be found here: rlbrew-workshop.github.io/papers.html

thumb_up_off_alt7

chat_bubble_outline0

repeat4

shareShare

Gokul Swamy

@g_k_swamy

a month ago

Come enjoy some brews after RLBReW :)!

thumb_up_off_alt7

chat_bubble_outline0

repeat0

shareShare

Gokul Swamy

@g_k_swamy

a month ago

I'll be at #RLC2024, helping organize the RL Beyond Rewards Workshop workshop, cheering proudly for our 2 orals at the RL Safety Workshop (rlsafetyworkshop.github.io) and perhaps posting a meme or two on RL_Conference! As usual, DM if you'd like to talk imitation, RLHF, or what's next :).

thumb_up_off_alt30

chat_bubble_outline0

repeat1

shareShare

Gokul Swamy

@g_k_swamy

a month ago

10th floor!

thumb_up_off_alt4

chat_bubble_outline0

repeat0

shareShare

RL Beyond Rewards Workshop

@rlbrew_2024

a month ago

We had a thriving morning poster session, followed by another one at 3:30. Swing on by to the Marriott Center on the 11th Floor then to participate!

thumb_up_off_alt5

chat_bubble_outline0

repeat2

shareShare

Gokul Swamy

@g_k_swamy

a month ago

If you enjoyed / missed Jingwu Tang's talk on multi-agent IL (arxiv.org/abs/2406.04219) or Nico's talk on efficient IRL without compounding errors (rlbrew-workshop.github.io/papers/15_effi…) at #RLC2024, stop by the RL Safety / RL Beyond Rewards Workshop workshop poster sessions this afternoon to hear more!

If you enjoyed / missed <a href="/jingwu_tang/">Jingwu Tang</a>'s talk on multi-agent IL (arxiv.org/abs/2406.04219) or Nico's talk on efficient IRL without compounding errors (rlbrew-workshop.github.io/papers/15_effi…) at #RLC2024, stop by the RL Safety / <a href="/RLBRew_2024/">RL Beyond Rewards Workshop</a> workshop poster sessions this afternoon to hear more!

thumb_up_off_alt33

chat_bubble_outline2

repeat6

shareShare

Harshit Sikchi

@harshit_sikchi

a month ago

Maybe this alone made RLC worth it 🥹

thumb_up_off_alt528

chat_bubble_outline14

repeat9

shareShare

Gokul Swamy

@g_k_swamy

a month ago

If you weren’t able to make it to our workshop, the talks are now public!

thumb_up_off_alt13

chat_bubble_outline0

repeat0

shareShare

Gokul Swamy

@g_k_swamy

25 days ago

Cool variant of SPO that learns a latent-conditioned policy for *controllable* generation. Leverages an under-appreciated benefit of preference models: they always produce outputs in [0, 1], making outputs more comparable / tradeoffs more reasonable than across reward models.

thumb_up_off_alt20

chat_bubble_outline1

repeat4

shareShare

Abhishek Gupta

@abhishekunique7

24 days ago

Sriyash Poddar Yanming Wan Given latent conditional reward, optimizing policies with this is hard, due to scale ambiguity in RLHF methods. We show that methods like self-play optimization (SPO from Gokul Swamy) can help, since rewards correspond to likelihoods instead of arbitrarily scaled utilities (3/7)

<a href="/sriyash__/">Sriyash Poddar</a> <a href="/yanming_wan/">Yanming Wan</a> Given latent conditional reward, optimizing policies with this is hard, due to scale ambiguity in RLHF methods. We show that methods like self-play optimization (SPO from <a href="/g_k_swamy/">Gokul Swamy</a>) can help, since rewards correspond to likelihoods instead of arbitrarily scaled utilities (3/7)

thumb_up_off_alt1

chat_bubble_outline1

repeat1

shareShare

Sanjiban Choudhury

@sanjibac

23 days ago

It was a very humbling and optimistic experience to spend a week coding with these high school students. I barely knew how to code at their age, and some of these students were coding up complex search algorithms on real robots in a matter of hours. Thank you CATALYST students!!

thumb_up_off_alt24

chat_bubble_outline0

repeat6

shareShare

Gokul Swamy

@g_k_swamy

15 days ago

The more time I spend in my PhD, the more I find myself agreeing with these points.

thumb_up_off_alt24

chat_bubble_outline1

repeat1

shareShare

Gokul Swamy

@g_k_swamy

12 days ago

ideal PhD application SOP:

thumb_up_off_alt24

chat_bubble_outline2

repeat0

shareShare

Murtaza Dalal

@mihdalal

9 days ago

Can a single neural network policy generalize over poses, objects, obstacles, backgrounds, scene arrangements, in-hand objects, and start/goal states? Introducing Neural MP: A generalist policy for solving motion planning tasks in the real world 🤖 1/N

thumb_up_off_alt237

chat_bubble_outline4

repeat61

shareShare

Kensuke Nakamura

@kensukenk

8 days ago

Not all prediction errors are made equal! In our new #corl2024 paper, we use the mathematical notion of regret to automatically identify when prediction failures actually led to downstream performance degradation. Website: cmu-intentlab.github.io/not-all-errors/ (1/n)

thumb_up_off_alt47

chat_bubble_outline1

repeat10

shareShare