Sungdong Kim (@sungdongkim4) 's Twitter Profile
Sungdong Kim

@sungdongkim4

Research Scientist @ NAVER Cloud; MS&PhD student @ KAIST #NLP #LLM #Alignment

ID: 1390479880698007555

calendar_today07-05-2021 01:33:40

92 Tweet

419 Takipçi

183 Takip Edilen

Sungdong Kim (@sungdongkim4) 's Twitter Profile Photo

🤔 Do we always need a human preference for effective LLM alignment after an SFT stage? Our answer is NO 🙅‍♂️ We present a ✨preference-free alignment approach✨, leveraging an off-the-shelf retriever with effective regularizer functions: Regularized Relevance Reward (R^3). [1/n]

🤔 Do we always need a human preference for effective LLM alignment after an SFT stage? Our answer is NO 🙅‍♂️

We present a ✨preference-free alignment approach✨, leveraging an off-the-shelf retriever with effective regularizer functions: Regularized Relevance Reward (R^3). [1/n]