Dipendra Misra – Medium

Dipendra Misra

Dipendra Misra

Learning to Generate Better than your LLM

We recently put out a paper proposing a new way of fine-tuning a Large Language Model (LLM) using a hybrid imitation learning-reinforcement…

Jul 12, 2023

Learning to Generate Better than your LLM

Jul 12, 2023

Dipendra Misra

HOMER: Provable Exploration in Reinforcement Learning

This week at ICML 2020, Mikael Henaff, Akshay Krishnamurthy, John Langford and I have a new paper on a new reinforcement learning (RL)…

Jul 15, 2020

HOMER: Provable Exploration in Reinforcement Learning

Jul 15, 2020

Dipendra Misra

Dipendra Misra

Machine learning and NLP Researcher at Microsoft Research, New York. https://dipendramisra.com/.

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams