Hi! I am Ankesh Anand, a Research Scientist in AI at Google DeepMind, London working on RL for reasoning and new capabilities in Gemini models. Most recently, I played a key role in building Gemini Flash Thinking and Project Mariner.
I previously did a PhD at Mila, with Aaron Courville on Self-Supervised Learning and Reinforcement Learning. I have also worked as a research intern at DeepMind, London with Jessica Hamrick, and at Microsoft Research, Montreal with Devon Hjelm and Philip Bachman.
Earlier, I graduated from IIT Kharagpur with a Bachelors and Masters in Mathematics and Computing. I have also spent time at VISA, HackerEarth and Google Summer of Code.
Check out my recent blog post on how we should think about RL in the era of foundation models!