Hi! I am Ankesh Anand, a Research Scientist in AI at Google DeepMind, London working on large-sclae multimodal models in the Gemini project. I also worked with the Blueshift team to improve mathematical reasoning in large language models like Minerva using RL planning methods.

I recently finished working as a PhD student at Mila, with Aaron Courville on Representation Rearning and Reinforcement Learning. I have also worked as a research intern at DeepMind, London with Jessica Hamrick, and at Microsoft Research, Montreal with Devon Hjelm and Philip Bachman.

Earlier, I graduated from IIT Kharagpur with a Bachelors and Masters in Mathematics and Computing. I have also spent time at VISA, HackerEarth and Google Summer of Code.

Check out my recent blog post on how we should think about RL in the era of foundation models!

Publications

Procedural Generalization by Planning with Self-Supervised World Models
Ankesh Anand, Jacob Walker, Yazhe Li, Eszter Vértes, Julian Schrittwieser, Sherjil Ozair, Théophane Weber, Jessica B. Hamrick
ICLR, 2022
Pretraining Representations for Data-Efficient Reinforcement Learning
Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville
NeurIPS, 2021
Data-Efficient RL with Self-Predictive Representations
Max Schwarzer*, Ankesh Anand*, Rishab Goel, R Devon Hjelm, Aaron Courville, Philip Bachman
ICLR 2021, Spotlight
Unsupervised State Representation Learning in Atari
Ankesh Anand*, Evan Racah*, Sherjil Ozair*, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm
NeurIPS 2019
Blindfold Baselines for Embodied QA
Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville
ViGIL Workshop at NeurIPS 2018
HoME: a Household Multimodal Environment
Simon Brodeur, Ethan Perez*, Ankesh Anand*, Florian Golemo*, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville
ICLR 2018, Workshop Track
MMGAN: Manifold Matching Generative Adversarial Networks
Noseong Park, Ankesh Anand, Joel Moniz, Kookjin Lee, Tanmoy Chakraborty, J Choo, H Park, Youngmin Kim
ICPR 2018
We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!
Ankesh Anand, Tanmoy Chakraborty, Noseong Park
ECIR 2017