Ankesh Anand

Hi! I am Ankesh Anand, a Research Scientist in AI at Google DeepMind, London. I am a core-contributor on the Gemini project working on RL for reasoning and new capabilities. Most recently, I played a key role in building Gemini 2.5 Pro, 2.5 Flash and Project Mariner.

I previously did a PhD at Mila, with Aaron Courville on Self-Supervised Learning and Reinforcement Learning. I have also worked as a research intern at DeepMind, London with Jessica Hamrick, and at Microsoft Research, Montreal with Devon Hjelm and Philip Bachman.

Earlier, I graduated from IIT Kharagpur with a Bachelors and Masters in Mathematics and Computing. I have also spent time at VISA, HackerEarth and Google Summer of Code.

Check out my recent blog post on how we should think about RL in the era of foundation models!

Publications

Procedural Generalization by Planning with Self-Supervised World Models

Ankesh Anand, Jacob Walker, Yazhe Li, Eszter Vértes, Julian Schrittwieser, Sherjil Ozair, Théophane Weber, Jessica B. Hamrick

ICLR, 2022

Pretraining Representations for Data-Efficient Reinforcement Learning

Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand, Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville

NeurIPS, 2021

Data-Efficient RL with Self-Predictive Representations

Max Schwarzer*, Ankesh Anand*, Rishab Goel, R Devon Hjelm, Aaron Courville, Philip Bachman

ICLR 2021, Spotlight

Unsupervised State Representation Learning in Atari

Ankesh Anand*, Evan Racah*, Sherjil Ozair*, Yoshua Bengio, Marc-Alexandre Côté, R Devon Hjelm

NeurIPS 2019

Blindfold Baselines for Embodied QA

Ankesh Anand, Eugene Belilovsky, Kyle Kastner, Hugo Larochelle, Aaron Courville

ViGIL Workshop at NeurIPS 2018

HoME: a Household Multimodal Environment

Simon Brodeur, Ethan Perez*, Ankesh Anand*, Florian Golemo*, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, Aaron Courville

ICLR 2018, Workshop Track

MMGAN: Manifold Matching Generative Adversarial Networks

Noseong Park, Ankesh Anand, Joel Moniz, Kookjin Lee, Tanmoy Chakraborty, J Choo, H Park, Youngmin Kim

ICPR 2018

We used Neural Networks to Detect Clickbaits: You won't believe what happened Next!

Ankesh Anand, Tanmoy Chakraborty, Noseong Park

ECIR 2017