Homanga Bharadhwaj

hbharadh at cs dot cmu dot edu

I am a PhD student in the Robotics Institute , School of Computer Science, Carnegie Mellon University advised by Abhinav Gupta and Shubham Tulsiani. I am engaged in the quest for understanding intelligence by trying to simulate it. Although this quest has kept me fully occupied for the past three years, I also paint, and have a Bachelor of Arts degree in Fine Arts. Some of my paintings can be found here.

I am fortunate to have been advised by fantastic researchers over the course of my studies. In reverse chronological order, I have been a student researcher at Google Brain , a research intern at Nvidia Research , a graduate student in the CS Department, University of Toronto and a student researcher at Vector Institute, a research intern at Preferred Networks Inc. Tokyo and an undergrad in the CSE Department, IIT Kanpur. While being an undergrad, I spent time in Mila, Montreal and the National University of Singapore during semester breaks.

Email  /  Github  /  Google Scholar  /  Twitter  /  Name


If you have any questions / want to collaborate, feel free to send me an email! I am always excited to learn more by talking with people.


I'm interested in developing autonomous agents that learn from raw sensory inputs efficiently, and are safe to interact with. To this end, I have worked on exploration for robot learning, robustness in machine learning, and improving sample efficiency, scalability, and representations in reinforcement learning.

I recently gave an invited talk at the Talking Robotics seminar series. link

Our research on safe exploration was covered by VentureBeat.

Most significant bits
Safety / Robustness
Auditing Robot Learning for Safety and Compliance during Deployment
Homanga Bharadhwaj
CoRL 2021 (Blue Sky Position Paper) paper reviews  

Auditing robot learning algorithms for safety and compliance is important to ensure human compatibility. This position paper argues for increased synergy between the robot learning and AI Alignment communities and describes a preliminary audit framework based on human-in-the-loop learning.

Auditing AI models for Verified Deployment under Semantic Specifications
Homanga Bharadhwaj, De-An Huang, Chaowei Xiao, Anima Anandkumar Animesh Garg
under review paper blog  

Auditing deep-learning models for human-interpretable specifications, prior to deployment is important in preventing unintended consequences. These specifications can be obtained by considering variations in an interpretable latent space of a generative model.

Conservative Safety Critics for Exploration
Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg
ICLR 2021 paper website reviews  

Training a critic to make conservative safety estimates by over-estimating how unsafe a particular state is, can help significantly minimize the number of catastrophic failures in constrained RL

DIBS: Diversity inducing Information Bottleneck in Model Ensembles
Samarth Sinha*, Homanga Bharadhwaj*, Anirudh Goyal, Hugo Larochelle, Animesh Garg, Florian Shkurti
AAAI, 2021 (and ICML-UDL, 2020) paper  

Explicitly maximizing diversity in ensembles through adversarial learning helps improve generalization, transfer, and uncertainty estimation

Exploration / Representation Learning
Latent Skill Exploration for Planning and Transfer
Kevin (Cheng) Xie*, Homanga Bharadhwaj*, Danijar Hafner, Animesh Garg, Florian Shkurti
ICLR 2021 paper website reviews  

Combining online planning of high level skills with an amortized low level policy can improve sample-efficiency of model-based RL for solving complex tasks, and transferring across tasks with similar dynamics.

Model-Predictive Planning via Cross-Entropy and Gradient-Based Optimization
Homanga Bharadhwaj*, Kevin (Cheng) Xie*, Florian Shkurti
L4DC, 2020 paper code reviews  

Updating the top action sequences identified by CEM through a few gradient steps helps improve sample efficiency and performance of planning in Model-based RL

Continual Model-Based Reinforcement Learning with Hypernetworks
Philip Huang, Kevin (Cheng) Xie, Homanga Bharadhwaj, Florian Shkurti
ICRA, 2021 and Deep RL Workshop (NeurIPS 20) paper blog  

Task-conditioned hypernetworks can be used to continually adapt to varying environment dynamics, with a limited replay buffer in lifelong robot learning

LEAF: Latent Exploration Along the Frontier
Homanga Bharadhwaj, Animesh Garg, Florian Shkurti
ICRA, 2021 paper  

Keeping track of the currently reachable frontier of states, and executing a deterministic policy to reach the frontier followed by a stochastic policy beyond, can help facilitate principled exploration in RL

MANGA: Method Agnostic Neural-policy Generalization and Adaptation
Homanga Bharadhwaj, Shoichiro Yamaguchi, Shin-ichi Maeda
ICRA, 2020 paper  

Training dynamics conditioned policies on dynamics randomized environments and estimating dynamics parameters from off-policy data can help achieve zero-shot adaptation in an unseen test environment

A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies
Homanga Bharadhwaj*, Zihan Wang*, Yoshua Bengio, Liam Paull
ICRA, 2019 paper  

Adversarial domain adaptation can be used for training a gradient descent based planner in simulation and transferrring the learned model to a real navigation environment.

D2RL: Deep Dense Architectures in Reinforcement Learning
Samarth Sinha*, Homanga Bharadhwaj*, Aravind Srinivas, Animesh Garg
Deep RL Workshop (NeurIPS 20) paper blog code reviews  

Introducing skip connections in the policy and Q function neural networks can improve sample efficiency of reinforcement learning algorithms across different continuous control environments

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
Timo Milbich*, Karsten Roth*, Homanga Bharadhwaj, Samarth Sinha, Yoshua Bengio, Bjorn Ommer, Joseph Paul Cohen
ECCV, 2020 paper code  

Appropriately augmenting training with multiple complimentary tasks can improve generalization in Deep Metric Learning.

Generalized Adversarially Learned Inference
Yatin Dandi, Homanga Bharadhwaj, Abhishek Kumar, Piyush Rai,
AAAI, 2021 paper  

Adversarially learned inference can be generalized to incorporate multiple layers of feedback through reconstructions, self-supervision, and learned knowledge.

A Generative Framework for Zero Shot Learning with Adversarial Domain Adaptation
Varun Khare, Divyat Mahajan, Homanga Bharadhwaj, VK Verma, Piyush Rai
WACV, 2020 paper code  

Adversarial Domain Adaptation appropriately incorporated in a Generative Zero Shot Learning model can help minimize domain shift and significantly enhance generalization on the unseen test classes

RecGAN: Recurrent Generative Adversarial Networks for Recommendation Systems
Homanga Bharadhwaj, Homin Park, Brian Y. Lim
RecSys, 2018 paper

Recurrent Neural Network based Generative Adversarial Networks can learn to effectively model the latent preference trends of users in time-series recommendation.

My tryst with HCI research
De-anonymization of authors through arXiv submissions during double-blind peer review
Homanga Bharadhwaj, Dylan Turpin, Animesh Garg, Ashton Anderson
arXiv, 2020 paper blog  

In an analysis of ICLR 2020 and 2019 papers, we find positive correlation between releasing preprints on arXiv and acceptance rates of papers by well-known authors. For well known authors, acceptance rates for papers with arxiv preprint are higher than those without preprints released during review.

New tab page recommendations cause a strong suppression of exploratory web browsing behaviors
Homanga Bharadhwaj, Nisheeth Srivastava
WebSci, 2019

Passive website recommendations embedded in the new tab displays of browsers (that recommend based on frecency) inhibit peoples' propensity to visit diverse information sources on the internet

My freshman year dabble with Quantum Entanglement

Phase matching in Spontaneous Parametric Down Conversion
Suman Karan, Shaurya Aarav, Homanga Bharadhwaj, Lavanya Taneja, Girish Kulkarni, Anand K Jha
Journal of Optics (Accepted 2020)

Spontaneous Parametric Down Conversion is used to generate entangled photon pairs. SPDC can be studies through the lens of Wave Optics by making some simplifying theoretical assumptions without compromising on empirical results. Also, a simulation for SPDC can be conveniently designed, given the assumptions.


Teaching Experience and Service

Teaching Assitant (TA),
Introduction to Mobile Robotics (CSC477), Fall 2020

Teaching Assitant (TA),
Algorithmic Intelligence in Robotics (CSC375), Fall 2020

Teaching Assitant (TA),
Computational Cognitive Science (CS786), Winter 2019

Course Project Mentor,
Topics in Probabilistic Modeling and Inference (CS698), Winter 2019

Course Project Mentor,
Introduction to Machine Learning (CS771), Autumn 2018


Student Volunteer at ACM RecSys 2018, Vancouver, BC

Student Volunteer at ACM SIGIR 2018, Ann Arbor, Michigan

Reviewer for Computers in Human Behavior , ICRA 2019 , ICRA 2020, IROS 2020, CoRL 2020, ACM ToCHI

I love his website design.

Miscellaneous stuffs - Co-authors  /  Fav books  /  My paintings  /  Travel  /  Love  /  Name  / 

Visitor Hit Counter