Homanga Bharadhwaj

homanga at cs dot toronto dot edu

Hi! I am a second year graduate student in the Department of Computer Science at the University of Toronto , supervised by Florian Shkurti and Animesh Garg, and am closely collaborating with Sergey Levine at UC Berkeley. I am also a student researcher at the Vector Institute. My studies are supported by the Vector AI Scholarship and scholarships from the Department of Computer Science . I am engaged in the quest for Artificial General Intelligence. Although this quest has kept me fully occupied for the past three years, I also paint, and have a Bachelor of Arts degree in Fine Arts. Some of my paintings can be found here.

I am fortunate to have been advised by fantastic researchers over the course of my undergraduate studies. Most recently, I worked with Shin-ichi Maeda at Preferred Networks Inc. Tokyo on transferring policies to different dynamics configurations at test-time. Previously, I worked under the guidance of Yoshua Bengio and Liam Paull at Mila, Montreal during the summer of 2018. My work at Mila focused on the problem of sim-to-real transfer of deep learning based planning algorithms in robotic navigation. Prior to this, I spent a wonderful winter at NUS, Singapore in Brian Lim's lab. I continued collaborating remotely and successfully completed two projects on Recommendation Systems and Multi-Task Learning respectively. In IIT Kanpur, I have worked with Nisheeth Srivastava on Human-Computer Interaction and with Piyush Rai on Zero-Shot Learning and Program Correction.

I am currently a research scientist intern at Nvidia Research with the AI Algorithms team.

Email  /  Github  /  Google Scholar  /  Twitter  /  Name


If you have any questions / want to collaborate, feel free to send me an email! I am always excited to learn more by talking with people.


I'm interested in deep reinforcement learning, robotics, machine learning, computer vision and explainable AI. Much of my current research is about exploration in the context of robotic policy learning. Previousy, I focused on developing efficient planning algorithms for robotic navigation/locomotion and learning transferable representations. Even before that I was involved in designing deep learning (and meta-learning) based algorithms, primarily for recommender systems and trying to make those algorithms explainable. I have explored a wide breadth of research topics around AI during my undergraduate studies, which have led to multiple primary author publications. Some of my papers are mentioned below.

I recently gave an invited talk at the Talking Robotics seminar series. link

Our research on safe exploration was covered by VentureBeat.

Most significant bits
New Analysis
De-anonymization of authors through arXiv submissions during double-blind peer review
Homanga Bharadhwaj, Dylan Turpin, Animesh Garg, Ashton Anderson
arXiv, 2020 paper blog  

In an analysis of ICLR 2020 and 2019 papers, we find positive correlation between releasing preprints on arXiv and acceptance rates of papers by well-known authors. For well known authors, acceptance rates for papers with arxiv preprint are higher than those without preprints released during review.

Skill Transfer via Partially Amortized Hierarchical Planning
Kevin (Cheng) Xie*, Homanga Bharadhwaj*, Danijar Hafner, Animesh Garg, Florian Shkurti
ICLR 2021 paper website reviews  

Combining online planning of high level skills with an amortized low level policy can improve sample-efficiency of model-based RL for solving complex tasks, and transferring across tasks with similar dynamics.

D2RL: Deep Dense Architectures in Reinforcement Learning
Samarth Sinha*, Homanga Bharadhwaj*, Aravind Srinivas, Animesh Garg
Deep RL Workshop (NeurIPS 20) paper blog code reviews  

Introducing skip connections in the policy and Q function neural networks can improve sample efficiency of reinforcement learning algorithms across different continuous control environments

Conservative Safety Critics for Exploration
Homanga Bharadhwaj, Aviral Kumar, Nicholas Rhinehart, Sergey Levine, Florian Shkurti, Animesh Garg
ICLR 2021 paper website reviews  

Training a critic to make conservative safety estimates by over-estimating how unsafe a particular state is, can help significantly minimize the number of catastrophic failures in constrained RL

Model-Predictive Planning via Cross-Entropy and Gradient-Based Optimization
Homanga Bharadhwaj*, Kevin (Cheng) Xie*, Florian Shkurti
L4DC, 2020 paper code reviews  

Updating the top action sequences identified by CEM through a few gradient steps helps improve sample efficiency and performance of planning in Model-based RL

Continual Model-Based Reinforcement Learning with Hypernetworks
Philip Huang, Kevin (Cheng) Xie, Homanga Bharadhwaj, Florian Shkurti
ICRA, 2021 and Deep RL Workshop (NeurIPS 20) paper blog  

Task-conditioned hypernetworks can be used to continually adapt to varying environment dynamics, with a limited replay buffer in lifelong robot learning

LEAF: Latent Exploration Along the Frontier
Homanga Bharadhwaj, Animesh Garg, Florian Shkurti
ICRA, 2021 paper  

Keeping track of the currently reachable frontier of states, and executing a deterministic policy to reach the frontier followed by a stochastic policy beyond, can help facilitate principled exploration in RL

MANGA: Method Agnostic Neural-policy Generalization and Adaptation
Homanga Bharadhwaj, Shoichiro Yamaguchi, Shin-ichi Maeda
ICRA, 2020 paper  

Training dynamics conditioned policies on dynamics randomized environments and estimating dynamics parameters from off-policy data can help achieve zero-shot adaptation in an unseen test environment

A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies
Homanga Bharadhwaj*, Zihan Wang*, Yoshua Bengio, Liam Paull
ICRA, 2019 paper  

Adversarial domain adaptation can be used for training a gradient descent based planner in simulation and transferrring the learned model to a real navigation environment.

DIBS: Diversity inducing Information Bottleneck in Model Ensembles
Samarth Sinha*, Homanga Bharadhwaj*, Anirudh Goyal, Hugo Larochelle, Animesh Garg, Florian Shkurti
AAAI, 2021 (and ICML-UDL, 2020) paper code  

Explicitly maximizing diversity in ensembles through adversarial learning helps improve generalization, transfer, and uncertainty estimation

DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
Timo Milbich*, Karsten Roth*, Homanga Bharadhwaj, Samarth Sinha, Yoshua Bengio, Bjorn Ommer, Joseph Paul Cohen
ECCV, 2020 paper code  

Appropriately augmenting training with multiple complimentary tasks can improve generalization in Deep Metric Learning.

Generalized Adversarially Learned Inference
Yatin Dandi, Homanga Bharadhwaj, Abhishek Kumar, Piyush Rai,
AAAI, 2021 paper  

Adversarially learned inference can be generalized to incorporate multiple layers of feedback through reconstructions, self-supervision, and learned knowledge.

A Generative Framework for Zero Shot Learning with Adversarial Domain Adaptation
Varun Khare, Divyat Mahajan, Homanga Bharadhwaj, VK Verma, Piyush Rai
WACV, 2020 paper code  

Adversarial Domain Adaptation appropriately incorporated in a Generative Zero Shot Learning model can help minimize domain shift and significantly enhance generalization on the unseen test classes

RecGAN: Recurrent Generative Adversarial Networks for Recommendation Systems
Homanga Bharadhwaj, Homin Park, Brian Y. Lim
RecSys, 2018 paper

Recurrent Neural Network based Generative Adversarial Networks can learn to effectively model the latent preference trends of users in time-series recommendation.

A Hierarchical Multi-Task Learning Framework for Healthy Drink Recognition
Homin Park, Homanga Bharadhwaj, Brian Y. Lim
IJCNN, 2019  

A Hierarchical Multi-Task Learning model can leverage several auxiliary tasks like detection of container properties to overcome the limitation of insufficient visual cues for drinks, while predicting the healthiness of drinks from their images

My tryst with HCI research
New tab page recommendations cause a strong suppression of exploratory web browsing behaviors
Homanga Bharadhwaj, Nisheeth Srivastava
WebSci, 2019

Passive website recommendations embedded in the new tab displays of browsers (that recommend based on frecency) inhibit peoples' propensity to visit diverse information sources on the internet

Less significant bits
Meta-Learning for User Cold-Start Recommendation
Homanga Bharadhwaj
IJCNN, 2019  

A Meta-Learning strategy can be used to develop a recommendation model that performs resonably good enough for a wide range of users and that can be cost-effectively updated during test time for a specific user

A Synchrophasor Assisted Optimal Features based Scheme for Fault Detection and Classification
Homanga Bharadhwaj Avinash Kumar Abheejeet Mohapatra
IJCNN, 2019

An optimal features' classifier developed using evolutinary heuristics can be used for real time fault detection and identification

Layer-wise Relevance Propagation for Explainable Recommendations
Homanga Bharadhwaj
EARS Workshop, SIGIR, 2018

Layer-wise relevance propagation can be used for explaaining the predictions of a convolutional neural network based recommendation model

Explanations for Temporal Recommendations
Homanga Bharadhwaj, Shruti Joshi
XAI Workshop, IJCAI, 2018  

A neighborhood style explanation scheme can be used as an auxiliary mechanism for interpreting the predictions of a Recurrent Neural Network based temporal recommendation model

My freshman year dabble with Quantum Entanglement

Phase matching in Spontaneous Parametric Down Conversion
Suman Karan, Shaurya Aarav, Homanga Bharadhwaj, Lavanya Taneja, Girish Kulkarni, Anand K Jha
Journal of Optics (Accepted 2020)

Spontaneous Parametric Down Conversion is used to generate entangled photon pairs. SPDC can be studies through the lens of Wave Optics by making some simplifying theoretical assumptions without compromising on empirical results. Also, a simulation for SPDC can be conveniently designed, given the assumptions.


Teaching Experience and Service

Teaching Assitant (TA),
Introduction to Mobile Robotics (CSC477), Fall 2020

Teaching Assitant (TA),
Algorithmic Intelligence in Robotics (CSC375), Fall 2020

Teaching Assitant (TA),
Computational Cognitive Science (CS786), Winter 2019

Course Project Mentor,
Topics in Probabilistic Modeling and Inference (CS698), Winter 2019

Course Project Mentor,
Introduction to Machine Learning (CS771), Autumn 2018


Student Volunteer at ACM RecSys 2018, Vancouver, BC

Student Volunteer at ACM SIGIR 2018, Ann Arbor, Michigan

Reviewer for Computers in Human Behavior , ICRA 2019 , ICRA 2020, IROS 2020, CoRL 2020, ACM ToCHI

I love his website design.

Miscellaneous stuffs - Co-authors  /  Fav books  /  My paintings  /  Travel  /  Love  /  Name  / 

Visitor Hit Counter