Homanga Bharadhwaj
homanga at cs dot toronto dot edu
Hi! I am a second year graduate student in the Department of Computer Science at the University of Toronto , supervised by Florian Shkurti and Animesh Garg, and am closely collaborating with Sergey Levine at UC Berkeley. I am also a student researcher at the Vector Institute. My studies are supported by the Vector AI Scholarship and scholarships from the Department of Computer Science . I am engaged in the
quest for Artificial General Intelligence. Although this quest has kept me fully occupied for the past three years, I also paint, and have a Bachelor of Arts degree in Fine Arts. Some of my paintings can be found here.
I am fortunate to have been advised by fantastic researchers
over the course of my undergraduate studies. Most recently, I worked with Shin-ichi Maeda at Preferred Networks Inc. Tokyo on transferring policies to different dynamics configurations at test-time.
Previously, I worked under the guidance of Yoshua Bengio and Liam Paull at Mila, Montreal during the summer of 2018.
My work at Mila focused on the problem of sim-to-real transfer of deep learning based planning algorithms
in robotic navigation. Prior to this, I spent a wonderful winter at NUS, Singapore in Brian Lim's lab. I continued collaborating remotely and successfully completed two projects on Recommendation Systems and
Multi-Task Learning respectively. In IIT Kanpur, I have worked with Nisheeth Srivastava on Human-Computer Interaction and with Piyush Rai on Zero-Shot Learning and Program Correction.
I am currently a research scientist intern at Nvidia Research with the AI Algorithms team.
Email  / 
Github  / 
Google Scholar  / 
Twitter  / 
Name
|
|
News
- [Jan, 21] Excited to start a research internship at Nvidia Research for Winter, 2021
- [Jan, 21] Our papers on safety and hierarchy got accepted at ICLR, 2021
- [Dec, 20] Our papers on deep ensembles and adversarial learning got accepted at AAAI, 2021
- [Mar, 20] Excited to be visiting Sergey Levine's lab at Berekely (virtually!) during the summer+fall of 2020
- [Mar, 20] Our paper on gradient-based planning got accepted at L4DC 2020
- [Jan, 20] Our paper on policy transfer in RL got accepted at ICRA 2020
- [Oct, 19] Our paper on Zero-Shot Learning got accepted at WACV 2020
- [Aug, 19] Thrilled to be a recipient of the Vector AI Scholarship
- [Mar, 19] Our paper on exploration/curiosity in the web got accepted at ACM WebSci 2019
- [Jan, 19] I will be working at Preferred Networks Tokyo as a research intern from July to September 2019
- [Jan, 19] I am attending ICRA 2019 to present our paper on sim-to-real transfer
|
If you have any questions / want to collaborate, feel free to send me an email! I am always excited to learn more by talking with people.
|
Research
I'm interested in deep reinforcement learning, robotics, machine learning, computer vision and explainable AI. Much of my current research is about exploration in the context of robotic policy learning. Previousy, I focused on developing efficient planning algorithms for robotic navigation/locomotion and learning transferable representations. Even before that I was involved in designing deep learning (and meta-learning) based algorithms, primarily for recommender systems and trying to make those algorithms explainable. I have explored a wide breadth of research topics around AI during my undergraduate studies, which have led to multiple primary author publications. Some of my papers are mentioned below.
I recently gave an invited talk at the Talking Robotics seminar series. link
Our research on safe exploration was covered by VentureBeat.
|
|
De-anonymization of authors through arXiv submissions during double-blind peer review
Homanga Bharadhwaj,
Dylan Turpin,
Animesh Garg,
Ashton Anderson
arXiv, 2020 paper blog  
In an analysis of ICLR 2020 and 2019 papers, we find positive correlation between releasing preprints on arXiv and acceptance rates of papers by well-known authors. For well known authors, acceptance rates for papers with arxiv preprint are higher than those without preprints released during review.
|
|
Skill Transfer via Partially Amortized Hierarchical Planning
Kevin (Cheng) Xie*,
Homanga Bharadhwaj*, Danijar Hafner,
Animesh Garg, Florian Shkurti
ICLR 2021 paper website reviews  
Combining online planning of high level skills with an amortized low level policy can improve sample-efficiency of model-based RL for solving complex tasks, and transferring across tasks with similar dynamics.
|
|
D2RL: Deep Dense Architectures in Reinforcement Learning
Samarth Sinha*, Homanga Bharadhwaj*,
Aravind Srinivas,
Animesh Garg
Deep RL Workshop (NeurIPS 20) paper blog code reviews  
Introducing skip connections in the policy and Q function neural networks can improve sample efficiency of reinforcement learning algorithms across different continuous control environments
|
|
Conservative Safety Critics for Exploration
Homanga Bharadhwaj,
Aviral Kumar, Nicholas Rhinehart,
Sergey Levine, Florian Shkurti,
Animesh Garg
ICLR 2021 paper website reviews  
Training a critic to make conservative safety estimates by over-estimating how unsafe a particular state is, can help significantly minimize the number of catastrophic failures in constrained RL
|
|
Model-Predictive Planning via Cross-Entropy and Gradient-Based Optimization
Homanga Bharadhwaj*,
Kevin (Cheng) Xie*,
Florian Shkurti
L4DC, 2020 paper code reviews  
Updating the top action sequences identified by CEM through a few gradient steps helps improve sample efficiency and performance of planning in Model-based RL
|
|
Continual Model-Based Reinforcement Learning with Hypernetworks
Philip Huang, Kevin (Cheng) Xie, Homanga Bharadhwaj,
Florian Shkurti
ICRA, 2021 and Deep RL Workshop (NeurIPS 20) paper blog  
Task-conditioned hypernetworks can be used to continually adapt to varying environment dynamics, with a limited replay buffer in lifelong robot learning
|
|
LEAF: Latent Exploration Along the Frontier
Homanga Bharadhwaj,
Animesh Garg,
Florian Shkurti
ICRA, 2021 paper  
Keeping track of the currently reachable frontier of states, and executing a deterministic policy to reach the frontier followed by a stochastic policy beyond, can help facilitate principled exploration in RL
|
|
MANGA: Method Agnostic Neural-policy Generalization and Adaptation
Homanga Bharadhwaj,
Shoichiro Yamaguchi,
Shin-ichi Maeda
ICRA, 2020 paper  
Training dynamics conditioned policies on dynamics randomized environments and estimating dynamics parameters from off-policy data can help achieve zero-shot adaptation in an unseen test environment
|
|
A Data-Efficient Framework for Training and Sim-to-Real Transfer of Navigation Policies
Homanga Bharadhwaj*,
Zihan Wang*,
Yoshua Bengio,
Liam Paull
ICRA, 2019 paper  
Adversarial domain adaptation can be used for training a gradient descent based planner in simulation and transferrring the learned model to a real navigation environment.
|
|
DIBS: Diversity inducing Information Bottleneck in Model Ensembles
Samarth Sinha*,
Homanga Bharadhwaj*,
Anirudh Goyal,
Hugo Larochelle,
Animesh Garg,
Florian Shkurti
AAAI, 2021 (and ICML-UDL, 2020) paper code  
Explicitly maximizing diversity in ensembles through adversarial learning helps improve generalization, transfer, and uncertainty estimation
|
|
DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning
Timo Milbich*,
Karsten Roth*,
Homanga Bharadhwaj,
Samarth Sinha,
Yoshua Bengio,
Bjorn Ommer,
Joseph Paul Cohen
ECCV, 2020 paper code  
Appropriately augmenting training with multiple complimentary tasks can improve generalization in Deep Metric Learning.
|
|
Generalized Adversarially Learned Inference
Yatin Dandi,
Homanga Bharadhwaj,
Abhishek Kumar,
Piyush Rai,
AAAI, 2021 paper  
Adversarially learned inference can be generalized to incorporate multiple layers of feedback through reconstructions, self-supervision, and learned knowledge.
|
|
A Generative Framework for Zero Shot Learning with Adversarial Domain Adaptation
Varun Khare,
Divyat Mahajan,
Homanga Bharadhwaj,
VK Verma,
Piyush Rai
WACV, 2020 paper code  
Adversarial Domain Adaptation appropriately incorporated in a Generative Zero Shot Learning model can help minimize domain shift and significantly enhance generalization on the unseen test classes
|
|
RecGAN: Recurrent Generative Adversarial Networks for Recommendation Systems
Homanga Bharadhwaj,
Homin Park,
Brian Y. Lim
RecSys, 2018 paper
Recurrent Neural Network based Generative Adversarial Networks can learn to effectively model the latent preference trends of users in time-series recommendation.
|
|
A Hierarchical Multi-Task Learning Framework for Healthy Drink Recognition
Homin Park,
Homanga Bharadhwaj,
Brian Y. Lim
IJCNN, 2019  
A Hierarchical Multi-Task Learning model can leverage several auxiliary tasks like detection of container properties to overcome the limitation of insufficient visual cues for drinks, while predicting the healthiness of drinks from their images
|
My tryst with HCI research
|
My freshman year dabble with Quantum Entanglement
|
Codes
|
Teaching Experience and Service
|
 |
Teaching Assitant (TA),
Introduction to Mobile Robotics (CSC477), Fall 2020
Teaching Assitant (TA),
Algorithmic Intelligence in Robotics (CSC375), Fall 2020
|
 |
Teaching Assitant (TA),
Computational Cognitive Science (CS786), Winter 2019
Course Project Mentor,
Topics in Probabilistic Modeling and Inference (CS698), Winter 2019
Course Project Mentor,
Introduction to Machine Learning (CS771), Autumn 2018
|
 |
Student Volunteer at
ACM RecSys 2018, Vancouver, BC
Student Volunteer at
ACM SIGIR 2018, Ann Arbor, Michigan
Reviewer for Computers in Human Behavior , ICRA 2019 , ICRA 2020, IROS 2020, CoRL 2020, ACM ToCHI
|
I love his website design.
|
|