Vijay Kumar B G

I am a Senior Researcher at NEC Laboratories, America. Before joining NEC, I was a Research Scientist at PARC and prior to that I was a Research Fellow at the Australian Centre for Robotic Vision working with Prof. Ian Reid and Dr. Gustavo Carneiro. Before joining ACRV, I was a Researcher at the Advanced Research Group, Samsung Research. I completed my Ph.D. in Computer science from Queen Mary University of London under Prof. Ioannis Patras. I received my MS in Electrical Engineering from IIT Madras.

Email  /  Scholar  /  linkedin

profile photo

Research

I am interested in machine learning and computer vision, with current focus on LLM Agents, multimodal large language models, vision-language understanding, and self-supervised representation learning.

Selected Publications

Publication Image Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Yun Fu, Manmohan Chandraker.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Project page / Paper / Arxiv
Publication Image Generating Enhanced Negatives for Training Language-Based Object Detectors
Shiyu Zhao, Long Zhao, Vijay Kumar B G, Yumin Suh, Dimitris Metaxas, Manmohan Chandraker, Samuel Schulter.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Code / Arxiv
Publication Image Taming Self-Training for Open-Vocabulary Object Detection
Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B G, Yumin Suh, Manmohan Chandraker, Dimitris Metaxas.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Code / Arxiv
Publication Image LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning
S P Sharan, Francesco Pittaluga, Vijay Kumar B G, Manmohan Chandraker.
Arxiv, 2023
Project page / Video / Arxiv
Publication Image Exploring Question Decomposition for Zero-shot VQA
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Yun Fu, Manmohan Chandraker.
Neural Information Processing Systems (NeurIPS), 2023
project page / Arxiv
Publication Image DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning
Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler.
Neural Information Processing Systems (NeurIPS), 2023
Code / Paper
Publication Image OmniLabel: A Challenging Benchmark for Language-Based Object Detection
Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas.
IEEE Conference on Computer Vision (ICCV), 2023 (Oral: acceptance rate < 3%)
Project page / Video / Arxiv
Publication Image Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Code / Arxiv
Publication Image Single-Stream Multi-Level Alignment for Vision Language Pretraining
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker.
European Conference on Computer Vision (ECCV), 2022
Code / Arxiv
Publication Image Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas.
European Conference on Computer Vision (ECCV), 2022
Code / Arxiv
Publication Image STRIVE: Scene Text Replacement In Videos
Vijay Kumar B G, Jeyasri Subramanian, Varnith Chordia, Eugene Bart, Shaobo Fang, Kelly Guan, Raja Bala.
IEEE Conference on Computer Vision (ICCV), 2021
Data / Project page / Arxiv
Publication Image A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning
Thanh-Toan Do, Toan Tran, Ian Reid, Vijay Kumar B G, Tuan Hoang, Gustavo Carneiro.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Arxiv
Publication Image Multi-modal Cycle-consistent Generalized Zero-Shot Learning
Rafael Felix, Vijay Kumar B G, Ian Reid, Gustavo Carneiro.
European Conference on Computer Vision (ECCV), 2018
Code / Arxiv
Publication Image Bayesian Semantic Instance Segmentation in Open Set World
Trung Pham, Vijay Kumar B G, Thanh-Toan Do, Gustavo Carneiro, and Ian Reid.
European Conference on Computer Vision (ECCV), 2018
Code / Arxiv
Publication Image Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge
D Morrison et. al.
IEEE International Conference on Robotics and Automation (ICRA), 2018
Media / Arxiv
Publication Image Smart Mining for Deep Metric Learning
Vijay Kumar B G*, Ben Harwood*, Gustavo Carneiro, Ian Reid, and Tom Drummond.
IEEE Conference on Computer Vision (ICCV), 2017
Arxiv
Publication Image DeepSetNet: Predicting Sets with Deep Neural Networks
Seyed Hamid Rezatofighi, Vijay Kumar B G, Anton Milan, Ehsan Abbasnejad, Antony Dick, Ian Reid.
IEEE Conference on Computer Vision (ICCV), 2017 (Spotlight: acceptance rate < 5%)
Arxiv
Publication Image Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
Ravi Garg, Vijay Kumar B G, Gustavo Carneiro, and Ian Reid.
European Conference on Computer Vision (ECCV), 2016
Code / Arxiv
Publication Image Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions
Vijay Kumar B G, Gustavo Carneiro, and Ian Reid.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight: acceptance rate < 10%)
Code / Arxiv

Template Credits Jon Barron