Vijay Kumar B G
I am a Senior Researcher at NEC Laboratories, America. Before joining NEC, I was a Research Scientist at PARC and prior to that I was a Research Fellow at the Australian Centre for Robotic Vision working with Prof. Ian Reid and Dr. Gustavo Carneiro. Before joining ACRV, I was a Researcher at the Advanced Research Group, Samsung Research. I completed my Ph.D. in Computer science from Queen Mary University of London under Prof. Ioannis Patras. I received my MS in Electrical Engineering from IIT Madras.
Email /
Scholar /
linkedin
|
|
Research
I am interested in machine learning and computer vision, with current focus on LLM Agents, multimodal large language models, vision-language understanding, and self-supervised representation learning.
|
Selected Publications
|
|
Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Yun Fu, Manmohan Chandraker.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Project page
/
Paper
/
Arxiv
|
|
Generating Enhanced Negatives for Training Language-Based Object Detectors
Shiyu Zhao, Long Zhao, Vijay Kumar B G, Yumin Suh, Dimitris Metaxas, Manmohan Chandraker, Samuel Schulter.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Code
/
Arxiv
|
|
Taming Self-Training for Open-Vocabulary Object Detection
Shiyu Zhao, Samuel Schulter, Long Zhao, Zhixing Zhang, Vijay Kumar B G, Yumin Suh, Manmohan Chandraker, Dimitris Metaxas.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2024
Code / Arxiv
|
|
LLM-Assist: Enhancing Closed-Loop Planning with Language-Based Reasoning
S P Sharan, Francesco Pittaluga, Vijay Kumar B G, Manmohan Chandraker.
Arxiv, 2023
Project page
/
Video
/
Arxiv
|
|
Exploring Question Decomposition for Zero-shot VQA
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Yun Fu, Manmohan Chandraker.
Neural Information Processing Systems (NeurIPS), 2023
project page
/
Arxiv
|
|
DP-Mix: Mixup-based Data Augmentation for Differentially Private Learning
Wenxuan Bao, Francesco Pittaluga, Vijay Kumar B G, Vincent Bindschaedler.
Neural Information Processing Systems (NeurIPS), 2023
Code / Paper
|
|
OmniLabel: A Challenging Benchmark for Language-Based Object Detection
Samuel Schulter, Vijay Kumar B G, Yumin Suh, Konstantinos M. Dafnis, Zhixing Zhang, Shiyu Zhao, Dimitris Metaxas.
IEEE Conference on Computer Vision (ICCV), 2023 (Oral: acceptance rate < 3%)
Project page
/
Video
/
Arxiv
|
|
Q: How to Specialize Large Vision-Language Models to Data-Scarce VQA Tasks? A: Self-Train on Unlabeled Images!
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Code
/
Arxiv
|
|
Single-Stream Multi-Level Alignment for Vision Language Pretraining
Zaid Khan, Vijay Kumar B G, Samuel Schulter, Xiang Yu, Yun Fu, Manmohan Chandraker.
European Conference on Computer Vision (ECCV), 2022
Code
/
Arxiv
|
|
Exploiting Unlabeled Data with Vision and Language Models for Object Detection
Shiyu Zhao, Zhixing Zhang, Samuel Schulter, Long Zhao, Vijay Kumar B G, Anastasis Stathopoulos, Manmohan Chandraker, Dimitris Metaxas.
European Conference on Computer Vision (ECCV), 2022
Code
/
Arxiv
|
|
STRIVE: Scene Text Replacement In Videos
Vijay Kumar B G, Jeyasri Subramanian, Varnith Chordia, Eugene Bart, Shaobo Fang, Kelly Guan, Raja Bala.
IEEE Conference on Computer Vision (ICCV), 2021
Data
/
Project page
/
Arxiv
|
|
A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning
Thanh-Toan Do, Toan Tran, Ian Reid, Vijay Kumar B G, Tuan Hoang, Gustavo Carneiro.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2019
Arxiv
|
|
Multi-modal Cycle-consistent Generalized Zero-Shot Learning
Rafael Felix, Vijay Kumar B G, Ian Reid, Gustavo Carneiro.
European Conference on Computer Vision (ECCV), 2018
Code
/
Arxiv
|
|
Bayesian Semantic Instance Segmentation in Open Set World
Trung Pham, Vijay Kumar B G, Thanh-Toan Do, Gustavo Carneiro, and Ian Reid.
European Conference on Computer Vision (ECCV), 2018
Code
/
Arxiv
|
|
Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge
D Morrison et. al.
IEEE International Conference on Robotics and Automation (ICRA), 2018
Media
/
Arxiv
|
|
Smart Mining for Deep Metric Learning
Vijay Kumar B G*, Ben Harwood*, Gustavo Carneiro, Ian Reid, and Tom Drummond.
IEEE Conference on Computer Vision (ICCV), 2017
Arxiv
|
|
DeepSetNet: Predicting Sets with Deep Neural Networks
Seyed Hamid Rezatofighi, Vijay Kumar B G, Anton Milan, Ehsan Abbasnejad, Antony Dick, Ian Reid.
IEEE Conference on Computer Vision (ICCV), 2017 (Spotlight: acceptance rate < 5%)
Arxiv
|
|
Unsupervised CNN for Single View Depth Estimation: Geometry to the Rescue
Ravi Garg, Vijay Kumar B G, Gustavo Carneiro, and Ian Reid.
European Conference on Computer Vision (ECCV), 2016
Code
/
Arxiv
|
|
Learning Local Image Descriptors with Deep Siamese and Triplet Convolutional Networks by Minimizing Global Loss Functions
Vijay Kumar B G, Gustavo Carneiro, and Ian Reid.
IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), 2016 (Spotlight: acceptance rate < 10%)
Code
/
Arxiv
|
|