About Me

I am a Research Engineer passionate about machine learning and computer vision. Currently working on cutting-edge AI solutions at Avataar.ai.

Life Updates

October 2024

Our paper "TACLE" has been accepted at WACV 2025! πŸŽ‰

July 2024

Started new role as Research Engineer-1 at Avataar.ai πŸš€

June 2024

Graduated from IISc Bangalore with M.Tech in AI πŸŽ“

Education

IISc Logo

M.Tech in Artificial Intelligence

Indian Institute of Science (IISc), Bangalore

2022 - 2024

CGPA: 8.0/10.0

BIT Mesra Logo

B.Tech in Electrical Engineering

Bhagalpur College of Engineering, Bhagalpur

2018 - 2021

CGPA: 8.75/10.0

Government Polytechnic Muzzafarpur Logo

Diploma in Electrical Engineering

Government Polytechnic Muzzafarpur, Muzaffarpur

2015 - 2018

Percentage: 77.73%

BSEB Logo

Secondary School (10th)

Bihar School Examination Board

Utkramit M S Parmanandpur Shraniganj

2015

Percentage: 60%

Experience

Research Engineer-1 - Avataar.ai

July 2024 - Present

  • Built an end-to-end pipeline that automatically creates lifestyle images using Flux Model and ControlNets, making products look better and more realistic for customers.
  • Modified the diffusion model’s sampling procedure to improve object reconstruction, followed by intrinsic decomposition for realistic product relighting.
  • Built classification systems for low-data scenarios using pre-trained models such as CLIP, BLIP2 and Qwen2.5, enhancing accuracy in product categorization.
  • Enhanced segmentation accuracy by implementing BiRefNet models while also integrating SAM with YOLO-world for complex scenes.
  • Improved object detection by benchmarking various frameworks including YOLO-world, Florence, and other mechanisms.

Teaching Assistant - IISc Bangalore

Subject: Signal Processing in Practice

Jan 2024 - Apr 2024

  • Integrated continual learning frameworks (L2P, DualPrompt) to mitigate catastrophic forgetting in neural networks.
  • Built self-supervised models using MoCo and SimCLR for robust visual representation learning.
  • Developed adaptive prompt-based learning with dynamic token expansion and attention mechanisms.

Teaching Assistant - IISc Bangalore

Subject: Digital Image Processing

Aug 2023 - Dec 2023

  • Developed DFT-based frequency domain filtering for advanced image denoising and enhancement.
  • Implemented SIFT and Normalized Cut for precise feature detection and image segmentation.
  • Optimized deep learning models using EfficientNet-B0 with custom classifiers for efficient convergence.

Projects

Developed an LSTM-based video classification model utilizing pre-trained image embeddings from CLIP and SigLIP, along with fine-tuned VideoMAE for end-to-end classification.

  • Trained LSTM using visual features extracted with CLIP and SigLIP, achieving 52% and 53% test accuracy, respectively, and saving the checkpoint with the highest validation accuracy.
  • Fine-tuned VideoMAE on the cricket shots dataset, achieving 66.05% test accuracy, while saving model checkpoints after each epoch and monitoring validation performance.
  • Prepared the video dataset by extracting frames, computing embeddings, and assigning labels, while implementing detailed logging and leveraging Hugging Face Hub for model sharing.
Python CLIP SigLIP LSTM VideoMAE Hugging Face

Virtual Try-On

Developed a deep learning-based Virtual Try-On pipeline integrating segmentation, garment transformation, and try-on synthesis for realistic virtual clothing visualization.

  • Built an AI-powered Virtual Try-On pipeline combining Florence2 and IDM-VTON models for automated garment transfer and visualization.
  • Enhanced system performance by implementing a 3-stage framework with CAT-TryOff model for improved color and pattern consistency.
  • Optimized system latency by developing a single-stage solution using Any2Any-Tryon and FLUX-based architectures.
  • Conducted iterative improvements through failure analysis, addressing key challenges in garment fitting and pattern preservation.
Python PyTorch Florence2 IDM-VTON CAT-TryOff Any2Any-Tryon Hugging Face

Publications

WACV Logo

TACLE: Task and Class-aware Exemplar-free Semi-supervised Class Incremental Learning

Jayateja Kalla*, Rohit Kumar*, Soma Biswas
WACV 2025

Skills

Research Interests

  • Image Processing & Computer Vision
  • Large Language Models (LLMs)
  • Natural Language Processing (NLP)

Languages

  • Python
  • SQL
  • MATLAB

Deep Learning Frameworks

  • PyTorch
  • TensorFlow

Tools

  • SLURM
  • AWS
  • GitHub
  • ComfyUI

Relevant Coursework

  • Digital Image Processing
  • Advanced Image Processing
  • Computer Vision
  • Introduction to NLP
  • LLMs for Practical NLP
  • Deep Learning for NLP
  • Digital Video Perception and Algorithms
  • Linear Algebra
  • Stochastic Models and Applications
  • Pattern Recognition and Neural Networks
  • Game Theory
  • Computational Methods of Optimization

Achievements

Entrance Exams

GATE EE

All India Rank: 221 | Score: 803 | Marks: 73.00

2022

View Certificate

GATE IN

All India Rank: 227 | Score: 670 | Marks: 67.33

2022

View Certificate

GATE EE

All India Rank: 1683 | Score: 634 | Marks: 55.33

2021

View Certificate

BCECE [LE]

Rank: 1

2018

Certifications

Agents Course

Agents Course

Agents Course
GenAI Hackathon

GenAI Hackathon

GenAI Hackathon
Programming in MATLAB

Programming in MATLAB

Programming in MATLAB
Power System Certificate

Power System Certificate

Power System Certificate

Volunteering

Organizing Committee Member

EE Summer School 2023

July 2023 - July 2023

View Certificate

Contact