Artificial Intelligence · Machine Learning

Hi, I’m Masud Ahmed

Graduate Research Assistant · Department of Information Systems, University of Maryland, Baltimore County

Welcome to my personal webpage! I have completed my Ph.D. in Information Systems and have over 6 years of experience developing foundation multimodal models, transformer-based generative systems, and autoregressive diffusion architectures for vision and language understanding. I am expert in real-time human–machine interaction, high-performance computing, multimodal LLMs, Agentic AI, andvision–text alignment. I am passionate about applying AI to improve efficiency, integrity, and user experience.

Information Systems · UMBC Baltimore, MD, USA

Education

Ph.D. in Information Systems

University of Maryland, Baltimore County

Supervisior: Dr. Nirmalya Roy

CGPA: 3.90/4.00

B.Sc. in Electrical and Electronic Engineering

University of Dhaka

Supervisor: Dr. Md Atiqur Rahman Ahad

CGPA: 3.18/4.00

Research

Research Areas & Projects

Theoretical interests, application domains, datasets, and ongoing projects.

Theoretical

  • Domain Adaptation, Continual Learning, Self-Supervised Learning, Active Learning, Foundation Model, Transformer, Large Language Model, Large Vision Model

Application

  • Computer Vision, Natural Language Processing, Healthcare, Robotics, Wearable Device Data Analysis, Sensor Data Analysis

Programming & Frameworks

  • Python, C++, C, SQL (Oracle), MATLAB, HTML, R programming, ROS (Robot Operating System)
  • PyTorch, HuggingFace Transformers, JAX, Tensorflow, spaCy
Projects

Selected Research Projects

Transformer-based Semantic Segmentation on Continuous-valued Embedding
  • Eliminated reliance on discrete codebooks, reducing information loss and enhancing spatial-contextual feature %learning
  • Proposed an RGB image-conditioned generation model using diffusion loss and transformer for continuous-valued semantic embedding
  • Improved robustness against noise, artifacts, lighting variation, and enabled zero-shot domain adaptation across diverse conditions
  • Validated across four public datasets with ablation studies on image size, color mapping, and distribution shifts
Hyperbolic Text-guided Semantic Segmentation
  • Proposed a text-guided hyperbolic semantic segmentation framework using the Lorentz model to capture hierarchical pixel-label relationships in low-dimensional space
  • Introduced a novel Lorentz entailment cone loss enabling pixel embeddings to align with class text prototypes, enforcing semantic hierarchy and improving interpretability
  • Achieved competitive mIoU across mutiple dataset and improved zero-shot and uncertainty-aware segmentation
Active Learning for Semantic Segmentation in Mobile Robotics
  • Develop a real-time framework for active selection of informative regions in visual data for continual learning in semantic segmentation
  • Entropy-driven ranking and cyclical feedback loop
  • Reduced data transfer overhead, improving model performance with minimal labeled data
Distributed Collaborative Robotics and Federated Learning in Vision
  • Developed a framework for Federated Class-Incremental Learning (FCIL) that enables collaborative training of machine learning models across geographically distributed agents without sharing raw data
  • Combined virtual simulations and real-world data collected from multiple physical sites, enabling domain adaptation to learn from both simulated and real environments
  • Improved decision-making capabilities in real-time by enabling agents to adapt to evolving environments and data streams, reducing reliance on extensive real-world data collection
Semantic Clustering Innovation: Novel Categories Discovery (NCD)
  • Develop NCD based algorithm for novel data clustering based on known class semantics, overcoming pseudo-labeling limitations
  • Leverage data sampling and multinoulli distribution for implicit semantic clustering without extensive annotations
  • Align class neuron activation distributions through Monte-Carlo sampling, explore directional statistics, and conduct ablation studies to advance state-of-the-art clustering approaches
Learning the Optical & Physiological Mechanics of rPPG with Self-Supervision
  • In this computational biology project, proposed a self-supervised learning approach for estimating heart rate from remote photoplethysmography (rPPG) signals obtained from skin videos without the need for synchronized ground truth annotations
  • Developed a contrastive learning-based pretraining strategy to learn the underlying diffusion signals' frequency, phase, and temporal coherence from unlabeled video frame sequences
Strata and Viewpoint Invariant Encoding for Robust Video Action Recognition
  • Address the challenge of robust video action recognition (VAR) in diverse settings with varying viewpoints and sensors
  • Propose a joint optimization method leveraging contrastive and adversarial loss for learning sensors and viewpoint invariant representation from unlabeled synchronous multiview (MV) video data
  • Collect a large-scale time synchronous MV video dataset encompassing diverse settings, actions, viewpoints, and sensor properties.

Publications & Profiles

  • Lorentz Entailment Cone for Semantic Segmentation
    2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
    Accepted for Publication
  • CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
    2025 International Conference on Computer Vision (ICCV)
    Paper Link
  • RRPIPS: Respiratory Waveform Reconstruction using Persistent Independent Particles Tracking from Video
    2025 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
    Paper Link
  • Respiratory Rate and Heart Rate in Facial Videos through the Lens of Temporal Pixel Variation
    2025 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
    Paper Link
  • ARSFineTune: On-the-Fly Tuning of Vision Models for Unmanned Ground Vehicles
    2024 IEEE International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)
    Paper Link
  • VIVAR: Learning View-Invariant Embedding for Video Action Recognition
    2024 International Conference on Video and Image Processing (ICVIP)
    Paper Link
  • An Online Continuous Semantic Segmentation Framework With Minimal Labeling Efforts
    2023 IEEE International Conference on Smart Computing (SMARTCOMP)
    Paper Link
  • NEV-NCD: Negative Learning, Entropy, and Variance Regularization Based Novel Action Categories Discovery
    2023 IEEE International Conference on Image Processing (ICIP)
    Paper Link
  • SrPPG: Semi-Supervised Adversarial Learning for Remote Photoplethysmography with Noisy Data
    2023 IEEE International Conference on Smart Computing (SMARTCOMP)
    Paper Link
  • Self-rPPG: Learning the Optical & Physiological Mechanics of Remote Photoplethysmography with Self-Supervision
    2022 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
    Paper Link
  • GADAN: Generative Adversarial Domain Adaptation Network For Debris Detection Using Drone
    2022 IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS)
    Paper Link
  • Benchmarking domain adaptation for semantic segmentation
    2022 SPIE Defense + Commercial Sensing
    Paper Link
  • Recognition of human locomotion on various transportations fusing smartphone sensors
    2021 Elsevier Pattern Recognition Letters
    Paper Link
  • Static postural transition-based technique and efficient feature extraction for sensor-based activity recognition
    2021 Elsevier Pattern Recognition Letters
    Paper Link
  • Action recognition using Kinematics Posture Feature on 3D skeleton joint locations
    2021 Elsevier Pattern Recognition Letters
    Paper Link
  • Temporal Clustering Based Thermal Condition Monitoring in Building
    2020 Elsevier Sustainable Computing: Informatics and Systems (SUSCOM)
    Paper Link
  • OU-ISIR wearable sensor-based gait challenge: Age and gender
    2019 IEEE International Conference on Biometrics (ICB)
    Paper Link
  • POIDEN: position and orientation independent deep ensemble network for the classification of locomotion and transportation modes
    2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)
    Paper Link
  • Challenges in Sensor-based Human Activity Recognition and a Comparative Analysis of Benchmark Datasets: A Review
    2019 International Conference on Informatics, Electronics & Vision (ICIEV)
    Paper Link
  • An Approach to Classify Human Activities in Real-time from Smartphone Sensor Data
    2019 International Conference on Informatics, Electronics & Vision (ICIEV)
    Paper Link
  • A comparative approach to classification of locomotion and transportation modes using smartphone sensor data
    2018 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)
    Paper Link

Work Experience

Graduate Research Assistant

University of Maryland Baltimore County

Beyond Research

Additional Information

Global Competition Awards

UMBC 3 Minute Thesis People's Choice

5 April 2022
UMBC, Maryland, USA

SHL Challenge 2018

12 October 2018
ACM UbiComp’18, Singapore

TechKriti 2017

23 March 2017 – 26 March 2017
IIT Kanpur, India

NASA Space App Challenge 2017

2 May 2017 – 3 March 2017
IUB, Bangladesh

Community Involvement

Skills

  • Math Skill: Linear Algebra, Differential Equation Solution, Probability and Statistics
  • Hardware Skill: Arduino and PIC microcontroller based projects, Raspberry Pi
  • Language Skill: Fluent in English

Contact