Artificial Intelligence · Machine Learning

Hi, I’m Masud Ahmed

Graduate Research Assistant · Department of Information Systems, University of Maryland, Baltimore County

Welcome to my personal webpage! I have completed my Ph.D. in Information Systems and have over 6 years of experience developing foundation multimodal models, transformer-based generative systems, and autoregressive diffusion architectures for vision and language understanding. I am expert in real-time human–machine interaction, high-performance computing, multimodal LLMs, Agentic AI, andvision–text alignment. I am passionate about applying AI to improve efficiency, integrity, and user experience.

Download CV View Publications

Information Systems · UMBC Baltimore, MD, USA

Education

Ph.D. in Information Systems

University of Maryland, Baltimore County

Supervisior: Dr. Nirmalya Roy

CGPA: 3.90/4.00

B.Sc. in Electrical and Electronic Engineering

University of Dhaka

Supervisor: Dr. Md Atiqur Rahman Ahad

CGPA: 3.18/4.00

Research

Research Areas & Projects

Theoretical interests, application domains, datasets, and ongoing projects.

Theoretical

Domain Adaptation, Continual Learning, Self-Supervised Learning, Active Learning, Foundation Model, Transformer, Large Language Model, Large Vision Model

Application

Computer Vision, Natural Language Processing, Healthcare, Robotics, Wearable Device Data Analysis, Sensor Data Analysis

Programming & Frameworks

Python, C++, C, SQL (Oracle), MATLAB, HTML, R programming, ROS (Robot Operating System)
PyTorch, HuggingFace Transformers, JAX, Tensorflow, spaCy

Datasets

Projects

Selected Research Projects

Transformer-based Semantic Segmentation on Continuous-valued Embedding

Eliminated reliance on discrete codebooks, reducing information loss and enhancing spatial-contextual feature %learning
Proposed an RGB image-conditioned generation model using diffusion loss and transformer for continuous-valued semantic embedding
Improved robustness against noise, artifacts, lighting variation, and enabled zero-shot domain adaptation across diverse conditions
Validated across four public datasets with ablation studies on image size, color mapping, and distribution shifts

Hyperbolic Text-guided Semantic Segmentation

Proposed a text-guided hyperbolic semantic segmentation framework using the Lorentz model to capture hierarchical pixel-label relationships in low-dimensional space
Introduced a novel Lorentz entailment cone loss enabling pixel embeddings to align with class text prototypes, enforcing semantic hierarchy and improving interpretability
Achieved competitive mIoU across mutiple dataset and improved zero-shot and uncertainty-aware segmentation

Active Learning for Semantic Segmentation in Mobile Robotics

Develop a real-time framework for active selection of informative regions in visual data for continual learning in semantic segmentation
Entropy-driven ranking and cyclical feedback loop
Reduced data transfer overhead, improving model performance with minimal labeled data

Distributed Collaborative Robotics and Federated Learning in Vision

Developed a framework for Federated Class-Incremental Learning (FCIL) that enables collaborative training of machine learning models across geographically distributed agents without sharing raw data
Combined virtual simulations and real-world data collected from multiple physical sites, enabling domain adaptation to learn from both simulated and real environments
Improved decision-making capabilities in real-time by enabling agents to adapt to evolving environments and data streams, reducing reliance on extensive real-world data collection

Semantic Clustering Innovation: Novel Categories Discovery (NCD)

Develop NCD based algorithm for novel data clustering based on known class semantics, overcoming pseudo-labeling limitations
Leverage data sampling and multinoulli distribution for implicit semantic clustering without extensive annotations
Align class neuron activation distributions through Monte-Carlo sampling, explore directional statistics, and conduct ablation studies to advance state-of-the-art clustering approaches

Learning the Optical & Physiological Mechanics of rPPG with Self-Supervision

In this computational biology project, proposed a self-supervised learning approach for estimating heart rate from remote photoplethysmography (rPPG) signals obtained from skin videos without the need for synchronized ground truth annotations
Developed a contrastive learning-based pretraining strategy to learn the underlying diffusion signals' frequency, phase, and temporal coherence from unlabeled video frame sequences

Strata and Viewpoint Invariant Encoding for Robust Video Action Recognition

Address the challenge of robust video action recognition (VAR) in diverse settings with varying viewpoints and sensors
Propose a joint optimization method leveraging contrastive and adversarial loss for learning sensors and viewpoint invariant representation from unlabeled synchronous multiview (MV) video data
Collect a large-scale time synchronous MV video dataset encompassing diverse settings, actions, viewpoints, and sensor properties.

Publications & Profiles

Lorentz Entailment Cone for Semantic Segmentation
2026 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)
Accepted for Publication
CAM-Seg: A Continuous-valued Embedding Approach for Semantic Image Generation
2025 International Conference on Computer Vision (ICCV)
Paper Link
RRPIPS: Respiratory Waveform Reconstruction using Persistent Independent Particles Tracking from Video
2025 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
Paper Link
Respiratory Rate and Heart Rate in Facial Videos through the Lens of Temporal Pixel Variation
2025 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
Paper Link
ARSFineTune: On-the-Fly Tuning of Vision Models for Unmanned Ground Vehicles
2024 IEEE International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT)
Paper Link
VIVAR: Learning View-Invariant Embedding for Video Action Recognition
2024 International Conference on Video and Image Processing (ICVIP)
Paper Link
An Online Continuous Semantic Segmentation Framework With Minimal Labeling Efforts
2023 IEEE International Conference on Smart Computing (SMARTCOMP)
Paper Link
NEV-NCD: Negative Learning, Entropy, and Variance Regularization Based Novel Action Categories Discovery
2023 IEEE International Conference on Image Processing (ICIP)
Paper Link
SrPPG: Semi-Supervised Adversarial Learning for Remote Photoplethysmography with Noisy Data
2023 IEEE International Conference on Smart Computing (SMARTCOMP)
Paper Link
Self-rPPG: Learning the Optical & Physiological Mechanics of Remote Photoplethysmography with Self-Supervision
2022 IEEE/ACM International Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE)
Paper Link
GADAN: Generative Adversarial Domain Adaptation Network For Debris Detection Using Drone
2022 IEEE International Conference on Distributed Computing in Sensor Systems (DCOSS)
Paper Link
Benchmarking domain adaptation for semantic segmentation
2022 SPIE Defense + Commercial Sensing
Paper Link
Recognition of human locomotion on various transportations fusing smartphone sensors
2021 Elsevier Pattern Recognition Letters
Paper Link
Static postural transition-based technique and efficient feature extraction for sensor-based activity recognition
2021 Elsevier Pattern Recognition Letters
Paper Link
Action recognition using Kinematics Posture Feature on 3D skeleton joint locations
2021 Elsevier Pattern Recognition Letters
Paper Link
Temporal Clustering Based Thermal Condition Monitoring in Building
2020 Elsevier Sustainable Computing: Informatics and Systems (SUSCOM)
Paper Link
OU-ISIR wearable sensor-based gait challenge: Age and gender
2019 IEEE International Conference on Biometrics (ICB)
Paper Link
POIDEN: position and orientation independent deep ensemble network for the classification of locomotion and transportation modes
2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)
Paper Link
Challenges in Sensor-based Human Activity Recognition and a Comparative Analysis of Benchmark Datasets: A Review
2019 International Conference on Informatics, Electronics & Vision (ICIEV)
Paper Link
An Approach to Classify Human Activities in Real-time from Smartphone Sensor Data
2019 International Conference on Informatics, Electronics & Vision (ICIEV)
Paper Link
A comparative approach to classification of locomotion and transportation modes using smartphone sensor data
2018 ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp)
Paper Link

Work Experience

Postdoctoral Researcher

University of Maryland Baltimore County

Graduate Research Assistant

University of Maryland Baltimore County

Assistant Researcher

Yagi Laboratory, Osaka University

Beyond Research

Additional Information

Global Competition Awards

UMBC 3 Minute Thesis People's Choice

5 April 2022
UMBC, Maryland, USA

SHL Challenge 2018

12 October 2018
ACM UbiComp’18, Singapore

TechKriti 2017

23 March 2017 – 26 March 2017
IIT Kanpur, India

NASA Space App Challenge 2017

2 May 2017 – 3 March 2017
IUB, Bangladesh

Community Involvement

GEARS (UMBC), Co-Chair
IEEE, Student Member

Skills

Math Skill: Linear Algebra, Differential Equation Solution, Probability and Statistics
Hardware Skill: Arduino and PIC microcontroller based projects, Raspberry Pi
Language Skill: Fluent in English

Contact

mahmed10@umbc.edu
1000 Hilltop Circle, ITE 457, Baltimore, 21250, USA
Click here to send message via Google form