About Me

I’m an AI Engineer and Researcher focused on advancing deep learning and intelligent systems across computer vision, natural language processing, and immersive technologies. My work explores the intersection of AI and human-computer interaction—developing interactive environments, intelligent avatars, and real-time applications that enhance how we learn, communicate, and engage with digital systems. Driven by curiosity and innovation, I aim to build scalable, impactful solutions that push the boundaries of AI in both research and real-world contexts.

Profile

Publications

View All →
Framework for Combating Counterfeit Products

Combating Counterfeit Products in Smart Cities with Digital Twin Technology

2023 IEEE International Smart Cities Conference (ISC2), Bucharest, Romania, 2023

Framework for Violence Detection

Action Knowledge Graph for Violence Detection Using Audiovisual Features

2024 IEEE International Conference on Consumer Electronics (ICCE), Las Vegas, NV, USA, 2024

Framework for Breast Cancer Classification

BreastUS: Vision Transformer for Breast Cancer Classification Using Breast Ultrasound Images

2022 16th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Dijon, France, 2022

Projects

Facial Emotion Recognition for Neurological Healthcare

This project, assigned by ALAMEDA to NTNU, focuses on analyzing facial expressions for pain assessment and emotional state monitoring in neurological healthcare. I contributed conceptually and technically to its development, ensuring the integration of advanced AI methodologies for accurate analysis.

Technologies: Python, Computer vision, Deep learning

ZapAura: Virtual Learning Platform

ZapAura is a virtual learning platform that enhances educational experiences through the customization of Mozilla Hubs to suit our specific needs. The platform incorporates features like AI-powered teaching assistants, full-body avatars, real-time lip-sync, and multilingual support.

Technologies: JavaScript, TypeScript, Three.js, A-Frame, Natural Language Processing

View Project

Visual Avatar Assistant

The Visual Avatar Assistant leverages a fine-tuned LLaMA 3 model trained on haptics and multimedia data to enhance educational experiences. This approach enables a deeper understanding of interactive content and provides a more immersive and engaging learning environment.

Technologies: JavaScript, TypeScript, Natural language processing, ElevenLabs

Get in Touch

LinkedIn

Muhammad Saad

Twitter

Muhammad Saad