ABOUT

Maroof Abdul Aziz
M.Sc. Robotic Systems Engineering @ RWTH Aachen
AI/LLM Developer | Internship & Thesis @ Audi | Ex-Mercedes-Benz
I’m a Master's student in Robotic Systems Engineering at RWTH Aachen, specializing in AI for language and vision. My work focuses on developing and optimizing large language models (LLMs), speech processing pipelines, and computer vision systems.
With industry experience at Audi and Mercedes-Benz, and a first-author IEEE publication, I bridge academic research and real-world AI deployment—building scalable, efficient, and impactful machine learning systems.
EXPERIENCE

Master Thesis – LLM Optimization
Audi AG
Oct 2024 – Present
- Topic: Optimization of Small Language Models for Embedded Voice Assistance.
- Research and develop language models tailored for specific use cases, utilizing fine-tuning, model compression and acceleration techniques.
- Built synthetic datasets simulating realistic vehicle assistant queries and tool usage.

Internship – ChatGPT TTS Integration
Audi AG
Apr 2024 – Sep 2024
- Project: ChatGPT integration into online speech processing of voice assistant.
- Developed test datasets and evaluated multilingual language models using LLMs for text-matching and language processing tasks.
- Conducted performance testing for scalable backends in automotive speech systems and automated reporting in CI/CD pipelines.
- Contributed to Agile (Scrum) development processes and streamlined data analysis workflows.

Research Assistant – Medical Imaging
RWTH Aachen University
Aug 2023 – Mar 2024
- Develop and test deep learning models for detection, classification and grading of tumor in high resolution images.
- Literature and dataset research.
- Published in IEEE ICIP 2024.

Senior Product Design Engineer
Mercedes-Benz R&D India
Dec 2015 – Jul 2022
- Developed Python-based automation tools for design and validation processes.
- Collaborated with cross-functional teams to prototype and design cost effective, manufacturable automotive exterior parts.
PROJECTS
Driver Drowsiness Detection
May 2023 – Jul 2023
- Built drowsiness detection model using CNNs and facial landmarks on 40K+ driver images.
- Achieved 90% accuracy with custom CNN and 86% with ResNet50 transfer learning.
- Used OpenCV for facial landmark detection to identify drowsy behavior.
- Compared traditional CNN, OpenCV, and transfer learning approaches for performance.
- Collaborated with TechLabs Aachen team during the Digital Shaper Program.







LangGraph Agent Deployment
Feb 2024 – Apr 2024
- Full-stack RAG application using FastAPI, LangGraph, and Streamlit for tool-using LLM agents.
- Document ingestion from websites, PDFs, and SQL into a Qdrant vector store using LlamaIndex.
- OpenAI and Groq models with session memory, tool calls, and LLM-based reranking.
- CI/CD pipeline with GitHub Actions to automate testing and deploy Docker containers to AWS EC2.
- User-friendly chat interface for uploading data, switching models, and visualizing agent reasoning.










Master Thesis: Optimizing Small Language Models
Jan 2024 – May 2024
- Optimized small language models for CPU-only embedded systems and in-vehicle voice assistants.
- Designed and fine-tuned models using QLoRA with special tokens for tool-call accuracy.
- Applied structured pruning and quantization (GPTQ, GGUF) for efficient model compression.
- Built synthetic datasets simulating realistic vehicle assistant queries and tool usage.
- Benchmarked models using real-world metrics: latency, memory, accuracy, and on-device inference speed.








LangGraph Agent Deployment
Feb 2024 – Apr 2024
- Full-stack RAG application using FastAPI, LangGraph, and Streamlit for tool-using LLM agents.
- Document ingestion from websites, PDFs, and SQL into a Qdrant vector store using LlamaIndex.
- OpenAI and Groq models with session memory, tool calls, and LLM-based reranking.
- CI/CD pipeline with GitHub Actions to automate testing and deploy Docker containers to AWS EC2.
- User-friendly chat interface for uploading data, switching models, and visualizing agent reasoning.









Master Thesis: Optimizing Small Language Models
Jan 2024 – May 2024
- Optimized small language models for CPU-only embedded systems and in-vehicle voice assistants.
- Designed and fine-tuned models using QLoRA with special tokens for tool-call accuracy.
- Applied structured pruning and quantization (GPTQ, GGUF) for efficient model compression.
- Built synthetic datasets simulating realistic vehicle assistant queries and tool usage.
- Benchmarked models using real-world metrics: latency, memory, accuracy, and on-device inference speed.







LangChain RAG Agent Suite
Nov 2023 – Jan 2024
- Modular RAG framework using LangChain + LlamaIndex.
- RAG pipelines with multi-source ingestion: web, PDFs, and SQL databases.
- Wikipedia, Arxiv, and Tavily tools for agent-based question answering.
- Used FAISS for vector storage and LLM-based reranking for response relevance.
- FastAPI backend and Streamlit frontend for interactive multi-modal user experience.






Cancer Detection on Whole Slide Images
Jul 2023 – Oct 2023
- Cancer detection and subtyping using whole-slide histopathology images.
- Attention-based heat maps to highlight tumor regions for interpretability and patch selection.
- Integrated patch-level annotations and center loss to improve classification accuracy and feature separation.
- Improved preprocessing with white-patch filtering and magnification normalization for consistent patch quality.
- Enhanced RCC subtype classification, achieving better AUC, bACC, and F1 than baseline models.






Driver Drowsiness Detection
May 2023 – Jul 2023
- Built drowsiness detection model using CNNs and facial landmarks on 40K+ driver images.
- Achieved 90% accuracy with custom CNN and 86% with ResNet50 transfer learning.
- Used OpenCV for facial landmark detection to identify drowsy behavior.
- Compared traditional CNN, OpenCV, and transfer learning approaches for performance.
- Collaborated with TechLabs Aachen team during the Digital Shaper Program.






PUBLICATIONS
Deep Learning Approach for Renal Cell Carcinoma Detection
IEEE ICIP 2024
A deep learning method for detecting renal cell carcinoma using histopathological images.
View Document ↗






Deep Learning Approach for Renal Cell Carcinoma Detection
IEEE ICIP 2024
A deep learning method for detecting renal cell carcinoma using histopathological images.
View Document ↗Optimization of Small Language Models for Embedded Voice Assistance
Master's Thesis, 2024
My Master’s thesis focused on usage of SLMs on edge devices.
View Document ↗CONTACT
I’m always open to discussing new projects or opportunities. Let's connect!