Bukhari's dissertation defense Friday, April 17

Moomal Bukhari's Ph.D. dissertation defense will be Friday, April 17 at 3 p.m. in 19 Avery Hall and via Zoom.
Moomal Bukhari's Ph.D. dissertation defense will be Friday, April 17 at 3 p.m. in 19 Avery Hall and via Zoom.

Ph.D. Dissertation Defense: Moomal Bukhari
Friday, April 17
3 PM
19 Avery Hall
Zoom: https://unl.zoom.us/j/99518165233

"When AI Turns Adversary: Privacy of Machine Learning Models"

"Machine learning (ML) models are increasingly deployed across industries and are often publicly shared to promote transparency, reproducibility, and reuse. Yet, releasing or exposing trained models—via repositories, APIs, or downstream integrations raises significant privacy risks: adversaries can probe models to extract aggregate characteristics of their training data, infer membership, or recover sensitive attributes, even without direct access to raw records. These risks become especially consequential in high-stakes domains such as e-health, where models trained on electronic health records (EHRs) or clinical registries can inadvertently reveal population-level properties or sensitive cohort attributes. As model sharing and reuse proliferate, the probability of accidental or intentional exposure correspondingly increases, underscoring the need for rigorous evaluation of property-level privacy leakage.

Given these concerns, we focus on property inference (PI) attacks, which infer global properties of a model’s training data from the released model itself and thereby pose a significant risk to data secrecy and user privacy. The first part of this dissertation presents a representation-learning–induced PI attack that uses a Variational Auto-Encoder (VAE) to learn complex data distributions from shadow models and decide whether a target model’s training set exhibits a property P.

Evaluated on a non-health benchmark (US Census) and multiple e-health datasets (e.g., Framingham, Sepsis, CDC Diabetes), the method attains strong accuracy (up to 94.29%) while using fewer shadow models and operates entirely on unlabeled data, unlike state-of-the-art meta-classifiers that rely on labeled shadow data, yielding stable performance across thresholds and reduced data dependence. A comparison with the popular meta-classifier based property inference attacks shows that the proposed attack not only has better success rate, but can do so with half training data and a smaller number of shadow models.

Building on the first objective, the second part of this dissertation integrates ensemble learning to further amplify attack reliability. A VAE first produces a property decision and a reconstruction-error confidence signal (MSE); these signals are then fused by a Random Forest Classifier (RFC) to refine the final PI prediction. Tested across tabular datasets (US Census and multiple healthcare datasets) and an image dataset (CelebA), the VAE-RFC hybrid consistently improves over VAE alone, reaching accuracy of up to 99.1%, and remains robust against recent Property Unlearning defenses that degrade many conventional meta-classifiers to near-random guessing (~50%). These findings underscore the effectiveness of coupling representation learning with ensemble learning for stronger and more reliable PI attacks across both tabular and image domains.

Extending beyond privacy leakage in shared models, the third part of this dissertation investigates the adversarial security of modern computer vision pipelines, particularly multi-object tracking (MOT) systems widely used in surveillance, autonomous systems, and safety-critical monitoring. This study introduces a motion-aware adversarial attack targeting YOLO-based detection-and-tracking pipelines by exploiting both the detector’s non-maximum suppression mechanism and the tracker’s Kalman filter-based motion prediction. Unlike prior attacks that focus primarily on single-frame detection errors, the proposed method propagates adversarial perturbations across consecutive frames so that false detections evolve into temporally coherent phantom trajectories. As a result, the tracker is misled into persistently tracking non-existent objects over time.

Instantiated on YOLOv8 and YOLOX with BoT-SORT and ByteTrack, and evaluated on the MOT17 benchmark, the proposed attack significantly degrades tracking performance by increasing false positives and false negatives while sharply reducing key tracking metrics such as IDF1 and MOTA. For example, in the YOLOX + BoT-SORT setting, IDF1 drops from 0.560 to 0.0901 and MOTA falls from 0.708 to -0.449, corresponding to an absolute decrease of 0.778 (77.8 percentage points), while mostly tracked trajectories collapse from 371 to 7. These results demonstrate that our attack drives tracking performance into the negative-MOTA regime, rendering trackers effectively unusable. Similarly, in YOLOX + ByteTrack, MOTA decreases from 0.805 to -0.287 and mostly tracked trajectories drop from 322 to 4. These findings reveal that motion prediction, although intended to improve temporal consistency and robustness, can itself become an exploitable attack surface. Together, the three parts of this dissertation present a broader view of machine learning security: trained models are vulnerable not only to privacy leakage through property inference, but also to adversarial manipulation in deployed decision-making pipelines. Overall, this work highlights pressing risks associated with the public release, reuse, and operational deployment of ML systems, especially in high-stakes domains such as e-health and safety-critical vision applications."

Committee:
Dr. Muhammad Naveed Aman, Advisor
Dr. Rahul Purandare
Dr. Bhuvana Gopal
Dr. Aemal Khattak