Saeed Vahidian
Scholar

Saeed Vahidian

Google Scholar ID: 8Jd1aUEAAAAJ
University of California, San Diego (UCSD), Duke University
Machine Learning SystemsGenerative AISynthetic DataExplainable AI
Citations & Impact
All-time
Citations
856
 
H-index
12
 
i10-index
15
 
Publications
20
 
Co-authors
13
list available
Resume (English only)
Academic Achievements
  • Jan 2025: One paper accepted at ICLR.
  • Oct 2024: Two papers submitted to ICLR.
  • Sep 2024: One paper submitted to Journal of Machine Learning Research (JMLR).
  • Jun 2024: Two papers accepted at CVPR.
  • Jun 2024: Chair at CVPR 2024; organized the 1st Workshop on Dataset Distillation for Computer Vision.
  • Jun 2024: Co-organized the 3rd FedVision Workshop at CVPR.
  • Feb 2024: One paper accepted at ECCV.
  • Research supported by major grants including:
  • - $20.39M AI Institute for Edge Computing Leveraging Next Generation Networks (Athena);
  • - $600K National Science Foundation (NSF) Grant.
  • Notable publications include:
  • - 'CoreInfer: Accelerating Large Language Model Inference with Semantics-Inspired Adaptive Sparse Activation' (2024, arXiv);
  • - 'Group Distributionally Robust Dataset Distillation with Risk Minimization' (ICLR 2025).
Research Experience
  • Postdoctoral Scholar, Duke University, April 2023 – Present.
  • PhD Student, UC San Diego, Sep 2018 – Apr 2023.
  • ML Researcher (Summer Intern), Qualcomm, Summer 2021.
  • Invited Collaboration on DDF project, NASA, 2022.
  • Research Collaboration, Stanford University, 2018–2019.
  • Research Collaboration, McGill University, 2014–2016.
Background
  • Currently a postdoctoral researcher at Duke University, working with Prof. Yiran Chen.
  • Research focuses on Multimodal Synthetic Data Generation (Vision ↔ Language ↔ Audio) to advance how foundation models learn across modalities.
  • Designs high-fidelity, controllable synthetic datasets to enable robust training pipelines for next-generation multimodal AI systems.
  • Work is built upon three core pillars: Controllability (fine-grained control over generated data), Explainability (understanding why and how synthetic data is created), and Robust Learning on Synthetic Data (enhancing model generalization and robustness).
  • Also explores Edge Intelligence for privacy-aware synthetic data generation and federated training under computational and privacy constraints.