Scholar

Matt Fredrikson

Google Scholar ID: tMYCvLAAAAAJ

Carnegie Mellon University

Security and PrivacyFair & Trustworthy AIFormal Methods

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

16,342

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗

Publications

19 items

Multi-Rollout On-Policy Distillation via Peer Successes and Failures

2026

Cited

The Vision Wormhole: Latent-Space Communication in Heterogeneous Multi-Agent Systems

2026

Cited

SecCodePRM: A Process Reward Model for Code Security

2026

Cited

LipNeXt: Scaling up Lipschitz-based Certified Robustness to Billion-parameter Models

2026

Cited

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition

arXiv.org · 2025

Cited

Comparing AI Agents to Cybersecurity Professionals in Real-World Penetration Testing

2025

Cited

PrivCode: When Code Generation Meets Differential Privacy

2025

Cited

D-REX: A Benchmark for Detecting Deceptive Reasoning in Large Language Models

2025

Cited

Resume (English only)

Academic Achievements

Introduced and characterized the emerging area of representation engineering (RepE), an approach to enhancing the transparency of AI systems drawing on insights from cognitive neuroscience.
Demonstrated that it is possible to automatically construct adversarial attacks on LLMs, which can transfer to many closed-source, publicly-available chatbots like ChatGPT, Bard, and Claude.
Developed globally-robust neural networks by integrating an efficient differentiable local-robustness verifier into the forward pass of a network during training.
Proposed Capture: Centralized Library Management for IoT Devices, which splits library code onto a secure, centralized hub, reducing the burden on vendors to maintain critical security patches.
Presented a fast procedure for checking local robustness on deep networks using only geometric projections, leading to an efficient, highly-parallel GPU implementation.
Explored the effect that small changes in the composition of training data can have on an individual's outcome under a model, and the implications for the responsible application of deep learning to sensitive decisions.
Conducted an AAAI'21 tutorial on how explainability can inform questions about the robustness, privacy, and fairness aspects of model quality, and how methods for improving these can lead to better outcomes for explainability.

Research Experience

Focuses on understanding the unique risks and vulnerabilities that arise from learned components, and developing methods to mitigate them, often with provable guarantees.

Background

Associate Professor at Carnegie Mellon's School of Computer Science, and a member of CyLab. His research aims to enable systems that make secure, fair, and reliable use of machine learning.

Co-authors

13 total