Scholar

Taehyun Cho

Google Scholar ID: kVi85ZgAAAAJ

Seoul National University

Reinforcement Learning

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

Contact

Publications

5 items

2026

Cited

2026

Cited

2026

Cited

2025

Cited

2024

Cited

Resume (English only)

Academic Achievements

- Publications:
- 'Policy-labeled Preference Learning: Is Preference Enough for RLHF?' (ICML 2025)
- 'Bellman Unbiasedness: Toward Provably Efficient Distributional Reinforcement Learning with General Value Function Approximation' (ICML 2025)
- 'Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees' (NeurIPS 2024)
- 'Pitfall of Optimism: Distributional Reinforcement Learning by Randomizing Risk Criterion' (NeurIPS 2023)
- 'SPQR: Controlling Q-ensemble Independence with Spiked Random Model for Reinforcement Learning' (NeurIPS 2023)
- 'On the Convergence of Continual Learning with Adaptive Methods' (UAI 2023)
- 'Adaptive Methods for Nonconvex Continual Learning' (NeurIPS 2022)
- 'Perturbed Quantile Regression for Distributional Reinforcement Learning' (NeurIPS 2022)
- 'Chebyshev polynomial codes: Task entanglement-based coding for distributed matrix multiplication' (ICML 2021)
- 'Optimized shallow neural networks for sum-rate maximization in energy harvesting downlink multiuser NOMA systems' (IEEE Journal on Selected Areas in Communications)
- 'An Efficient Neural Network Architecture for Rate Maximization in Energy Harvesting Downlink Channels' (2020 IEEE International Symposium on Information Theory (ISIT))

Research Experience

- Research Projects:
- Preprints: 'An Axiomatization of Process Score Model', etc.
- Work In Progress: 'Off-policy Direct Preference Optimization with Monotonic Improvement Guarantee', etc.
- Work In Progress: 'Policy Optimization with Process Regret Model', etc.
- Position: Ph.D. Candidate

Education

Background

- Research Interests: Sequential decision-making under uncertainty, particularly in the context of human feedback
- Specialization: Distributional reinforcement learning (distRL), reinforcement learning from human feedback (RLHF), and regret analysis
- Goal: To develop mathematical models and optimize for human-in-the-loop systems, uncovering both theoretical insights and practical algorithms for robust decision-making
- Current Interests: Reasoning LLM agents and regret-based decision theory

Miscellany

- Personal Interests: Actively looking for postdoctoral opportunities in theoretical foundations of reinforcement learning or reasoning LLM research

Co-authors

0 total

Co-authors: 0 (list not available)