Scholar

Nikki Lijing Kuang

Google Scholar ID: XYhmg74AAAAJ

University of California San Diego

Reinforcement LearningFoundation ModelsBayesian Inference

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailnikki.kuang@gmail.com GitHubOpen ↗LinkedInOpen ↗

Publications

7 items

Residual Skill Optimization for Text-to-SQL Ensembles

2026

Cited

OLIVIA: Online Learning via Inference-time Action Adaptation for Decision Making in LLM ReAct Agents

2026

Cited

Skill-R1: Agent Skill Evolution via Reinforcement Learning

2026

Cited

Beyond Mode Elicitation: Diversity-Preserving Reinforcement Learning via Latent Diffusion Reasoner

2026

Cited

LaDiR: Latent Diffusion Enhances LLMs for Text Reasoning

2025

Cited

Almost Linear Convergence under Minimal Score Assumptions: Quantized Transition Diffusion

2025

Cited

Diffusion-BBO: Diffusion-Based Inverse Modeling for Online Black-Box Optimization

2024

Cited

Resume (English only)

Academic Achievements

Published multiple papers, including 'Towards Personalized Language Models via Inference-time Human Preference Optimization' (NeurIPS 2024 AFM), 'Diff-BBO: Diffusion-Based Inverse Modeling for Black-Box Optimization' (NeurIPS 2024 BDU), 'Log-concave Sampling from a Convex Body with a Barrier: a Robust and Unified Dikin Walk' (NeurIPS 2024), and more. Invited to give talks at various academic conferences.

Research Experience

Interned at IBM Research, Amazon, and Honda Research Institute, working on LLM for personalization, RL for ranking and recommendation systems, and robotics. Experienced in fine-tuning LLMs and reward models, designing CoT prompting and reasoning frameworks, LLM decoding, and training R1-style reasoning LLMs using RL (e.g., PPO, GRPO).

Education

PhD candidate in Computer Science at UC San Diego (expected 2025), advised by Prof. Yian Ma; MSc in Computer Science from UC San Diego (2020).

Background

Primary research interests span reinforcement learning (RL), foundation models, and Bayesian inference, with a focus on addressing fundamental challenges in sequential decision making under uncertainty. Recently, particularly interested in LLM alignment and reasoning, exploring how RL plays a role in these topics. The goal is to design provably efficient and practical algorithms with performance guarantees, achieving both statistical and computational benefits.

Miscellany

Received awards such as the NSF AIVO Travel Grant. Served as a reviewer for several international conferences (e.g., NeurIPS, AISTATS, AAAI, ICML, ICLR, ISIT) and journals (e.g., IEEE Transactions on Circuits and Systems for Video Technology, IEEE Transactions on Information Theory).

Co-authors

12 total