Scholar

Xin-Qiang Cai

Google Scholar ID: rtMUMooAAAAJ

RIKEN Center for Advanced Intelligence Project

Machine LearningReinforcement LearningImitation Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

152

H-index

8

i10-index

7

Publications

20

Co-authors

0

Contact

Publications

7 items

VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction

2026

Cited

0

Positive-Unlabeled Reinforcement Learning Distillation for On-Premise Small Models

2026

Cited

0

Rethinking Toxicity Evaluation in Large Language Models: A Multi-Label Perspective

2025

Cited

0

Reinforcement Learning with Verifiable yet Noisy Rewards under Imperfect Verifiers

2025

Cited

0

PIG-Nav: Key Insights for Pretrained Image Goal Navigation Models

2025

Cited

0

UC-MOA: Utility-Conditioned Multi-Objective Alignment for Distributional Pareto-Optimality

2025

Cited

0

Offline Reinforcement Learning with Domain-Unlabeled Data

2024

Cited

0

Resume (English only)

Background

Research Interests: Reward Modeling, Reinforcement Learning & Imitation Learning, Learning with Weak Supervision. Biography: Currently a postdoctoral researcher in the Imperfect Information Learning Team at RIKEN Center for Advanced Intelligence Project (AIP), led by Professor Masashi Sugiyama.

Miscellany

Recent activities include attending workshops between the University of Melbourne and RIKEN-AIP in Melbourne (July 3–4, 2025) and the University of Sydney and RIKEN-AIP in Sydney (July 7–8, 2025).

Co-authors

0 total

Co-authors: 0 (list not available)