Scholar

Fengshuo Bai

Google Scholar ID: rzt0quQAAAAJ

Shanghai Jiao Tong University

Embodied AIAI AlignmentReinforcement LearningPreference-based Learning

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

187

H-index

i10-index

Publications

Co-authors

list available

Contact

TwitterOpen ↗GitHubOpen ↗

Publications

9 items

ToolPRM: Fine-Grained Inference Scaling of Structured Outputs for Function Calling

2025

Cited

DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation

2025

Cited

AdaptFlow: Adaptive Workflow Optimization via Meta-Learning

2025

Cited

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

2025

Cited

EuroCon: Benchmarking Parliament Deliberation for Political Consensus Finding

2025

Cited

Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs

2025

Cited

Retrieval Dexterity: Efficient Object Retrieval in Clutters with Dexterous Hand

2025

Cited

$eta$-DQN: Improving Deep Q-Learning By Evolving the Behavior

2025

Cited

Resume (English only)

Academic Achievements

1. Paper 'STAR: Efficient Preference-based Reinforcement Learning via Dual Regularization' accepted to NeurIPS 2025.
2. Paper 'DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation' accepted to NeurIPS 2025 and selected as Spotlight.
3. Paper 'ValueDCG: Measuring Comprehensive Human Value Understanding Ability of Language Models' accepted to NeurIPS 2025 Workshop on Regulatable ML (RegML).
4. Paper 'AdaptFlow: Adaptive Workflow Optimization via Meta-Learning' accepted to EMNLP 2025.
5. Released survey paper 'A Survey on Vision-Language-Action Models: An Action Tokenization Perspective'.
6. Paper 'Roadmap on Incentive Compatibility for AI Alignment and Governance in Sociotechnical Systems' accepted to AGI 2025 and selected as Oral.
7. Released paper 'Retrieval Dexterity: Efficient Object Retrieval in Clutters with Dexterous Hand'.
8. Paper 'Amulet: ReAlignment During Test Time for Personalized Preference Adaptation of LLMs' accepted to ICLR 2025.
9. Paper 'GRAIT: Gradient-Driven Refusal-Aware Instruction Tuning for Effective Hallucination Mitigation' accepted to NAACL 2025.
10. Paper 'β-DQN: Improving Deep Q-Learning By Evolving the Behavior' accepted to ICML 2025 and selected as Oral.

Research Experience

Involved in multiple research projects, including publishing papers at top conferences such as NeurIPS and EMNLP.

Education

Ph.D. Student at the Department of Computer Science and Engineering, Shanghai Jiao Tong University, co-advised by Prof. Yaodong Yang and Prof. Ying Wen; selected into Wen-Tsun Wu AI Honorary Doctoral Class in 2023, advised by Prof. Cewu Lu.

Background

Currently a Ph.D. Student at the Department of Computer Science and Engineering, Shanghai Jiao Tong University, and a member of PAIR-Lab. Research interests include Dexterous Manipulation, Preference-based RL, and AI Alignment.

Co-authors

7 total