Scholar

Chaojun Xiao

Google Scholar ID: xoC8smYAAAAJ

Postdoctoral Researcher, Tsinghua University

Large Language Model

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

3,436

H-index

i10-index

Publications

Co-authors

list available

Contact

GitHubOpen ↗

Publications

20 items

DECO: Sparse Mixture-of-Experts with Dense-Comparable Performance on End-Side Devices

2026

Cited

PRISM: Pre-alignment via Black-box On-policy Distillation for Multimodal Reinforcement Learning

2026

Cited

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

2026

Cited

Student-in-the-Loop Chain-of-Thought Distillation via Generation-Time Selection

2026

Cited

Data Science and Technology Towards AGI Part I: Tiered Data Management

2026

Cited

Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts

2026

Cited

Spava: Accelerating Long-Video Understanding via Sequence-Parallelism-aware Approximate Attention

2026

Cited

Revealing the Attention Floating Mechanism in Masked Diffusion Models

2026

Cited

Resume (English only)

Academic Achievements

1. Densing Law of LLMs, Nature Machine Intelligence 2025
2. InfLLM-V2: Dense-sparse switchable attention for seamless short-to-long adaptation, Preprint 2025
3. BlockFFN: Towards end-side acceleration-friendly mixture-of-experts with chunk-level activation sparsity, COLM 2025
4. Document Segmentation Matters for Retrieval-Augmented Generation, ACL 2025 Findings
5. MiniCPM4: Ultra-efficient LLMs on end devices, Preprint 2025
6. APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs, ACL 2025
7. Ultra-FineWeb: Efficient data filtering and verification for high-quality LLM training data, Preprint 2025
8. InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory, NeurIPS 2024
9. Fine-Grained Legal Argument-Pair Extraction via Coarse-Grained Pre-training, COLING 2024
10. Exploring the Benefit of Activation Sparsity in Pre-training, ICML 2024
11. Configurable Foundation Models: Building LLMs from a Modular Perspective, Preprint 2024
12. Enhancing Legal Case Retrieval via Scaling High-quality Synthetic Query-Candidate Pairs, EMNLP 2024

Research Experience

Post-Doctoral Researcher, Natural Language Processing Lab, Department of Computer Science and Technology, Tsinghua University.

Education

Ph.D., Department of Computer Science and Technology, Tsinghua University, Advisors: Professor Maosong Sun and Professor Zhiyuan Liu; B.S., Department of Computer Science and Technology, Tsinghua University.

Background

Research Interests: Intersection of natural language processing and large-scale language models. Received doctoral and bachelor degrees from Tsinghua University.

Co-authors

13 total