Scholar

Jiaang Li

Google Scholar ID: RTmTPqQAAAAJ

University of Copenhagen

Computer VisionMultimodalityNatural Language Processing

Homepage↗Google Scholar↗

Citations & Impact

All-time

Citations

224

H-index

i10-index

Publications

Co-authors

list available

Contact

Emailjili@di.ku.dk TwitterOpen ↗GitHubOpen ↗

Publications

9 items

Align Documents to Questions: Question-Oriented Document Rewriting for Retrieval-Augmented Generation

2026

Cited

Video Understanding: From Geometry and Semantics to Unified Models

2026

Cited

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning

2026

Cited

MeSH: Memory-as-State-Highways for Recursive Transformers

2025

Cited

What if Othello-Playing Language Models Could See?

2025

Cited

Cultural Evaluations of Vision-Language Models Have a Lot to Learn from Cultural Theory

2025

Cited

RAVENEA: A Benchmark for Multimodal Retrieval-Augmented Visual Culture Understanding

2025

Cited

ChatMotion: A Multimodal Multi-Agent for Human Motion Analysis

2025

Cited

Resume (English only)

Academic Achievements

One paper, 'What if Othello-Playing Language Models Could See?', accepted to EMNLP 2025.
Presented RAVENEA, a large-scale benchmark for culture-aware multimodal retrieval-augmented visual understanding tasks.
Introduced ChatMotion, a multimodal multi-agent framework for human motion analysis.
Conducted a vector space alignment study to investigate whether vision and language models share concepts.
Launched FoodieQA, a fine-grained image-text dataset for Chinese food culture understanding.
Analyzed the robustness of a retrieval-augmented captioning model, SmallCap, proposing methods to improve its performance.

Research Experience

Spent two wonderful years at CoAStaL for research.

Education

Received a Master's degree in Computer Science at the University of Copenhagen, advised by Prof. Anders Søgaard; currently pursuing a PhD at the University of Copenhagen and the University of Cambridge, advised by Prof. Serge Belongie and Prof. Ivan Vulić respectively.

Background

An ELLIS PhD student with research interests revolving around the convergence of natural language processing and computer vision, focusing on gaining insights from human cognition. Enthusiastic about exploring language grounding within multimodal contexts and investigating the linguistic and cognitive characteristics of models.

Miscellany

Links: Github / BlueSky / Google Scholar / X / Email

Co-authors

4 total

Anders Søgaard

Full Professor in NLP and Machine Learning, University of Copenhagen

Yova Kementchedjhieva

Assistant Professor, MBZUAI

Serge Belongie

University of Copenhagen

Ivan Vulić

Google DeepMind & University of Cambridge