Scholar

Federico Cocchi

Google Scholar ID: BRG3e1EAAAAJ

PhD student, University of Modena and Reggio Emilia

Computer Vision

Citations & Impact

All-time

Citations

346

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

4 items

2025

Cited

2025

Cited

2025

Cited

arXiv.org · 2024

Cited

Resume (English only)

Academic Achievements

Paper 'Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering' accepted at CVPR 2025
Paper 'The (R)Evolution of Multimodal Large Language Models: A Survey' accepted at ACL Findings 2024
Paper 'Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs' accepted at CVPR Workshop 2024
Published research at top-tier conferences including CVPR, ECCV, ACL, ICCV Workshop, and ICPR
Proposed ReflectiVA model that integrates external knowledge via reflective tokens for improved knowledge-based VQA
Developed Wiki-LLaVA with hierarchical retrieval pipeline to augment MLLMs with external knowledge without compromising standard benchmark performance

Background

Final-year PhD student in the Italian National PhD Program in Artificial Intelligence
Research focuses on Multimodal LLMs, especially at the intersection of vision and language
Aims to enhance reasoning and understanding capabilities of models
Explores post-training techniques to enrich models with retrieval and reranking using multimodal data
Uses HPC systems for multi-GPU/multi-node foundation model training through collaboration with CINECA
Research interests include Generative AI, Computer Vision, and Natural Language Processing

Co-authors

10 total