Federico Cocchi
Scholar

Federico Cocchi

Google Scholar ID: BRG3e1EAAAAJ
PhD student, University of Modena and Reggio Emilia
Computer Vision
Citations & Impact
All-time
Citations
346
 
H-index
7
 
i10-index
7
 
Publications
13
 
Co-authors
10
list available
Resume (English only)
Academic Achievements
  • Paper 'Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering' accepted at CVPR 2025
  • Paper 'The (R)Evolution of Multimodal Large Language Models: A Survey' accepted at ACL Findings 2024
  • Paper 'Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs' accepted at CVPR Workshop 2024
  • Published research at top-tier conferences including CVPR, ECCV, ACL, ICCV Workshop, and ICPR
  • Proposed ReflectiVA model that integrates external knowledge via reflective tokens for improved knowledge-based VQA
  • Developed Wiki-LLaVA with hierarchical retrieval pipeline to augment MLLMs with external knowledge without compromising standard benchmark performance
Background
  • Final-year PhD student in the Italian National PhD Program in Artificial Intelligence
  • Research focuses on Multimodal LLMs, especially at the intersection of vision and language
  • Aims to enhance reasoning and understanding capabilities of models
  • Explores post-training techniques to enrich models with retrieval and reranking using multimodal data
  • Uses HPC systems for multi-GPU/multi-node foundation model training through collaboration with CINECA
  • Research interests include Generative AI, Computer Vision, and Natural Language Processing