Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria \`a Pr\'atica

📅 2026-02-11
📈 Citations: 0
Influential: 0
📄 PDF

Technology Category

Application Category

📝 Abstract
Multimodal Large Language Models (MLLMs) combine the natural language understanding and generation capabilities of LLMs with perception skills in modalities such as image and audio, representing a key advancement in contemporary AI. This chapter presents the main fundamentals of MLLMs and emblematic models. Practical techniques for preprocessing, prompt engineering, and building multimodal pipelines with LangChain and LangGraph are also explored. For further practical study, supplementary material is publicly available online: https://github.com/neemiasbsilva/MLLMs-Teoria-e-Pratica. Finally, the chapter discusses the challenges and highlights promising trends.
Problem

Research questions and friction points this paper is trying to address.

Multimodal Large Language Models
natural language understanding
multimodal perception
AI integration
multimodal pipelines
Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Large Language Models
LangChain
LangGraph
prompt engineering
multimodal pipelines
🔎 Similar Papers
No similar papers found.
N
Neemias da Silva
Universidade Tecnológica Federal do Paraná (UTFPR) - Curitiba, Brasil
J
Júlio C. W. Scholz
Universidade Tecnológica Federal do Paraná (UTFPR) - Curitiba, Brasil
J
John Harrison
Universidade Tecnológica Federal do Paraná (UTFPR) - Curitiba, Brasil
M
Marina Borges
Universidade Tecnológica Federal do Paraná (UTFPR) - Curitiba, Brasil
P
Paulo Ávila
Universidade Tecnológica Federal do Paraná (UTFPR) - Curitiba, Brasil
F
Frances A Santos
Universidade Estadual de Campinas (UNICAMP), Campinas, Brasil
Myriam Delgado
Myriam Delgado
Federal University of Technology of Paraná
Natural ComputingComputational IntelligenceFuzzy SystemsOptimizaton
Rodrigo Minetto
Rodrigo Minetto
Federal University of Technology of Paraná, DAINF, Curitiba, Brazil
Image Processing and Computer Vision
T
Thiago H Silva
University of Toronto, Toronto, Canada