Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria \`a Pr\'atica

📅 2026-02-11

📈 Citations: 0

✨ Influential: 0

📄 PDF

career value

210K/year

Technology Category

Application Category

📝 Abstract

Multimodal Large Language Models (MLLMs) combine the natural language understanding and generation capabilities of LLMs with perception skills in modalities such as image and audio, representing a key advancement in contemporary AI. This chapter presents the main fundamentals of MLLMs and emblematic models. Practical techniques for preprocessing, prompt engineering, and building multimodal pipelines with LangChain and LangGraph are also explored. For further practical study, supplementary material is publicly available online: https://github.com/neemiasbsilva/MLLMs-Teoria-e-Pratica. Finally, the chapter discusses the challenges and highlights promising trends.

Problem

Research questions and friction points this paper is trying to address.

Multimodal Large Language Models

natural language understanding

multimodal perception

AI integration

multimodal pipelines

Innovation

Methods, ideas, or system contributions that make the work stand out.

Multimodal Large Language Models

LangChain

LangGraph