🤖 AI Summary
This study addresses the challenge of automating narrative construction and cultural analysis for television archives. We propose a dynamic semantic retrieval framework: leveraging ASR transcripts from 1,547 Italian TV programs, we perform sentence-level semantic segmentation and vectorization to build a searchable video-fragment embedding repository. Integrating large language models (LLMs) with retrieval-augmented generation (RAG), our method enables theme-driven, cross-temporal recombination of fragments, generating narrative montages that balance ironic juxtaposition with thematic coherence. Unlike conventional approaches reliant on static metadata, our framework is the first to deeply couple semantic cataloging, vector-based retrieval, and LLM-based generation within broadcast archive recontextualization. The project releases open-source datasets and code, offering a reusable methodological paradigm for media history, digital humanities, and AI-mediated cultural research.
📝 Abstract
This paper introduces AI Blob!, an experimental system designed to explore the potential of semantic cataloging and Large Language Models (LLMs) for the retrieval and recontextualization of archival television footage. Drawing methodological inspiration from Italian television programs such as Blob (RAI Tre, 1989-), AI Blob! integrates automatic speech recognition (ASR), semantic embeddings, and retrieval-augmented generation (RAG) to organize and reinterpret archival content. The system processes a curated dataset of 1,547 Italian television videos by transcribing audio, segmenting it into sentence-level units, and embedding these segments into a vector database for semantic querying. Upon user input of a thematic prompt, the LLM generates a range of linguistically and conceptually related queries, guiding the retrieval and recombination of audiovisual fragments. These fragments are algorithmically selected and structured into narrative sequences producing montages that emulate editorial practices of ironic juxtaposition and thematic coherence. By foregrounding dynamic, content-aware retrieval over static metadata schemas, AI Blob! demonstrates how semantic technologies can facilitate new approaches to archival engagement, enabling novel forms of automated narrative construction and cultural analysis. The project contributes to ongoing debates in media historiography and AI-driven archival research, offering both a conceptual framework and a publicly available dataset to support further interdisciplinary experimentation.