🤖 AI Summary
This study addresses the challenge of integrating and analyzing heterogeneous multimodal data—such as video, audio, handwritten notes, and eye-tracking streams—in collaborative design, which often renders collaboration mechanisms and decision-making processes opaque. We propose a modular, extensible multimodal analysis framework that integrates AI-driven artifact auto-extraction, multi-stream temporal alignment graphs, thematic card summarization, and drill-down interactive analysis, implemented in an interactive visual system named reCAPit. Our key contribution lies in enabling semantic-level cross-modal fusion and transparent, interpretable explanations, thereby significantly enhancing process traceability and comprehensibility. Evaluation across six interdisciplinary workshops—including urban planning and ensemble music rehearsal—demonstrates that the framework effectively uncovers collaborative dynamics, supports decision provenance, and improves communication efficiency between researchers and practitioners.
📝 Abstract
An essential task in analyzing collaborative design processes, such as those that are part of workshops in design studies, is identifying design outcomes and understanding how the collaboration between participants formed the results and led to decision-making. However, findings are typically restricted to a consolidated textual form based on notes from interviews or observations. A challenge arises from integrating different sources of observations, leading to large amounts and heterogeneity of collected data. To address this challenge we propose a practical, modular, and adaptable framework of workshop setup, multimodal data acquisition, AI-based artifact extraction, and visual analysis. Our interactive visual analysis system, reCAPit, allows the flexible combination of different modalities, including video, audio, notes, or gaze, to analyze and communicate important workshop findings. A multimodal streamgraph displays activity and attention in the working area, temporally aligned topic cards summarize participants' discussions, and drill-down techniques allow inspecting raw data of included sources. As part of our research, we conducted six workshops across different themes ranging from social science research on urban planning to a design study on band-practice visualization. The latter two are examined in detail and described as case studies. Further, we present considerations for planning workshops and challenges that we derive from our own experience and the interviews we conducted with workshop experts. Our research extends existing methodology of collaborative design workshops by promoting data-rich acquisition of multimodal observations, combined AI-based extraction and interactive visual analysis, and transparent dissemination of results.