🤖 AI Summary
This study investigates the dynamic mechanisms underlying speaking-time sharing in video-mediated conversations, focusing on turn-level temporal equity and its underlying interactional dynamics. We propose the first taxonomy of speaking-time sharing dynamics and develop a computational analytical framework to model multi-scale temporal distributions—such as turn-taking rhythm, response latency, and dominance-switch frequency—using large-scale video dialogues between strangers. Our findings reveal: (1) aggregate speaking-time equity significantly enhances participant preference, with disproportionately greater benefits for less-dominant speakers; and (2) even when macro-level equity is matched, distinct micro-level dynamic patterns yield markedly different subjective experiences. These results advance theoretical understanding of interpersonal coordination and provide empirical foundations for designing fairer, more natural human–computer dialogue systems.
📝 Abstract
An intrinsic aspect of every conversation is the way talk-time is shared between multiple speakers. Conversations can be balanced, with each speaker claiming a similar amount of talk-time, or imbalanced when one talks disproportionately. Such overall distributions are the consequence of continuous negotiations between the speakers throughout the conversation: who should be talking at every point in time, and for how long?
In this work we introduce a computational framework for quantifying both the conversation-level distribution of talk-time between speakers, as well as the lower-level dynamics that lead to it. We derive a typology of talk-time sharing dynamics structured by several intuitive axes of variation. By applying this framework to a large dataset of video-chats between strangers, we confirm that, perhaps unsurprisingly, different conversation-level distributions of talk-time are perceived differently by speakers, with balanced conversations being preferred over imbalanced ones, especially by those who end up talking less. Then we reveal that -- even when they lead to the same level of overall balance -- different types of talk-time sharing dynamics are perceived differently by the participants, highlighting the relevance of our newly introduced typology. Finally, we discuss how our framework offers new tools to designers of computer-mediated communication platforms, for both human-human and human-AI communication.