🤖 AI Summary
Current social intelligence agent (SIA) frameworks lack user-centered design, real-time responsiveness, and robust multimodal coordination—critical gaps hindering human-aligned SIA development.
Method: We propose Estuary, an open-source, low-latency, multimodal real-time social interaction agent framework. For the first time, we integrate the Rapid Assessment Process (RAP) into SIA framework development, informed by in-depth domain expert interviews to identify core bottlenecks, and validated through structured end-user surveys and iterative architectural prototyping.
Contribution/Results: (1) Empirical identification of fundamental bottlenecks in real-time response, human-centered interaction modeling, and cross-modal coordination; (2) Demonstration of Estuary’s sub-100ms end-to-end latency, horizontal scalability, and natural multimodal interaction support; (3) Introduction of the first RAP-based SIA framework design paradigm, accompanied by an open-source implementation and engineering best-practice guidelines—establishing a methodological and technical foundation for next-generation human-centered SIAs.
📝 Abstract
This case study presents our user-centered design model for Socially Intelligent Agent (SIA) development frameworks through our experience developing Estuary, an open source multimodal framework for building low-latency real-time socially interactive agents. We leverage the Rapid Assessment Process (RAP) to collect the thoughts of leading researchers in the field of SIAs regarding the current state of the art for SIA development as well as their evaluation of how well Estuary may potentially address current research gaps. We achieve this through a series of end-user interviews conducted by a fellow researcher in the community. We hope that the findings of our work will not only assist the continued development of Estuary but also guide the development of other future frameworks and technologies for SIAs.