🤖 AI Summary
Traditional travel planning platforms struggle to meet demands for personalization, dynamism, and real-time interactive engagement. To address this, we propose a graph-structured multi-agent framework powered by large language models (LLMs) for end-to-end personalized travel planning. Our approach innovatively orchestrates strategy-oriented and information-oriented agents in a synergistic manner, integrating symbolic reasoning with conversational understanding to enable user intent parsing, multi-constraint dynamic optimization (e.g., budget, weather, time), and natural-language-driven iterative refinement. The system incorporates structured tool invocation and a map-based feedback loop, ensuring interpretability, context-aware adaptation, and real-time situational awareness. Human evaluation using a rubric-based assessment yields an average score of 8.5/10—significantly outperforming baselines—particularly excelling in itinerary feasibility, temporal efficiency, and integration of real-time contextual factors.
📝 Abstract
Planning trips is a cognitively intensive task involving conflicting user preferences, dynamic external information, and multi-step temporal-spatial optimization. Traditional platforms often fall short - they provide static results, lack contextual adaptation, and fail to support real-time interaction or intent refinement. Our approach, Vaiage, addresses these challenges through a graph-structured multi-agent framework built around large language models (LLMs) that serve as both goal-conditioned recommenders and sequential planners. LLMs infer user intent, suggest personalized destinations and activities, and synthesize itineraries that align with contextual constraints such as budget, timing, group size, and weather. Through natural language interaction, structured tool use, and map-based feedback loops, Vaiage enables adaptive, explainable, and end-to-end travel planning grounded in both symbolic reasoning and conversational understanding. To evaluate Vaiage, we conducted human-in-the-loop experiments using rubric-based GPT-4 assessments and qualitative feedback. The full system achieved an average score of 8.5 out of 10, outperforming the no-strategy (7.2) and no-external-API (6.8) variants, particularly in feasibility. Qualitative analysis indicated that agent coordination - especially the Strategy and Information Agents - significantly improved itinerary quality by optimizing time use and integrating real-time context. These results demonstrate the effectiveness of combining LLM reasoning with symbolic agent coordination in open-ended, real-world planning tasks.