🤖 AI Summary
This work addresses the core challenge in low-data personalization for LLM agents—minimizing user data disclosure while preserving task utility—under realistic structural constraints (e.g., logical dependencies, class quotas, hierarchical rules) that violate standard subset selection assumptions.
Method: We formally model these constraints as a laminar matroid, enabling theoretically grounded submodular maximization with a guaranteed (1−1/e) approximation ratio. Our approach integrates knowledge graph compilation, macro-facet abstraction, and continuous greedy optimization to efficiently identify the minimal sufficient personalized dataset under matroid constraints.
Contribution/Results: Experiments demonstrate substantial reduction in user data disclosure while maintaining high task utility. To our knowledge, this is the first framework for LLM personalization that provides both theoretical rigor—via structured constraint formalization—and practical efficacy, establishing a novel paradigm for privacy-sensitive, constraint-aware personalization.
📝 Abstract
Personalizing Large Language Model (LLM) agents requires conditioning them on user-specific data, creating a critical trade-off between task utility and data disclosure. While the utility of adding user data often exhibits diminishing returns (i.e., submodularity), enabling near-optimal greedy selection, real-world personalization is complicated by structural constraints. These include logical dependencies (e.g., selecting fact A requires fact B), categorical quotas (e.g., select at most one writing style), and hierarchical rules (e.g., select at most two social media preferences, of which at most one can be for a professional network). These constraints violate the assumptions of standard subset selection algorithms. We propose a principled method to formally model such constraints. We introduce a compilation process that transforms a user's knowledge graph with dependencies into a set of abstract macro-facets. Our central result is a proof that common hierarchical and quota-based constraints over these macro-facets form a valid laminar matroid. This theoretical characterization lets us cast structured personalization as submodular maximization under a matroid constraint, enabling greedy with constant-factor guarantees (and (1-1/e) via continuous greedy) for a much richer and more realistic class of problems.