🤖 AI Summary
This work addresses the limitations of existing data-driven approaches in effectively integrating domain-specific temporal constraints and logical rules, which often leads to suboptimal predictive performance and insufficient compliance with regulatory requirements. To overcome this, the authors propose a neuro-symbolic learning framework that formalizes domain knowledge as differentiable linear temporal logic and first-order logic constraints. The framework employs a two-stage optimization mechanism: an initial data-driven pretraining phase followed by a satisfaction-guided dynamic pruning step that retains only those rules that are both effective and logically consistent. Evaluated on four real-world event logs, the method significantly outperforms purely data-driven baselines, particularly in scenarios with scarce compliant examples, simultaneously improving both prediction accuracy and adherence to prescribed logical rules.
📝 Abstract
Predictive modeling on sequential event data is critical for fraud detection and healthcare monitoring. Existing data-driven approaches learn correlations from historical data but fail to incorporate domain-specific sequential constraints and logical rules governing event relationships, limiting accuracy and regulatory compliance. For example, healthcare procedures must follow specific sequences, and financial transactions must adhere to compliance rules. We present a neuro-symbolic approach integrating domain knowledge as differentiable logical constraints using Logic Networks (LTNs). We formalize control-flow, temporal, and payload knowledge using Linear Temporal Logic and first-order logic. Our key contribution is a two-stage optimization strategy addressing LTNs' tendency to satisfy logical formulas at the expense of predictive accuracy. The approach uses weighted axiom loss during pretraining to prioritize data learning, followed by rule pruning that retains only consistent, contributive axioms based on satisfaction dynamics. Evaluation on four real-world event logs shows that domain knowledge injection significantly improves predictive performance, with the two-stage optimization proving essential knowledge (without it, knowledge can severely degrade performance). The approach excels particularly in compliance-constrained scenarios with limited compliant training examples, achieving superior performance compared to purely data-driven baselines while ensuring adherence to domain constraints.