The Query/Hit Model for Sequential Hypothesis Testing

📅 2025-02-02

📈 Citations: 0

✨ Influential: 0

career value

220K/year

🤖 AI Summary

This paper addresses sequential hypothesis testing for streaming data under constraints of limited communication bandwidth, scarce computational resources, and stringent privacy requirements. We propose the Query/Hit (Q/H) learning framework, wherein a passive party performs hypothesis verification solely by issuing symbolic queries and observing response latency—without accessing raw data or transmitting sensitive features. To our knowledge, this is the first formalization of such a query-latency–driven sequential inference paradigm. We design the Dynamic Scout-Sentinel Algorithm (DSSA), integrating a mutual information neural estimator to enable adaptive, low-overhead query policy optimization. Theoretically grounded in sequential analysis and streaming modeling, DSSA jointly optimizes statistical efficiency and privacy preservation. Empirical evaluation on real-world datasets—including mouse trajectories, typography patterns, and touch interaction logs—demonstrates significant reductions in both false alarm rate and detection delay compared to multiple state-of-the-art baselines.

Technology Category

Application Category

📝 Abstract

This work introduces the Query/Hit (Q/H) learning model. The setup consists of two agents. One agent, Alice, has access to a streaming source, while the other, Bob, does not have direct access to the source. Communication occurs through sequential Q/H pairs: Bob sends a sequence of source symbols (queries), and Alice responds with the waiting time until each query appears in the source stream (hits). This model is motivated by scenarios with communication, computation, and privacy constraints that limit real-time access to the source. The error exponent for sequential hypothesis testing under the Q/H model is characterized, and a querying strategy, the Dynamic Scout-Sentinel Algorithm (DSSA), is proposed. The strategy employs a mutual information neural estimator to compute the error exponent associated with each query and to select the query with the highest efficiency. Extensive empirical evaluations on both synthetic and real-world datasets -- including mouse movement trajectories, typesetting patterns, and touch-based user interactions -- are provided to evaluate the performance of the proposed strategy in comparison with baselines, in terms of probability of error, query choice, and time-to-detection.

Problem

Research questions and friction points this paper is trying to address.

Query/Hit Learning

Non-real-time Data Stream

Assumption Accuracy Optimization

Innovation

Methods, ideas, or system contributions that make the work stand out.

Dynamic Scout-Sentry Algorithm

Mutual Information Neural Estimator

Query/Hit Learning Model

🔎 Similar Papers

Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection