Active Measurement of Two-Point Correlations

📅 2026-04-06
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
This work addresses the challenge of efficiently estimating the two-point correlation function (2PCF) for sparse subsets—such as star clusters—within massive point sets. To avoid the prohibitive cost of constructing a fully labeled catalog, the authors propose a human-in-the-loop framework that integrates a pretrained classifier with active learning to adaptively select the most informative samples for human annotation. Concurrently, a novel unbiased estimator yields 2PCF estimates and associated confidence intervals across multiple distance bins. The proposed adaptive sampling strategy substantially reduces estimation variance, enabling statistically rigorous and scalable 2PCF measurements with significantly fewer annotations. Experiments on astronomical data demonstrate the method’s effectiveness and superiority over conventional approaches.
📝 Abstract
Two-point correlation functions (2PCF) are widely used to characterize how points cluster in space. In this work, we study the problem of measuring the 2PCF over a large set of points, restricted to a subset satisfying a property of interest. An example comes from astronomy, where scientists measure the 2PCF of star clusters, which make up only a tiny subset of possible sources within a galaxy. This task typically requires careful labeling of sources to construct catalogs, which is time-consuming. We present a human-in-the-loop framework for efficient estimation of 2PCF of target sources. By leveraging a pre-trained classifier to guide sampling, our approach adaptively selects the most informative points for human annotation. After each annotation, it produces unbiased estimates of pair counts across multiple distance bins simultaneously. Compared to simple Monte Carlo approaches, our method achieves substantially lower variance while significantly reducing annotation effort. We introduce a novel unbiased estimator, sampling strategy, and confidence interval construction that together enable scalable and statistically grounded measurement of two-point correlations in astronomy datasets.
Problem

Research questions and friction points this paper is trying to address.

two-point correlation function
active measurement
astronomy
sparse subset
point clustering
Innovation

Methods, ideas, or system contributions that make the work stand out.

two-point correlation function
active sampling
human-in-the-loop
unbiased estimation
adaptive annotation
🔎 Similar Papers
No similar papers found.