NCorr-FP: A Neighbourhood-based Correlation-preserving Fingerprinting Scheme for Intellectual Property Protection of Structured Data

📅 2025-05-09
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
To address the challenges of ownership verification, weak traceability against unauthorized redistribution, and statistical fidelity degradation in structured data sharing, this paper proposes a neighborhood-aware fingerprinting embedding method. Our approach introduces a novel association-preserving paradigm grounded in local record similarity and nonparametric density estimation. By jointly optimizing Hellinger and KL divergence constraints and incorporating an inverse decoding mechanism, it achieves simultaneous imperceptibility, high downstream utility (prediction accuracy drop <1%), and strong robustness—evidenced by Hellinger distance <0.005, KL divergence ≈0, 100% detection success even under 50% record deletion, and resilience against both 5-party collusion and gradient inversion attacks. Furthermore, the method natively supports heterogeneous data types. To our knowledge, this is the first work to unify statistical fidelity, practical efficiency, and security robustness in structured-data watermarking.

Technology Category

Application Category

📝 Abstract
Ensuring data ownership and traceability of unauthorised redistribution are central to safeguarding intellectual property in shared data environments. Data fingerprinting addresses these challenges by embedding recipient-specific marks into the data, typically via content modifications. We propose NCorr-FP, a Neighbourhood-based Correlation-preserving Fingerprinting system for structured tabular data with the main goal of preserving statistical fidelity. The method uses local record similarity and density estimation to guide the insertion of fingerprint bits. The embedding logic is then reversed to extract the fingerprint from a potentially modified dataset. Extensive experiments confirm its effectiveness, fidelity, utility and robustness. Results show that fingerprints are virtually imperceptible, with minute Hellinger distances and KL divergences, even at high embedding ratios. The system also maintains high data utility for downstream predictive tasks. The method achieves 100% detection confidence under substantial data deletions and remains robust against adaptive and collusion attacks. Satisfying all these requirements concurrently on mixed-type datasets highlights the strong applicability of NCorr-FP to real-world data settings.
Problem

Research questions and friction points this paper is trying to address.

Protecting structured data intellectual property via fingerprinting
Preserving statistical fidelity while embedding fingerprints
Ensuring robustness against data modifications and attacks
Innovation

Methods, ideas, or system contributions that make the work stand out.

Uses local record similarity for fingerprinting
Preserves statistical fidelity via density estimation
Reverses embedding logic for fingerprint extraction
🔎 Similar Papers
No similar papers found.