π€ AI Summary
Limited availability of labeled CCL (Casing Collar Locator) signal data hinders the generalization capability of casing collar identification models in downhole logging scenarios. Method: This paper proposes a lightweight data augmentation framework tailored for downhole logging, integrating label smoothing regularization (LSR), time scaling, random cropping, and multi-sampling strategies upon standardized preprocessing and label distribution smoothing. The framework is co-optimized with an AlexNet-based architecture to alleviate small-sample training challenges. Contribution/Results: Evaluated on a real-world CCL waveform dataset, the proposed method elevates the modelβs F1 score from 0.937/0.952 to 1.0, demonstrating significant improvement in depth measurement accuracy and engineering applicability. It establishes a reusable technical paradigm for intelligent downhole identification under low-data regimes.
π Abstract
Accurate downhole depth measurement is essential for oil and gas well operations, directly influencing reservoir contact, production efficiency, and operational safety. Collar correlation using a casing collar locator (CCL) is fundamental for precise depth calibration. While neural network-based CCL signal recognition has achieved significant progress in collar identification, preprocessing methods for such applications remain underdeveloped. Moreover, the limited availability of real well data poses substantial challenges for training neural network models that require extensive datasets. This paper presents a system integrated into downhole tools for CCL signal acquisition to facilitate dataset construction. We propose comprehensive preprocessing methods for data augmentation and evaluate their effectiveness using our AlexNet-based neural network models. Through systematic experimentation across various configuration combinations, we analyze the contribution of each augmentation method. Results demonstrate that standardization, label distribution smoothing (LDS), and random cropping are fundamental requirements for model training, while label smoothing regularization (LSR), time scaling, and multiple sampling significantly enhance model generalization capability. The F1 scores of our two benchmark models trained with the proposed augmentation methods maximumly improve from 0.937 and 0.952 to 1.0 and 1.0, respectively. Performance validation on real CCL waveforms confirms the effectiveness and practical applicability of our approach. This work addresses the gaps in data augmentation methodologies for training casing collar recognition models in CCL data-limited environments.