Origin Tracer: A Method for Detecting LoRA Fine-Tuning Origins in LLMs

📅 2025-05-26

📈 Citations: 0

✨ Influential: 0

career value

188K/year

🤖 AI Summary

The opaque provenance of open-source large language models (LLMs) and the difficulty of attributing LoRA-adapted models to their base models hinder trust, accountability, and regulatory compliance. Method: We propose the first formal method for LoRA fine-tuning origin identification. By modeling the low-rank structure of weight residuals and leveraging singular value decomposition coupled with subspace alignment, our approach extracts confusion-invariant features that robustly identify the base model identity and enable invertible estimation of the LoRA rank. Contribution/Results: The method maintains high accuracy under strong confounding transformations—including weight permutation and scaling—thereby significantly enhancing verification interpretability. Evaluated on 31 real-world open-source LLMs, it achieves reliable attribution across diverse architectures and training configurations. Our work establishes a new benchmark for LLM provenance tracing and provides a foundational framework for model lineage authentication in open-model ecosystems.

Technology Category

Application Category

📝 Abstract

As large language models (LLMs) continue to advance, their deployment often involves fine-tuning to enhance performance on specific downstream tasks. However, this customization is sometimes accompanied by misleading claims about the origins, raising significant concerns about transparency and trust within the open-source community. Existing model verification techniques typically assess functional, representational, and weight similarities. However, these approaches often struggle against obfuscation techniques, such as permutations and scaling transformations. To address this limitation, we propose a novel detection method Origin-Tracer that rigorously determines whether a model has been fine-tuned from a specified base model. This method includes the ability to extract the LoRA rank utilized during the fine-tuning process, providing a more robust verification framework. This framework is the first to provide a formalized approach specifically aimed at pinpointing the sources of model fine-tuning. We empirically validated our method on thirty-one diverse open-source models under conditions that simulate real-world obfuscation scenarios. We empirically analyze the effectiveness of our framework and finally, discuss its limitations. The results demonstrate the effectiveness of our approach and indicate its potential to establish new benchmarks for model verification.

Problem

Research questions and friction points this paper is trying to address.

Detects LoRA fine-tuning origins in LLMs

Addresses transparency and trust in model customization

Resists obfuscation techniques for robust verification

Innovation

Methods, ideas, or system contributions that make the work stand out.

Detects LoRA fine-tuning origins rigorously

Extracts LoRA rank for robust verification

Formalizes pinpointing model fine-tuning sources

🔎 Similar Papers

Safe LoRA: the Silver Lining of Reducing Safety Risks when Fine-tuning Large Language Models