đ¤ AI Summary
Large language models (LLMs) exhibit identifiable output characteristicsâtermed ânatural fingerprintsââeven when trained on identical data, revealing latent biases and behavioral distinguishability.
Method: Through controlled training experiments, we systematically investigate how minor perturbationsâincluding parameter scale, optimization configurations, and random seedsâinfluence fingerprint emergence. We integrate cross-model text fingerprinting, statistical significance testing, and attribution analysis to isolate causal factors.
Contribution/Results: We provide the first empirical evidence that (1) LLM provenance can be accurately tracedâwith >92% accuracyâdespite zero variation in training data; and (2) low-level training variables, particularly random seeds, play a decisive role in shaping these fingerprints. Our findings establish a novel paradigm for analyzing the origins of implicit model biases and offer quantitative foundations for enhancing LLM behavioral controllability and interpretability.
đ Abstract
Large language models (LLMs) often exhibit biases -- systematic deviations from expected norms -- in their outputs. These range from overt issues, such as unfair responses, to subtler patterns that can reveal which model produced them. We investigate the factors that give rise to identifiable characteristics in LLMs. Since LLMs model training data distribution, it is reasonable that differences in training data naturally lead to the characteristics. However, our findings reveal that even when LLMs are trained on the exact same data, it is still possible to distinguish the source model based on its generated text. We refer to these unintended, distinctive characteristics as natural fingerprints. By systematically controlling training conditions, we show that the natural fingerprints can emerge from subtle differences in the training process, such as parameter sizes, optimization settings, and even random seeds. We believe that understanding natural fingerprints offers new insights into the origins of unintended bias and ways for improving control over LLM behavior.