Digging Into the Internal: Causality-Based Analysis of LLM Function Calling

📅 2025-09-18

📈 Citations: 0

✨ Influential: 0

career value

189K/year

🤖 AI Summary

Existing studies lack a deep mechanistic understanding of how function calling (FC) enhances large language models’ (LLMs’) instruction following and safety. Method: This paper introduces, for the first time, causal intervention techniques at both layer-level and token-level to systematically dissect FC’s impact on internal representations and reasoning pathways. Experiments are conducted across mainstream LLMs and two benchmark datasets. Contribution/Results: We find that FC significantly strengthens neural activations associated with compliance in critical layers, thereby improving accuracy in intent interpretation and malicious input detection. Specifically, FC achieves an average 135% performance gain over conventional prompting methods on malicious input detection—substantially boosting LLM safety robustness. Our work establishes a novel paradigm for interpretable and controllable LLM behavior regulation through fine-grained, causally grounded intervention.

Technology Category

Application Category

📝 Abstract

Function calling (FC) has emerged as a powerful technique for facilitating large language models (LLMs) to interact with external systems and perform structured tasks. However, the mechanisms through which it influences model behavior remain largely under-explored. Besides, we discover that in addition to the regular usage of FC, this technique can substantially enhance the compliance of LLMs with user instructions. These observations motivate us to leverage causality, a canonical analysis method, to investigate how FC works within LLMs. In particular, we conduct layer-level and token-level causal interventions to dissect FC's impact on the model's internal computational logic when responding to user queries. Our analysis confirms the substantial influence of FC and reveals several in-depth insights into its mechanisms. To further validate our findings, we conduct extensive experiments comparing the effectiveness of FC-based instructions against conventional prompting methods. We focus on enhancing LLM safety robustness, a critical LLM application scenario, and evaluate four mainstream LLMs across two benchmark datasets. The results are striking: FC shows an average performance improvement of around 135% over conventional prompting methods in detecting malicious inputs, demonstrating its promising potential to enhance LLM reliability and capability in practical applications.

Problem

Research questions and friction points this paper is trying to address.

Analyzing how function calling influences LLM behavior mechanisms

Investigating FC's impact on model internal computational logic

Enhancing LLM safety robustness against malicious inputs detection

Innovation

Methods, ideas, or system contributions that make the work stand out.

Causality-based analysis of LLM function calling

Layer-level and token-level causal interventions

Function calling enhances LLM safety and robustness

🔎 Similar Papers

From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks?