Digging Into the Internal: Causality-Based Analysis of LLM Function Calling

📅 2025-09-18
📈 Citations: 0
Influential: 0
📄 PDF
🤖 AI Summary
Existing studies lack a deep mechanistic understanding of how function calling (FC) enhances large language models’ (LLMs’) instruction following and safety. Method: This paper introduces, for the first time, causal intervention techniques at both layer-level and token-level to systematically dissect FC’s impact on internal representations and reasoning pathways. Experiments are conducted across mainstream LLMs and two benchmark datasets. Contribution/Results: We find that FC significantly strengthens neural activations associated with compliance in critical layers, thereby improving accuracy in intent interpretation and malicious input detection. Specifically, FC achieves an average 135% performance gain over conventional prompting methods on malicious input detection—substantially boosting LLM safety robustness. Our work establishes a novel paradigm for interpretable and controllable LLM behavior regulation through fine-grained, causally grounded intervention.

Technology Category

Application Category

📝 Abstract
Function calling (FC) has emerged as a powerful technique for facilitating large language models (LLMs) to interact with external systems and perform structured tasks. However, the mechanisms through which it influences model behavior remain largely under-explored. Besides, we discover that in addition to the regular usage of FC, this technique can substantially enhance the compliance of LLMs with user instructions. These observations motivate us to leverage causality, a canonical analysis method, to investigate how FC works within LLMs. In particular, we conduct layer-level and token-level causal interventions to dissect FC's impact on the model's internal computational logic when responding to user queries. Our analysis confirms the substantial influence of FC and reveals several in-depth insights into its mechanisms. To further validate our findings, we conduct extensive experiments comparing the effectiveness of FC-based instructions against conventional prompting methods. We focus on enhancing LLM safety robustness, a critical LLM application scenario, and evaluate four mainstream LLMs across two benchmark datasets. The results are striking: FC shows an average performance improvement of around 135% over conventional prompting methods in detecting malicious inputs, demonstrating its promising potential to enhance LLM reliability and capability in practical applications.
Problem

Research questions and friction points this paper is trying to address.

Analyzing how function calling influences LLM behavior mechanisms
Investigating FC's impact on model internal computational logic
Enhancing LLM safety robustness against malicious inputs detection
Innovation

Methods, ideas, or system contributions that make the work stand out.

Causality-based analysis of LLM function calling
Layer-level and token-level causal interventions
Function calling enhances LLM safety and robustness
🔎 Similar Papers
No similar papers found.
Zhenlan Ji
Zhenlan Ji
The Hong Kong University of Science and Technology
Software Engineering
Daoyuan Wu
Daoyuan Wu
Lingnan University, Hong Kong. Past Affiliation: HKUST; NTU; CUHK; SMU; PolyU
Large Language ModelAI SecurityBlockchain SecurityMobile SecuritySoftware Security
W
Wenxuan Wang
Renmin University of China
P
Pingchuan Ma
The Hong Kong University of Science and Technology
S
Shuai Wang
The Hong Kong University of Science and Technology
L
Lei Ma
The University of Tokyo & University of Alberta