Enhancing Japanese Large Language Models with Reasoning Vectors

📅 2025-08-04

📈 Citations: 0

✨ Influential: 0

career value

182K/year

🤖 AI Summary

To address the limited reasoning capabilities of Japanese large language models (LLMs) stemming from insufficient training resources, this paper proposes a training-free reasoning capability transfer method. The core idea is to extract “reasoning vectors”—i.e., weight-difference representations—derived from high-performing multilingual or English reasoning models under the task vector paradigm, and then inject them directionally into Japanese LLMs. This approach is augmented with a lightweight post-training strategy to enable cross-lingual, low-overhead reasoning enhancement. Experiments demonstrate significant performance gains across multiple Japanese reasoning benchmarks, without requiring additional labeled data or full-parameter fine-tuning. Importantly, the method preserves the original model architecture and parameters while achieving robust generalization. This work establishes a scalable, highly generalizable paradigm for enhancing reasoning capabilities in low-resource language LLMs.

Technology Category

Application Category

📝 Abstract

Post-training methods have improved the performance and enhanced the reasoning capability for mainstream large language models (LLMs), but the same is challenging for Japanese LLMs to achieve due to the amount of resources required. Inspired by task vectors that extract the change of weights before and after training, specifically for a certain task, we obtain reasoning vectors from reasoning LLMs and apply them to Japanese LLMs to boost their performance. While the resources available present a challenge to improve Japanese LLMs, we present a simple and effective way to obtain high improvement and hope to inspire for other languages.

Problem

Research questions and friction points this paper is trying to address.

Improving Japanese LLMs' reasoning with limited resources

Applying reasoning vectors from advanced LLMs to Japanese models

Enhancing performance of Japanese LLMs using post-training methods

Innovation

Methods, ideas, or system contributions that make the work stand out.

Using reasoning vectors from advanced LLMs

Applying vectors to enhance Japanese LLMs

Simple method for significant performance improvement

🔎 Similar Papers

Large Language Models Are Cross-Lingual Knowledge-Free Reasoners