LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

📅 2024-07-04

🏛️ arXiv.org

📈 Citations: 3

✨ Influential: 0

career value

209K/year

🤖 AI Summary

To address the underdeveloped open-source ecosystem for Japanese large language models (LLMs), this work introduces the first cross-institutional, cross-domain, full-stack open collaboration paradigm—ensuring end-to-end autonomy, transparency, and reproducibility in data curation, model training, and evaluation. Built upon the Transformer architecture, the approach incorporates Japanese-specific tokenization, rigorous high-quality corpus cleaning, multi-stage pretraining, and instruction fine-tuning, with fully documented and reproducible training pipelines. The initiative unites over 1,500 industry and academic researchers, resulting in the publicly released LLM-jp series (e.g., LLM-jp-13b), which achieves state-of-the-art performance on Japanese benchmarks including JA-MMLU and JCommonsenseQA. Its core contribution is the establishment of the first high-performance, fully open, and reproducible foundational LLM ecosystem for Japanese, now serving as the de facto standard base model in the Japanese AI community.

Technology Category

Application Category

📝 Abstract

This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.

Problem

Research questions and friction points this paper is trying to address.

Open-source

Large Language Model

Japanese Language

Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source

Collaborative Effort

Advanced Japanese Language Processing

🔎 Similar Papers

No similar papers found.