LLM-jp: A Cross-organizational Project for the Research and Development of Fully Open Japanese LLMs

📅 2024-07-04
🏛️ arXiv.org
📈 Citations: 3
Influential: 0
📄 PDF
🤖 AI Summary
To address the underdeveloped open-source ecosystem for Japanese large language models (LLMs), this work introduces the first cross-institutional, cross-domain, full-stack open collaboration paradigm—ensuring end-to-end autonomy, transparency, and reproducibility in data curation, model training, and evaluation. Built upon the Transformer architecture, the approach incorporates Japanese-specific tokenization, rigorous high-quality corpus cleaning, multi-stage pretraining, and instruction fine-tuning, with fully documented and reproducible training pipelines. The initiative unites over 1,500 industry and academic researchers, resulting in the publicly released LLM-jp series (e.g., LLM-jp-13b), which achieves state-of-the-art performance on Japanese benchmarks including JA-MMLU and JCommonsenseQA. Its core contribution is the establishment of the first high-performance, fully open, and reproducible foundational LLM ecosystem for Japanese, now serving as the de facto standard base model in the Japanese AI community.

Technology Category

Application Category

📝 Abstract
This paper introduces LLM-jp, a cross-organizational project for the research and development of Japanese large language models (LLMs). LLM-jp aims to develop open-source and strong Japanese LLMs, and as of this writing, more than 1,500 participants from academia and industry are working together for this purpose. This paper presents the background of the establishment of LLM-jp, summaries of its activities, and technical reports on the LLMs developed by LLM-jp. For the latest activities, visit https://llm-jp.nii.ac.jp/en/.
Problem

Research questions and friction points this paper is trying to address.

Open-source
Large Language Model
Japanese Language
Innovation

Methods, ideas, or system contributions that make the work stand out.

Open-source
Collaborative Effort
Advanced Japanese Language Processing
🔎 Similar Papers
No similar papers found.
L
LLM-jp Akiko Aizawa
E
Eiji Aramaki
B
Bowen Chen
Hiroyuki Deguchi
Hiroyuki Deguchi
NTT
Natural Language ProcessingMachine Translation
R
Rintaro Enomoto
Kazuki Fujii
Kazuki Fujii
Institute of Science Tokyo
Systems for Machine Learning
K
Kensuke Fukumoto
T
Takuya Fukushima
N
Namgi Han
Y
Yuto Harada
C
Chikara Hashimoto
Tatsuya Hiraoka
Tatsuya Hiraoka
Mohamed bin Zayed University of Artificial Intelligence
Natural Language Processing
S
Shohei Hisada
S
Sosuke Hosokawa
Lu Jie
Lu Jie
Tsinghua University
Integrated Circuit Design
K
Keisuke Kamata
T
T. Kanazawa
H
H. Kanezashi
H
Hiroshi Kataoka
Daisuke Kawahara
Daisuke Kawahara
Waseda University
Computational LinguisticsNatural Language Processing
S
Seiya Kawano
A
Atsushi Keyaki
K
Keisuke Kiryu
H
Hirokazu Kiyomaru
Takashi Kodama
Takashi Kodama
National Institute of Informatics
Natural Language Processing
T
Takahiro Kubo
Yohei Kuga
Yohei Kuga
R
Ryoma Kumon
Shuhei Kurita
Shuhei Kurita
National Institute of Informatics
Deep LearningLarge Language ModelsComputer Vision
S
S. Kurohashi
Conglong Li
Conglong Li
Senior Research Scientist at Google DeepMind, CMU Ph.D.
Natural Language ProcessingDeep LearningDistributed Systems
H
Hiroshi Matsuda
Y
Yusuke Miyao
Sakae Mizuki
Sakae Mizuki
Hottolink, Inc. / Institute of Science Tokyo
machine learningnatural language processingrepresentation learningcomputational statistics
Y
Yugo Murawaki
Ryo Nakamura
Ryo Nakamura
Taishi Nakamura
Taishi Nakamura
Institute of Science Tokyo
artificial general intelligencelarge language modelsmachine learning
Kouta Nakayama
Kouta Nakayama
理化学研究所
自然言語処理
T
Tomoka Nakazato
T
Takuro Niitsuma
J
Jiro Nishitoba
Yusuke Oda
Yusuke Oda
NAIST
Natural Language ProcessingMachine TranslationSoftware Engineering
H
Hayato Ogawa
T
Takumi Okamoto
Naoaki Okazaki
Naoaki Okazaki
Institute of Science Tokyo
natural language processingartificial intelligencemachine learning
Yohei Oseki
Yohei Oseki
University of Tokyo
Computational LinguisticsCognitive Science
K
Koki Ryu
R
Rafał Rzepka
Keisuke Sakaguchi
Keisuke Sakaguchi
Tohoku University
Natural Language ProcessingMachine LearningPsycholinguistics
S
S. Sasaki
Satoshi Sekine
Satoshi Sekine
LLMC, NII
Natural Language Processing
K
Kohei Suda
Saku Sugawara
Saku Sugawara
National Institute of Informatics
natural language processingcompurational linguistics
Hiroaki Sugiyama
Hiroaki Sugiyama
NTT Communication Science Labs.
Artificial IntelligenceDialog ManagementCognitive Developmental ScienceLanguage Acquisition
H
Hisami Suzuki
Jun Suzuki
Jun Suzuki
Tohoku University
Natural Language ProcessingMachine LearningArtificial Intelligence
T
T. Suzumura
Kyosuke Takami
Kyosuke Takami
Osaka Kyoiku University
Learning AnalyticsBullyingNeuroscience
M
Masashi Takeshita
M
Masahiro Tanaka
K
K. Taura
A
A. Tolmachev
Nobuhiro Ueda
Nobuhiro Ueda
NEC Corporation
Natural Language Processing
Zhen Wan
Zhen Wan
2nd year Ph.D. student, Kyoto University
Natural language processingInformation extraction
Shuntaro Yada
Shuntaro Yada
S
Sakiko Yahata
Y
Yuya Yamamoto
Hitomi Yanaka
Hitomi Yanaka
The University of Tokyo, RIKEN
Natural Language ProcessingSemantics
Rio Yokota
Rio Yokota
Professor, Institute of Science Tokyo
high performance computinglarge scale deep learninghierarchical low-rank matricesGPU computing
Koichiro Yoshino
Koichiro Yoshino
Tokyo Institute of Technology / GRP, RIKEN
spoken dialogue systemsnatural language processingspoken language processinghuman robot