- 'When Does Metadata Conditioning (NOT) Work for Language Model Pre-Training? A Study with Context-Free Grammars' (COLM 2025)
- 'Efficient Construction of Model Family through Progressive Training Using Model Expansion' (COLM 2025)
- 'User-Guided Correction of Reconstruction Errors in Structure-from-Motion' (IUI 2025)
- 'Spike No More: Stabilizing the Pre-training of Large Language Models' (COLM 2025)
- 'B2T Connection: Serving Stability and Performance in Deep Transformers' (ACL Findings 2023)
- 'Decomposing NeRF for Editing via Feature Field Distillation' (NeurIPS 2022)
- 'Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model' (BigScience 2022)
- 'VocabEncounter: NMT-powered Vocabulary Learning by Presenting Computer-Generated Usages of Foreign Words into Users' Daily Lives' (CHI 2022)
Research Experience
- Researcher at Preferred Networks, Inc.
- Specially Appointed Associate Professor (Visiting) at the Center for Language AI Research, Tohoku University
- High-quality 3D/4D reconstruction
- Neural scene representations with semantic or language features
- Tractable training methods of large language models
- Analyzing influence of training by deterministic dropout
- Subnetworks for building diverse models at once
- Random parameters in neural networks
- Controlling robots/agents with language or other interactions
- Language-model-based conditional data augmentation
- Entity-centric representations in NLP
Education
Holds a PhD in Information Science.
Background
Research interests include machine learning, natural language processing, and 2D/3D/4D computer vision. Broadly interested in exploring surprising and useful findings and applications across various AI fields.