Scholar

Zengyi Qin

Google Scholar ID: lwwVd7sAAAAJ

Massachusetts Institute of Technology

Multi-modal LLMs and Agents

Citations & Impact

All-time

Citations

1,522

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

6 items

2026

Cited

2026

Cited

2025

Cited

2025

Cited

2025

Cited

2025

Cited

Resume (English only)

Academic Achievements

Papers: JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars, OpenVoice: Versatile Instant Voice Cloning, MeloTTS: A high-quality multi-lingual multi-accent text-to-speech library, DreamVoice: Text-Guided Voice Conversion, MonoGRNet: A General Framework for Monocular 3D Object Detection; Awards: MyShell publicly listed on major crypto exchanges; Patents: Not explicitly mentioned; Projects: JetMoE-8B, OpenVoice, MyShell, MeloTTS, DreamVoice, MonoGRNet

Research Experience

Led the development of JetMoE-8B, pre-trained and post-trained from scratch under extreme limitations in compute and data, with less than 0.1M USD cost, outperforming LLaMA2-7B; Led the development of OpenVoice, an audio foundation model that allows users to clone any voice and generate speech in various styles and languages; Co-founded MyShell platform, which has 6 million users, more than 200,000 AI agents built, and over 1 billion interactions with AI agents.

Education

PhD: Massachusetts Institute of Technology (2020-2025); Visiting Researcher: Stanford University (2019), Mentor: Stanford Vision and Learning Lab; BE: Tsinghua University (2016-2020), Major: Electronic Engineering

Background

Research Interests: AI models, voice cloning, 3D computer vision; Professional Field: Electronic Engineering; Bio: Researcher and entrepreneur, primary author of several widely recognized AI models.

Miscellany