A Hybrid Approach for EMF Code Generation:Code Templates Meet Large Language Models

πŸ“… 2025-12-05
πŸ“ˆ Citations: 0
✨ Influential: 0
πŸ“„ PDF
πŸ€– AI Summary
To address the tension between rigid template-based code generation and the high error rates of large language model (LLM)-generated code, this paper proposes iEcoreGenβ€”a hybrid framework for LLM-augmented model-driven development. It leverages EMF/Ecore metamodels to formally specify system structure, decomposes requirements into operation-level specifications, and employs domain-specific templates to generate Java skeleton code annotated with specification-conforming docstrings; remaining method bodies are then completed by an LLM. This approach synergistically combines the correctness guarantees of model-driven engineering with the flexibility of LLMs. Evaluation across five mainstream LLMs and 20 diverse tasks shows that iEcoreGen achieves significantly higher pass@k than pure-LLM baselines, while maintaining comparable compilation@k. Ablation studies confirm the effectiveness of three core components: template-guided scaffolding, specification embedding in docstrings, and collaborative LLM-based method completion.

Technology Category

Application Category

πŸ“ Abstract
Template-based and LLM-based code generation are both key enablers of automated software development. The former provides correctness guarantees but are rigid for complex requirements, whereas LLMs offer high flexibility at the risk of producing faulty code.This paper proposes iEcoreGen, a hybrid approach that integrates Eclipse Modeling Framework (EMF) and LLMs. In EMF, an Ecore model defines a system structure and acts as a blueprint for code-generation.iEcoreGen decomposes requirements to derive operation specifications, uses EMF's template-based generator to produce initial Java code, and serializes specifications into docstrings. LLMs are then invoked to complete and fix unimplemented methods. We assessed iEcoreGen on twenty code-generation tasks across five LLMs. It surpasses LLM-only baselines on pass@k and performs on par with them on compilation@k. An ablation study clarified the contribution of each component of iEcoreGen. Overall, the findings indicate that LLM-enhanced model-driven development is a promising path toward more efficient software automation.
Problem

Research questions and friction points this paper is trying to address.

Integrates EMF templates with LLMs for code generation
Ensures correctness while handling complex software requirements
Improves automated development by combining structured and flexible methods
Innovation

Methods, ideas, or system contributions that make the work stand out.

Hybrid approach integrates EMF templates with LLMs
Decomposes requirements to generate initial Java code
Uses LLMs to complete and fix unimplemented methods
πŸ”Ž Similar Papers
No similar papers found.
X
Xiao He
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
R
Ru Chen
School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
Zeqing Zhang
Zeqing Zhang
The University of Hong Kong
robotic manipulationmulti-agent systemcollision detection
Yanling Wang
Yanling Wang
Zhipu AI
Data MiningNatural Language Processing
Q
Qiuyan Dong
Beijing Institute of mechanical and Electrical Engineering, Beijing, China