Embedding Alignment in Code Generation for Audio

📅 2025-08-07

📈 Citations: 0

✨ Influential: 0

career value

237K/year

🤖 AI Summary

To address the limited diversity and lack of semantic audio feedback in LLM-generated code for audio creative programming (e.g., live coding), this paper proposes a code–audio embedding alignment framework. Methodologically, it constructs a cross-modal embedding space where nonlinear mapping functions align code representations with corresponding audio features—such as spectrograms and rhythmic patterns—by jointly leveraging large language models, audio feature extraction, and embedding space modeling. The key contribution is the first predictive and interpretable code-to-audio semantic mapping model, overcoming the “black-box” limitation inherent in conventional generative models. Experimental results demonstrate significant improvements in both musical intent consistency and acoustic diversity of generated code, enabling efficient user exploration of diverse musical expressions.

Technology Category

Application Category

📝 Abstract

LLM-powered code generation has the potential to revolutionize creative coding endeavors, such as live-coding, by enabling users to focus on structural motifs over syntactic details. In such domains, when prompting an LLM, users may benefit from considering multiple varied code candidates to better realize their musical intentions. Code generation models, however, struggle to present unique and diverse code candidates, with no direct insight into the code's audio output. To better establish a relationship between code candidates and produced audio, we investigate the topology of the mapping between code and audio embedding spaces. We find that code and audio embeddings do not exhibit a simple linear relationship, but supplement this with a constructed predictive model that shows an embedding alignment map could be learned. Supplementing the aim for musically diverse output, we present a model that given code predicts output audio embedding, constructing a code-audio embedding alignment map.

Problem

Research questions and friction points this paper is trying to address.

Investigating code-audio embedding alignment in LLM code generation

Addressing lack of diverse code candidates in audio generation

Predicting audio output embeddings from code for alignment

Innovation

Methods, ideas, or system contributions that make the work stand out.

Mapping code-audio embedding spaces topology

Predictive model for embedding alignment

Generating musically diverse code candidates

🔎 Similar Papers

Video-to-Audio Generation with Hidden Alignment

2024-07-10arXiv.orgCitations: 7

💼 Related Jobs

Creative Coder