🤖 AI Summary
This work proposes a novel approach to audio equalization by leveraging large language models (LLMs) to enable context-aware, language-driven dynamic tuning, addressing the limitations of traditional manual equalization in adapting to dynamic listening contexts such as shifting emotions or environments. The method integrates in-context learning with parameter-efficient fine-tuning and aligns model outputs with population-level preference distributions using data from controlled listening experiments. Experimental results demonstrate that the proposed system significantly outperforms both random sampling and static preset baselines on distribution alignment metrics, thereby validating the effectiveness and novelty of employing LLMs as “artificial equalizers” capable of interpreting natural language cues to deliver personalized, context-sensitive audio adjustments.
📝 Abstract
Conventional audio equalization is a static process that requires manual and cumbersome adjustments to adapt to changing listening contexts (e.g., mood, location, or social setting). In this paper, we introduce a Large Language Model (LLM)-based alternative that maps natural language text prompts to equalization settings. This enables a conversational approach to sound system control. By utilizing data collected from a controlled listening experiment, our models exploit in-context learning and parameter-efficient fine-tuning techniques to reliably align with population-preferred equalization settings. Our evaluation methods, which leverage distributional metrics that capture users'varied preferences, show statistically significant improvements in distributional alignment over random sampling and static preset baselines. These results indicate that LLMs could function as"artificial equalizers,"contributing to the development of more accessible, context-aware, and expert-level audio tuning methods.