Feature Selection via Graph Topology Inference for Soundscape Emotion Recognition

📅 2025-09-20

📈 Citations: 0

✨ Influential: 0

career value

219K/year

🤖 AI Summary

This paper addresses the opacity and unreliability in modeling relationships between acoustic features and affective dimensions (arousal and valence) in soundscape emotion recognition (SER). We propose a graph topology inference framework based on linear structural equation modeling (SEM), which integrates information criteria with a generalized elbow detector to automatically learn sparse directed graphs that reveal causal contributions of features to emotional outputs, while quantifying uncertainty in sparsity selection. Experiments on the Emo-Soundscapes dataset demonstrate that our method significantly improves feature selection accuracy and enables interpretable visualization of feature–emotion relationships. Crucially, it provides the first quantitative evidence of a strong statistical association between arousal and valence—challenging the conventional orthogonality assumption—and establishes a novel, interpretable paradigm for SER modeling.

Technology Category

Application Category

📝 Abstract

Research on soundscapes has shifted the focus of environmental acoustics from noise levels to the perception of sounds, incorporating contextual factors. Soundscape emotion recognition (SER) models perception using a set of features, with arousal and valence commonly regarded as sufficient descriptors of affect. In this work, we blend emph{graph learning} techniques with a novel emph{information criterion} to develop a feature selection framework for SER. Specifically, we estimate a sparse graph representation of feature relations using linear structural equation models (SEM) tailored to the widely used Emo-Soundscapes dataset. The resulting graph captures the relations between input features and the two emotional outputs. To determine the appropriate level of sparsity, we propose a novel emph{generalized elbow detector}, which provides both a point estimate and an uncertainty interval. We conduct an extensive evaluation of our methods, including visualizations of the inferred relations. While several of our findings align with previous studies, the graph representation also reveals a strong connection between arousal and valence, challenging common SER assumptions.

Problem

Research questions and friction points this paper is trying to address.

Developing feature selection framework for soundscape emotion recognition

Estimating sparse graph representation of feature relations using SEM

Challenging common SER assumptions about arousal-valence relationship

Innovation

Methods, ideas, or system contributions that make the work stand out.

Graph learning with information criterion for feature selection

Sparse graph representation using structural equation models

Generalized elbow detector for sparsity level determination

🔎 Similar Papers

Graph Neural Networks in EEG-based Emotion Recognition: A Survey