π€ AI Summary
This work proposes a Dynamic Meta-Metric (DMM) framework to address the limitations of conventional machine translation evaluation metrics, which rely on static or language-specific weights and struggle to adapt to varying source sentence characteristics. DMM dynamically adjusts the combination weights of existing metrics through source-aware hard clustering and a soft continuous weighting mechanism, enabling context-sensitive quality estimation. The framework leverages multilayer perceptrons (MLPs), Gaussian processes, and cluster analysis to construct an interpretable and flexible dynamic weighting model. Evaluated on multilingual data from the WMT Metrics Shared Task, the MLP-based instantiation significantly outperforms linear and Gaussian process approaches, while the soft conditioning mechanism further enhances segment-level and system-level correlations, demonstrating the frameworkβs effectiveness and adaptability.
π Abstract
We propose Dynamic Meta-Metrics (DMM), a framework for machine translation evaluation that learns source-sentence conditioned combinations of existing metrics. Rather than relying on a single static ensemble or language-specific weighting, DMM adapts the metric combination based on properties of the source segment. We study hard conditioning, which fits an interpretable combiner per cluster, and an exploratory soft-conditioned extension whose weights vary continuously with source-cluster responsibilities. We evaluate DMM on the WMT Metrics Shared Task data across multiple language pairs using pairwise agreement measures at the system and segment levels. Across settings, MLP-based combinations outperform linear and Gaussian process-based ensembles, and introducing soft conditioning yields gains over linear models.