Publication: EuqAud: Detecting Gender Bias in Audio Datasets Using Polynomial Regression-Based Metric
Type:
Article
Date
2026
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
With the growing adoption of audio based AI systems in high-stakes domains such as healthcare, law enforcement, and social media, ensuring fairness particularly regarding gender bias has become critically important. While prior work on fairness has predominantly focused on disparities in model performance, bias inherent in training datasets remains underexplored. To address this gap, we propose EuqAud, a novel, pre-trained and traceable fairness metric that quantifies gender bias in audio datasets using raw acoustic features such as pitch, energy, amplitude, and voice activity. Unlike methods dependent on demographic labels such as race, age or language, EuqAud is designed to be demographic and language agnostic, enhancing its applicability across diverse contexts. The score is computed using an equation derived from polynomial regression with L2 regularization (Ridge regression), yielding robust and generalizable outputs. It spans a range from −10 to 10, where 0 denotes neutral, positive scores indicate male dominant bias, and negative scores reflect female dominant bias. For clarity, bias severity is categorized into three tiers: Neutral (EuqAud < 2), Moderate Bias (2 ≤ EuqAud ≤ 6), and Strong Bias (EuqAud > 6). Evaluation across multiple datasets demonstrates high predictive performance, with R2 values between 0.95 and 0.99. By focusing on dataset level bias rather than model outcomes, EuqAud offers a scalable and rigorous solution for advancing fairness in audio-based AI systems.
Description
Keywords
Audio datasets, bias detection, EuqAud, gender bias, polynomial regression, responsible AI
