An AI startup has released a new tool to measure bias in speech recognition systems. The Artie Bias Corpus (ABC) was put together using Mozilla’s Common Voice corpus, and includes 1,712 individual audio clips and transcriptions.
Those clips represent 2.4 hours of total audio, and include recordings of people speaking in 17 distinct English accents. Those people are divided into eight unique age ranges between 18 and 80, each with three gender categories. All of the demographic information is self-reported, and was provided with the consent of the speaker.
Artie is hoping that developers will be able to use the platform to identify and eventually eliminate bias in speech recognition systems, which will in turn prevent discrimination based on age, gender, accent, and other factors. For example, a recent study found that leading speech recognition systems (from companies like Apple, Amazon, and Google) were collectively far less accurate when applied to black voices (a 35 percent error rate) than white ones (19 percent).
Artie itself is currently developing a platform for mobile games that leverage AI in some capacity. However, the ABC tool can ultimately be used to assess any speech recognition system, regardless of industry.
“Bias can render a technology unusable for someone because of their demographic,” said Artie lead scientist and Mozilla research fellow Josh Meyer. “Even for well-resourced languages like English, state of the art speech recognizers cannot understand all native accents reliably, and they often understand men better than women. The solution is to face the problem, and work toward solutions.”
To test its tool, Artie used ABC to analyze Mozilla’s open source DeepSpeech models, finding that it favored U.S. and British accents. Artie also found that a publicly available Google model had more gender bias than a publicly available counterpart from Amazon.
The results show that bias is still a problem for many voice recognition providers, even if that bias is not as well publicized as the racial bias that has been found in many facial recognition systems.
Source: Venture Beat
–
July 17, 2020 – by Eric Weiss
Follow Us