We are excited to bring Transform 2022 back in-person July 19 and virtually July 20 - 28. Join AI and data leaders for insightful talks and exciting networking opportunities. Register today!


A group of seven influential women studying algorithmic bias, AI, and technology have released a spoken word piece called “Voicing Erasure.” The project highlights racial bias in the speech recognition systems made by tech giants and recognizes the overlooked contributions of female scholars and researchers in the field.

A report titled “Racial disparities in automated speech recognition” was also published roughly a week ago. The authors found that automatic speech recognition systems for Apple, Amazon, Google, IBM, and Microsoft collectively achieve word error rates of 35% for African-American voices versus 19% for white voices. Automatic speech recognition systems from these tech giants can do things like transcribe speech-to-text and power AI assistants like Alexa, Cortana, and Siri.

The Voicing Erasure project is a product of the Algorithmic Justice League, a group created by Joy Buolamwini. Participants in the computer science art piece include former White House CTO Megan Smith; Race After Technology author Ruha Benjamin; Design Justice author Sasha Costanza-Chock; and Kimberlé Crenshaw, a professor of law at Columbia Law School and UCLA.

“We cannot let the promise of AI overshadow real and present harms,” Benjamin said in the piece.

In 2018 and 2019, Buolamwini and collaborators carried out audits of facial recognition bias that are frequently cited by lawmakers and activists. The team’s findings are recognized as central to understanding race and gender disparities in the performance of facial recognition systems from tech giants like Amazon and Microsoft. Buolamwini was also part of the Coded Bias documentary, which premiered at the Sundance Film Festival earlier this year, and “AI, Ain’t I A Woman?,” a play on an 1851 Sojourner Truth speech with a similar name.

Additional audits are in the works, Buolamwini told VentureBeat, but the performance piece was made to underscore racial disparities we already know exist in automated speech recognition. The Voicing Erasure project also highlights the ways voice assistants often reinforce gender stereotypes. In an effort to roll back some of that gendered bias, most major assistants today offer both masculine and feminine voice options, with the exception of Amazon’s Alexa.

The poetic protest also recognizes the sexism female researchers can encounter in the field, pointing to a New York Times article about the bias report that cites multiple male authors but fails to recognize lead author Allison Koenecke, who appears in Voicing Erasure. Algorithms of Oppression author Dr. Safiya Noble, who has also been critical of tech journalists, participated in the spoken word project.

“Racial disparities in automated speech recognition” was published in the Proceedings of the National Academy of Sciences by a team of 10 researchers from Stanford University and Georgetown University. They found that Microsoft’s automatic speech assistant tech performed best, while Apple and Google ranked worst.

Each conversational AI system transcribed a total of 42 white speakers and 73 African-American speakers from data sets with nearly 20 hours of voice recordings. Researchers focused on voice data from Humboldt County and Sacramento, California, drawing on data sets with African-American Vernacular English (AAVE), like Voices of California and the Corpus of Regional African American Language (CORAAL).

The authors said these discrepancies likely derive from speech recognition systems using insufficient audio data from African-American speakers during training. They said the error rates also highlight the need for speech recognition system makers, academics, and governments sponsoring research to invest in inclusivity.

“Such an effort, we believe, should entail not only better collection of data on AAVE speech but also better collection of data on other nonstandard varieties of English, whose speakers may similarly be burdened by poor ASR performance — including those with regional and nonnative-English accents,” the report reads. “We also believe developers of speech recognition tools in industry and academia should regularly assess and publicly report their progress along this dimension.”

In statements following the release of the study, Google and IBM Watson pledged to do more to correct this type of bias.

VentureBeat's mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Learn more about membership.

Author
Topics