● The Gradient 📅 08/04/2024 à 19:54

A Brief Overview of Gender Bias in AI

Géopolitique 👤 Yennie Jun

🏷️ Tags : chine llm big data cert chatgpt gemini machine learning openai pm rag

AI models reflect, and often exaggerate, existing gender biases from the real world. It is important to quantify such biases present in models in order to properly address and mitigate them.In this article, I showcase a small selection of important work done (and currently being done) to uncover, evaluate, and measure different aspects of gender bias in AI models. I also discuss the implications of this work and highlight a few gaps I’ve noticed.But What Even Is Bias?All of these terms (“AI”, “gender”, and “bias”) can be somewhat overused and ambiguous. “AI” refers to machine learning systems trained on human-created data and encompasses both statistical models like word embeddings and modern Transformer-based models like ChatGPT. “Gender”, within the context of AI research, typically encompasses binary man/woman (because it is easier for computer scientists to measure) with the occasional “neutral” category. Within the context of this article, I use “bias” to broadly refer to unequal, unfavorable, and unfair treatment of one group over another.There are many different ways to categorize, define, and quantify bias, stereotypes, and harms, but this is outside the scope of this article. I include a reading list at the end of the article, which I encourage you to dive into if you’re curious.A Short History of Studying Gender Bias in AIHere, I cover a very small sample of papers I’ve found influential studying gender bias in AI. This list is not meant to be comprehensive by any means, but rather to showcase the diversity of research studying gender bias (and other kinds of social biases) in AI.Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings (Bolukbasi et al., 2016)Short Summary: Gender bias exists in word embeddings (numerical vectors which represent text data) as a result of biases in the training data.Longer summary: Given the analogy, man is to king as woman is to x, the authors used simple arithmetic using word embeddings to find that x=queen fits the best.Subtracting the vector representations for “man” from “woman” results in a similar value as subtracting the vector representations for “king” and “queen”. From Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.However, the authors found sexist analogies to exist in the embeddings, such as:He is to carpentry as she is to sewingFather is to doctor as mother is to nurseMan is to computer programmer as woman is to homemakerSubtracting the vector representations for “man” from “woman” results in a similar value as subtracting the vector representations for “computer programmer” and “homemaker”. From Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.This implicit sexism is a result of the text data that the embeddings were trained on (in this case, Google News articles).Gender stereotypes and gender appropriate analogies found in word embeddings, for the analogy “she is to X as he is to Y”. From Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings.Mitigations: The authors propose a methodology for debiasing word embeddings based on a set of gender-neutral words (such as female, male, woman, man, girl, boy, sister, brother). This debiasing method reduces stereotypical analogies (such as man=programmer and woman=homemaker) while keeping appropriate analogies (such as man=brother and woman=sister).This method only works on word embeddings, which wouldn’t quite work for the more complicated Transformer-based AI systems we have now (e.g. LLMs like ChatGPT). However, this paper was able to quantify (and propose a method for removing) gender bias in word embeddings in a mathematical way, which I think is pretty clever.Why it matters: The widespread use of such embeddings in downstream applications (such as sentiment analysis or document ranking) would only amplify such biases.Gender Shades: Intersectional Accuracy Disparities in Commercial Gender Classification [Buolamwini and Gebru, 2018]Short summary: Intersectional gender-and-racial biases exist in facial recognition systems, which can classify certain demographic groups (e.g. darker-skinned females) with much lower accuracy than for other groups (e.g. lighter-skinned males).Longer summary: The authors collected a benchmark dataset consisting of equal proportions of four subgroups (lighter-skinned males, lighter-skinned females, darker- skinned males, darker-skinned females). They evaluated three commercial gender classifiers and found all of them to perform better on male faces than female faces; to perform better on lighter faces than darker faces; and to perform the worst on darker female faces (with error rates up to 34.7%). In contrast, the maximum error rate for lighter-skinned male faces was 0.8%.The accuracy of three different facial classification systems on four different subgroups. Table sourced from the Gender Shades overview website.Mitigation: In direct response to this paper,

🔗 Lire l'article original 👁️ 4 lectures

← Retour