1. Evaluating and mitigating social bias
Welcome back. In this video, we'll learn about the key social biases that can appear in AI models and explore methods to evaluate and mitigate them.
2. What do we mean by social bias?
Social bias refers to systematic unfairness in AI, typically referring to unfairness for or against certain groups.
As generative AI becomes more prevalent in human life, such bias can have serious social consequences. Imagine AI involved in hiring that only interviews people with names that sound a certain way, or AI in medical settings that has only been trained on data from patients that don't represent the broader population.
But what is fair? Fairness might mean different things to different people.
Inherent societal blindspots and differing perspectives make this a difficult topic,
but focusing on broadly shared human values can help us seek fairness and unbiased outcomes when generative AI is developed and applied.
3. Where bias appears
Bias can appear in training data,
the model itself,
and how the model is used.
Let's look at each.
4. Bias in data
First, training data may lack diversity or misrepresent groups, leading to skewed outputs.
A model can only generate responses similar to its training data. If training data overrepresents or underrepresents certain groups, or misrepresents them, model outputs will reflect that.
For example, if a model is trained only on purple squares, it can only generate purple squares.
However, if it's trained on a variety of shapes and colors, it will be able to generate a variety of shapes and colors.
5. Bias in models
Second, assumptions or optimization choices might make a model pursue a narrow goal that results in bias.
For example, consider a generative AI developed to write political speeches. Its goal to win an election may have unintended consequences, such as stirring up group rivalries to incite votes.
The training data may be unbiased and users might have no intention of creating ill will, but models can still produce results that we perceive as biased.
6. Bias in use
Third, unfair outcomes can occur if users apply generative AI in wrong or malicious ways.
For example, a user may use detailed prompting tricks to get a video generator AI to make fake videos that promote biased or harmful narratives. Even if the AI itself is unbiased, this user applies it in a way that creates biased outcomes.
We'll discuss this further in a future video.
7. Identifying bias in data and models
To fix bias, we first must detect it. Some key techniques are as follows.
Representation analysis compares how the model refers to different groups. For example, we can check if the model uses very different language when referring to men or women. This may indicate gender bias.
Fairness metrics can be calculated by algorithms that evaluate models for equal treatment, opportunity, and accuracy across groups. Metrics can cover more examples and detect subtle biases that humans may miss.
Finally are human audits, where people review a model's outputs to identify bias.
8. Mitigating bias in data and models
How do we address bias? Many strategies exist, but there are several common ones.
Training data can be diversified to address underrepresentation.
The model can be adjusted to prioritize different kinds of data. This is helpful if fewer examples exist of groups that need better representation.
Adversarial training can be used, like in a GAN. Train a separate model to detect bias in the generative AI, then use that feedback to adjust.
We can also continuously evaluate and improve our models. This includes making responsible adjustments to models as needed as new anti-biasing techniques arise and engaging diverse stakeholders throughout development to ensure potential biases are surfaced and addressed early on.
9. Let's practice!
Time to practice our understanding of biases in generative AI and the strategies to mitigate them.