1. Parameters estimation
In this lesson, we'll introduce the process of estimating the parameters in mixture models.
2. The problem
To start, imagine we have twenty points and we want to cluster them with mixture model.
Here you can observe how they are ordered on the x axis.
3. Assumptions
Let's assume the points come from two Gaussian distributions.
So, the parameters are the two means, the two proportions, and the two standard deviations.
Moreover, let's suppose the values of the standard deviations are both equal to one.
How can we estimate these four parameters?
To answer this, you will learn an iterative method that is separated into two steps.
4. Two steps
The first step assumes that we somehow know the probabilities of belonging to each cluster, represented by the colour of the points, and try to estimate the means and the proportions of the Gaussians.
The second step assumes that we somehow know the means and the proportions of the Gaussians, and try to estimate the probabilities.
5. Step 1: Known probabilities
When we know the probabilities, the data will have three columns, one with the original values represented by x and two more with the probabilities belonging to each color.
In this case, the estimation of the means is a weighted average of the observations, where the weights are the probabilities themselves.
6. Step 1: Known probabilities
Thus, to estimate the mean of the red points, for example, we sum all the observations multiplied by the probability of being red over the total probability of being red, as shown in the code.
The result for the red points is a sample mean of 2.8 and for the blue points, 5.1.
For the proportions estimation, we calculate the fraction corresponding to the red points as the sample mean of the probability of being red. The same applies to the blue ones.
We can observe that the red points represent around 30 percent of the points, so the blue points are about 70 percent.
7. Step 1: Known probabilities
Then, the distributions that characterized each cluster look as shown in the image.
8. Step 2: Known means and proportions
Now, what would happen if we do know the means and the proportions of the distributions?
How can we estimate the colour of the points with this information?
For this example, let's consider that the blue Gaussian has a mean of 5 and the red one of 3.
Also, the proportion for the blue is 70% and for the red is 30%.
9. Example: one point
To illustrate the procedure, Let's take a look at the orange point depicted in the figure and try to figure out the probability of belonging to each distribution.
10. Example: probability from red
Since we already know the density of each distribution, this step is direct.
The probability given by the red Gaussian to the point is 0.115.
11. Example: probability from blue
And the probability given by the blue Gaussian to the orange point is 0.065.
12. Step 2: Scaled probabilities
To compare both probabilities though, we need to scale them by the total probability.
For the orange point example, the scaled probability of being blue is 0.065 over the sum of 0.065 and 0.115, which gives 0.36.
To do it in R for every point, we can create the new variables prob_from_red and prob_from_blue which measure the probabilities given by each distribution.
Then, using these two variables, we create the two scaled probabilites which represent the real values.
13. Summary
You saw that when we know the probabilities, the estimation of the parameters can be done easily.
Conversely, when we know the parameters, we can estimate the probabilities.
In the next lesson, you will see that the estimation is done iteratively between these two steps.
14. Let's practice!
Now it's your turn.