Estimation given the probabilities
Parameters estimation for mixture models is not an easy task. But if you are provided with the probabilities of belonging to each cluster, the estimation of the means and the proportions of the clusters is not so difficult.
In this exercise, you will use a dataset created by two Gaussian distributions called gaussian_sample_with_probs
, which in its original form only has the column x
, but here you are also provided with the probabilities for each cluster (prob_cluster1
and prob_cluster2
). The aim is to estimate the parameters and then visualize the estimated mixture.
This exercise is part of the course
Mixture Models in R
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Estimation of the means
___ <- ___ %>%
summarise(mean_cluster1= sum(___*prob_cluster1)/sum(prob_cluster1),
mean_cluster2 = sum(x*___)/sum(___))
means_estimates
# Estimation of the proportions
props_estimates <- ___ %>%
summarise(props_cluster1 = ___(prob_cluster1),
props_cluster2 = 1 - ___)
props_estimates