Get startedGet started for free

Calculating the probabilities

In the last exercise, you saw how you can estimate the means and proportions when the probabilities are provided. The aim of this exercise is to estimate the probabilities when the means and the proportions are known. Assume the means for cluster 1 and 2 are 10 and 50, respectively, and assume cluster 1 represents 35 percent of the population.

Also, since we are only concerned with the parameters' estimation, assume both sd are 10. The data set gaussian_sample is available for you in your workspace.

This exercise is part of the course

Mixture Models in R

View Course

Exercise instructions

  • Create a new data frame called gaussian_sample_with_probs with the estimations of the probabilities for cluster 1 and 2. For that purpose, create two new variables called prob_cluster1 and prob_cluster2. Remember to scale the probabilities.
  • Check out the first 6 observations of gaussian_sample_with_probs.

Hands-on interactive exercise

Have a go at this exercise by completing this sample code.

# Create data frame with probabilities
gaussian_sample_with_probs <- gaussian_sample %>% 
  ___(prob_from_cluster1 = 0.35 * ___(___, mean = 10, sd = 10),
         prob_from_cluster2 = 0.65 * dnorm(___, mean = 50, sd = 10),
         prob_cluster1 = ___/(prob_from_cluster1 + prob_from_cluster2),
         prob_cluster2 = ___/(prob_from_cluster1 + prob_from_cluster2)) %>%
  select(x, prob_cluster1, prob_cluster2) 
         
head(___)
Edit and Run Code