Calculating the probabilities
In the last exercise, you saw how you can estimate the means and proportions when the probabilities are provided. The aim of this exercise is to estimate the probabilities when the means and the proportions are known. Assume the means for cluster 1 and 2 are 10
and 50
, respectively, and assume cluster 1 represents 35 percent of the population.
Also, since we are only concerned with the parameters' estimation, assume both sd
are 10
. The data set gaussian_sample
is available for you in your workspace.
Este exercício faz parte do curso
Mixture Models in R
Instruções do exercício
- Create a new data frame called
gaussian_sample_with_probs
with the estimations of the probabilities for cluster 1 and 2. For that purpose, create two new variables calledprob_cluster1
andprob_cluster2
. Remember to scale the probabilities. - Check out the first 6 observations of
gaussian_sample_with_probs
.
Exercício interativo prático
Experimente este exercício completando este código de exemplo.
# Create data frame with probabilities
gaussian_sample_with_probs <- gaussian_sample %>%
___(prob_from_cluster1 = 0.35 * ___(___, mean = 10, sd = 10),
prob_from_cluster2 = 0.65 * dnorm(___, mean = 50, sd = 10),
prob_cluster1 = ___/(prob_from_cluster1 + prob_from_cluster2),
prob_cluster2 = ___/(prob_from_cluster1 + prob_from_cluster2)) %>%
select(x, prob_cluster1, prob_cluster2)
head(___)