1. Poisson Mixture Models
So far, you have learned how to apply mixture models when the data can be modeled by either a Gaussian or Bernoulli distribution.
In the next two lessons, we will explore a third distribution called Poisson, which is useful when we are provided with count data.
2. The crimes dataset
To introduce the framework, we'll use the crimes dataset, which is a count data example and is formed by the number of crimes committed in the city of Chicago.
Recall that the crimes are separated by the type of the crime and each row represents a community in the city.
3. The problem to solve
We want to identify groups or clusters of communities that are more or less dangerous to live in depending on the type and number of crimes committed.
4. Comparison of Poisson with Bernoulli
Recall that the Bernoulli distribution can describe discrete events where the outcome has two possibilities.
The Poisson distribution also describes discrete events but the outcome can take, in theory, any positive integer value. The parameter that characterizes this distribution is usually called lambda and can be thought of as the expected outcome value.
In this histogram, for example, a lambda of 250 has been used.
Observe that from the 100 values simulated, the most probable are indeed around 250.
5. Poisson distribution
The Poisson is a popular distribution for modeling the number of times an event occurs in a fixed period.
For example, modeling the number of car accidents in a year or the number of emails someone receives in a day or, more related with the crimes dataset, the number of robberies that occur in a year in a specific area of the city.
As we mentioned before, this distribution is characterized by lambda, which is the average number of events in the corresponding interval of time.
6. Sample of Poisson distribution
To simulate samples from this distribution, we use `rpois()` function, which has as its arguments the number of samples and the value of lambda.
Here, 100 observations are simulated from a Poisson distribution with a lambda of 100.
7. Sample of multivariate Poisson distribution
Similar to the multivariate Bernoulli distribution, we can also have a multivariate case for the Poisson.
Here's how to simulate 100 observations from a multivariate Poisson where each column has a different lambda.
8. Count data as (multi) Poisson distribution
You can extend this notion to the crimes dataset, where instead of having just three variables, we are provided with thirteen.
9. Poisson mixture model
To summarise the problem framework, the distribution we will use for the crimes dataset is a multivariate Poisson distribution.
Also, for this particular dataset and with the aim to illustrate another functionality of `flexmix` package, we will fit models with 1 to 15 clusters and pick the one that minimizes the BIC criterion, which I will cover in the next lesson.
For now, the BIC is a useful exploratory procedure when you don't have too much information regarding the data.
The parameters to be estimated are each of the lambdas for each of the multivariate Poisson distributions. Remember that since the crimes dataset has 13 variables, each of the multivariate Poisson will have 13 lambdas. Also, we need to estimate the proportions of the clusters.
10. Let's practice!
Before going into the `flexmix` function, let's practice the concepts of this distribution.