Understanding Bayesian methods

1. Understanding Bayesian methods

Some smartphones and apps predict the user's destination to offer routes and traffic estimates without the user even asking. If you didn't know about machine learning, the phone's ability to predict the future this way might seem a bit like magic! The phone obviously keeps a record of the user's past locations. It then uses this data to forecast the user's most probable future location, much like a meteorologist estimates the precipitation probability in a weather report. A branch of statistics called Bayesian methods apply the work of 18th century statistician Thomas Bayes, who proposed rules for estimating probabilities in light of historic data. By applying these methods to my own location tracking data, you will learn how probability estimates can forecast action. Let's see where the data finds me!

2. Estimating probability

This map shows the number of times my phone recorded my position at four different locations. Based on this data, my phone can predict that at any given time my most probable location is at work, because I was there 57-point-5 percent of the time: 23 of the past 40 times it checked. This illustrates how the probability of an event is estimated from historic data; it the number of times the event happened, divided by the number of times it could have happened. But even though I am at work a lot, the phone should not predict I am there all the time. Instead, it should incorporate additional data like time of day to better tailor its predictions to the situation. This requires an understanding of how to combine information from several events into a single probability estimate.

3. Joint probability and independent events

When events occur together, they have a joint probability. Their intersection can be depicted using a Venn diagram like those shown here. These show that there is a much greater probability that I am at work in the afternoon than the evening; the overlap is much greater for work and afternoon. The joint probability of two events is computed by finding the proportion of observations they occurred together. Sometimes one event does not influence the probability of another. These are said to be independent events. For example, my location is unrelated to most other users' locations. Knowing where they are does not provide information about where I might be. This notion of independent events will be important later on. However, many of the other data elements my phone collects, such as the time and date, are VERY predictive of where I may be. When one event is predictive of another, they are called dependent.

4. Conditional probability and dependent events

These are the basis of prediction with Bayesian methods. Conditional probability expresses exactly how one event depends on another. The formula shows that the probability of event A given B is equal to their joint probability divided by the probability of B. We can use this to compute the 4% probability that I am at work, given the knowledge that it is evening. In comparison, there is an 80% chance I am at work in the afternoon.

5. Making predictions with Naive Bayes

The algorithm known as "Naive Bayes" applies Bayesian methods to estimate the conditional probability of an outcome. The naivebayes package provides a function to build this model. Because the location depends on the time of day, it is specified as location-tilde-time-of-day; this form is called the R formula interface, which relates the outcome to be predicted to its predictors. The corresponding predict() function computes conditional probabilities to predict a future location based on the future conditions.

6. Let's practice!

In the next exercises, you'll apply what you've learned to a real dataset that tracked my location over time.

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.