Predictive analytics

1. Predictive analytics

After learning all about descriptive and diagnostic analytics, it is time to take a closer look at predictive analytics.

2. Analytics overview

Descriptive and diagnostic analytics are concerned with the past or, at most, the present; predictive and prescriptive analytics look at the future. Predictive analytics are focused on identifying outcomes and their probability, helping us investigate what will most likely happen based on what we know from the data.

3. Why use predictive analytics?

Predictive analytics can assist in anticipating the most likely outcomes of an event like predicting whether a machine will soon fail based on maintenance data so that the company can organize repairs in advance. It is also often used to forecast how a specific trend will continue, for example, to predict stock prices or the evolution of a pandemic. Often linked to future events, predictive analytics can also be used to predict something that is not yet known in the present. For example, based on symptoms, predict which disease a patient most likely has, or, based on the characteristics of a transaction, predict whether it is fraudulent or not. Either way, predictions are always associated with a degree of uncertainty, even more so if we look further away from the data we have.

4. Common techniques

One of the most common techniques used in predictive analytics is machine learning, a set of techniques where computers learn from existing patterns in data to make predictions on new data. A distinction can be made between classification and regression. Classification predicts categories or membership to a group, like predicting whether a customer will cancel their subscription in the near future. In contrast, regression predicts values, such as predicting housing prices based on the characteristics of a neighborhood. Time series forecasting focuses on predicting future values over time, like predicting sales revenue for the next quarter. Predictive text analytics is a sub-group of predictive techniques focused on text, typically used to predict to which category a text belongs, for example, whether an email is spam or not. Regardless of the exact technique used, there are some common steps all predictive models use, which we'll discuss in more detail on the following slide.

5. Predictive modeling

The first step consists of defining what we want to predict. Second, collect and prepare the data related to what we want to predict. Typically the data is split into a training set for building the model and a smaller test set for evaluation. The data is then used in the third step to build and train the model until it provides accurate predictions on the training data. When the model is ready, the predictions are interpreted and evaluated on the test data, using pre-determined metrics like accuracy, which is the percentage of correct predictions. Usually, multiple metrics are used to evaluate a predictive model thoroughly. Finally, the model can be further fine-tuned if necessary.

6. Case study: World Cup winner

Before any great sports event, many try to predict who the eventual winner will be, from animals with predictive ability to supercomputers. The computer models use data like team ratings, player ratings, current rankings, estimated difficulty of the matches, and so on to predict who will most likely win and other things like how likely a team will reach the knockout phase or the chances of getting to the final.

7. Let's practice!

Time to practice what you've learned!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.