Get startedGet started for free

Regression models

1. Regression models

In our final chapter, we'll discuss some fundamental modeling and machine learning concepts like linear regression, logistic regression, dealing with missing data and outliers, and the bias-variance trade off. Let's get things started with regression models.

2. Getting started

We'll quickly review regression. Regression is a technique used to model and analyze the relationships between variables, and how the variables contribute to producing a particular outcome. More concretely, it's a way to determine which variables have an impact, which don't, which factors interact, and how certain we are about this. In this lesson, we'll review the two most common regression techniques: linear regression and logistic regression. But first, some assumptions.

3. Assumptions

In order to effectively leverage regression models, we need the true relationship of the variables to be linear, the errors to be normally distributed and homoscedastic, meaning they have uniform variance, and each observation to be independent. Interviewers may have you walk through these assumptions when problem solving.

4. Linear regression

Simple linear regression involves one independent and one dependent variable with a linear relationship. This results in a fit that will look similar to this plot.

5. Linear regression

Let's dissect this formula. We are solving for the Y value or the dependent variable, which is our output. This is calculated by taking the y intercept plus our population slope coefficient, times the independent variable, X, plus our random irreducible error term. More variables can be included by simply adding a beta coefficient for each additional factor. Note that sometimes you will only see the linear component of our intercept and slope, without the random error component.

6. Example: linear regression

To implement linear regression in python, we'll call on the scikit-learn package. After creating the linear regression object and changing any default parameters, simply call the fit function to create your model.

7. Example: linear regression

Let's take things a step further by looking at the coefficients. Since we only have one independent variable in this example, there is only one coefficient. It is essentially the slope of the line, and tells us that for every 0 point 8 units of dependent variable, we get 1 unit of independent variable.

8. Logistic regression

Another regression technique is logistic regression, one of the most common machine learning algorithms for two-class classification. As you can see here, while linear regression gives us a continuous output, logistic regression produces a discrete output. This allows us to compute probabilities that each observation belong to a class, thanks to the sigmoid function.

9. Logistic regression

The sigmoid function is also called the logistic function. It gives us an S-shaped curve that takes any real number and maps or converts it between 0 and 1.

10. Example: logistic regression

Similar to linear regression, we can implement logistic regression and then fit the model to our data. As you can see, we get a whole slew of different default parameters given to the logistic regression function, most of which you shouldn't worry about.

11. Example: logistic regression

Again, we view the coefficients from our logistic regression. Since we used two independent variables, we get two coefficients back. Note that these are only interpretable when you normalize your data first, since you can't draw any conclusions based on their magnitudes otherwise. We can also print accuracy to see how our model performed. Here, it accurately identified around 85 percent of the observations in the test set. Other noteworthy functions include predict and the ravel function for data preparation.

12. Summary

To summarize, we briefly reviewed regression models and their assumptions, and then discussed linear regression and logistic regression in more detail.

13. Let's prepare for the interview!

Let's put this knowledge to work in the exercises!