Get startedGet started for free

Linear regression model

1. Linear regression model

Hi again! Great job on the covariance and the correlation coefficient. In this video, we will review linear regression models.

2. Linear regression model

You are likely to get asked about linear regression models by an interviewer if your future role involves machine learning. In your job, you might be asked to predict, for example, the impact of rainfall on fruit production,

3. Linear regression model

or the impact of house area on its price.

4. Plot

Here, we see data for two variables. There is a linear relationship between them. Thanks to this relationship, we can use the value of one variable to estimate the probable value of the other.

5. Plot

The linear regression model leverages the knowledge it got from all the gathered data. When presented with a new data point, the model provides its best guess - a prediction.

6. Linear regression model

A linear regression model consists of several elements: a dependent variable, also called a response, independent variables or explanatory variables, parameters, and an error. The error is the part of the data that we can't explain with a straight line.

7. Linear predictor function

To predict a value, we use explanatory variables and coefficients. The beta coefficients reflect the degree of change in the outcome variable for every one unit of change in the predictor variables.

8. Simple linear model - interpretation

A simple linear regression model consists of one explanatory variable. It can be visualized with a straight line.

9. Simple linear model - interpretation

The beta zero parameter reflects the intercept of the line.

10. Simple linear model - interpretation

The beta one parameter tells us how much the prediction changes

11. Simple linear model - interpretation

when the explanatory variable increases by one. In other words, it's the slope of the function.

12. Log-transformation

If the relationship between variables is not linear, you can use a transformation to achieve linearity. A common approach for skewed-data is to apply a logarithm on one or more of the variables.

13. Assumptions

In order for it to be appropriate to use a linear regression model, the data needs to conform to several assumptions. The relationship between the variables needs to be linear. The errors must be normally distributed and homoscedastic, which means that their variance must be uniform. The observations need to be independent.

14. Linear model in R

To fit a linear regression model in R, you can use the lm function. Within the function, you need to specify a formula and the source data.

15. Linear model in R

To predict a value with a linear regression model, use the predict function applied to a model and specify a new set of data points.

16. Diagnostic plots

If you use the plot function on a linear regression model, R will output four diagnostic plots. These diagnostic plots can help you determine if the assumptions of a linear regression model are met.

17. Diagnostic plots

The first plot shows if the residuals have non-linear patterns. If you find equally spread residuals around the horizontal line without distinct patterns, that is a good indication you do not have non-linear relationships.

18. Diagnostic plots

The second plot shows a Q-Q plot and can tell you if residuals are normally distributed. They are if the points are lined well on the straight dashed line.

19. Diagnostic plots

The third plot shows if residuals are spread equally along the ranges of predictors. A horizontal line with equally and randomly spread points implies that the homoscedasticity assumption is met.

20. Diagnostic plots

The last plot helps us to find influential cases. Watch out for outlying values at the upper or the lower right corner outside of a dashed line. These cases are influential to the regression results. If we exclude them from the analysis, the results may be significantly altered. You may want to take a closer look at them individually. Is there anything special about these observations?

21. Summary

To wrap up, we've covered linear regression model, linear prediction function, the lm function in R, and diagnostic plots.

22. Let's practice!

Let's exercise fitting linear regression models before your interview!