Fit a linear model
We'll now fit a linear model, because they are simple and easy to understand. Once we've fit our model, we can see which predictor variables appear to be meaningfully linearly correlated with the target, as well as their magnitude of effect on the target. Our judgment of whether or not predictors are significant is based on the p-values of coefficients. This is using a t-test to statistically test if the coefficient significantly differs from 0. The p-value is the percent chance that the coefficient for a feature does not differ from zero. Typically, we take a p-value of less than 0.05 to mean the coefficient is significantly different from 0.
This exercise is part of the course
Machine Learning for Finance in Python
Exercise instructions
- Fit the linear model (using the
.fit()
method) and save the results in theresults
variable. - Print out the results summary with the
.summary()
function. - Print out the p-values from the results (the
.pvalues
property ofresults
). - Make predictions from the
train_features
andtest_features
using the.predict()
function of ourresults
object.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Create the linear model and complete the least squares fit
model = sm.OLS(train_targets, train_features)
results = model.____ # fit the model
print(results.____)
# examine pvalues
# Features with p <= 0.05 are typically considered significantly different from 0
print(results.____)
# Make predictions from our model for train and test sets
train_predictions = results.predict(train_features)
test_predictions = ____