Interpreting the coefficients
The linear regression model for flight duration as a function of distance takes the form
\(\text{duration} = \alpha + \beta \times \text{distance}\)
where
- \(\alpha\) — intercept (component of duration which does not depend on distance) and
- \(\beta\) — coefficient (rate at which duration increases as a function of distance; also called the slope).
By looking at the coefficients of your model you will be able to infer
- how much of the average flight duration is actually spent on the ground and
- what the average speed is during a flight.
The linear regression model is available as regression
.
This exercise is part of the course
Machine Learning with PySpark
Exercise instructions
- What's the intercept?
- What are the coefficients? This is a vector.
- Extract the element from the vector which corresponds to the slope for distance.
- Find the average speed in km per hour.
Hands-on interactive exercise
Have a go at this exercise by completing this sample code.
# Intercept (average minutes on ground)
inter = regression.____
print(inter)
# Coefficients
coefs = ____.____
print(coefs)
# Average minutes per km
minutes_per_km = ____.____[____]
print(minutes_per_km)
# Average speed in km per hour
avg_speed = ____ / ____
print(avg_speed)