Exercise

# Data type and distribution family

In this lesson you learned different data types which can be modeled using the generalized linear models (GLMs). In this exercise you will review the data types and apply the correct distribution family to fit a GLM.

Instructions 1/3

## Question

Consider a study in which you are trying to predict the number of bike crossings over the Brooklyn bridge in New York City given daily temperature.

Use the Console to view the top five rows of dataset `bike`

, which contains your variables. For this you can use `pandas`

`head()`

function.

In the data you will find the variables you need to train your model. The two variables you need are:

`Brooklyn_B`

: the number of bike crossings over the Brooklyn Bridge`Avg_Temp`

: average daily temperature in New York City

You visualize the data using the scatterplot and obtain the following:

You decide to fit a GLM model. Now considering the response, **the number of bike crossings**, which distribution family would you consider for fitting a GLM model?

### Possible answers

`Binomial()`

`Gaussian()`

`Poisson()`