Get startedGet started for free

Model review and comparison

1. Model review and comparison

In this lesson, we'll do a complete review and comparison of models we've worked on so far. First we'll review the basics of each model and implementations, talk about model evaluation, and wrap up with a comparison of pros and cons to get a better idea of when to use deep learning versus more classical machine learning methods.

2. Model review

We've covered a few models so far in the course. First, recall Logistic Regression, which finds a relevant decision boundary. Next, we looked at Decision Trees, which use conditional statements in a tree format, and each path to a leaf represents a particular classification rule. Random Forests are an ensemble of many Decision Trees, that use bagging (bootstrap aggregation) to reduce overfitting of any one particular tree. Lastly, we learned the multi-layer perceptron (MLP) model, which uses a nonlinear activation function applied to linear combinations of inputs. Now that we've done a high review of the models, let's jump into a comparison of their implementations.

3. Model implementation

The implementation of models covered all have some high level similarities and differences. Each model can use standard transformations, and regularization can be applied to reduce overfitting. Probability scores and label predictions can be produced through the predict_proba and predict methods. The main differences are in the parameters of each model and how those parameters are configured. For example, tree-based models like Decision Trees and Random Forests have various parameters, like the maximum depth and the quality of a particular split. Models like logistic regression and neural networks, both have weights on the linear combinations of variables and their own nonlinear function for outputs. After implementation of models, we can then evaluate them. Let's briefly review that process now.

4. Model evaluation

After making predictions, we can use both the model's labels and probability scores to assess the key classification metrics - including the confusion matrix to look at the four categories of outcomes, precision, recall, F-beta score, and AUC of the ROC curve. The implementations in sklearn are shown here as a recap. Now that we've covered a review of model implementation and evaluation, let's explore the main pros and cons of neural networks versus the other models we've learned.

5. Main pros and cons of using neural networks

Neural networks have three main pros and cons versus traditional machine learning techniques. The first pro is data scalability: neural networks have better improved performance with more data. For example, many image recognition pre-trained libraries utilize over a million images. On the flip side, neural networks are less powerful on smaller datasets, where data might be difficult to acquire or expensive. The second pro is reduced explicit feature engineering - the complexity of the neural network derives implicit features involving combinations of inputs. The flip side is that the end model is much less understandable - for example, while logistic regression parameters are interpretable, the same is not true of neural networks. The last pro is that neural networks are more adaptable and transferable among different domains such as natural language processing, video games, etc. The con is that such versatility requires large networks and large volumes of data.

6. Let's practice!

Now that we've done a high level overview of model review and comparison, let's practice some implementation! In particular, we'll review the models we've covered thus far with a focus on neural networks.