1. Introduction to deep learning
2. Imagine you work for a bank
Imagine you work for a bank, and you need to build a model predicting how many transactions each customer will make next year. You have predictive data or features like
3. Example as seen by linear regression
each customer’s age,
4. Example as seen by linear regression
bank balance,
5. Example as seen by linear regression
whether they are retired
6. Example as seen by linear regression
and so on. We'll get to deep learning in a moment, but for comparison, consider how a simple linear regression model works for this problem. The linear regression embeds an assumption that the outcome, in this case
7. Example as seen by linear regression
how many transactions a user makes, is the sum of individual parts. It starts by saying, "what is the average?" Then it adds
8. Example as seen by linear regression
the effect of age.
9. Example as seen by linear regression
Then the effect of bank balance. And so on. So the linear regression model isn't identifying the interactions between these parts, and how they affect banking activity.
10. Example as seen by linear regression
Say we plot predictions from this model.
11. Example as seen by linear regression
We draw one line with the predictions for retired people,
12. Example as seen by linear regression
and another with the predictions for those still working.
13. Example as seen by linear regression
We put current bank balance on the horizontal axis, and the
14. Example as seen by linear regression
vertical axis is the predicted number of transactions.
15. Example as seen by linear regression
The left graph shows predictions from a model with no interactions. In that model we simply add up the effect of the retirement status, and current bank balance. The lack of interactions is reflected by both lines being parallel. That's probably unrealistic, but it's an assumption of the linear regression model.
16. Example as seen by linear regression
The graph on the right shows the predictions from a model that allows interactions, and the lines don't need to be parallel. Neural networks are a powerful
17. Interactions
modeling approach that accounts for interactions like this especially well. Deep learning, the focus of this course, is the use of especially powerful neural networks. Because deep learning models account for these types of interactions so well, they perform great on most prediction problems you've seen before. But their ability to capture extremely complex interactions also allow them to do amazing things with text, images, videos, audio, source code and almost anything else you could imagine doing data science with.
18. Course structure
The first two chapters of this course focus on conceptual knowledge about deep learning. This part will be hard, but it will prepare you to debug and tune deep learning models on conventional prediction problems, and it will lay the foundation for progressing towards those new and exciting applications. You'll see this pay off in the third and fourth chapter.
19. Build and tune deep learning models using keras
You will write code that looks like this, to build and tune deep learning models using keras, to solve many of the same modeling problems you might have previously solved with scikit-learn. As a start to how deep learning models capture interactions and achieve these amazing results, we'll modify the diagram you saw a moment ago.
20. Deep learning models capture interactions
Here there is an interaction between
21. Deep learning models capture interactions
retirement status and bank balance. Instead of having them separately affecting the outcome, we calculate a function of these variables that accounts for their interaction, and use that to predict the outcome.
22. Deep learning models capture interactions
Even this graphic oversimplifies reality, where most things interact with each in some way, and real neural network models account for far more interactions. So the diagram for a simple neural network looks like this.
23. Interactions in neural network
On the far left, we have something called an input layer. This represents our predictive features like age or income.
24. Interactions in neural network
On the far right we have the output layer. The prediction from our model, in this case, the predicted number of transactions. All layers that are not the input or output layers
25. Interactions in neural network
are called hidden layers. They are called hidden layers because, while the inputs and outputs correspond to visible things that happened in the world, and they can be stored as data, the values in the hidden layer aren't something we have data about, or anything we observe directly from the world. Nevertheless, each dot, called a node, in the hidden layer, represents an aggregation of information
26. Interactions in neural network
from our input data, and each node
27. Interactions in neural network
adds to the model's ability to capture interactions. So the more nodes we have, the more interactions we can capture.
28. Let's practice!