1. Dense layers
In this chapter, we will focus on training neural networks in TensorFlow. We will start with an overview of a frequently used component of neural networks: the dense layer.
2. The linear regression model
Throughout this chapter, we'll make use of a dataset on credit card default. It contains features, such as marital status and payment amount, which we'll use to predict a target, default.
Here, we have the familiar linear regression model. We take marital status, which is 1, and bill amount, which is 3. We then multiply the inputs by their respective weights, zero point one and minus zero point two five, and sum.
3. What is a neural network?
So how do we get from a linear regression to a neural network? By adding a hidden layer, which, in this case, consists of two nodes.
Each hidden layer node takes our two inputs, multiplies them by their respective weights, and sums them together. We also typically pass the hidden layer output to an activation function, but we will come back to that later.
Finally, we sum together the outputs of the two hidden layers to compute our prediction for default. This entire process of generating a prediction is referred to as forward propagation.
4. What is a neural network?
In this chapter, we will construct neural networks with three types of layers: an input layer, some number of hidden layers, and an output layer. The input layer consists of our features. The output layer contains our prediction. Each hidden layer takes inputs from the previous layer, applies numerical weights to them, sums them together, and then applies an activation function.
In the neural network graph, we have applied a particular type of hidden layer called a dense layer. A dense layer applies weights to all nodes from the previous layer. We will use dense layers throughout this chapter to construct networks.
5. A simple dense layer
Let's look at a simple example of a dense layer.
We'll first define a constant tensor that contains the marital status and age data as the input layer.
We then initialize weights as a variable, since we will train those weights to predict the output from the inputs.
We also define a bias, which will play a similar role to the intercept in the linear regression model.
6. A simple dense layer
Finally, we define a dense layer. Note that we first perform a matrix multiplication of the inputs by the weights and assign that to the tensor named product.
We then add product to the bias and apply a non-linear transformation, in this case the sigmoid function. This is called the activation function and we will explore this in more depth in the next video, but do not worry about it for now.
Furthermore, note that the bias is not associated with a feature and is analogous to the intercept in a linear regression. We will typically not draw it in neural network diagrams for simplicity.
7. Defining a complete model
Note that TensorFlow also comes with higher level operations, such as tf dot keras dot layers dot Dense, which allows us to skip the linear algebra.
In this example, we take input data and convert it to a 32-bit float tensor. We then define a first hidden dense layer using keras layers dense. The first argument specifies the number of outgoing nodes. And the second argument is the activation function. By default, a bias will be included.
Note that we've also passed inputs as an argument to the first dense layer.
8. Defining a complete model
We can easily define another dense layer, which takes the first dense layer as an argument and then reduces the number of nodes.
The output layer reduces this again to one node.
9. High-level versus low-level approach
Finally, let's compare the high-level and low-level approaches.
The high-level approach relies on complex operations in high-level APIs, such as Keras and Estimators, reducing the amount of code needed. The weights and the mathematical operations will typically be hidden by the layer constructor.
The low-level approach uses linear algebra, which allows for the construction of any model.
TensorFlow allows us to use either approach or even combine them.
10. Let's practice!
You now know how to construct dense layers using both the high and low-level approaches, so let's do that in some exercises!