Introduction to GANs

1. Introduction to GANs

Hello again! Let's learn about image generation!

2. General Adversarial Networks

This is a picture of a cat. However, it is quite a special cat in that it does not exist. This image comes from the website thesecatsdonotexist.com where we can find infinitely many non-existent cats. They were all artificially generated using a technique known as Generative Adversarial Networks, or GANs for short. GANs are generative models able to create completely new data samples similar to the training data they are given.

3. Pokemon Sprites Dataset

Throughout this chapter, we will be working with the Pokemon Sprites dataset, available from the PokeAPI. It consists of about 1300 sprite images of Pokemons, animal-like creatures from a popular Japanese video game. Our goal is to use GANs to generate completely new Pokemons!

4. GANs architecture

Let's discuss how GANs work. Their architecture contains a neural network called a generator. We can think of it as a fraudster trying to produce forged paintings.

5. GANs architecture

The Generator model receives random noise as input, and produces an image, in our case, a Pokemon sprite. The noise is a tensor of random values drawn from a standard normal distribution.

6. GANs architecture

At this point, a second neural network called the discriminator enters the scene. We can think of the discriminator as the police officer attempting to catch art forgers.

7. GANs architecture

Its job is to distinguish between real and fake images.

8. GANs learning process

The generator and the discriminator are trained in tandem but with conflicting objectives. This is referred to as adversarial training. The generator learns to produce realistic-looking images that would fool the discriminator, while the discriminator learns to tell the increasingly better fakes from real images. These conflicting goals of the two networks should ensure that each would gradually become better at its task during training. In the end, the generator will hopefully be able to generate realistic images.

9. Basic Generator

Let's build a basic generator. We start by defining the Generator class. In the init method, we define a sequential network consisting of three generator blocks produced using a custom gen-block function. Each generator block is a linear layer followed by batch normalization and ReLU activation. Notice how with each block we increase the size of the feature maps to go from the small input noise to the large output image. The specific numbers of neurons in each layer are chosen arbitrarily here. After the generator blocks, we append a linear layer and a sigmoid activation. In the forward method, we pass the input through the sequential network we defined. This generator will take as input a random noise vector of size in-dim, and produce the output image of size out-dim.

10. Basic Discriminator

Let's turn our attention to the discriminator now. The concept is quite similar. We start by defining the Discriminator class. Next, we define the sequential network, this time consisting of discriminator blocks created using a custom disc_block function. Each discriminator block consists of a single linear layer followed by a leaky ReLU activation. Notice how the first discriminator block maps the input to size 1024, while all the subsequent blocks decrease the size of the feature map, until we arrive at a single number in the last linear layer. In the forward method, we pass the input through all the layers. This discriminator will take the image of size in_dim as input, and will produce the output of size 1: a single prediction whether the input is a real or a fake image.

11. Let's practice!

In the next video, we will learn about GANs with convolutional layers. But now, to get familiar with the concept, it's your turn to build a basic linear GAN!

Create Your Free Account

By continuing, you accept our Terms of Use, our Privacy Policy and that your data is stored in the USA.