Why Deep Convolutional Neural Network GAN, trained using my step sequence, doesn't performs good as in the GFG DC-GAN Tutorial example? Even tho my code is similar in logic and resembles same to the one from GFG DC-GAN.

Abhas_Kumar · July 11, 2022, 8:25am

Why Deep Convolutional Neural Network GAN, trained using my step sequence, doesn’t performs good as in the GFG DC-GAN Tutorial example? Even tho my code is similar in logic and resembles same to the one from GFG DC-GAN.

As the question above sounds, it seems like I’m missing a very major thing and there is certainly some very big blunder. But the problem is that despite days of manual bugging of these few lines of code, I’ve not been able to successfully track down the issue with my code. I need some urgent help regarding the issue as soon as possible and would appreciate any help. Thank You.

Note: We both are using same default dataset: tf.keras.datasets.fashion_mnist.load_data()[0]

My Notebook: Google Colab
GFG Version : Deep Convolutional GAN with Keras - GeeksforGeeks

Results with my code vs the GFG Version of the Code.

(GFG Version of the Code, after 1st Epoch and 128 batch size)

download

(My Version of the Code, after training 1000 batch size and 60 epochs in total)

As you can see, my model fails severely to converge to the desired result and for some reason, I notice mode collapse or something which doesn’t lets my model converge to the desired results.

I’m posting the code of my training sequence as well as a bit of explaination of what it does in the following section:

def training_sequence(batch_size=1000, epochs=60):
    for epoch in range(epochs):
        print("Epoch: " + str(epoch + 1) + " of " + str(epochs))
        #Training Discriminator:

        GAN.layers[0].trainable = False
        GAN.layers[1].trainable = True

        #1 for real samples and 0 for fake samples
        fake_generated_examples = generator(generate_random_sequence(size = int(batch_size / 2)))
        zeros = tf.zeros((int(batch_size / 2), 1, 1, 1))

        #train discriminator
        discriminator.fit(fake_generated_examples, zeros)
        ones = tf.ones((int(batch_size / 2), 1, 1, 1))
        real_examples = x_train[int(batch_size / 2)*(epoch) : int(batch_size / 2)*(epoch + 1)].reshape(int(int(batch_size / 2)*(epoch + 1) - int(batch_size / 2)*(epoch)), 28, 28, 1)

        #train discriminator
        discriminator.fit(real_examples, ones)

        #Training Generator
        GAN.layers[0].trainable = True
        GAN.layers[1].trainable = False

        #train generator
        GAN.fit(generate_random_sequence(size = int(batch_size / 2)), ones)

There are two models (I’ll post the summary just bit after their introduction) -

Discriminator: Takes in an input of (28, 28, 1) and outputs : [[[[1]]]] for real samples (the ones from dataset) and [[[[0]]]] for fake ones (which are generated from Generators.
GAN: It is a Sequence of Generator and Discriminator model. It takes in a random value of (1, 100) and returns the output from discriminator.

generate_random_sequences(size) is just a function to generate a number of random 100 numbers of size rows and (1,100) columns which are fed into generator.

int(batch_size / 2)*(epoch) used in the code is just used to select first batch_size instances from the dataset in each epoch loop and keep selecting next batch size and so on, so there is no repetition of selection of a data from dataset during each epoch and they remain unique.

Here’s my model architecture:

(My Model Architecture)

As you all can see it is very similar to the one from GFG.

So, as apparently seen from the whole mess above, it certainly seems I’ve missed something which others haven’t. I don’t know the issue yet and I really need help of someone experienced to guide me with my endeavors further. Thank you everyone.

Mark_Strefford · July 11, 2022, 10:03am

After a quick look, try the following:

Train the Discriminator

Train on real samples (output = 1)
Train on fake samples (output = 0)

You seem to be doing this, but you can find better results with 2 passes here

Train the generator

You seem to be asking the generator to predict ones as the output?

#train generator
 GAN.fit(generate_random_sequence(size = int(batch_size / 2)), ones)

This is incorrect. You need to feed fake images into the discriminator with a predicted output of 1 (ie. real). This discriminator loss is what you use for the generator.

I’m also not sure what you’re trying to achieve by setting some layers to untrainable?

This is a better, and official, tutorial of Tensorflow DCGAN for MNIST

Abhas_Kumar · July 11, 2022, 11:21am

Hello @Mark_Strefford ,

Thank you for the reply. Let me explain the scenario here a bit.

I’m using a singular model GAN which is composed by end-to-end connecting:

generator model (which takes in (1,100) random input and generates (28,28,1) image) and one
discriminator (which takes (28,28,1) image and generates [[[[1]]] or [[[[0]]]] based on whether it’s a real image or a generated one.

So when I feed a random sequence of (1,100) into GAN model, it returns one [[[[1]]]] or [[[[0]]]] based on whether the image is real or synthesized one.

When I first train discriminator with few samples from batch size about real image or a fake one, I lock that discriminator layer and train the generator model like this:

Generator generates it’s own output from given random input and passes it through discriminator, all inside the GAN Model with second layer (discriminator layer) set to untrainable and it generates either 0s or 1s in the above format. The difference in predicted output and actual outputs exists! So, GAN model is fitted with the actual output and since the discriminator layer of GAN is untrainable, only Generator gets trained this way. That is exactly the same logic the GFG Page code is using.

Also thank you for the official link, I’ve seen it a while before, but I’d refrain from using Gradient Tape for now which that official tutorial is using for training the models.

Hopefully my explaination was clear. If not, kindly point out the part.

Mark_Strefford · September 12, 2022, 7:54am

Did you fix this?

Have you tried displaying the training images before they are fed into the GAN? That will give you confidence you’re actually training it on the right data.

**Why Deep Convolutional Neural Network GAN, trained using my step sequence, doesn't performs good as in the GFG DC-GAN Tutorial example? Even tho my code is similar in logic and resembles same to the one from GFG DC-GAN.**

Why Deep Convolutional Neural Network GAN, trained using my step sequence, doesn't performs good as in the GFG DC-GAN Tutorial example? Even tho my code is similar in logic and resembles same to the one from GFG DC-GAN.