Generative AI - Implementing a Conditional GAN for MNIST Digit Generation

Implementing a Conditional GAN for MNIST Digit Generation

1. Import Libraries

import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

First, we bring in TensorFlow and Keras to build and train the models, as well as NumPy to work with data and Matplotlib to see the pictures that were made.

2. Load and Preprocess the Data

(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = (x_train / 255.0).reshape(-1, 28, 28, 1)
y_train = tf.keras.utils.to_categorical(y_train, 10)

We get the MNIST dataset and prepare it by setting the pixel values to be in the range [0, 1]. The names are also changed into one-hot encoded vectors.

3. Define the Generator

def build_generator(latent_dim):
    input_label = layers.Input(shape=(10,))
    label_embedding = layers.Embedding(10, 50)(input_label)
    label_embedding = layers.Dense(7 * 7 * 1)(label_embedding)
    label_embedding = layers.Reshape((7, 7, 1))(label_embedding)

    input_noise = layers.Input(shape=(latent_dim,))
    noise_dense = layers.Dense(7 * 7 * 128)(input_noise)
    noise_dense = layers.Reshape((7, 7, 128))(noise_dense)

    merge = layers.Concatenate()([noise_dense, label_embedding])
    upsample = layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same')(merge)
    upsample = layers.ReLU()(upsample)
    upsample = layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same')(upsample)
    upsample = layers.ReLU()(upsample)
    output = layers.Conv2D(1, kernel_size=7, activation='tanh', padding='same')(upsample)
    return models.Model([input_noise, input_label], output)

A noise vector and a name are fed into the generation network. The name is put inside and changed to fit the size of the noise vector. To make a 28x28 picture, both sources are joined together and sent through upsampling levels (Conv2DTranspose).

4. Define the Discriminator

def build_discriminator():
    input_image = layers.Input(shape=(28, 28, 1))
    input_label = layers.Input(shape=(10,))
    label_embedding = layers.Embedding(10, 50)(input_label)
    label_embedding = layers.Dense(28 * 28)(label_embedding)
    label_embedding = layers.Reshape((28, 28, 1))(label_embedding)

    merge = layers.Concatenate()([input_image, label_embedding])
    conv1 = layers.Conv2D(64, kernel_size=3, strides=2, padding='same')(merge)
    conv1 = layers.LeakyReLU(alpha=0.2)(conv1)
    conv2 = layers.Conv2D(64, kernel_size=3, strides=2, padding='same')(conv1)
    conv2 = layers.LeakyReLU(alpha=0.2)(conv2)
    flatten = layers.Flatten()(conv2)
    output = layers.Dense(1, activation='sigmoid')(flatten)
    return models.Model([input_image, input_label], output)

A picture and a name are fed into the discriminator network. The name is put inside the picture and changed to fit its size. To figure out if a picture is real or fake, both sources are added together and sent through convolutional layers.

5. Compile the Models

latent_dim = 100
generator = build_generator(latent_dim)
discriminator = build_discriminator()
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

discriminator.trainable = False
noise = layers.Input(shape=(latent_dim,))
label = layers.Input(shape=(10,))
generated_image = generator([noise, label])
validity = discriminator([generated_image, label])
cgan = models.Model([noise, label], validity)
cgan.compile(optimizer='adam', loss='binary_crossentropy')

It is possible to build and create the generator and discriminator models. The discriminator is trained separately at first, and then its weights are kept the same while the mixed cGAN model is being trained. Noise and labels are fed into the cGAN model, which then sends the validity score from the discriminator out.

6. Training the cGAN

def train_cgan(epochs, batch_size=128):
    real = np.ones((batch_size, 1))
    fake = np.zeros((batch_size, 1))

    for epoch in range(epochs):
        idx = np.random.randint(0, x_train.shape[0], batch_size)
        real_images = x_train[idx]
        labels = y_train[idx]

        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        gen_labels = np.random.randint(0, 10, batch_size)
        gen_labels = tf.keras.utils.to_categorical(gen_labels, 10)
        generated_images = generator.predict([noise, gen_labels])

        d_loss_real = discriminator.train_on_batch([real_images, labels], real)
        d_loss_fake = discriminator.train_on_batch([generated_images, gen_labels], fake)
        d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        valid_y = np.ones((batch_size, 1))
        g_loss = cgan.train_on_batch([noise, gen_labels], valid_y)

        if epoch % 1000 == 0:
            print(f"Epoch {epoch} - D Loss: {d_loss}, G Loss: {g_loss}")
            save_generated_images(epoch, generator)

def save_generated_images(epoch, generator, examples=10, dim=(1, 10), figsize=(10, 1)):
    noise = np.random.normal(0, 1, (examples, latent_dim))
    sampled_labels = np.arange(0, 10).reshape(-1, 1)
    sampled_labels = tf.keras.utils.to_categorical(sampled_labels, 10)
    generated_images = generator.predict([noise, sampled_labels])
    generated_images = 0.5 * generated_images + 0.5

    plt.figure(figsize=figsize)
    for i in range(examples):
        plt.subplot(dim[0], dim[1], i + 1)
        plt.imshow(generated_images[i, :, :, 0], cmap='gray')
        plt.axis('off')
    plt.tight_layout()
    plt.savefig(f"cgan_generated_image_epoch_{epoch}.png")
    plt.close()

train_cgan(epochs=10000, batch_size=64)

The cGAN's training loop is run by the train_cgan function. It changes between using real and fake images to train the discriminator and the combined model to train the generator. The discriminator is tricked by the creator, which makes pictures that match the titles that are given. The save_generated_images method saves generated images on a regular basis so that you can see how the training is going.

By doing these steps, you can set up a conditional GAN that will make certain kinds of MNIST numbers based on their names. This example shows how conditional information can be added to the GAN framework to make data creation more controlled and focused.

logo

Generative AI

Beginner 5 Hours

Implementing a Conditional GAN for MNIST Digit Generation

1. Import Libraries

import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import matplotlib.pyplot as plt

First, we bring in TensorFlow and Keras to build and train the models, as well as NumPy to work with data and Matplotlib to see the pictures that were made.

2. Load and Preprocess the Data

(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = (x_train / 255.0).reshape(-1, 28, 28, 1)
y_train = tf.keras.utils.to_categorical(y_train, 10)

We get the MNIST dataset and prepare it by setting the pixel values to be in the range [0, 1]. The names are also changed into one-hot encoded vectors.

3. Define the Generator

def build_generator(latent_dim):
    input_label = layers.Input(shape=(10,))
    label_embedding = layers.Embedding(10, 50)(input_label)
    label_embedding = layers.Dense(7 * 7 * 1)(label_embedding)
    label_embedding = layers.Reshape((7, 7, 1))(label_embedding)

    input_noise = layers.Input(shape=(latent_dim,))
    noise_dense = layers.Dense(7 * 7 * 128)(input_noise)
    noise_dense = layers.Reshape((7, 7, 128))(noise_dense)

    merge = layers.Concatenate()([noise_dense, label_embedding])
    upsample = layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same')(merge)
    upsample = layers.ReLU()(upsample)
    upsample = layers.Conv2DTranspose(128, kernel_size=4, strides=2, padding='same')(upsample)
    upsample = layers.ReLU()(upsample)
    output = layers.Conv2D(1, kernel_size=7, activation='tanh', padding='same')(upsample)
    return models.Model([input_noise, input_label], output)

A noise vector and a name are fed into the generation network. The name is put inside and changed to fit the size of the noise vector. To make a 28x28 picture, both sources are joined together and sent through upsampling levels (Conv2DTranspose).

4. Define the Discriminator

def build_discriminator():
    input_image = layers.Input(shape=(28, 28, 1))
    input_label = layers.Input(shape=(10,))
    label_embedding = layers.Embedding(10, 50)(input_label)
    label_embedding = layers.Dense(28 * 28)(label_embedding)
    label_embedding = layers.Reshape((28, 28, 1))(label_embedding)

    merge = layers.Concatenate()([input_image, label_embedding])
    conv1 = layers.Conv2D(64, kernel_size=3, strides=2, padding='same')(merge)
    conv1 = layers.LeakyReLU(alpha=0.2)(conv1)
    conv2 = layers.Conv2D(64, kernel_size=3, strides=2, padding='same')(conv1)
    conv2 = layers.LeakyReLU(alpha=0.2)(conv2)
    flatten = layers.Flatten()(conv2)
    output = layers.Dense(1, activation='sigmoid')(flatten)
    return models.Model([input_image, input_label], output)

A picture and a name are fed into the discriminator network. The name is put inside the picture and changed to fit its size. To figure out if a picture is real or fake, both sources are added together and sent through convolutional layers.

5. Compile the Models

latent_dim = 100
generator = build_generator(latent_dim)
discriminator = build_discriminator()
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

discriminator.trainable = False
noise = layers.Input(shape=(latent_dim,))
label = layers.Input(shape=(10,))
generated_image = generator([noise, label])
validity = discriminator([generated_image, label])
cgan = models.Model([noise, label], validity)
cgan.compile(optimizer='adam', loss='binary_crossentropy')

It is possible to build and create the generator and discriminator models. The discriminator is trained separately at first, and then its weights are kept the same while the mixed cGAN model is being trained. Noise and labels are fed into the cGAN model, which then sends the validity score from the discriminator out.

6. Training the cGAN

def train_cgan(epochs, batch_size=128):
    real = np.ones((batch_size, 1))
    fake = np.zeros((batch_size, 1))

    for epoch in range(epochs):
        idx = np.random.randint(0, x_train.shape[0], batch_size)
        real_images = x_train[idx]
        labels = y_train[idx]

        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        gen_labels = np.random.randint(0, 10, batch_size)
        gen_labels = tf.keras.utils.to_categorical(gen_labels, 10)
        generated_images = generator.predict([noise, gen_labels])

        d_loss_real = discriminator.train_on_batch([real_images, labels], real)
        d_loss_fake = discriminator.train_on_batch([generated_images, gen_labels], fake)
        d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

        noise = np.random.normal(0, 1, (batch_size, latent_dim))
        valid_y = np.ones((batch_size, 1))
        g_loss = cgan.train_on_batch([noise, gen_labels], valid_y)

        if epoch % 1000 == 0:
            print(f"Epoch {epoch} - D Loss: {d_loss}, G Loss: {g_loss}")
            save_generated_images(epoch, generator)

def save_generated_images(epoch, generator, examples=10, dim=(1, 10), figsize=(10, 1)):
    noise = np.random.normal(0, 1, (examples, latent_dim))
    sampled_labels = np.arange(0, 10).reshape(-1, 1)
    sampled_labels = tf.keras.utils.to_categorical(sampled_labels, 10)
    generated_images = generator.predict([noise, sampled_labels])
    generated_images = 0.5 * generated_images + 0.5

    plt.figure(figsize=figsize)
    for i in range(examples):
        plt.subplot(dim[0], dim[1], i + 1)
        plt.imshow(generated_images[i, :, :, 0], cmap='gray')
        plt.axis('off')
    plt.tight_layout()
    plt.savefig(f"cgan_generated_image_epoch_{epoch}.png")
    plt.close()

train_cgan(epochs=10000, batch_size=64)

The cGAN's training loop is run by the train_cgan function. It changes between using real and fake images to train the discriminator and the combined model to train the generator. The discriminator is tricked by the creator, which makes pictures that match the titles that are given. The save_generated_images method saves generated images on a regular basis so that you can see how the training is going.

By doing these steps, you can set up a conditional GAN that will make certain kinds of MNIST numbers based on their names. This example shows how conditional information can be added to the GAN framework to make data creation more controlled and focused.

Frequently Asked Questions for Generative AI

Sequence of prompts stored as linked records or documents.

It helps with filtering, categorization, and evaluating generated outputs.



As text fields, often with associated metadata and response outputs.

Combines keyword and vector-based search for improved result relevance.

Yes, for storing structured prompt-response pairs or evaluation data.

Combines database search with generation to improve accuracy and grounding.

Using encryption, anonymization, and role-based access control.

Using tools like DVC or MLflow with database or cloud storage.

Databases optimized to store and search high-dimensional embeddings efficiently.

They enable semantic search and similarity-based retrieval for better context.

They provide organized and labeled datasets for supervised trainining.



Track usage patterns, feedback, and model behavior over time.

Enhancing model responses by referencing external, trustworthy data sources.

They store training data and generated outputs for model development and evaluation.

Removing repeated data to reduce bias and improve model generalization.

Yes, using BLOB fields or linking to external model repositories.

With user IDs, timestamps, and quality scores in relational or NoSQL databases.

Using distributed databases, replication, and sharding.

NoSQL or vector databases like Pinecone, Weaviate, or Elasticsearch.

With indexing, metadata tagging, and structured formats for efficient access.

Text, images, audio, and structured data from diverse databases.

Yes, for representing relationships between entities in generated content.

Yes, using structured or document databases with timestamps and session data.

They store synthetic data alongside real data with clear metadata separation.



line

Copyrights © 2024 letsupdateskills All rights reserved