Generative AI - Implementing a WGAN for MNIST Digit Generation

Implementing a WGAN for MNIST Digit Generation

1. Import Libraries

import tensorflow as tf
from tensorflow.keras import layers, models, initializers, constraints
import numpy as np
import matplotlib.pyplot as plt

We use TensorFlow and Keras to build and train the models, NumPy to work with the data, and Matplotlib to see the pictures that were made.

2. Define ClipConstraint


class ClipConstraint(tf.keras.constraints.Constraint):
    def __init__(self, clip_value):
        self.clip_value = clip_value

    def __call__(self, weights):
        return tf.clip_by_value(weights, -self.clip_value, self.clip_value)

    def get_config(self):
        return {'clip_value': self.clip_value}
In this class, there is a constraint that limits the weights of the critic to a set range. This makes sure that the Lipschitz constraint is met, which is needed for the WGAN training process.

3. Load and Preprocess the Data


(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
x_train = (x_train.astype('float32') - 127.5) / 127.5
x_train = np.expand_dims(x_train, axis=-1)
For easier training, we load the MNIST dataset and set the values of each image to a range of -1 to 1. In order to add a channel depth, the pictures are altered.

4. Define the Generator

def build_generator():
    model = models.Sequential()
    model.add(layers.Dense(128 * 7 * 7, activation="relu", input_dim=100))
    model.add(layers.Reshape((7, 7, 128)))
    model.add(layers.UpSampling2D())
    model.add(layers.Conv2D(128, kernel_size=4, padding="same"))
    model.add(layers.BatchNormalization(momentum=0.8))
    model.add(layers.Activation("relu"))
    model.add(layers.UpSampling2D())
    model.add(layers.Conv2D(64, kernel_size=4, padding="same"))
    model.add(layers.BatchNormalization(momentum=0.8))
    model.add(layers.Activation("relu"))
    model.add(layers.Conv2D(1, kernel_size=4, padding="same", activation='tanh'))
    return model

A noise vector is turned into a 28x28 picture by the generator model's thick and upsampling layers. Activation and batch adjustment levels help keep training stable.

5. Define the Critic

def build_critic():
    const = ClipConstraint(0.01)
    model = models.Sequential()
    model.add(layers.Conv2D(64, kernel_size=3, strides=2, padding="same", input_shape=[28, 28, 1], kernel_constraint=const))
    model.add(layers.LeakyReLU(alpha=0.2))
    model.add(layers.Conv2D(128, kernel_size=3, strides=2, padding="same", kernel_constraint=const))
    model.add(layers.LeakyReLU(alpha=0.2))
    model.add(layers.Flatten())
    model.add(layers.Dense(1))
    return model

There are neural layers with LeakyReLU activations in the reviewer model. For WGAN to work, the ClipConstraint makes sure that the weights are limited to a set range. This enforces the Lipschitz constraint.

6. Compile the Models

generator = build_generator()
critic = build_critic()
critic.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.00005), loss='mse')

z = layers.Input(shape=(100,))
img = generator(z)
critic.trainable = False
validity = critic(img)
combined = models.Model(z, validity)
combined.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.00005), loss='mse')



The maker and reviewer models are put together separately. To make sure training is stable, the reviewer uses the RMSprop algorithm. The linked model connects the generator and reviewer, which makes training the generator easier.

6. Training the WGAN

def train_wgan(epochs, batch_size=64, sample_interval=200):
    (x_train, _), (_, _) = tf.keras.datasets.mnist.load_data()
    x_train = (x_train.astype('float32') - 127.5) / 127.5
    x_train = np.expand_dims(x_train, axis=-1)

    valid = -np.ones((batch_size, 1))
    fake = np.ones((batch_size, 1))
    clip_value = 0.01
    n_critic = 5

    for epoch in range(epochs):
        for _ in range(n_critic):
            idx = np.random.randint(0, x_train.shape[0], batch_size)
            imgs = x_train[idx]
            noise = np.random.normal(0, 1, (batch_size, 100))
            gen_imgs = generator.predict(noise)
            d_loss_real = critic.train_on_batch(imgs, valid)
            d_loss_fake = critic.train_on_batch(gen_imgs, fake)
            d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)
            for layer in critic.layers:
                weights = layer.get_weights()
                weights = [np.clip(w, -clip_value, clip_value) for w in weights]
                layer.set_weights(weights)

        noise = np.random.normal(0, 1, (batch_size, 100))
        g_loss = combined.train_on_batch(noise, valid)

        if epoch % sample_interval == 0:
            print(f"{epoch} [D loss: {d_loss}] [G loss: {g_loss}]")
            sample_images(epoch, generator)

def sample_images(epoch, generator, image_grid_rows=5, image_grid_columns=5):
    noise = np.random.normal(0, 1, (image_grid_rows * image_grid_columns, 100))
    gen_imgs = generator.predict(noise)
    gen_imgs = 0.5 * gen_imgs + 0.5
    fig, axs = plt.subplots(image_grid_rows, image_grid_columns, figsize=(10, 10), sharey=True, sharex=True)
    cnt = 0
    for i in range(image_grid_rows):
        for j in range(image_grid_columns):
            axs[i, j].imshow(gen_imgs[cnt, :, :, 0], cmap='gray')
            axs[i, j].axis('off')
            cnt += 1
    plt.savefig(f"wgan_generated_image_epoch_{epoch}.png")
    plt.close()

train_wgan(epochs=10000, batch_size=64, sample_interval=1000)

The training loop is controlled by the train_wgan function, which trains both the reviewer and the creator in turns. To keep the balance, the reviewer is changed more often. Every so often, the sample_images method saves the images that were made so that you can see how the training is going.

If you follow these steps, you can set up a WGAN to make realistic MNIST numbers. You can use the Wasserstein distance to make training more stable and improve the quality of the data you make. This method fixes some problems that regular GANs have, like mode collapse and training instability, making it a strong base for generative models.

logo

Generative AI

Beginner 5 Hours

Implementing a WGAN for MNIST Digit Generation

1. Import Libraries

import tensorflow as tf from tensorflow.keras import layers, models, initializers, constraints import numpy as np import matplotlib.pyplot as plt

We use TensorFlow and Keras to build and train the models, NumPy to work with the data, and Matplotlib to see the pictures that were made.

2. Define ClipConstraint


class ClipConstraint(tf.keras.constraints.Constraint): def __init__(self, clip_value): self.clip_value = clip_value def __call__(self, weights): return tf.clip_by_value(weights, -self.clip_value, self.clip_value) def get_config(self): return {'clip_value': self.clip_value}
In this class, there is a constraint that limits the weights of the critic to a set range. This makes sure that the Lipschitz constraint is met, which is needed for the WGAN training process.

3. Load and Preprocess the Data


(x_train, _), (_, _) = tf.keras.datasets.mnist.load_data() x_train = (x_train.astype('float32') - 127.5) / 127.5 x_train = np.expand_dims(x_train, axis=-1)
For easier training, we load the MNIST dataset and set the values of each image to a range of -1 to 1. In order to add a channel depth, the pictures are altered.

4. Define the Generator

def build_generator(): model = models.Sequential() model.add(layers.Dense(128 * 7 * 7, activation="relu", input_dim=100)) model.add(layers.Reshape((7, 7, 128))) model.add(layers.UpSampling2D()) model.add(layers.Conv2D(128, kernel_size=4, padding="same")) model.add(layers.BatchNormalization(momentum=0.8)) model.add(layers.Activation("relu")) model.add(layers.UpSampling2D()) model.add(layers.Conv2D(64, kernel_size=4, padding="same")) model.add(layers.BatchNormalization(momentum=0.8)) model.add(layers.Activation("relu")) model.add(layers.Conv2D(1, kernel_size=4, padding="same", activation='tanh')) return model

A noise vector is turned into a 28x28 picture by the generator model's thick and upsampling layers. Activation and batch adjustment levels help keep training stable.

5. Define the Critic

def build_critic(): const = ClipConstraint(0.01) model = models.Sequential() model.add(layers.Conv2D(64, kernel_size=3, strides=2, padding="same", input_shape=[28, 28, 1], kernel_constraint=const)) model.add(layers.LeakyReLU(alpha=0.2)) model.add(layers.Conv2D(128, kernel_size=3, strides=2, padding="same", kernel_constraint=const)) model.add(layers.LeakyReLU(alpha=0.2)) model.add(layers.Flatten()) model.add(layers.Dense(1)) return model

There are neural layers with LeakyReLU activations in the reviewer model. For WGAN to work, the ClipConstraint makes sure that the weights are limited to a set range. This enforces the Lipschitz constraint.

6. Compile the Models

generator = build_generator() critic = build_critic() critic.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.00005), loss='mse') z = layers.Input(shape=(100,)) img = generator(z) critic.trainable = False validity = critic(img) combined = models.Model(z, validity) combined.compile(optimizer=tf.keras.optimizers.RMSprop(lr=0.00005), loss='mse')



The maker and reviewer models are put together separately. To make sure training is stable, the reviewer uses the RMSprop algorithm. The linked model connects the generator and reviewer, which makes training the generator easier.

6. Training the WGAN

def train_wgan(epochs, batch_size=64, sample_interval=200): (x_train, _), (_, _) = tf.keras.datasets.mnist.load_data() x_train = (x_train.astype('float32') - 127.5) / 127.5 x_train = np.expand_dims(x_train, axis=-1) valid = -np.ones((batch_size, 1)) fake = np.ones((batch_size, 1)) clip_value = 0.01 n_critic = 5 for epoch in range(epochs): for _ in range(n_critic): idx = np.random.randint(0, x_train.shape[0], batch_size) imgs = x_train[idx] noise = np.random.normal(0, 1, (batch_size, 100)) gen_imgs = generator.predict(noise) d_loss_real = critic.train_on_batch(imgs, valid) d_loss_fake = critic.train_on_batch(gen_imgs, fake) d_loss = 0.5 * np.add(d_loss_real, d_loss_fake) for layer in critic.layers: weights = layer.get_weights() weights = [np.clip(w, -clip_value, clip_value) for w in weights] layer.set_weights(weights) noise = np.random.normal(0, 1, (batch_size, 100)) g_loss = combined.train_on_batch(noise, valid) if epoch % sample_interval == 0: print(f"{epoch} [D loss: {d_loss}] [G loss: {g_loss}]") sample_images(epoch, generator) def sample_images(epoch, generator, image_grid_rows=5, image_grid_columns=5): noise = np.random.normal(0, 1, (image_grid_rows * image_grid_columns, 100)) gen_imgs = generator.predict(noise) gen_imgs = 0.5 * gen_imgs + 0.5 fig, axs = plt.subplots(image_grid_rows, image_grid_columns, figsize=(10, 10), sharey=True, sharex=True) cnt = 0 for i in range(image_grid_rows): for j in range(image_grid_columns): axs[i, j].imshow(gen_imgs[cnt, :, :, 0], cmap='gray') axs[i, j].axis('off') cnt += 1 plt.savefig(f"wgan_generated_image_epoch_{epoch}.png") plt.close() train_wgan(epochs=10000, batch_size=64, sample_interval=1000)

The training loop is controlled by the train_wgan function, which trains both the reviewer and the creator in turns. To keep the balance, the reviewer is changed more often. Every so often, the sample_images method saves the images that were made so that you can see how the training is going.

If you follow these steps, you can set up a WGAN to make realistic MNIST numbers. You can use the Wasserstein distance to make training more stable and improve the quality of the data you make. This method fixes some problems that regular GANs have, like mode collapse and training instability, making it a strong base for generative models.

Frequently Asked Questions for Generative AI

Sequence of prompts stored as linked records or documents.

It helps with filtering, categorization, and evaluating generated outputs.



As text fields, often with associated metadata and response outputs.

Combines keyword and vector-based search for improved result relevance.

Yes, for storing structured prompt-response pairs or evaluation data.

Combines database search with generation to improve accuracy and grounding.

Using encryption, anonymization, and role-based access control.

Using tools like DVC or MLflow with database or cloud storage.

Databases optimized to store and search high-dimensional embeddings efficiently.

They enable semantic search and similarity-based retrieval for better context.

They provide organized and labeled datasets for supervised trainining.



Track usage patterns, feedback, and model behavior over time.

Enhancing model responses by referencing external, trustworthy data sources.

They store training data and generated outputs for model development and evaluation.

Removing repeated data to reduce bias and improve model generalization.

Yes, using BLOB fields or linking to external model repositories.

With user IDs, timestamps, and quality scores in relational or NoSQL databases.

Using distributed databases, replication, and sharding.

NoSQL or vector databases like Pinecone, Weaviate, or Elasticsearch.

With indexing, metadata tagging, and structured formats for efficient access.

Text, images, audio, and structured data from diverse databases.

Yes, for representing relationships between entities in generated content.

Yes, using structured or document databases with timestamps and session data.

They store synthetic data alongside real data with clear metadata separation.



line

Copyrights © 2024 letsupdateskills All rights reserved