天天看點

論文閱讀:(DCGAN)Unsupervised representation learning with deepUnsupervised representation learning with deep convolutional generative adversarial networks

Unsupervised representation learning with deep convolutional generative adversarial networks

(2016 ICLR)

Alec Radford, Luke Metz, Soumith Chintala

Notes

Contributions

We propose and evaluate a set of constraints on the architectural topology of Convolutional GANs that make them stable to train in most settings. We name this class of architectures Deep Convolutional GANs (DCGAN). Moreover, we use the trained discriminators for image classification tasks, showing competitive performance with other unsupervised algorithms. Finally, we show that the generators have interesting vector arithmetic properties allowing for easy manipulation of many semantic qualities of generated samples.

Method

We identified a family of architectures that resulted in stable training across a range of datasets and allowed for training higher resolution and deeper generative models. These are the architecture guidelines for stable Deep Convolutional GANs:

  1. Replace any pooling layers with strided convolutions (discriminator) and fractional-strided convolutions (generator).
  2. Use batchnorm in both the generator and the discriminator.
  3. Remove fully connected hidden layers for deeper architectures.
  4. Use ReLU activation in generator for all layers except for the output, which uses Tanh.
  5. Use LeakyReLU activation in the discriminator for all layers.
論文閱讀:(DCGAN)Unsupervised representation learning with deepUnsupervised representation learning with deep convolutional generative adversarial networks

Training Process

code:https://github.com/eriklindernoren/Keras-GAN/blob/master/dcgan/dcgan.py

# Build the generator
self.generator = self.build_generator()

# The generator takes noise as input and generates imgs
z = Input(shape=(self.latent_dim,))
img = self.generator(z)

# For the combined model we will only train the generator
self.discriminator.trainable = False

# The discriminator takes generated images as input and determines validity
valid = self.discriminator(img)

# The combined model  (stacked generator and discriminator)
# Trains the generator to fool the discriminator
self.combined = Model(z, valid)
self.combined.compile(loss='binary_crossentropy', optimizer=optimizer)
           

Classifying CIFAR-10 using GANs as a feature extractor

To evaluate the quality of the representations learned by DCGANs for supervised tasks, we train on Imagenet-1k and then use the discriminator’s convolutional features from all layers, maxpooling each layers representation to produce a 4 × 4 spatial grid. These features are then flattened and concatenated to form a 28672 dimensional vector and a regularized linear L2-SVM classifier is trained on top of them.

Code in Generator & Discriminator

code:https://github.com/eriklindernoren/Keras-GAN/blob/master/dcgan/dcgan.py

Generator

def build_generator(self):

        model = Sequential()

        model.add(Dense(128 * 7 * 7, activation="relu", input_dim=self.latent_dim))
        model.add(Reshape((7, 7, 128)))
        model.add(UpSampling2D())
        model.add(Conv2D(128, kernel_size=3, padding="same"))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Activation("relu"))
        model.add(UpSampling2D())
        model.add(Conv2D(64, kernel_size=3, padding="same"))
        model.add(BatchNormalization(momentum=0.8))
        model.add(Activation("relu"))
        model.add(Conv2D(self.channels, kernel_size=3, padding="same"))
        model.add(Activation("tanh"))

        model.summary()

        noise = Input(shape=(self.latent_dim,))
        img = model(noise)

        return Model(noise, img)
           

Discriminator

def build_discriminator(self):

    model = Sequential()

    model.add(Conv2D(32, kernel_size=3, strides=2, input_shape=self.img_shape, padding="same"))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(64, kernel_size=3, strides=2, padding="same"))
    model.add(ZeroPadding2D(padding=((0,1),(0,1))))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(128, kernel_size=3, strides=2, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Conv2D(256, kernel_size=3, strides=1, padding="same"))
    model.add(BatchNormalization(momentum=0.8))
    model.add(LeakyReLU(alpha=0.2))
    model.add(Dropout(0.25))
    model.add(Flatten())
    model.add(Dense(1, activation='sigmoid'))

    model.summary()

    img = Input(shape=self.img_shape)
    validity = model(img)

    return Model(img, validity)
           

Results

論文閱讀:(DCGAN)Unsupervised representation learning with deepUnsupervised representation learning with deep convolutional generative adversarial networks

繼續閱讀