Day 93 — Adversarial Autoencoder (AAE) for MNIST Dataset

今日主題:使用對抗自編碼機產生模擬MNIST資料



筆記

今天一樣是繼續對同一個Github專案做Code Study。今天寫的主題是對抗自編碼機。關於對抗自編碼機的介紹已經在之前的文章中寫過了,可以參考這裡。今天就專注在研究 Keras Code本身上。

這個作者的Code都是蠻一致的使用物件方式編寫,所以底下的Code也都是在 class AdversarialAutoencoder() 底下。


  • __init__()
def __init__(self):
self.img_rows = 28
self.img_cols = 28
self.channels = 1
self.img_shape = (self.img_rows,
self.img_cols,
self.channels)
self.latent_dim = 10
optimizer = Adam(0.0002, 0.5)
    # Build and compile the discriminator
self.discriminator = self.build_discriminator()
self.discriminator.compile(loss='binary_crossentropy',
optimizer=optimizer,
metrics=['accuracy'])
    # Build the encoder / decoder
self.encoder = self.build_encoder()
self.decoder = self.build_decoder()
    img = Input(shape=self.img_shape)
# The generator takes the image, encodes it and reconstructs it
from the encoding
encoded_repr = self.encoder(img)
reconstructed_img = self.decoder(encoded_repr)
    # For the adversarial_autoencoder model we will only train the 
generator
self.discriminator.trainable = False
    # The discriminator determines validity of the encoding
validity = self.discriminator(encoded_repr)
    # The adversarial_autoencoder model  (stacked generator and 
discriminator)
self.adversarial_autoencoder = Model(img,
[reconstructed_img, validity])

self.adversarial_autoencoder.compile(loss=['mse',
'binary_crossentropy'],
loss_weights=[0.999, 0.001],
optimizer=optimizer)

跟昨天的基本GAN實作邏輯一樣,類別中的 self.discriminator
以及 self.generator 都是 build_discriminator 以及 build_generator 的實例。底下看這兩種模型的定義。

  • Discriminator
def build_discriminator(self):
model = Sequential()
model.add(Dense(512, input_dim=self.latent_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(256))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(1, activation="sigmoid"))
model.summary()

encoded_repr = Input(shape=(self.latent_dim, ))
validity = model(encoded_repr)
    return Model(encoded_repr, validity)

得到的模型如下:

老實說在這邊Discriminator與Generator的差別也不太大就是了。

  • Generator
def build_encoder(self):
img = Input(shape=self.img_shape)
h = Flatten()(img)
h = Dense(512)(h)
h = LeakyReLU(alpha=0.2)(h)
h = Dense(512)(h)
h = LeakyReLU(alpha=0.2)(h)
mu = Dense(self.latent_dim)(h)
log_var = Dense(self.latent_dim)(h)
latent_repr = merge([mu, log_var],
mode=lambda p: p[0] +
K.random_normal(K.shape(p[0])) *
K.exp(p[1] / 2),
output_shape=lambda p: p[0])
    return Model(img, latent_repr)

以及

def build_decoder(self):
    model = Sequential()
model.add(Dense(512, input_dim=self.latent_dim))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(512))
model.add(LeakyReLU(alpha=0.2))
model.add(Dense(np.prod(self.img_shape), activation='tanh'))
model.add(Reshape(self.img_shape))
model.summary()
    z = Input(shape=(self.latent_dim,))
img = model(z)
    return Model(z, img)

得到的模型 summary:

  • 訓練過程
def train(self, epochs, batch_size=128, sample_interval=50):
# Load the dataset
(X_train, _), (_, _) = mnist.load_data()
    # Rescale -1 to 1
X_train = (X_train.astype(np.float32) - 127.5) / 127.5
X_train = np.expand_dims(X_train, axis=3)
    # Adversarial ground truths
valid = np.ones((batch_size, 1))
fake = np.zeros((batch_size, 1))
    for epoch in range(epochs):
# Train Discriminator
# Select a random batch of images
idx = np.random.randint(0, X_train.shape[0], batch_size)
imgs = X_train[idx]
latent_fake = self.encoder.predict(imgs)
latent_real = np.random.normal(size=(batch_size,
self.latent_dim))
        # Train the discriminator
d_loss_real = self.discriminator.train_on_batch(latent_real,
valid)

d_loss_fake = self.discriminator.train_on_batch(latent_fake,
fake)

d_loss = 0.5 * np.add(d_loss_real, d_loss_fake)

# Train Generator
# Train the generator
g_loss = self.adversarial_autoencoder.train_on_batch(imgs,
[imgs, valid])
        # If at save interval => save generated image samples
if epoch % sample_interval == 0:
# Plot the progress
print ("%d [D loss: %f, acc: %.2f%%]
[G loss: %f, mse: %f]" %
(epoch, d_loss[0], 100*d_loss[1],
g_loss[0], g_loss[1]))
self.sample_images(epoch)

這邊不難看出他設計的訓練邏輯跟昨天一模一樣,真正改變的部分只有Generator的架構而已。

  • Helper Functions
def sample_images(self, epoch):
r, c = 5, 5
z = np.random.normal(size=(r*c, self.latent_dim))
gen_imgs = self.decoder.predict(z)
gen_imgs = 0.5 * gen_imgs + 0.5
fig, axs = plt.subplots(r, c)
cnt = 0
for i in range(r):
for j in range(c):
axs[i,j].imshow(gen_imgs[cnt, :,:,0], cmap='gray')
axs[i,j].axis('off')
cnt += 1
fig.savefig("images/mnist_%d.png" % epoch)
plt.close()

很單純的輸出模組。另一個Helper function只是在儲存訓練好的模型,就不放在這裡了。


調整參數後的觀察

  1. 訓練一開始,Discriminator的accuracy很高,甚至有時候會達到100%。但稍微想想也是合理,畢竟任務是判斷輸入圖片是不是真實的MNIST圖片,如果輸入的假圖片只是一堆雜訊,那也應該會很容易就被判斷出來。
  2. 隱變量數 latent_dim 調高(10 -> 50)訓練完成的時間沒變,但是出來的結果變差(在 Epoch = 20,000的時候圖片還是很不清楚)。
  3. 調整 learning rate 在訓練時間上的感受不明顯,但是似乎需要花較多的Epoch才能得到更清晰的圖片。
  4. 嘗試使用較大的 batch_size = 128 ,速度似乎有顯著變慢,而且在經過相當多個Epoch(~20K)之後,Discriminator的accuracy依然可以達到80%左右,似乎表示Generator在這時候偏弱一些?最後產生出來的圖片依然可以肉眼辨識得出不太一樣的地方。
Like what you read? Give Falconives a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.