I hoped that transfer learning would allow me to do more with less. Less training data, less computing power, and less time. In the context of image generation with GANs, transfer learning means taking a neural network (GAN) trained on one set of data, then switching to a different set of data and continuing to train briefly.
Robbie Barrat demonstrated this technique by training a GAN on landscape images, then briefly switching the training data to abstract paintings:
In previous blog posts, I have attempted to compensate for having a relatively small training dataset by using an undertrained StyleGAN, creating a category in GANGogh, and interpolating between different parts of a trained GAN’s latent space (see Evolition here).
I used several implementations of DC GANs such as Soumith Chintala’s DCGAN (here), Robbie Barrat’s implementation of Soumith’s GAN (here), and Taehoon Kim’s tensorflow version (here). These three all implement the option of resuming training from a checkpoint, which makes it possible to pause training and swap the data. From what I can see, this does not automatically produce usable results. After all, the Generator network is now trained around the features common to one dataset. Swapping a completely different dataset is just as likely to confuse the network as enhance it. The other half of the GAN is the Discriminator network and its job changes completely when the data is switched. Letting the training run for more than a couple of epoch (the term for one complete pass through all the training data) generally results in a garbled mess.
I did manage to produce some interesting images, based on Robbie Barrat’s GAN trained first on portait paintings, then switching the data for my renaissance faces. One of the first things I noticed was the subdued palette coming in. My renaissance faces training dataset includes drawings as well as paintings, so some of the output reflects this, consisting of two colour line drawing images, and these sorts of features began appearing in the output.
This technique definitely shows promise, however training on two quite different datasets doesn’t seem to be the way to get the most predictable results, as the image above indicates. With the two datasets I’ve used, the output lacks something in predictability, but it is helpful for generating unexpected results -e.g. a painted torso with hand drawn face, or a second face appearing out of the sitter’s clothing. For now experimentation seems to be the way. I making more and hope to get some insight on what datasets can work well together.