Style Transfer using a CycleGAN [WIP]

We make a Neural Network paint like Monet!

Generative Neural Networks are a very different story compared to "normal" Neural Networks. Since they are trained to learn the probability distribution from data rather than the marginal distribution of a target feature, you can sample them just like any other probability distribution and "create" your own data. A very well-known example of this is the Generative Adversarial Network (GAN), in which two rivaling networks are trained to generate realistic data based on a training input.

A visually appealing example of this type of architecture is the CycleGAN. Conceptually, it is an adversarial network designed to learn a mapping between two different datasets, in order to be able to create data from one or the other. A lot of additional information as well as some examples for this concept can be found on the original paper author's page here.

Now we want to take real photographs and use a CycleGAN network to turn it into something that looks just like a Monet painting. This concept is commonly called Style Transfer - with it, you could for example take any painter's style and apply it to your heart's content on any image, creating a perfectly realistic rendition of that image in that painter's style, provided you have enough real sample paintings to sufficiently train such a network.

This tutorial assumes that you have successfully installed the Core Engine. If you have not done so, please refer to the installation section.

Authentication

First, create a client to interact with the Core Engine.

client = cengine.Client(username=<YOUR_USERNAME>, password=<YOUR_PASSWORD>)

Now, create a provider if you haven't already. It serves as the link between your own cloud infrastructure and your Core Engine account. For this tutorial, we will show you how to set up your Google Cloud Platform (GCP) as a provider.

# PROVIDER_NAME: name of your provider
# SERVICE_ACCOUNT: path to your service_account json file
# BUCKET_NAME: path to your google cloud storage bucket, gs://your_bucket
# Create a provider
gcp_provider = client.create_provider(name=PROVIDER_NAME,
provider_type='gcp',
args={'service_account': SERVICE_ACCOUNT,
'artifact_store': BUCKET_NAME})

The dataset

This tutorial is based on the Kaggle competition on CycleGANs. After registering there, you can download the data (Monet images and real photos in this case) either using the Kaggle API or by following the instructions on the website.

If you want to try out another painter instead, then the Github repository of the CycleGAN creators has you covered, with images of paintings by Cezanne, Van Gogh and Ukiyo-e. These images can be downloaded by running a shell script provided inside that repository as well.

Creating the datasource

Before creating the image data source, you have to add a labels.json file. If you have not created an image data source before, consider reading the documentation on images on GCS with the Core Engine. You can use the Python script prepare_gan_images.py in the Github repository to this tutorial to prepare your source.

If you create want to create two or more different datasets with the prepare_gan_images.py script mentioned above, either specify different target folders each time or delete the target folder after the previous run, as the script will not assert an empty target directory or try to remove files in the target directory.

After downloading the CycleGAN dataset from Kaggle onto your local machine and creating a labels.json file with the script above, you have to upload them to a Google Cloud Storage (GCS) bucket that resides in the same Google Cloud project as your orchestrator.

The dataset on Kaggle also has pre-made TFRecords of both the Monet images and the real images coming with the actual images themselves. Be sure not to upload these TFRecords to your GCS bucket, just the actual images plus the labels.json file in the target directory that you set in prepare_gan_images.py !

Now, create the datasource with the Python SDK as follows:

# YOUR_BUCKET_PATH: Path to your GCS bucket with the Monet/real images
# and the labels.json file
gan_datasource = client.create_datasource(
name='CycleGANDatasource',
source='local_image',
type='image',
provider_id=gcp_provider.id,
args={"base_path": <YOUR_BUCKET_PATH>})
print(gan_datasource)

And that's it! Creating the datasource will take some time, calculate with about 15-20 minutes. Afterwards, commit your datasource like this:

# Create a commit of your datasource
gan_datasource_commit = client.commit_datasource(datasource_id=gan_datasource.id)
print(gan_datasource_commit)

Splitting the data

Now we have to separate the Monet painting images from the real photos to feed them into the CycleGAN. The reason for this type of split is that CycleGAN expects data input as a pair of a single Monet image and a single real image. To facilitate that, we can use the "dummy" label we added during the creation of the labels.json file above: The convention we followed there is that the label value 0 indicates a Monet image, while the value 1 indicates a real image. And thus, we can set up our data split like this:

from cengine import Method, PipelineConfig
# Start with a template
c = PipelineConfig()
# Configure the dataset split
c.split.categorize.by = 'label'
c.split.categorize.categories = {"train": [0], "eval": [1]}

Now we named the splits "train" and "eval" here simply because these names are expected by the Core Engine; in this special case, both splits are used for training. Now, it's time to set up a custom transform for our data!

Custom transform

Next up, we have to set up a custom transform. Natively, because of space optimization considerations, the images are stored in binary format in BigQuery. Hence, before they can be processed by our GAN, they need to be decoded and normalized. Both of these steps happen inside the Transform component. The Core Engine allows us to wrap both steps into one operation by injecting a custom transform. The code is very simple and shown below:

# my_transform.py
import tensorflow as tf
def decode_and_reshape_image(input_):
image = tf.map_fn(
lambda x: tf.io.decode_image(x[0], channels=3),
input_,
dtype=tf.uint8)
image = (tf.cast(image, tf.float32) / 127.5) - 1
image = tf.reshape(image, [-1, 256, 256, 3])
return image

Let us briefly look at two particular intricacies of this code. First, the actual image decoding happens inside of a tf.map_fn; this is because the transform operates on batches of data, so the decode_image function cannot be called directly. Instead, using the map paradigm, it is applied to every element of the image batch separately.

The second thing to pay attention to are the semantics inside the transform component. After the image is cast to a float tensor and normalized between -1 and 1, in the last line before the return statement, a tf.reshape operation takes place. When normally, you would reshape your image to be of the shape height x width x num_channels (or num_channels x height x width, depending on the format you use), here you need to add an extra placeholder -1 for the batch dimension in the front.

In order to use the custom transform defined above in your own GAN pipeline, you have to push the function to your workspace. This can easily be done over the Python SDK, too:

# LOCAL_PATH: Path to the python file on your system
# containing the "decode_and_reshape_image" transform
client.push_function(name='decode_and_reshape_image',
function_type='transform',
local_path=LOCAL_PATH,
udf_name='decode_and_reshape_image')

To add the custom transform to your pipeline configuration, run the following code in the Python SDK on the same PipelineConfig object as above:

# Configure non-default preprocessing with a built-in method
c.features.add(['image'])
c.features['image'].transform = Method(method='decode_and_reshape_image@latest')
# Configure your labels
c.labels.add(['label'])

Since we only decided to enter the image feature into our config, every other metadata we had will not be accessible inside the training - but that is fine, since we only had shape information as metadata, which is inferred by our custom transform, anyway. As the label is a separate key, the label will be there however.

After the transform, we have numpy array representations of our images that we can feed into our CycleGAN network. Now it is time to add the actual CycleGAN model to our pipeline.

Adding the CycleGAN model

Conceptually, the model we use here is the same as the one found in the Introductory Notebook on the Kaggle CycleGAN challenge. There are a few caveats along the way, though:

  1. Since the model has a more complicated architecture, we subclass the keras.Model class and override the train_step method to our own liking. On the latest Tensorflow version, this alone would be sufficient - the model then incorporates the custom train_step method and uses it inside of the standard fit function. But since we are on Tensorflow 2.1, that functionality is not there yet - therefore we override the fit function as well, adding the most important things like Tensorboard logging and custom loss logging ourselves.

  2. The output model will be very big (in excess of 1GB!), which can pose a problem upon deployment on GCAIP. Because of that limitation, we only return the Monet Generator network that lets us create Monet-styled images looking like impressionist paintings.

For more information, please check the source code on the Github repository for this tutorial. As before with the custom transform, we first push the model to our workspace and then add the trainer section into our pipeline's configuration.

# LOCAL_PATH: Path to the python file on your system
# containing the "my_gan_model" function
client.push_function(name='cycle_gan',
function_type='model',
local_path=LOCAL_PATH,
udf_name='my_gan_model')

The custom model can be added to your config like this. This assumes that you registered your model under the name cycle_gan in the Core Engine. Naturally, you can choose any other name as you wish. The @latest decorator is for versioning purposes - here, it means that the pipeline should run with the newest version of the CycleGAN model. The model expects the number of training epochs as well as a regularization parameter as inputs, which we fill here with (hopefully) sensible default values.

# Configure your training with the custom CycleGAN model
c.trainer.fn = "cycle_gan@latest"
c.trainer.params = {'epochs': 25,
'lambda_cycle': 10}

Additionally, we want to skip evaluation of our model for now. For doing this, we add the following lines to our pipeline configuration:

# Configure to skip evaluation with these settings
c.evaluator.slices = []
c.evaluator.metrics = []

Adding GCAIP Training and Serving Backends

Another big plus of the Core Engine is that you can easily add more powerful backends to process your pipelines - and that is exactly what we are going to do here. We will use the AI Platform Service by Google Cloud to train our model on a GPU and then deploy it on AI Platform as well. After that, you can send it requests from a Jupyter Notebook to get your Monet-styled, generated images!

First of all, push your finished pipeline using this SDK command:

active_workspace = client.get_workspaces()[0]
gan_pipeline = client.push_pipeline(name='CycleGANPipeline',
config=c,
workspace_id=active_workspace.id)

Now it is time to bring out the big guns. We will use a 32-core machine to handle preprocessing and a GCAIP-powered backend to handle training and serving. To set this up, you need no more than the following lines:

orchestrator_args = {
"machine_type": "n1-standard-32",
"zone": "europe-west1-b",
"preemptible": True,
}
training_backend = gcp_provider.name + "_" + "gcaip_training"
serving_backend = gcp_provider.name + "_" + "gcaip_serving"

And we are all set! Now let us train the pipeline.

gan_pipeline_run = client.train_pipeline(
pipeline_id=gan_pipeline.id,
datasource_commit_id=gan_datasource_commit.id,
orchestration_args=orchestrator_args,
training_backend=training_backend,
serving_backend=serving_backend)

Since the network is so large, with multiple convolution-based upsampling and downsampling operations, training for 25 epochs as specified above will take a considerable amount of time even on a GPU machine on AI Platform. But when you are done, you should have a deployed model ready to receive images and turn them into Monet paintings!

Getting predictions from your deployed CycleGAN model

After some time, your model should have successfully completed training and been pushed to an instance on AI platform. You can confirm this best using your own Google Cloud Console. In general, if you are only seeing green checkmarks, that should indicates that things went well. Now let's go and paint some pictures!

First head over to your AI Platform Console and find the model that you just trained. You will need to copy its name, which should be right there when you open the Models tab in your AI Platform console.

Go to the root of your local cengine repository, activate the virtual environment that comes with it and start a new Jupyter Notebook Server:

#substitute CENGINE_ROOT with the root of the repository on your local machine
cd CENGINE_ROOT
source venv/bin/activate
jupyter notebook

Now after starting the notebook server, navigate the explorer tree to the CycleGAN tutorial in the Core Engine repository and fire up the tf_serving_example.ipynb notebook. Follow the instructions there and create some Monet magic!

Monet's lost work?!?