High-Resolution Daytime Translation Without Domain Labels
posted on: arXiv 2020
In this paper, they proposed an image-to-image translation model that doesn’t rely on domain labels during training and testing. The model can transfer style from specific image and from a style distribution. They conducted quantitative comparison and qualitative comparison, showing that their model has comparable performance to models requiring labels at least at training time. They also proposed a post-processing model to up-sample the result to a higher resolution. They also demonstrated that their model could be generalized to other domains besides landscapes and also generalized to videos.
Pros:
Convert video translation task to image-to-image translation. Training with still landscape images and could generate video frames.
Compare to previous methods that need labels for training or training and testing, domain labels are not essential here, largely facilitating research.
They propose a style distribution loss, which can constrain the style space to be representative and with large diversity.
The post-processing model is practical for image generation tasks. And the combination of skip connections and adaptive instance normalization seems good.
It’s quite a mixture of components of many state-of-the-art methods.
Cons:
The model is really complicated with 5 parts and 6 kinds of losses.
This task is training and testing specifically on landscape images.