Daily Reading 20200513

Example-Guided Style-Consistent Image Synthesis from Semantic Labeling

posted on: CVPR2019

In this paper, they present a method for example guided image synthesis with style-consistency from general-form semantic labels. They mainly focus on face, dance and street view image synthesis tasks. Based on pix2pixHD, their network contains 1) a generator, with semantic map x, style example I and its corresponding label F(I) as input and output a synthetic image; 2) a standard discriminator to distinguish real images from fake ones given conditional inputs; and 3) a style consistency discriminator to detect whether the synthetic output and the guidance image I are style-compatible, which operates on image pairs. During network training, they propose to sample style-consistent and style-inconsistent image pairs from video to provide style awareness to the model. They also introduce the style consistency adversarial losses as well as the semantic consistency loss with adaptive weights to produce plausible results. They perform qualitative and quantitative comparison in several applications.

Pros:

  1. In the past, image-to-image synthesis is difficult to tell whether the network has learned a new data distribution. The style image of this article gives different data distribution types, so this network can output different data distributions.

Cons:

  1. Though they build their network based on pix2pixHD, they don’t apply that multi-scale architecture and limit the input size to 256*256.

  2. The results could be effected significantly by the performance of the state-of-the-art semantic labeling function F (·).