Daily Reading 20200521

GeneGAN: Learning Object Transfiguration and Attribute Subspace from Unpaired Data

posted on: BMVC2017

GeneGAN proposed a deterministic generative model which learns disentangled attribute subspaces from weakly labeled data by adversarial training. Fed with two unpaired sets of images (with and without object), GeneGAN uses an Encoder to encode image into two parts: object attribute subspace and background subspace. The object attribute may be eyeglasses, smile, hairstyle and lighting condition. By swapping the object feature input of Decoder, GeneGAN could generate different styles of the same person, such as a person with smile to without smile. Besides reconstruction loss and normal adversarial loss, they also present nulling loss to disentangle object features from background features and the parallelogram loss to enforce a constraint between the children object and the parent objects in image pixel values. Their experiments are conducted on aligned faces.

Pros:

  1. Compared with CycleGAN, GeneGAN is simpler with only one generator and one discriminator and gains a good performance on face attribute transfiguration in face images from CelebA and Multi-PIE database.

  2. The way to learn from weakly labeled unpaired data is inspiring. Two unpaired sets of images that with and without some object is effectively a 0/1 labeling over all training data.

Cons:

  1. The constraints they presented hold only approximately and there will be potential leakage of information between the object and feature parts.

  2. The object feature is not clearly defined. For eyeglasses, it can be the color, type, size, etc. While for hairstyle, it mainly focuses on the hair directions instead of any color information. Maybe it’s following the previous works and I’m wondering.