Disentangling content and style via unsupervised geometry distillation

Wayne Wu; Kaidi Cao; Cheng Li; Chen Qian; Chen Change Loy

Disentangling content and style via unsupervised geometry distillation

Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

Research output: Contribution to conference › Paper › peer-review

11 Citations (Scopus)

Abstract

It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably. It is rare for one to have access to a large number of data to help separate the influences. In this paper, we present a novel framework to learn this disentangled representation in a completely unsupervised manner. We address this problem in a two-branch Autoencoder framework. For the structural content branch, we project the latent factor into a soft structured point tensor and constrain it with losses derived from prior knowledge. This constraint encourages the branch to distill geometry information. Another branch learns the complementary style information. The two branches form an effective framework that can disentangle object's content-style representation without any human annotation. We evaluate our approach on four image datasets, on which we demonstrate the superior disentanglement and visual analogy quality both in synthesized and real-world data. We are able to generate photo-realistic images with 256 × 256 resolution that are clearly disentangled in content and style.

Original language	English
Publication status	Published - 2019
Externally published	Yes
Event	2019 Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop - New Orleans, United States Duration: May 6 2019 → …

Conference

Conference	2019 Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop
Country/Territory	United States
City	New Orleans
Period	5/6/19 → …

Bibliographical note

Publisher Copyright:
© Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop.All right reserved.

ASJC Scopus Subject Areas

Linguistics and Language
Language and Linguistics
Education
Computer Science Applications

Cite this

@conference{218b882646764dc0a0f8c95a1fcc49b1,

title = "Disentangling content and style via unsupervised geometry distillation",

abstract = "It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably. It is rare for one to have access to a large number of data to help separate the influences. In this paper, we present a novel framework to learn this disentangled representation in a completely unsupervised manner. We address this problem in a two-branch Autoencoder framework. For the structural content branch, we project the latent factor into a soft structured point tensor and constrain it with losses derived from prior knowledge. This constraint encourages the branch to distill geometry information. Another branch learns the complementary style information. The two branches form an effective framework that can disentangle object's content-style representation without any human annotation. We evaluate our approach on four image datasets, on which we demonstrate the superior disentanglement and visual analogy quality both in synthesized and real-world data. We are able to generate photo-realistic images with 256 × 256 resolution that are clearly disentangled in content and style.",

author = "Wayne Wu and Kaidi Cao and Cheng Li and Chen Qian and Loy, \{Chen Change\}",

note = "Publisher Copyright: {\textcopyright} Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop.All right reserved.; 2019 Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop ; Conference date: 06-05-2019",

year = "2019",

language = "English",

}

TY - CONF

T1 - Disentangling content and style via unsupervised geometry distillation

AU - Wu, Wayne

AU - Cao, Kaidi

AU - Li, Cheng

AU - Qian, Chen

AU - Loy, Chen Change

PY - 2019

Y1 - 2019

N2 - It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably. It is rare for one to have access to a large number of data to help separate the influences. In this paper, we present a novel framework to learn this disentangled representation in a completely unsupervised manner. We address this problem in a two-branch Autoencoder framework. For the structural content branch, we project the latent factor into a soft structured point tensor and constrain it with losses derived from prior knowledge. This constraint encourages the branch to distill geometry information. Another branch learns the complementary style information. The two branches form an effective framework that can disentangle object's content-style representation without any human annotation. We evaluate our approach on four image datasets, on which we demonstrate the superior disentanglement and visual analogy quality both in synthesized and real-world data. We are able to generate photo-realistic images with 256 × 256 resolution that are clearly disentangled in content and style.

AB - It is challenging to disentangle an object into two orthogonal spaces of content and style since each can influence the visual observation differently and unpredictably. It is rare for one to have access to a large number of data to help separate the influences. In this paper, we present a novel framework to learn this disentangled representation in a completely unsupervised manner. We address this problem in a two-branch Autoencoder framework. For the structural content branch, we project the latent factor into a soft structured point tensor and constrain it with losses derived from prior knowledge. This constraint encourages the branch to distill geometry information. Another branch learns the complementary style information. The two branches form an effective framework that can disentangle object's content-style representation without any human annotation. We evaluate our approach on four image datasets, on which we demonstrate the superior disentanglement and visual analogy quality both in synthesized and real-world data. We are able to generate photo-realistic images with 256 × 256 resolution that are clearly disentangled in content and style.

UR - http://www.scopus.com/inward/record.url?scp=85083950796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85083950796&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:85083950796

T2 - 2019 Deep Generative Models for Highly Structured Data, DGS@ICLR 2019 Workshop

Y2 - 6 May 2019

ER -

Disentangling content and style via unsupervised geometry distillation

Abstract

Conference

Bibliographical note

ASJC Scopus Subject Areas

Other files and links

Cite this