Transgaga: Geometry-aware unsupervised image-to-image translation

Wayne Wu; Kaidi Cao; Cheng Li; Chen Qian; Chen Change Loy

doi:10.1109/CVPR.2019.00820

Transgaga: Geometry-aware unsupervised image-to-image translation

Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, Chen Change Loy

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

89 Citations (Scopus)

Abstract

Unsupervised image-to-image translation aims at learning a mapping between two visual domains. However, learning a translation across large geometry variations al-ways ends up with failure. In this work, we present a novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task. Instead of learning the mapping on the image space directly, we disentangle image space into a Cartesian product of the appearance and the geometry latent spaces. Specifically, we first in-troduce a geometry prior loss and a conditional VAE loss to encourage the network to learn independent but com-plementary representations. The translation is then built on appearance and geometry space separately. Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks. In addition, by taking different exemplars as the ap-pearance references, our method also supports multimodal translation. Project page: https://wywu.github. io/projects/TGaGa/TGaGa.html.

Original language	English
Title of host publication	Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Publisher	IEEE Computer Society
Pages	8004-8013
Number of pages	10
ISBN (Electronic)	9781728132938
DOIs	https://doi.org/10.1109/CVPR.2019.00820
Publication status	Published - Jun 2019
Externally published	Yes
Event	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States Duration: Jun 16 2019 → Jun 20 2019

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2019-June
ISSN (Print)	1063-6919

Conference

Conference	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/Territory	United States
City	Long Beach
Period	6/16/19 → 6/20/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

ASJC Scopus Subject Areas

Software
Computer Vision and Pattern Recognition

Keywords

And Body Pose
Deep Learning
Face
Gesture
Image and Video Synthesis
Representation Learning
Vision Applications and Syst

Access to Document

10.1109/CVPR.2019.00820

Cite this

Wu, W., Cao, K., Li, C., Qian, C., & Loy, C. C. (2019). Transgaga: Geometry-aware unsupervised image-to-image translation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 (pp. 8004-8013). Article 8954399 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June). IEEE Computer Society. https://doi.org/10.1109/CVPR.2019.00820

@inproceedings{81794e19664d45a1b04b720846167a5a,

title = "Transgaga: Geometry-aware unsupervised image-to-image translation",

abstract = "Unsupervised image-to-image translation aims at learning a mapping between two visual domains. However, learning a translation across large geometry variations al-ways ends up with failure. In this work, we present a novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task. Instead of learning the mapping on the image space directly, we disentangle image space into a Cartesian product of the appearance and the geometry latent spaces. Specifically, we first in-troduce a geometry prior loss and a conditional VAE loss to encourage the network to learn independent but com-plementary representations. The translation is then built on appearance and geometry space separately. Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks. In addition, by taking different exemplars as the ap-pearance references, our method also supports multimodal translation. Project page: https://wywu.github. io/projects/TGaGa/TGaGa.html.",

keywords = "And Body Pose, Deep Learning, Face, Gesture, Image and Video Synthesis, Representation Learning, Vision Applications and Syst",

author = "Wayne Wu and Kaidi Cao and Cheng Li and Chen Qian and Loy, \{Chen Change\}",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

year = "2019",

month = jun,

doi = "10.1109/CVPR.2019.00820",

language = "English",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "8004--8013",

booktitle = "Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019",

address = "United States",

}

Wu, W, Cao, K, Li, C, Qian, C & Loy, CC 2019, Transgaga: Geometry-aware unsupervised image-to-image translation. in Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019., 8954399, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, IEEE Computer Society, pp. 8004-8013, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, United States, 6/16/19. https://doi.org/10.1109/CVPR.2019.00820

Transgaga: Geometry-aware unsupervised image-to-image translation. / Wu, Wayne; Cao, Kaidi; Li, Cheng et al.
Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. p. 8004-8013 8954399 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Transgaga

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

AU - Wu, Wayne

AU - Cao, Kaidi

AU - Li, Cheng

AU - Qian, Chen

AU - Loy, Chen Change

PY - 2019/6

Y1 - 2019/6

N2 - Unsupervised image-to-image translation aims at learning a mapping between two visual domains. However, learning a translation across large geometry variations al-ways ends up with failure. In this work, we present a novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task. Instead of learning the mapping on the image space directly, we disentangle image space into a Cartesian product of the appearance and the geometry latent spaces. Specifically, we first in-troduce a geometry prior loss and a conditional VAE loss to encourage the network to learn independent but com-plementary representations. The translation is then built on appearance and geometry space separately. Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks. In addition, by taking different exemplars as the ap-pearance references, our method also supports multimodal translation. Project page: https://wywu.github. io/projects/TGaGa/TGaGa.html.

AB - Unsupervised image-to-image translation aims at learning a mapping between two visual domains. However, learning a translation across large geometry variations al-ways ends up with failure. In this work, we present a novel disentangle-and-translate framework to tackle the complex objects image-to-image translation task. Instead of learning the mapping on the image space directly, we disentangle image space into a Cartesian product of the appearance and the geometry latent spaces. Specifically, we first in-troduce a geometry prior loss and a conditional VAE loss to encourage the network to learn independent but com-plementary representations. The translation is then built on appearance and geometry space separately. Extensive experiments demonstrate the superior performance of our method to other state-of-the-art approaches, especially in the challenging near-rigid and non-rigid objects translation tasks. In addition, by taking different exemplars as the ap-pearance references, our method also supports multimodal translation. Project page: https://wywu.github. io/projects/TGaGa/TGaGa.html.

KW - And Body Pose

KW - Deep Learning

KW - Face

KW - Gesture

KW - Image and Video Synthesis

KW - Representation Learning

KW - Vision Applications and Syst

UR - http://www.scopus.com/inward/record.url?scp=85078743815&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85078743815&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00820

DO - 10.1109/CVPR.2019.00820

M3 - Conference contribution

AN - SCOPUS:85078743815

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 8004

EP - 8013

BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

PB - IEEE Computer Society

Y2 - 16 June 2019 through 20 June 2019

ER -

Wu W, Cao K, Li C, Qian C, Loy CC. Transgaga: Geometry-aware unsupervised image-to-image translation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society. 2019. p. 8004-8013. 8954399. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00820

Transgaga: Geometry-aware unsupervised image-to-image translation

Abstract

Publication series

Conference

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Cite this