Self-supervised learning via conditional motion propagation

Xiaohang Zhan; Xingang Pan; Ziwei Liu; Dahua Lin; Chen Change Loy

doi:10.1109/CVPR.2019.00198

Self-supervised learning via conditional motion propagation

Xiaohang Zhan, Xingang Pan, Ziwei Liu, Dahua Lin, Chen Change Loy

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

43 Citations (Scopus)

Abstract

Intelligent agent naturally learns from motion. Various self-supervised algorithms have leveraged the motion cues to learn effective visual representations. The hurdle here is that motion is both ambiguous and complex, rendering previous works either suffer from degraded learning efficacy, or resort to strong assumptions on object motions. In this work, we design a new learning-from-motion paradigm to bridge these gaps. Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem. Given an input image and several sparse flow guidance on it, our framework seeks to recover the full-image motion. Compared to other alternatives, our framework has several appealing properties: (1) Using sparse flow guidance during training resolves the inherent motion ambiguity, and thus easing feature learning. (2) Solving the pretext task of conditional motion propagation encourages the emergence of kinematically-sound representations that poss greater expressive power. Extensive experiments demonstrate that our framework learns structural and coherent features; and achieves state-of-the-art self-supervision performance on several downstream tasks including semantic segmentation, instance segmentation and human parsing. Furthermore, our framework is successfully extended to several useful applications such as semi-automatic pixel-level annotation.

Original language	English
Title of host publication	Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Publisher	IEEE Computer Society
Pages	1881-1889
Number of pages	9
ISBN (Electronic)	9781728132938
DOIs	https://doi.org/10.1109/CVPR.2019.00198
Publication status	Published - Jun 2019
Externally published	Yes
Event	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 - Long Beach, United States Duration: Jun 16 2019 → Jun 20 2019

Publication series

Name	Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume	2019-June
ISSN (Print)	1063-6919

Conference

Conference	32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019
Country/Territory	United States
City	Long Beach
Period	6/16/19 → 6/20/19

Bibliographical note

Publisher Copyright:
© 2019 IEEE.

ASJC Scopus Subject Areas

Software
Computer Vision and Pattern Recognition

Keywords

Representation Learning
Scene Analysis and Understanding

Access to Document

10.1109/CVPR.2019.00198

Cite this

Zhan, X., Pan, X., Liu, Z., Lin, D., & Loy, C. C. (2019). Self-supervised learning via conditional motion propagation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 (pp. 1881-1889). Article 8954463 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June). IEEE Computer Society. https://doi.org/10.1109/CVPR.2019.00198

@inproceedings{00e570cd39c74c6197d800bc28bf61e4,

title = "Self-supervised learning via conditional motion propagation",

abstract = "Intelligent agent naturally learns from motion. Various self-supervised algorithms have leveraged the motion cues to learn effective visual representations. The hurdle here is that motion is both ambiguous and complex, rendering previous works either suffer from degraded learning efficacy, or resort to strong assumptions on object motions. In this work, we design a new learning-from-motion paradigm to bridge these gaps. Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem. Given an input image and several sparse flow guidance on it, our framework seeks to recover the full-image motion. Compared to other alternatives, our framework has several appealing properties: (1) Using sparse flow guidance during training resolves the inherent motion ambiguity, and thus easing feature learning. (2) Solving the pretext task of conditional motion propagation encourages the emergence of kinematically-sound representations that poss greater expressive power. Extensive experiments demonstrate that our framework learns structural and coherent features; and achieves state-of-the-art self-supervision performance on several downstream tasks including semantic segmentation, instance segmentation and human parsing. Furthermore, our framework is successfully extended to several useful applications such as semi-automatic pixel-level annotation.",

keywords = "Representation Learning, Scene Analysis and Understanding",

author = "Xiaohang Zhan and Xingang Pan and Ziwei Liu and Dahua Lin and Loy, \{Chen Change\}",

note = "Publisher Copyright: {\textcopyright} 2019 IEEE.; 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019 ; Conference date: 16-06-2019 Through 20-06-2019",

year = "2019",

month = jun,

doi = "10.1109/CVPR.2019.00198",

language = "English",

series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",

publisher = "IEEE Computer Society",

pages = "1881--1889",

booktitle = "Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019",

address = "United States",

}

Zhan, X, Pan, X, Liu, Z, Lin, D & Loy, CC 2019, Self-supervised learning via conditional motion propagation. in Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019., 8954463, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2019-June, IEEE Computer Society, pp. 1881-1889, 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, United States, 6/16/19. https://doi.org/10.1109/CVPR.2019.00198

Self-supervised learning via conditional motion propagation. / Zhan, Xiaohang; Pan, Xingang; Liu, Ziwei et al.
Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society, 2019. p. 1881-1889 8954463 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2019-June).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Self-supervised learning via conditional motion propagation

AU - Zhan, Xiaohang

AU - Pan, Xingang

AU - Liu, Ziwei

AU - Lin, Dahua

AU - Loy, Chen Change

PY - 2019/6

Y1 - 2019/6

N2 - Intelligent agent naturally learns from motion. Various self-supervised algorithms have leveraged the motion cues to learn effective visual representations. The hurdle here is that motion is both ambiguous and complex, rendering previous works either suffer from degraded learning efficacy, or resort to strong assumptions on object motions. In this work, we design a new learning-from-motion paradigm to bridge these gaps. Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem. Given an input image and several sparse flow guidance on it, our framework seeks to recover the full-image motion. Compared to other alternatives, our framework has several appealing properties: (1) Using sparse flow guidance during training resolves the inherent motion ambiguity, and thus easing feature learning. (2) Solving the pretext task of conditional motion propagation encourages the emergence of kinematically-sound representations that poss greater expressive power. Extensive experiments demonstrate that our framework learns structural and coherent features; and achieves state-of-the-art self-supervision performance on several downstream tasks including semantic segmentation, instance segmentation and human parsing. Furthermore, our framework is successfully extended to several useful applications such as semi-automatic pixel-level annotation.

AB - Intelligent agent naturally learns from motion. Various self-supervised algorithms have leveraged the motion cues to learn effective visual representations. The hurdle here is that motion is both ambiguous and complex, rendering previous works either suffer from degraded learning efficacy, or resort to strong assumptions on object motions. In this work, we design a new learning-from-motion paradigm to bridge these gaps. Instead of explicitly modeling the motion probabilities, we design the pretext task as a conditional motion propagation problem. Given an input image and several sparse flow guidance on it, our framework seeks to recover the full-image motion. Compared to other alternatives, our framework has several appealing properties: (1) Using sparse flow guidance during training resolves the inherent motion ambiguity, and thus easing feature learning. (2) Solving the pretext task of conditional motion propagation encourages the emergence of kinematically-sound representations that poss greater expressive power. Extensive experiments demonstrate that our framework learns structural and coherent features; and achieves state-of-the-art self-supervision performance on several downstream tasks including semantic segmentation, instance segmentation and human parsing. Furthermore, our framework is successfully extended to several useful applications such as semi-automatic pixel-level annotation.

KW - Representation Learning

KW - Scene Analysis and Understanding

UR - http://www.scopus.com/inward/record.url?scp=85072047797&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85072047797&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2019.00198

DO - 10.1109/CVPR.2019.00198

M3 - Conference contribution

AN - SCOPUS:85072047797

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 1881

EP - 1889

BT - Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

PB - IEEE Computer Society

T2 - 32nd IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019

Y2 - 16 June 2019 through 20 June 2019

ER -

Zhan X, Pan X, Liu Z, Lin D, Loy CC. Self-supervised learning via conditional motion propagation. In Proceedings - 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2019. IEEE Computer Society. 2019. p. 1881-1889. 8954463. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). doi: 10.1109/CVPR.2019.00198

Self-supervised learning via conditional motion propagation

Abstract

Publication series

Conference

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Cite this