Abstract
This paper presents Dense Siamese Network (DenseSiam), a simple unsupervised learning framework for dense prediction tasks. It learns visual representations by maximizing the similarity between two views of one image with two types of consistency, i.e., pixel consistency and region consistency. Concretely, DenseSiam first maximizes the pixel level spatial consistency according to the exact location correspondence in the overlapped area. It also extracts a batch of region embeddings that correspond to some sub-regions in the overlapped area to be contrasted for region consistency. In contrast to previous methods that require negative pixel pairs, momentum encoders or heuristic masks, DenseSiam benefits from the simple Siamese network and optimizes the consistency of different granularities. It also proves that the simple location correspondence and interacted region embeddings are effective enough to learn the similarity. We apply DenseSiam on ImageNet and obtain competitive improvements on various downstream tasks. We also show that only with some extra task-specific losses, the simple framework can directly conduct dense prediction tasks. On an existing unsupervised semantic segmentation benchmark, it surpasses state-of-the-art segmentation methods by 2.1 mIoU with 28% training costs. Code and models are released at https://github.com/ZwwWayne/DenseSiam.
Original language | English |
---|---|
Title of host publication | Computer Vision – ECCV 2022 - 17th European Conference, Proceedings |
Editors | Shai Avidan, Gabriel Brostow, Moustapha Cissé, Giovanni Maria Farinella, Tal Hassner |
Publisher | Springer Science and Business Media Deutschland GmbH |
Pages | 464-480 |
Number of pages | 17 |
ISBN (Print) | 9783031200557 |
DOIs | |
Publication status | Published - 2022 |
Externally published | Yes |
Event | 17th European Conference on Computer Vision, ECCV 2022 - Tel Aviv, Israel Duration: Oct 23 2022 → Oct 27 2022 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13690 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 17th European Conference on Computer Vision, ECCV 2022 |
---|---|
Country/Territory | Israel |
City | Tel Aviv |
Period | 10/23/22 → 10/27/22 |
Bibliographical note
Publisher Copyright:© 2022, The Author(s), under exclusive license to Springer Nature Switzerland AG.
ASJC Scopus Subject Areas
- Theoretical Computer Science
- General Computer Science