PSANet: Point-wise spatial attention network for scene parsing

Hengshuang Zhao; Yi Zhang; Shu Liu; Jianping Shi; Chen Change Loy; Dahua Lin; Jiaya Jia

doi:10.1007/978-3-030-01240-3_17

PSANet: Point-wise spatial attention network for scene parsing

Hengshuang Zhao^*, Yi Zhang, Shu Liu, Jianping Shi, Chen Change Loy, Dahua Lin, Jiaya Jia

^*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

269 Citations (Scopus)

Abstract

We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes. In this paper, we propose the point-wise spatial attention network (PSANet) to relax the local neighborhood constraint. Each position on the feature map is connected to all the other ones through a self-adaptively learned attention mask. Moreover, information propagation in bi-direction for scene parsing is enabled. Information at other positions can be collected to help the prediction of the current position and vice versa, information at the current position can be distributed to assist the prediction of other ones. Our proposed approach achieves top performance on various competitive scene parsing datasets, including ADE20K, PASCAL VOC 2012 and Cityscapes, demonstrating its effectiveness and generality.

Original language	English
Title of host publication	Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
Editors	Martial Hebert, Vittorio Ferrari, Cristian Sminchisescu, Yair Weiss
Publisher	Springer Verlag
Pages	270-286
Number of pages	17
ISBN (Print)	9783030012397
DOIs	https://doi.org/10.1007/978-3-030-01240-3_17
Publication status	Published - 2018
Externally published	Yes
Event	15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany Duration: Sept 8 2018 → Sept 14 2018

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11213 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	15th European Conference on Computer Vision, ECCV 2018
Country/Territory	Germany
City	Munich
Period	9/8/18 → 9/14/18

Bibliographical note

Publisher Copyright:
© Springer Nature Switzerland AG 2018.

ASJC Scopus Subject Areas

Theoretical Computer Science
General Computer Science

Keywords

Adaptive context aggregation
Bi-direction information flow
Point-wise spatial attention
Scene parsing
Semantic segmentation

Access to Document

10.1007/978-3-030-01240-3_17

Cite this

Zhao, H., Zhang, Y., Liu, S., Shi, J., Loy, C. C., Lin, D., & Jia, J. (2018). PSANet: Point-wise spatial attention network for scene parsing. In M. Hebert, V. Ferrari, C. Sminchisescu, & Y. Weiss (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 270-286). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01240-3_17

Zhao, Hengshuang ; Zhang, Yi ; Liu, Shu et al. / PSANet : Point-wise spatial attention network for scene parsing. Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. editor / Martial Hebert ; Vittorio Ferrari ; Cristian Sminchisescu ; Yair Weiss. Springer Verlag, 2018. pp. 270-286 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{fbf21e1cd9a746ceac1bca557904b7f1,

title = "PSANet: Point-wise spatial attention network for scene parsing",

abstract = "We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes. In this paper, we propose the point-wise spatial attention network (PSANet) to relax the local neighborhood constraint. Each position on the feature map is connected to all the other ones through a self-adaptively learned attention mask. Moreover, information propagation in bi-direction for scene parsing is enabled. Information at other positions can be collected to help the prediction of the current position and vice versa, information at the current position can be distributed to assist the prediction of other ones. Our proposed approach achieves top performance on various competitive scene parsing datasets, including ADE20K, PASCAL VOC 2012 and Cityscapes, demonstrating its effectiveness and generality.",

keywords = "Adaptive context aggregation, Bi-direction information flow, Point-wise spatial attention, Scene parsing, Semantic segmentation",

author = "Hengshuang Zhao and Yi Zhang and Shu Liu and Jianping Shi and Loy, \{Chen Change\} and Dahua Lin and Jiaya Jia",

note = "Publisher Copyright: {\textcopyright} Springer Nature Switzerland AG 2018.; 15th European Conference on Computer Vision, ECCV 2018 ; Conference date: 08-09-2018 Through 14-09-2018",

year = "2018",

doi = "10.1007/978-3-030-01240-3\_17",

language = "English",

isbn = "9783030012397",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "270--286",

editor = "Martial Hebert and Vittorio Ferrari and Cristian Sminchisescu and Yair Weiss",

booktitle = "Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings",

address = "Germany",

}

Zhao, H, Zhang, Y, Liu, S, Shi, J, Loy, CC, Lin, D & Jia, J 2018, PSANet: Point-wise spatial attention network for scene parsing. in M Hebert, V Ferrari, C Sminchisescu & Y Weiss (eds), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11213 LNCS, Springer Verlag, pp. 270-286, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 9/8/18. https://doi.org/10.1007/978-3-030-01240-3_17

PSANet: Point-wise spatial attention network for scene parsing. / Zhao, Hengshuang; Zhang, Yi; Liu, Shu et al.
Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. ed. / Martial Hebert; Vittorio Ferrari; Cristian Sminchisescu; Yair Weiss. Springer Verlag, 2018. p. 270-286 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11213 LNCS).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - PSANet

T2 - 15th European Conference on Computer Vision, ECCV 2018

AU - Zhao, Hengshuang

AU - Zhang, Yi

AU - Liu, Shu

AU - Shi, Jianping

AU - Loy, Chen Change

AU - Lin, Dahua

AU - Jia, Jiaya

N1 - Publisher Copyright: © Springer Nature Switzerland AG 2018.

PY - 2018

Y1 - 2018

N2 - We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes. In this paper, we propose the point-wise spatial attention network (PSANet) to relax the local neighborhood constraint. Each position on the feature map is connected to all the other ones through a self-adaptively learned attention mask. Moreover, information propagation in bi-direction for scene parsing is enabled. Information at other positions can be collected to help the prediction of the current position and vice versa, information at the current position can be distributed to assist the prediction of other ones. Our proposed approach achieves top performance on various competitive scene parsing datasets, including ADE20K, PASCAL VOC 2012 and Cityscapes, demonstrating its effectiveness and generality.

AB - We notice information flow in convolutional neural networks is restricted inside local neighborhood regions due to the physical design of convolutional filters, which limits the overall understanding of complex scenes. In this paper, we propose the point-wise spatial attention network (PSANet) to relax the local neighborhood constraint. Each position on the feature map is connected to all the other ones through a self-adaptively learned attention mask. Moreover, information propagation in bi-direction for scene parsing is enabled. Information at other positions can be collected to help the prediction of the current position and vice versa, information at the current position can be distributed to assist the prediction of other ones. Our proposed approach achieves top performance on various competitive scene parsing datasets, including ADE20K, PASCAL VOC 2012 and Cityscapes, demonstrating its effectiveness and generality.

KW - Adaptive context aggregation

KW - Bi-direction information flow

KW - Point-wise spatial attention

KW - Scene parsing

KW - Semantic segmentation

UR - http://www.scopus.com/inward/record.url?scp=85055131955&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055131955&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01240-3_17

DO - 10.1007/978-3-030-01240-3_17

M3 - Conference contribution

AN - SCOPUS:85055131955

SN - 9783030012397

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 270

EP - 286

BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings

A2 - Hebert, Martial

A2 - Ferrari, Vittorio

A2 - Sminchisescu, Cristian

A2 - Weiss, Yair

PB - Springer Verlag

Y2 - 8 September 2018 through 14 September 2018

ER -

Zhao H, Zhang Y, Liu S, Shi J, Loy CC, Lin D et al. PSANet: Point-wise spatial attention network for scene parsing. In Hebert M, Ferrari V, Sminchisescu C, Weiss Y, editors, Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 270-286. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-01240-3_17

PSANet: Point-wise spatial attention network for scene parsing

Abstract

Publication series

Conference

Bibliographical note

ASJC Scopus Subject Areas

Keywords

Access to Document

Other files and links

Cite this