Slicing convolutional neural network for crowd video understanding

Jing Shao, Chen Change Loy, Kai Kang, Xiaogang Wang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

80 Citations (Scopus)

Abstract

Learning and capturing both appearance and dynamic representations are pivotal for crowd video understanding. Convolutional Neural Networks (CNNs) have shown its remarkable potential in learning appearance representations from images. However, the learning of dynamic representation, and how it can be effectively combined with appearance features for video analysis, remains an open problem. In this study, we propose a novel spatio-temporal CNN, named Slicing CNN (S-CNN), based on the decomposition of 3D feature maps into 2D spatio-and 2D temporal-slices representations. The decomposition brings unique advantages: (1) the model is capable of capturing dynamics of different semantic units such as groups and objects, (2) it learns separated appearance and dynamic representations while keeping proper interactions between them, and (3) it exploits the selectiveness of spatial filters to discard irrelevant background clutter for crowd understanding. We demonstrate the effectiveness of the proposed S-CNN model on the WWW crowd video dataset for attribute recognition and observe significant performance improvements to the state-of-the-art methods (62.55% from 51.84% [21]).

Original languageEnglish
Title of host publicationProceedings - 29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
PublisherIEEE Computer Society
Pages5620-5628
Number of pages9
ISBN (Electronic)9781467388504
DOIs
Publication statusPublished - Dec 9 2016
Externally publishedYes
Event29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016 - Las Vegas, United States
Duration: Jun 26 2016Jul 1 2016

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2016-December
ISSN (Print)1063-6919

Conference

Conference29th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016
Country/TerritoryUnited States
CityLas Vegas
Period6/26/167/1/16

Bibliographical note

Publisher Copyright:
© 2016 IEEE.

ASJC Scopus Subject Areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this