Reference-Based Image and Video Super-Resolution via C2-Matching

Yuming Jiang; Kelvin C.K. Chan; Xintao Wang; Chen Change Loy; Ziwei Liu

doi:10.1109/TPAMI.2022.3231089

Reference-Based Image and Video Super-Resolution via C²-Matching

Yuming Jiang, Kelvin C.K. Chan, Xintao Wang, Chen Change Loy, Ziwei Liu^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

10 Citations (Scopus)

Abstract

Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution (Ref Image SR) under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution (Ref VSR) task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts by up to 0.7 dB on the standard CUFED5 benchmark and also boosts the performance of video super-resolution by incorporating the C2-Matching component into Video SR pipelines. Notably, C2-Matching also shows great generalizability on WR-SR dataset as well as robustness across large scale and rotation transformations. Codes and datasets are available at https://github.com/yumingj/C2-Matching.

Original language	English
Pages (from-to)	8874-8887
Number of pages	14
Journal	IEEE Transactions on Pattern Analysis and Machine Intelligence
Volume	45
Issue number	7
DOIs	https://doi.org/10.1109/TPAMI.2022.3231089
Publication status	Published - Jul 1 2023
Externally published	Yes

Bibliographical note

Publisher Copyright:
© 1979-2012 IEEE.

ASJC Scopus Subject Areas

Software
Computer Vision and Pattern Recognition
Computational Theory and Mathematics
Artificial Intelligence
Applied Mathematics

Keywords

Image super-resolution
reference-based super-resolution
video super-resolution

Access to Document

10.1109/TPAMI.2022.3231089

Cite this

@article{09b9a28560854aacb3c7aaf353642956,

title = "Reference-Based Image and Video Super-Resolution via C2-Matching",

abstract = "Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution (Ref Image SR) under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution (Ref VSR) task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts by up to 0.7 dB on the standard CUFED5 benchmark and also boosts the performance of video super-resolution by incorporating the C2-Matching component into Video SR pipelines. Notably, C2-Matching also shows great generalizability on WR-SR dataset as well as robustness across large scale and rotation transformations. Codes and datasets are available at https://github.com/yumingj/C2-Matching.",

keywords = "Image super-resolution, reference-based super-resolution, video super-resolution",

author = "Yuming Jiang and Chan, \{Kelvin C.K.\} and Xintao Wang and Loy, \{Chen Change\} and Ziwei Liu",

note = "Publisher Copyright: {\textcopyright} 1979-2012 IEEE.",

year = "2023",

month = jul,

day = "1",

doi = "10.1109/TPAMI.2022.3231089",

language = "English",

volume = "45",

pages = "8874--8887",

journal = "IEEE Transactions on Pattern Analysis and Machine Intelligence",

issn = "0162-8828",

publisher = "IEEE Computer Society",

number = "7",

}

TY - JOUR

T1 - Reference-Based Image and Video Super-Resolution via C2-Matching

AU - Jiang, Yuming

AU - Chan, Kelvin C.K.

AU - Wang, Xintao

AU - Loy, Chen Change

AU - Liu, Ziwei

PY - 2023/7/1

Y1 - 2023/7/1

N2 - Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution (Ref Image SR) under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution (Ref VSR) task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts by up to 0.7 dB on the standard CUFED5 benchmark and also boosts the performance of video super-resolution by incorporating the C2-Matching component into Video SR pipelines. Notably, C2-Matching also shows great generalizability on WR-SR dataset as well as robustness across large scale and rotation transformations. Codes and datasets are available at https://github.com/yumingj/C2-Matching.

AB - Reference-based Super-Resolution (Ref-SR) has recently emerged as a promising paradigm to enhance a low-resolution (LR) input image or video by introducing an additional high-resolution (HR) reference image. Existing Ref-SR methods mostly rely on implicit correspondence matching to borrow HR textures from reference images to compensate for the information loss in input images. However, performing local transfer is difficult because of two gaps between input and reference images: the transformation gap (e.g., scale and rotation) and the resolution gap (e.g., HR and LR). To tackle these challenges, we propose C2-Matching in this work, which performs explicit robust matching crossing transformation and resolution. 1) To bridge the transformation gap, we propose a contrastive correspondence network, which learns transformation-robust correspondences using augmented views of the input image. 2) To address the resolution gap, we adopt teacher-student correlation distillation, which distills knowledge from the easier HR-HR matching to guide the more ambiguous LR-HR matching. 3) Finally, we design a dynamic aggregation module to address the potential misalignment issue between input images and reference images. In addition, to faithfully evaluate the performance of Reference-based Image Super-Resolution (Ref Image SR) under a realistic setting, we contribute the Webly-Referenced SR (WR-SR) dataset, mimicking the practical usage scenario. We also extend C2-Matching to Reference-based Video Super-Resolution (Ref VSR) task, where an image taken in a similar scene serves as the HR reference image. Extensive experiments demonstrate that our proposed C2-Matching significantly outperforms state of the arts by up to 0.7 dB on the standard CUFED5 benchmark and also boosts the performance of video super-resolution by incorporating the C2-Matching component into Video SR pipelines. Notably, C2-Matching also shows great generalizability on WR-SR dataset as well as robustness across large scale and rotation transformations. Codes and datasets are available at https://github.com/yumingj/C2-Matching.

KW - Image super-resolution

KW - reference-based super-resolution

KW - video super-resolution

UR - http://www.scopus.com/inward/record.url?scp=85146222873&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85146222873&partnerID=8YFLogxK

U2 - 10.1109/TPAMI.2022.3231089

DO - 10.1109/TPAMI.2022.3231089

M3 - Article

C2 - 37015431

AN - SCOPUS:85146222873

SN - 0162-8828

VL - 45

SP - 8874

EP - 8887

JO - IEEE Transactions on Pattern Analysis and Machine Intelligence

JF - IEEE Transactions on Pattern Analysis and Machine Intelligence

IS - 7

ER -