Control Color: Multimodal Diffusion-Based Interactive Image Colorization

Zhexin Liang, Zhaochen Li, Shangchen Zhou, Chongyi Li, Chen Change Loy*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Despite the existence of numerous colorization methods, several limitations still exist, such as lack of user interaction, inflexibility in local colorization, unnatural color rendering, insufficient color variation, and color overflow. To solve these issues, we introduce Control Color (CtrlColor), a multi-modal colorization method that leverages the pre-trained Stable Diffusion (SD) model, offering promising capabilities in highly controllable interactive image colorization (Fig. 1). While several diffusion-based methods have been proposed, supporting colorization in multiple modalities remains non-trivial. In this study, we aim to tackle both unconditional and conditional image colorization (text prompts, strokes, exemplars) and address color overflow and incorrect color within a unified framework. Apart from accepting text prompts as conditions, we present an effective way to encode user strokes to enable precise local color manipulation and employ a practical method to learn the implicit color distribution in exemplars, adding versatility to our approach. We also introduce a novel module based on self-attention and a content-guided deformable autoencoder to address the long-standing issues of color overflow and inaccurate coloring. Extensive comparisons show that our model outperforms state-of-the-art image colorization methods both qualitatively and quantitatively. Project page: https://zhexinliang.github.io/Control_Color/

Original languageEnglish
JournalInternational Journal of Computer Vision
DOIs
Publication statusAccepted/In press - 2025
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.

ASJC Scopus Subject Areas

  • Software
  • Computer Vision and Pattern Recognition
  • Artificial Intelligence

Keywords

  • Diffusion model
  • Image colorization
  • Interactive control
  • Multi-modality

Cite this