Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer

CVPR 2024

1MAIS & CRIPAC, CASIA, 2University of Chinese Academy of Sciences, 3University of Science and Technology of China, 4OPPO Research Institute, 5The Hong Kong Polytechnic University, 6ShanghaiTech University
Empty

SODA-SR achieves state-of-the-art performance without source data.

Abstract

Unsupervised Domain Adaptation (UDA) can effectively address domain gap issues in real-world image SuperResolution (SR) by accessing both the source and target data. Considering privacy policies or transmission restrictions of source data in practical scenarios, we propose a SOurce-free Domain Adaptation framework for image SR (SODA-SR) to address this issue, i.e., adapt a source-trained model to a target domain with only unlabeled target data. SODA-SR leverages the source-trained model to generate refined pseudo-labels for teacher-student learning. To better utilize pseudo-labels, we propose a novel wavelet-based augmentation method, named Wavelet Augmentation Transformer (WAT), which can be flexibly incorporated with existing networks, to implicitly produce useful augmented data. WAT learns low-frequency information of varying levels across diverse samples, which is aggregated efficiently via deformable attention. Furthermore, an uncertainty-aware self-training mechanism is proposed to improve the accuracy of pseudo-labels, with inaccurate predictions being rectified by uncertainty estimation. To acquire better SR results and avoid overfitting pseudo-labels, several regularization losses are proposed to constrain target LR and SR images in the frequency domain. Experiments show that without accessing source data, SODA-SR outperforms state-of-the-art UDA methods in both synthetic→real and real→real adaptation settings, and is not constrained by specific network architectures.

Method

Framework

One target LR input image together with its seven geometrically augmented images (i.e., rotate and flip the input) will be fed into the teacher model to generate the refined pseudo-label. The Softmax normalization function in the teacher model will be replaced by Gumbel-Softmax. For one LR input image, the teacher model will run multiple times to generate N pseudo-labels and calculate their mean and variance for uncertainty estimation.

Empty

Visual Illustration of UE

Pixels with higher error in the pseudo-label will be assigned lower confidence, which also proves the effectiveness of UE.

Empty

LAM Illustration of WAT

The results indicate WAT enables the SR model to utilize a wider range of pixels for reconstruction.

Empty

Experiments

Quantitative Comparison

Comparison with state-of-the-art UDA methods for × 4 SR. In six real→real tasks, our method achieves the best performance on PSNR and SSIM, and the second best performance on LPIPS.

Empty

Comparison with typical self-supervised SR methods for ×4 SR. Since our method preserves the domain-invariant knowledge in the pre-trained source model and reduces the cross-domain discrepancy by model adaptation, our method performs much better than the self-supervised SR methods.

Empty

Qualitative Comparison

Visual comparison for × 4 SR on DRealSR dataset (Sony→Panasonic). Our method can not only reason the correct structure of the buildings but also generate clear details, while other methods may generate deformed structure and blurry results. The results of our method are closest to those trained with target labels.

Empty

Contact

If you have any questions, please feel free to contact with Yuang Ai at shallowdream555@gmail.com.

BibTeX

@InProceedings{ai2024sodasr,
      author    = {Ai, Yuang and Zhou, Xiaoqiang and Huang, Huaibo and Zhang, Lei and He, Ran},
      title     = {Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {June},
      year      = {2024},
      pages     = {8142-8152}
  }