Diffusion-Based Adaptation for Classification of Unknown Degraded Images

Abstract

Classification of unknown degraded images is essential in practical applications since image-degraded models are usually unknown. Diffusion-based models provide enhanced performance for image enhancement and image restoration from degraded images. In this study, we use the diffusion-based model for the adaptation instead of restoration. Restoration from the degraded image aims to restore the degrade-free clean image, while adaptation from the degraded image transforms the degraded image towards a clean image domain. However, the diffusion models struggle to perform image adaptation in case of specific degradations attributable to the unknown degradation models. To address the issue of imperfect adapted clean images from diffusion models for the classification of degraded images, we propose a novel Diffusion-based Adaptation for Unknown Degraded images (DiffAUD) method based on robust classifiers trained on a few known degradations. Our proposed method complements the diffusion models and consistently generalizes well on different types of degradations with varying severities. DiffAUD improves the performance from the baseline diffusion model and clean classifier on the Imagenet-C dataset by 5.5%, 5%, and 5% with ResNet-50, Swin Transformer (Tiny), and ConvNeXt-Tiny backbones, respectively. Moreover, we exhibit that training classifiers using known degradations provides significant performance gains for classifying degraded images.

Method

Motivation: The performance of typical classifiers significantly drops due to unknown degradation. Hence, we employ DDPM to adjust degraded images towards the domain of clean images. We inherently assume that the adapted images domain is better than directly using unknown degraded images for classification. Indeed, previous studies like DDA have shown that DDPM helps improve the performance of classifying unknown degraded images. The DDA method applies an ensemble of classifiers trained on clean images to the input of degraded and adapted images to resolve imperfect adapted images' limitations. Contrastingly, we train two separate classifiers on adapted and degraded images that substantially improve classification performance for both adapted and degraded images. In particular, a classifier trained on adapted images with a limited set of known degradations anticipates imperfections in the image, thereby contributing to the robustness of our proposed method. Similarly, a classifier trained on degraded images of a few dissimilar known degradations helps our proposed method handle the degraded images directly.

Notations: We categorize three types of training images, i.e., clean, degraded, and adapted, denoted as $X_{clean}$ , $X_{deg}$ , and $X_{adapt}$ , respectively. Clean images are natural images without degradation; degraded images undergo synthesis using a specific degradation model, and the adapted images are sampled by applying DDPM on degraded images. Furthermore, there are two types of classifier in our study, i.e., simple classifier and distilled classifier, denoted as $C$ and $DC$ . $C$ and $DC$ are trained using image and label pairs $\{X_k, Y\}$ , where $k \in \{clean, deg, adapt\}$ represents clean, degraded, and adapted images respectively. We represent classifiers trained with clean, adapted, and degraded images as $C_{clean}$ , $C_{adapt}$ , and $C_{deg}$ , respectively. Likewise, distilled classifiers trained using $X_{clean}$ , $X_{adapt}$ , and $X_{deg}$ images are represented as $DC_{clean}$ , $DC_{adapt}$ , and $DC_{deg}$ respectively. Besides, there are two other symbols utilized in our study, i.e., $\Delta$ describes the DDPM process for adaptation such as the one described in DDA and $E$ denotes the ensemble, which comprises a set of distinct classifiers defined as $E(,)$ .

Proposed Method: We propose DiffAUD, i.e., diffusion-based adaptation for unknown degraded images as described in Figure 1, where the top block shows the overall process for the classification of degraded images, which constitutes applying a diffusion model and an ensemble of distilled classifiers $DC_{adapt}$ and $DC_{deg}$ to get the final classification prediction. Furthermore, to apply ensemble, we take the sum of logits from the two classifiers before the softmax function and apply $argmax$ to predict the input image class. Our proposed method is split into three steps as follows:

Apply DDPM on the degraded images $X_{deg}$ to yield adapted images $X_{adapt}$ .
Feed adapted images $X_{adapt}$ to a distilled classifier trained on adapted images from known degradations, i.e., $DC_{adapt}$ and in parallel, we input degraded images $X_{deg}$ directly to a distilled classifier trained on known degradation images, i.e., $DC_{deg}$ .
Apply ensemble on the outputs of two distilled classifiers to output $Y_{deg}$ .

Results

To provide an in-depth view of the Imagenet-C dataset, we show accuracy over different severity levels averaged over all respective corruptions in Figure 3 with different backbones. With an increase in severity levels from 1 to 5, naturally, performance drops for all methods. On the ResNet-50 backbone, the performance of the DDA method becomes closer to the $C_{clean}$ method towards low severity levels. Similarly, on Swin-Tiny and ConvNeXt-Tiny backbones, we can see similar patterns where $C_{clean}$ performs almost similarly or, in fact, better on lower degradation levels than DDA. While DDA performs decently compared to $C_{clean}$ on higher severity levels, severity levels are often unknown in real-world images, making the DDA method much more prone to performance reduction on lower severity levels. Next, performance of $C_{deg}$ and $DC_{deg}$ is very close to each other; however, $DC_{deg}$ performing slightly better on Swin-Tiny and ConvNeXt-Tiny backbones; showing the effectiveness of our distillation strategy for training classifiers.

On the other hand, our proposed method DiffAUD consistently performs drastically better on all severity levels as compared to $C_{clean}$ as well as DDA methods, which shows that DiffAUD is invariant of lower adaptation quality from the diffusion models following the same diffusion process as DDA. It makes our work more significant toward achieving higher robustness and generalization, which can work with different diffusion models and adaptation processes.

Result Graphs

Figure 3: Performance with several backbones on Imagenet-C dataset with different severity levels averaged over all corruptions.

BibTeX

@InProceedings{Daultani_2024_CVPR, author = {Daultani, Dinesh and Tanaka, Masayuki and Okutomi, Masatoshi and Endo, Kazuki}, title = {Diffusion-Based Adaptation for Classification of Unknown Degraded Images}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops}, month = {June}, year = {2024}, pages = {5982-5991} }

Diffusion-Based Adaptation for Classification of Unknown Degraded Images

Abstract

Method

Sample training images

Figure 2a: Sample figure of a bird with different known training degradations and severity levels.

Figure 2b: Sample figure of a butterfly with different known training degradations and severity levels.

Figure 2c: Sample figure of a panda with different known training degradations and severity levels.

Results

BibTeX