CNN-based Image Recognition for Degraded Images

Recognition for degraded images having various levels of degradation is very important in practical applications.
The objective of this project is to construct a recognition network for degraded images.


1. Ensemble Approach [3], [4]

We propose a convolutional neural network to classify degraded images by using a restoration network and an ensemble learning in [3], [4]. Ensemble weights are automatically estimated by depending on the estimated degradation level. The proposed network can classify degraded images over various levels of degradation well.

NetworkComparison

Fig. 1 Classification networks of degraded images.


We compared the proposed network with existing methods, as seen in Fig.1. We use the CIFAR-10 dataset for the training and test. "(a)-org" and "(a)-jpg" denotes classification networks trained with original CIFAR-10 and JPEG CIFAR-10 images, respectively. "(b)-org" and "(b)-res" denotes sequential networks whose classification network trained with original CIFAR-10 and restored images of JPEG CIFAR-10, respectively. Figure 2 shows that the proposed method almost outperforms other networks.

JPEG compressed CIFAR-10

Fig. 2 Accuracy of JPEG CIFAR-10 with VGG-like.


2. Feature Adjustor [6]

The ensemble approach might underperform the classification network trained with clean images only. To overcome this deficit, we propose a network to learn the classification of degraded images and degradation levels of degraded images as multi-task learning in [6], as seen in Fig.3. The proposed network is based on the consistency regularization of image features between clean images and degraded images. The proposed network has enough ability to classify degraded images without sacrificing the performance for clean images.

Feature adjustor

Fig. 3 Feature adjustor.


Figure 4 shows the classification performance of the proposed feature adjustor and the source network trained with clean images only. "Degrade" in Fig.4 denotes the classification network trained with degraded and clean images. "Distillation" in Fig.4 denotes the classification network trained with degraded and clean images by using the knowledge distillation. "Degrade" and "Distillation" have the same network architecture as the source network. The proposed feature adjustor outperforms other networks for not only degraded images but also clean images.

JPEG compressed CIFAR-10 with ShakePyramidNet

Fig. 4 Accuracy of JPEG CIFAR-10 with ShakePyramidNet.


3. Layer-Wise Feature Adjustor [7]

The previous feature adjustor, which we call a single-layer feature adjustor, focuses on the final image features of a feature extractor. We extend the feature adjustor to the muti-layer of image features called the "layer-wise feature adjustor." Figure 5 shows the structure of the layer-wise feature adjustor based on SegNet, which is a type of semantic segmentation network. Unlike the single-layer feature adjustor, the layer-wise feature adjustor does not infer any degradation levels.

Layer-wise feature adjustor

Fig. 5 Layer-wise feature adjustor.


Table 1 shows mIoUs of semantic segmentation networks for CamVid trained on different conditions where a degradation type is the JPEG distortion. "Clean" denotes a SegNet trained with clean images only. "Degrade" denotes a SegNet trained with JPEG images and clean images. "Single-layer" denotes a feature adjustor trained with JPEG and clean images using "Clean" as a source network. "Layer-wise" denotes a feature adjustor trained with JPEG and clean images using "Clean" as a source network. According to Table 1, only "Layer-wise" can recognize high-quality and low-quality images well. Figure 6 shows sample images of segmentation for CamVid under the JPEG distortion.

Table 1. MIoUs for JPEG distored CamVid images
JPEG quality factor Clean Degrade Single-layer Layer-wise
Clean images 0.575 0.543 0.575 0.575
90 0.572 0.543 0.572 0.574
70 0.567 0.541 0.569 0.573
50 0.563 0.539 0.565 0.572
30 0.545 0.534 0.552 0.566
10 0.460 0.505 0.506 0.536
Average 0.547 0.534 0.557 0.566


JPEG quality factor Clean Image Quality 90 Quality 50 Quality 10
Input Input Clean Input Degrade 90 Input Degrade 50 Input Degrade 10
Ground truth Ground truth Ground truth Ground truth Ground truth
Clean Ground truth Ground truth Ground truth Ground truth
Degrade Ground truth Ground truth Ground truth Ground truth
Layer-wise Input Clean Input Degrade 90 Input Degrade 50 Input Degrade 10

Fig. 6 Sample images of the segmentation results for CamVid.


Publication

[1] CNN-based Classification of Degraded Images
Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi, Image Processing: Algorithms and Systems Conference, at IS&T Electronic Imaging 2020.
[PDF (move to ingenta CONNECT)]
[Reproduction Code]

[2] 畳み込みニューラルネットワークを用いた劣化画像のクラス分類
遠藤和紀, 田中正行, 奥富正敏, 第26回 画像センシングシンポジウム (SSII2020), June, 2020.
[Poster(PDF)]

[Reproduction Code]

[3] Classifying degraded images over various levels of degradation
Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi, Proceedings of IEEE International Conference on Image Processing (ICIP2020), October, 2020.
[PDF (move to arXiv)]
[Reproduction Code]

[4] CNN-Based Classification of Degraded Images with Awareness of Degradation Levels
Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi, IEEE Transactions on Circuits and Systems for Video Technology, Early Access, 2020.
[Abstract in IEEE Xplore]
[PrePrint PDF]
[Reproduction Code]

[5] 多様な劣化水準に対応可能な劣化画像のクラス分類ネットワーク
遠藤和紀, 田中正行, 奥富正敏, 第27回 画像センシングシンポジウム (SSII2021), June, 2021.
[Reproduction Code]

[6] CNN-Based Classification of Degraded Images Without Sacrificing Clean Images
Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi, IEEE Access, Early Access, 2021.
[Abstract&PDF in IEEE Xplore]
[Reproduction Code]

[Model Files (pth files, 17GB)]


[7] Semantic Segmentation of Degraded Images Using Layer-Wise Feature Adjustor
Kazuki Endo, Masayuki Tanaka, Masatoshi Okutomi, Proceedings of Winter Conference on Applications of Computer Vision (WACV2023), January 2023(to appear).
[PDF(forthcoming)]
[Reproduction Code]