This project proposed a new metric named the area under the accuracy-accuracy curve (AUAAC) to simultaneously evaluate IND task and OOD detection accuracy.
Determining whether input data are out-of-distribution (OOD) is important for real-world applications of machine learning. Various approaches to OOD detection have been proposed, and there is a growing interest in evaluating their performance. A commonly employed approach for OOD detection is training the network model using an in-distribution (IND) task and then applying a threshold to the probability estimated of unknown data. However, current evaluation metrics only assess the OOD detection performance while neglecting the IND task performance. To address this issue, we propose new evaluation metrics for OOD detection. Our novel metric, the area under the accuracy-accuracy curve (AUAAC), is designed to simultaneously evaluate both the IND task and OOD detection performances. Specifically, it calculates the area under the accuracy-accuracy curve after estimating the accuracy of the IND task and OOD detection for all thresholds. Flaws within the training dataset, such as contaminated labels or inaccurate annotations, disturb the network in properly performing the IND task. Nevertheless, the network may distinguish whether new input data are in IND because it was priorly exposed to IND data and trained by their features. The proposed AUAAC can asses such malfunction while existing evaluation metrics overlook the performance of the IND task and cannot identify such issues.
AUAAC: Area Under Accuracy-Accuracy Curve for Evaluating Out-of-Distribution Detection [Paper]
Wonjik Kim, Masayuki Tanaka, Masatoshi Okutomi
Pacific-Rim Symposium on Image and Video Technology (PSIVT) 2023.