Confidence Calibration of CNNs in Medical Image Databases

Nancy López-Miguel, Raquel Diaz-Hernández, Leopoldo Altamirano-Robles

Abstract


Disease classification with convolutional neural networks (CNNs) has evolved significantly, providing practical tools to address this challenge in medical imaging. Notwithstanding these advances, there is a significant gap in evaluating the confidence of the results provided by the networks. Consequently, this paper proposes to perform confidence calibration of the predictions in these models. Some authors have proposed approaches to solve this problem. However, there is still a lack of evaluation of these confidence calibration methods in medical contexts. In this paper, two confidence calibration methods (Mixup and Temperature Scaling) are applied on three different medical image bases (MIDBs), in addition to evaluating a base case with the Geometric Shapes Dataset. The MIDBs analyzed are BCS-DBT, BreakHis, and lung disease. Our results demonstrate the importance of confidence calibration in medical image classification for the following reasons: 1) Model predictions in medical imaging are crucial to be reliable and backed by an accurate measure of their confidence. 2) Calibration methods help identify erroneous or unreliable predictions the model makes. 3) Implementing confidence calibration methods in the models decreases the overconfidence predictions and, in general, improves the predictions made by the model. In the three bases analyzed, the Mixup procedure and the one combined with Temperature Scaling have the best results, obtaining ECE values between 0.0037 and 0.0671 and Accuracy values of 77.91 as the lowest value and 97.52 as the highest.



Keywords


Confidence calibration, mixup, TS

Full Text: PDF