Discovering Diagnostic Features Used by a CNN in Plant Species Identification
Abstract
An approach to improve the explainability (interpretability) of convolutional neural networks that identify plant species from leaf images is proposed. Specifically, a methodology is established to discover the most determining diagnostic features used by a convolutional neural network (CNN) in the identification of 63 native plant species from Costa Rica. The result is a CNN that not only identifies plant species but also provides an explanation through a heat map and a translation of that map into a table of diagnostic features used in classical taxonomy, each with a weight that describes the relative importance of each trait (e.g., apex, primary vein, and leaf base). To achieve this, a CNN was trained using leaf images from 63 vascular plant species from Costa Rica. Once the network was trained, the Layer-wise Relevance Propagation (LRP) technique was applied to a subset I of 50 leaves images distributed uniformly across a set of 10 species to visualize the representations (heat maps) learned by the internal layers of the CNN. Then, a taxonomist was asked to perform an equivalent task manually, annotating the same 50 leaf images in I by graphically highlighting the most significant features according to their expert judgment (feature map). Finally, algorithmic comparisons were made between the heat maps and feature maps to determine the similarity between the hottest areas used by the CNN and the features used in classical taxonomy.
Keywords
Convolutional neural network; heat map; layer-wise relevance propagation; deep learning; interpretability; automated plant species identification