An important milestone in digital pathology was observed in 2016 during the Camelyon Grand Challenge (https://camelyon16.grand-challenge.org/Home/) held by the International Symposium on Biomedical Imaging. Participants were tasked with developing a computer vision-based approach for automated detection of metastatic breast cancer in whole slide images (WSI) of sentinel lymph node biopsies. There were two main objectives of the challenge: 1) Detect instances of cancer at the slide-level, and 2) Localize tumor regions in the WSI. The top performer of the challenge was a group from Harvard and MIT that achieved an area under the receiver-operator curve (AUC) of 0.994 for the first objective of slide-level detection, and a free-response receiver-operator curve (FROC) score of 0.807 for the second objective of tumor localization . A group from Google was able to improve upon the published results after the challenge and achieve a high FROC score of 0.885 for tumor localization. For reference, a pathologist in used in this study achieved an AUC of 0.966 and an FROC score of 0.733 for the Camelyon16 data set.
Both groups used a deep learning-based approach that we at Visikol currently leverage in our 3Screen™ image analysis software that we use in our contract research services. A single whole slide image is often several gigabytes in size, which is extremely undesirable for training a convolutional neural network (CNN). To combat this issue, each whole slide image is split into many smaller images called “patches”. To create the training set, millions of patches are extracted from the dataset of WSIs and are accompanied by a label of either ‘normal’ or ‘tumor’ defined by pathologist annotation. There is often a disproportion between the number of normal patches and tumor patches in the training set, therefore image augmentation is used to create more samples for a particular class of images. Image augmentation involves utilizing the addition of noise, using affine transformations, rotations, and other modifications to the image set to increase the number of samples within a training dataset.
The InceptionV3 architecture is a CNN architecture that has achieved high degrees of accuracy for many image classification tasks and thus was selected for training a CNN to predict if an image patch belongs to the normal or tumor class . In a validation set of 10k patches from the Camelyon16 dataset, our 3Screen™ trained CNN achieved a patch-wise accuracy of 97.0%, which is comparable to the 98.4% patch-wise accuracy reported by Dayong et al .
Figure 1. Inception v3 architecture (source: Medium)