An important milestone in digital pathology was observed in 2016 during the Camelyon Grand Challenge (https://camelyon16.grand-challenge.org/Home/) held by the International Symposium on Biomedical Imaging. Participants were tasked with developing a computer vision-based approach for automated detection of metastatic breast cancer in whole slide images (WSI) of sentinel lymph node biopsies. There were two main objectives of the challenge: 1) Detect instances of cancer at the slide-level, and 2) Localize tumor regions in the WSI. The top performer of the challenge was a group from Harvard and MIT that achieved an area under the receiver-operator curve (AUC) of 0.994 for the first objective of slide-level detection, and a free-response receiver-operator curve (FROC) score of 0.807 for the second objective of tumor localization [1]. A group from Google was able to improve upon the published results after the challenge and achieve a high FROC score of 0.885 for tumor localization. For reference, a pathologist in used in this study achieved an AUC of 0.966 and an FROC score of 0.733 for the Camelyon16 data set.
Both groups used a deep learning-based approach that we at Visikol currently leverage in our 3Screen™ image analysis software that we use in our contract research services. A single whole slide image is often several gigabytes in size, which is extremely undesirable for training a convolutional neural network (CNN). To combat this issue, each whole slide image is split into many smaller images called “patches”. To create the training set, millions of patches are extracted from the dataset of WSIs and are accompanied by a label of either ‘normal’ or ‘tumor’ defined by pathologist annotation. There is often a disproportion between the number of normal patches and tumor patches in the training set, therefore image augmentation is used to create more samples for a particular class of images. Image augmentation involves utilizing the addition of noise, using affine transformations, rotations, and other modifications to the image set to increase the number of samples within a training dataset.
The InceptionV3 architecture is a CNN architecture that has achieved high degrees of accuracy for many image classification tasks and thus was selected for training a CNN to predict if an image patch belongs to the normal or tumor class [2]. In a validation set of 10k patches from the Camelyon16 dataset, our 3Screen™ trained CNN achieved a patch-wise accuracy of 97.0%, which is comparable to the 98.4% patch-wise accuracy reported by Dayong et al [3].
Figure 1. Inception v3 architecture (source: Medium)
With a trained CNN able to predict normal and tumor image patches from a WSI, a tumor ‘heatmap’ can be assembled by applying the sliding window approach to the entire WSI. Starting at the top-left corner of the WSI, an image patch is extracted and then classified using the trained CNN. The output from the CNN (which is a probability of the patch belonging to the tumor class) is projected onto the heatmap. Once this process is complete for one patch, the window slides to a new position in the WSI and the same process is repeated until all regions of the WSI have been classified and projected onto the heatmap. Tumor containing regions in the heatmap will contain values close to 1.0 indicating 100% probability that region contains a tumor while normal regions will reflect values closer to 0.0.
We have used this approach with several of our Client’s to assist them in the automated characterization of whole slides. We find that this approach greatly reduces the cost of pathological analysis while providing highly accurate results.
Figure 3. Representative result of classification of lymph node whole slide image. Original image with tumor metastasis outlined on left; Heatmap demonstrating predicted probability of metastatic regions
References
- Liu, Y., Gadepalli, K., Norouzi, M., Dahl, G. E., Kohlberger, T., Boyko, A., … & Hipp, J. D. (2017). Detecting cancer metastases on gigapixel pathology images. arXiv preprint arXiv:1703.02442.
- Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., & Wojna, Z. (2016). Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2818-2826).
- Wang, D., Khosla, A., Gargeya, R., Irshad, H., & Beck, A. H. (2016). Deep learning for identifying metastatic breast cancer. arXiv preprint arXiv:1606.05718.