Classification with Machine Learning

Classification with Machine Learning2019-06-07T11:31:47-05:00

Overview

Evaluation of histological tissue sections is a critical tool for diagnosis and understanding of disease, and an indispensable tool for assessment of therapy in drug discovery and development. It is well known that there is fundamental and important prognostic data within images obtained from histological sections. The ability to quantitatively extract and analyze image features from digital pathology slide images that may not be visually discernible by a pathologist offers the opportunity for better modeling of disease and potentially improved prediction of disease aggressiveness and patient outcome.

With significant advances in computer software and hardware in the last decade, machine learning has become a pivotal tool used in a huge variety of applications. Major advances in computer vision and image processing techniques have contributed significantly to bio-imaging applications. Machine learning is especially well adapted to the classification and categorization of images based on the image content.

For classification and categorization, there are two classes of machine learning: supervised and unsupervised. Supervised machine learning involves annotation of image data by a human operator to build a dataset utilized to “train” the machine learning algorithm to recognize the important structures, patterns, and features of each category. Supervised machine learning is especially useful when such a training set exists, and the features used to identify and classify the image data are known. Deep learning and neural networks fall into the category of supervised machine learning, as well as Random Forest, Naive Bayes, and Support Vector Machine algorithms. Supervised machine learning techniques are essentially probability models that indicate the likelihood of a sample image falling into one of the trained categories. This approach is particularly valuable in the classification of image data into well defined categories identified by human operators.

Unsupervised machine learning is utilized when no training set or parameters exist, and is used to categorize image data based on features extracted the image data. Unsupervised machine learning requires no training data, rather it requires careful selection of features to be utilized to classify images (e.g. nuclear morphology). This approach is specifically useful when one wishes to examine the underlying structure of data without explicitly defining categories and training sets for the model.  Unsupervised machine learning techniques include clustering analysis (K-means and hierarchical agglomerative clustering), dimensionality reduction (e.g. principal component analysis).  Unsupervised approaches are particularly valuable for exploratory analysis when one seeks to identify the underlying structure of data gathered from image sets.

Visikol offers machine learning-based classification of histological sections as a service to clients using supervised or unsupervised techniques depending on the specific requirements of the project.

Protocol

InstrumentAperio XT2 Slide Scanner
Analysis MethodBrightfield Imaging
MarkersHematoxylin and Eosin (H&E)
Immunohistochemical staining
Sample SubmissionWhole Tissue fixed and stored in PBS with 0.05% azide
Formalin Fixed Paraffin Embedded (FFPE) tissue blocks
Tissues embedded in OCT
Pre-stained and mounted slides
Digitized slide images
Imaging Parameters20X, 40X magnification
Data DeliveryResults of machine learning classification:
Supervised: Overlay of probability maps, heatmap of overall probability of classes
Unsupervised: Cluster dendrograms, principal component plots, numerical results of feature extraction

General Procedure – Supervised Classification

  1. Tissue samples are transferred to Visikol in PBS w/ 0.05% azide or in a form most appropriate for the customer (e.g. FFPE, OCT compound).
  2. The samples are processed, sectioned, and stained with Hematoxylin and Eosin or IHC.
  3. The sample slides are imaged with high-throughput slide scanner at desired magnification.
  4. Alternatively, mounted and stained slides or digitized images of H&E or IHC slides can be sent for analysis.
  5. Client identifies classes for training machine learning algorithm
  6. The training set of images is annotated to identify classes
  7. Machine learning algorithm is trained using the training dataset
  8. The trained machine learning classifier is applied to the unknown images
  9. Resultant image probability maps and quantification report are then transferred to the customer.

General Procedure – Unsupervised Classification

  1. Tissue samples are transferred to Visikol in PBS w/ 0.05% azide or in a form most appropriate for the customer (e.g. FFPE, OCT compound).
  2. The samples are processed, sectioned, and stained with Hematoxylin and Eosin or IHC.
  3. The sample slides are imaged with high-throughput slide scanner at desired magnification.
  4. Alternatively, mounted and stained slides or digitized images of H&E or IHC slides can be sent for analysis.
  5. Quantitative features extracted using image processing techniques (e.g. nuclear morphology)
  6. Unsupervised machine learning algorithm is applied to the image data
  7. Resultant classification dendrogram, principal component plots, or heatmaps delivered to customer

Representative Data

Figure 1. Results of supervised machine learning classification of potential tumor regions in sentinel lymph node section.

Figure 2. Representation of process of extraction of features for use in unsupervised machine learning classification. In this example, the nuclear channel is extracted and segmented to measure nuclear morphology, and the resultant data is utilized to classify tissue samples according to variations in nuclear morphology.

Figure 3. Heatmap of results of clustering of the quantitative measurements of nuclear morphology from a set of histological samples, depicting similarities and differences between tissue sections.

Figure 4. Result of principal component analysis on the quantitative measurements of nuclear morphology from a set of histological samples showing the shift in phenotype from diseased to healthy upon drug treatment.

This website uses cookies to enhance the user experience. Ok