Skip to main content


Fig. 1 | Breast Cancer Research

Fig. 1

From: A whole slide image-based machine learning approach to predict ductal carcinoma in situ (DCIS) recurrence risk

Fig. 1

WSI method for stratifying DCIS patients based on their recurrence risk. The first step in this pipeline automatically annotates the patient’s whole surgical H&E slides into prognostically informative tissue classes. For this automated annotation, the patient’s whole virtual slide is (a) preprocessed through whole-slide color normalization and down-sampling followed by (b) a sliding window, over the whole slide, which extracts non-overlapping image tiles which are then (c) color deconvoluted to yield the hematoxylin image from which (d) values for 166 texture features are extracted. These features are then (e) input into a random forest annotation classifier which (f) outputs a probability of each tile belonging to a specific class (malignant ducts of DCIS, surrounding the breast parenchyma/ducts, blood vessels, and stromal regions with and without dense immune infiltration [immune cells occupying at least 50% of the tile area]) which are combined to produce (g) a whole-slide annotation. The second step extracts tissue architecture features and features of the spatial relationship between these tissue classes, from the previously annotated slides, and compiles them into what serves as the “full-slide” feature set. For the prediction of DCIS recurrence risk, (h) each annotation is analyzed through (i) feature distributions, spatial features which compare distances between different classes, and other features such as region confidence. (j) The final (optimized) feature list, alongside the patient’s follow-up (recurrence) data as the labels, is used to train a (k) random forest recurrence risk classifier to predict (l) high versus low risk of recurrence and allows for the recommendation of optimal therapy

Back to article page