Breast tissue acquisition and preparation
Fresh human breast tissue specimens from 34 patients were acquired through a protocol approved by The University of Texas MD Anderson Cancer Center and Rice University Institutional Review Boards, and each participant gave written informed consent. Fresh breast tissue was acquired from patients undergoing surgery to excise a clinically abnormal lesion. The procedure for tissue preparation has been described previously [24]. In brief, two tissue specimens - one grossly abnormal and one grossly normal in appearance were acquired from each patient for image acquisition and evaluation; each specimen measured approximately 15 × 15 mm2 in size, with thickness varying from 2 to 7 mm. Within 30 min of surgical excision, breast tissue specimens were stained for 1 min in a solution of 0.01 % proflavine in 1X phosphate-buffered saline (PBS). Proflavine is a nuclear contrast agent [30, 31], which has been used to stain breast tissue, oral mucosa, Barrett’s esophagus, cervical tissue, and sarcoma in previous studies [24, 25, 32–37]. Following topical application of proflavine, specimens were washed with 1X PBS and then immediately imaged.
Image acquisition and evaluation
Confocal fluorescence images were acquired from multiple sites within each specimen using a multi-wavelength scanning confocal microscope (Vivascope 2500®, Caliber Imaging and Diagnostics Inc., Andover, MA, USA) as described previously [24, 25, 38]. Following topical application of proflavine and the PBS wash, each tissue specimen was positioned on the microscope stage and imaged using 2.1 ± 0.4 mW power at 488 nm laser excitation, and the fluorescence was detected in a band pass of 550 ± 44 nm with a 30× water immersion lens. At these settings, the lateral and axial resolution was 1.0 μm and 5.0 μm, respectively, in the center of the 750 × 750 μm2 field of view. A 12 × 12 mm2 composite image was created for both sides of each tissue specimen. To create the composite image, images were acquired from contiguous sites in a grid pattern (maximum area 12.2 × 12.2 mm2) over the surface of the specimen at an approximate depth of 20 μm. Following image acquisition, specimens were kept moist in 1X PBS and were submitted for routine histologic preparation and fixation. Samples were stained with H&E and fixed on microscope slides for histologic assessment.
A board-certified, breast pathologist (author S. Krishnamurthy) viewed composite confocal images and fixed tissue specimens stained with H&E using a conventional light microscope to identify sites that corresponded to the same approximate location in the specimen based on similar image morphology. Specifically we selected in-focus confocal microscope fields of view that contain representative examples of characteristic benign and malignant breast features. Thus, at each site, a corresponding pair of confocal and H&E images were available from a 750 × 750 μm2 field of view. At each site, the H&E images of fixed tissue specimens were used as a reference standard to identify breast architectural features that should be present in corresponding confocal images [1, 24]. Benign breast features identified in reference H&E images included adipose and fibrous tissue, lobules, non-hyperplastic ducts, and ductal hyperplasia. Malignant breast features identified in reference H&E images included: ductal carcinoma in situ (DCIS), invasive ductal carcinoma (IDC), and invasive lobular carcinoma (ILC). Benign breast features were identified in specimens acquired from 30 patients and malignant features were identified in specimens from 22 patients.
Nuclear segmentation and connected components algorithms for identifying nuclei
A technique called maximally stable extremal regions (MSER) was used to segment nuclei from confocal images of proflavine-stained breast tissue. MSER has been used previously in the image-processing community for automatic reconstruction of three-dimensional scenes; here we have adapted it to segment nuclei from high-resolution fluorescence confocal microscopy images [39]. MSER employs intensity thresholding; however, no global or optimal threshold is sought; rather all thresholds are tested and the stability of the isolated connected components (i.e. nuclei) is evaluated. All possible thresholds from 0 to 255 are applied to an image and the sets of connected components (as well as their area) are stored (Fig. 1a-d). This yields a data structure in which the area of each connected component is stored as a function of the intensity threshold. Finally, the intensity thresholds which correspond to local minima in the rate of change of the area function are selected as thresholds producing MSER.
In order to apply MSER to our images, five tuning parameters associated with MSER had to be selected. The first two parameters, which included the minimum area (MinArea) and maximum area (MaxArea) of the connected components, are related to the expected size of nuclei. These parameters were selected based on the biologically expected range of nuclear diameters. Specifically, other groups have found nuclear volume to range from approximately 200 to 1500 μm3, which corresponds to 7 to 14 μm in diameter [23]. Therefore, MaxArea was set to 500 pixels, which corresponds to 19 μm in diameter, which is larger than the expected nuclear size for our images. MinArea was set to 35 pixels, which corresponds to 5 μm in diameter, which is smaller than the expected nuclear size for our images. The next set of parameters is related to the intensity thresholds and includes maximum variation (MaxVariation), minimum diversity (MinDiversity), and Delta. MaxVariation is the maximum variation allowed within a region that corresponds to a potential nucleus. MinDiversity is employed if there are two nested maximally stable regions. Specifically, if the diversity between the two nested regions is less than MinDiversity, then the nested region is removed. Lastly, Delta is the stability threshold. The stability of a region is defined as the relative variation of the region area when the intensity is changed by Delta/2. These intensity parameters were systematically tuned through applying a range of values to representative images in order to select the best value for each parameter. Specifically, one input parameter was varied over a wide range while other input parameters were held constant. For each iteration, the area fraction (AF) from representative images of tumor and normal tissue was calculated and overlays of the features isolated with that particular setting were displayed. The values that led to the largest differences in AF between tumor and normal tissues, while isolating features that approximately corresponded to nuclei or nucleoli, were selected. An illustration of this tuning approach can be found in Figure S1 in Additional file 1. Specifically, MaxVariation was set equal to 2.5, MinDiversity to 0.5, and Delta to 6. These parameter values are in terms of relative intensity, which for our 8-bit images ranges from 0 to 255.
After nuclei were isolated with MSER, a connected components algorithm was applied in order to calculate parameters such as nuclear density and diameter. In the connected components algorithm, all touching or connected pixels are assumed to belong to the same cell nucleus. Parameters include nuclear density (the number of nuclei in a unit area), area fraction (the total nuclear area divided by the total area), minimum inter-nuclear distance (the distance from a nucleus center to the next closest nucleus center), and nuclear diameter (the length of the major axis of each nucleus). Nuclear density and AF represent scalar variables – only one value is returned for each image, while the minimum inter-nuclear distance (IND) and nuclear diameter represent vector variables – a value is calculated for each nucleus in the image. In order to consolidate the vector variables into a scalar value, several summary statistics were evaluated, including mean, median, mode, interquartile range, and standard deviation.
Ductal segmentation algorithm and quantification of ductal parameters
An algorithm was developed to measure ductal parameters, which segments non-hyperplastic ducts, ductal hyperplasia, and DCIS lesions based on the intensity of proflavine staining (Fig. 1e-l). To reduce noise and increase image contrast, a Wiener lowpass filter was first applied (Fig. 1e) followed by contrast-limited adaptive histogram equalization (CLAHE, Fig. 1f). Images were converted from grayscale to binary using a user-defined threshold based on relative intensity. The mean threshold used to segment ducts was 107 ± 27 (range: 52–168) on a scale of 0 to 255 for 8-bit images (Fig. 1g). It was not possible to select a universal threshold, because in order to accurately segment ducts from surrounding tissue, it is necessary to isolate both nuclei in the duct walls and inter-nuclear space between them. The relative intensity of these features differed between images due to the variation in illumination power used for image acquisition and the variation in proflavine staining. Areas smaller than the upper threshold for cell nuclei (approximately equivalent to 280 μm2 or 500 pixels, with a diameter of 19 μm [40]) were removed to avoid segmenting individual nuclei outside of the duct walls. Individual ducts were manually segmented using a user-defined polygon selection tool to define architectural features corresponding to breast ducts (Fig. 1h). After application of the ductal segmentation algorithm, the binary confocal image showed the segmented duct walls (Fig. 1h) and the outer and inner boundaries (Fig. 1i, j) of the duct used to measure ductal parameters.
Following segmentation of ducts, a number of ductal parameters were measured based on the properties of the inner and outer duct boundaries. The outer boundary defines the outer edge of the duct wall and the inner boundary defines the inner edge of the duct wall; the lumen. The width of the duct wall was measured at every pixel on the outer edge of the duct wall. This was done by finding the shortest distance between every point on the outer boundary and the nearest point on the inner boundary (Fig. 1k). Duct wall width was measured for each non-hyperplastic duct, ductal hyperplasia, and DCIS lesion and the vector of values were summarized by calculating the mean, median, mode, interquartile range, and standard deviation. Other scalar parameters measured include the area of the duct wall, area of the lumen, area of an ellipse approximating the duct wall, area of an ellipse approximating the lumen, lengths of the major and minor axes for the duct and the lumen, solidity of the duct and the lumen, and eccentricity of the duct and the lumen (Fig. 1l).
Statistical analysis and model building
Nuclear parameters were calculated for all sites (n = 259) and ductal parameters were calculated for all sites that contained ducts (n = 50), and the diagnostic performance of each image parameter was individually assessed by determining the classification accuracy. Two-class linear discriminant analysis was performed to classify malignant from benign breast architectural features based on each individual nuclear or ductal parameter; receiver operator characteristic (ROC) curves were constructed and area under the curve (AUC) was calculated for each ROC curve. Sensitivity and specificity values were determined at the optimal cutpoint. Parameters were sorted by accuracy for classification of neoplasia based on AUC values. Boxplots were created for the parameters with the highest AUCs. A Student t test for samples with unequal variances was used to identify statistically significant differences between mean parameter values measured in benign and malignant tissues. This analysis was performed to evaluate individual nuclear and ductal parameters to incorporate into a classification model.
Next we sought to develop a multivariate model to yield optimal separation between benign and malignant tissues. Toward that end, all 33 nuclear and ductal variables were used as input for a classification and regression tree (CART) function in Matlab. Decision trees were constructed using the automated Matlab function classregtree, which selects parameters and cutpoints that lead to the optimal classification of benign and malignant breast architectural features. Decision trees were pruned to prevent a single nuclear or ductal from being used at more than one node within the tree. Pruning was also performed to prevent the number of categories for classification of malignant breast features from exceeding three: the number of malignant tissue types (IDC, ILC, and DCIS). After construction, decision tree nodes were pruned by finding the next higher node whose decision point led to two categories, one with a majority of neoplastic sites, and one with a majority of benign sites. A custom leave-one-out cross-validation algorithm was also developed in order to calculate the cross-validated sensitivity and specificity. Specifically, 258 of the 259 data points were used to build a CART model, which contained the same two variables at the first and second decision points. Specifically the standard deviation of IND (StdIND) was the first decision point and the number of lumens was the second decision point. However, with each iteration of leave-one-out cross-validation, the cutoff value of StdIND could vary. The cutoff value associated with the number of lumens (number of lumens >1) was held constant because biologically normal ducts are expected to only contain a single lumen; therefore, this was considered to be the optimal and only logical cutoff value and therefore was held constant. Then the model was applied to the remaining data point, which was classified as either benign or malignant. This process was repeated for all 259 data points, and the calculated diagnosis for each image was compared to the known diagnosis in order to calculate sensitivity and specificity for the cross-validated model. The performance of the decision tree was characterized by computing sensitivity and specificity for classification of malignant breast architectural features. Additionally, sensitivity and specificity were calculated for each individual histologic type of malignant tissue in order to determine the relative classification accuracy for IDC, ILC, and DCIS sites. For example, in order to calculate sensitivity for IDC, true positives were defined as IDC sites that had been classified as malignant by the decision tree, and false negatives were defined as IDC sites that had been classified as benign. Specificity was calculated by defining true negatives as benign sites that were correctly classified in the decision tree and false positives were defined as benign sites that were incorrectly classified. An ROC curve was constructed for the decision tree model. All sites were sorted in order of ascending StdIND value and then sensitivity and specificity for classification of neoplasia were calculated at every StdIND value. The cutoff value for number of lumens was held constant at one lumen because biologically normal ducts are expected to only contain a single lumen. AUC was calculated based on the resulting ROC curve.