Non-invasive optical spectroscopic monitoring of breast development during puberty
- Lothar Lilge†1, 2,
- Mary Beth Terry†3, 4,
- Jane Walter2,
- Dushanthi Pinnaduwage5,
- Gord Glendon5,
- Danielle Hanna5,
- Mai-Liis Tammemagi5,
- Angela Bradbury6,
- Saundra Buys7,
- Mary Daly8,
- Esther M. John9, 10,
- Julia A. Knight5, 11 and
- Irene L. Andrulis5, 12Email author
© The Author(s). 2017
Received: 16 November 2016
Accepted: 12 January 2017
Published: 6 February 2017
Tanner staging (TS), a five-stage classification indicating no breast tissue (TS1) to full breast development (TS5), is used both in health research and clinical care to assess the onset of breast development (TS2) and duration in each stage. Currently, TS is measured both visually and through palpation but non-invasive methods will improve comparisons across settings.
We used optical spectroscopy (OS) measures from 102 girls at the Ontario site of the LEGACY girls study (average age 12 years, range 10.0–15.4 years) to determine whether breast tissue optical properties map to each TS. We further examined whether these properties differed by age, body mass index (BMI), and breast cancer risk score (BCRS) by examining the major principal components (PC).
Age and BMI increased linearly with increasing TS. Eight PCs explained 99.9% of the variation in OS data. Unlike the linear increase with age and BMI, OS components had distinct patterns by TS: the onset of breast development (TS1 to TS2) was marked by elevation of PC3 scores indicating an increase in adipose tissue and decrease in signal from the pectoral muscle; transition to TS3 was marked by elevation of PC6 and PC7 and decline of PC2 scores indicating an increase in glandular or dense tissue; and transition to TS4+ by decline of PC2 scores representing a further increase in glandular tissue relative to adipose tissue. Of the eight PCs, three component scores (PC4, PC5, and PC8) remained in the best-fitting model of BCRS, suggesting different levels of collagen in the breast tissue by BCRS.
Our results suggest that serial measures of OS, a non-invasive assessment of breast tissue characteristics, can be used as an objective outcome that does not rely on visual inspection or palpation, for studying drivers of breast development.
KeywordsOptical spectroscopy Breast development Breast cancer family history Tanner staging LEGACY girls study
Breast cancer (BC) incidence is increasing in women under age 40 years in the US  and is increasingly common worldwide in women under age 50 years [2, 3]. Decline in the age of breast development  may account for some of the change. Age of menarche, a long-established risk factor for breast cancer, has been relatively stable in recent decades . As the interval between early breast development and the age at menarche (referred to as pubertal tempo) when the breast may be more susceptible to carcinogens has widened, it is essential to have other measures of pubertal development . Height, age at breast development, age at menarche, and increased tempo were each independently associated with an increase in BC risk in a large prospective cohort study . Compared with height and age at menarche, age at breast development has been more challenging to determine.
Breast development is often assessed using Tanner stages (TS), which is routinely used in clinical evaluation. TS range from TS1 to TS5, and are separately evaluated for breast and pubic hair. We focus this paper on breast TS with TS1 referring to no breast development, TS2 as the first appearance of breast buds, TS3 where the areola and breast are larger than just buds but the areola does not stick out away from the breast, TS4 where the nipple is raised above the breast, and TS5 the mature breast. Tanner stage is generally assessed by a clinician using visual inspection followed with palpation, but can also be evaluated by self-reporting or maternal reporting using drawings of TS with explanatory text . TS reporting by parents or self-reporting has been less reliable and valid compared with clinician reports, with parents more accurate reporters of TS in children before age 11 years and children more accurate reporters after age 11 .
Breast development can also be tracked through imaging methods, although most imaging methods such as dual-energy x-ray absorptiometry, magnetic resonance imaging or mammography are either too expensive to use routinely in young girls and/or involve exposing the breast to ionizing radiation. Breast tissue composition is associated with mammographic breast density (MBD), which represents the connective and glandular versus the adipose tissue fraction [10–12]. The tissue components giving rise to MBD have distinct optical absorption spectra, which led to the development of optical spectroscopy (OS) methods to examine breast tissue composition using visible and near infrared light. OS has been shown to identify women of mammographic screening age having >75% MBD  and who are at elevated risk of BC, with sensitivity and specificity >0.9 [14, 15]. Studies in younger women (31–40 years of age) showed strong associations with parity , another well-established BC risk factor. Here we present an extension of the OS technique adapted for the developing breast of girls ages ≥10 years, to demonstrate the utility of this method to detect breast development TS, adjusting for age, BMI, and breast cancer risk score (BCRS). We further examined whether BCRS was associated with OS components.
The participants in this study were from the Ontario site of the LEGACY girls study , an NCI-funded prospective cohort of 1040 girls enrolled at ages 6–13 years at five study sites in the US and Canada. Half of the girls come from families with positive BC history (BCFH+) defined as having at least one first or second degree relative diagnosed with BC. Girls without a breast cancer family history (BCFH-) had no first or second degree relative with BC. All participating institutions obtained Institutional Review Board approval (for more details see  and www.legacygirlstudy.org).
Of the girls from the Ontario site who were 10 years and older and who were invited to participate in the OS study, 93% accepted and completed baseline and follow-up measures. There were 105 Ontario girls initially eligible for this pilot study, with 102 complete datasets for analysis.
OS instrumentation and data preparation for final analysis
The OS approach was similar to that previously described in adult women [18, 19] except for using light diffusely reflected from the tissue rather than transmitted through the breast, as TS1 to TS3 do not provide sufficient tissue to place the optical fiber bundles at opposite sides of the breast for transmission experiments [20, 21]. Reflectance quantification covered the 635–1060 nm spectral range. A 5-mm fiber bundle delivered broadband light from a halogen lamp to the skin surface and a 3-mm fiber bundle collected the diffuse reflected photons guiding it to the holographic transmission spectrophotometer (PPO, Kitchener, ON, Canada) with a cooled 256 × 1440 pixel CCD (Photometrics, NJ, USA). A black flexible template (shown in Additional file 1: Figure S1A), provided reproducible inter-optode distances and absorbed all photons reaching the surface. The participant was in the supine position and optical measurements were executed at four quadrants (Additional file 1: Figure S1B), superior, lateral, inferior, and medial on each breast, resulting in eight diffuse reflectance spectra per participant. The light source irradiance (approximately 180 mWcm−2) equals approximately twice the noontime solar exposure during the summer solstice in Boston, MA, USA, but does not contain UV or blue spectral components. Exposure times were 2–80 sec per spectrum.
Spectra were corrected for exposure time and dark signal, and a 7-point boxcar smoothing algorithm was applied followed by a cubic spline interpolation to sample spectra at 1 nm increments. Spectra were corrected for variations in the instrument throughput using a high albedo reflection standard, resulting in effective light attenuation spectra. Corrected spectra were mean-centered for principal component analysis (PCA). While two inter-optode distances (1.5 and 3 cm) were used, the short distance at times resulted in suspect detector saturation effects and was not further considered in this analysis; thus, 840 spectra were used to determine orthogonal PCA spectra reducing the dimensionality of information in each original spectrum. The eight first component vectors (PC1–PC8) (see Additional file 1: Figure S2) represent 99.99% of the variation seen in the complete dataset. Each principal component (PC) spectrum represents different optical tissue features, including light scattering by cellular and structural components, and absorption dominated by the five main breast tissue components (water, lipid, oxy-hemoglobin (HbO2), deoxy-hemoglobin (Hb), and collagen) and residual absorption by yet unidentified chromophores. As the breast develops homogenously bilaterally and only tissue average properties are sought, each PC score was averaged over both breasts resulting in an OS dataset comprising 105 girls, each having one score for each of the eight principal components (PC1–PC8).
Breast cancer risk score
We calculated a continuous probability score reflecting each girl’s estimated absolute lifetime risk of breast cancer. We estimated the BCRS based on available detailed pedigree data, allowing us to calculate a risk score using the breast and ovarian analysis of disease incidence and carrier estimation algorithm (BOADICEA) [22–24].
Complete data on age, body mass index (BMI) collected through clinical measures, Tanner breast stage assessed by a guardian, and BCRS were available for 102 of the 105 girls. Due to a small sample of girls in the TS5 group (n = 6) and the fact that some adolescent girls go directly from TS3 to TS5 without a TS4 or do not progress to TS5, we combined TS4 and TS5 for the analyses.
We used descriptive statistics to summarize the data. Analysis of variance (ANOVA) and univariate logistic regression were performed to identify covariates from PC1–PC8 scores, age, BMI and BCRS that predict breast stage. We also incorporated random forest analysis to examine the influence of all covariates together in the prediction, as multivariate logistic regression predictions are not reliable in a small dataset with many covariates. The features selected were used in multivariate logistic regression models. Linear discriminant analysis (LDA) with 60% of the data used as a training set and the rest used as a test set was applied to measure the predictive ability. We also examined the ability of OS measurements to predict BCRS. The best predictive model was selected by Akaike’s information criteria (AIC) . Before conducting the above analyses, each PC score and BCRS were rescaled by dividing by the corresponding interquartile range for meaningful interpretation of the results. Correlation, regression, ANOVA and LDA analyses were performed using SAS 9.1 software (SAS Institute, Inc.) and the other analyses and plots were achieved using R statistical software, version 2.15.0 (http://www.r-project.org).
Study cohort characteristics
Characteristics of the cohort
10 to <12
12 to <14
> = 14
Mean age = 12.03, SD = 1.43, minimum = 10.01, maximum = 15.38
10 to <15
15 to <20
20 to <25
Mean BMI = 18.58, SD = 3.11, minimum = 12.46, maximum = 33.18
Breast Tanner stage
Breast cancer risk category
0.12 <= BCRS <0.2
BCRS > = 0.2
Mean BCRS = 0.15, SD = 0.05, minimum = 0.11, maximum = 0.30
The association between OS measurements and breast TS
Association between breast Tanner stage and age, BMI, BCRS, and OS principal component scores
Breast Tanner stage
Multinomial logistic regression results for TS2 versus TS1 breast stage prediction
Binary logistic regression results for late (TS3–TS5) vs. early (TS1–TS2) breast stage prediction
Binary logistic regression results for TS4–TS5 vs. TS1–TS3 breast stage prediction
The association between OS measurements and BCRS
Breast cancer risk score (BCRS): simple and multivariate regression results
Simple regression analysis
Multiple regression analysis (best model by AIC)
Breast cancer risk score (BCRS): simple and multivariate regression results in the late (TS3 − TS5) subgroup
Simple regression analysis
Multiple regression analysis (best model by AIC)
Using PCA of visible and near-infra-red (NIR) spectra from breast tissue, we were able to capture over 99% of the variation in breast tissue optical properties through eight PCs. Unlike the linear increase with age and BMI, OS components had distinct patterns by TS suggesting that OS can be used to objectively identify breast TS.
During early-stage breast development, the majority of the optical information pertains to the skin, subcutaneous tissue including the adipose tissue and the pectoral muscle, whereas for the later TS the optical signal of the pectoral muscle is replaced by the actual breast tissue. The PC scores that are correlated with each stage are sufficient to capture the changing ratios of muscle to adipose to glandular tissue within the optically sampled volume in girls’ chests during puberty.
Spectroscopically, the most striking features in the PC spectra are the strong peaks at 930 nm and 970 nm representing lipid and water absorption, respectively. These peaks both appear inversely in PC1 and are visible in PC2, PC3, PC5, and PC6, and are not statistically significant, reflecting a change in the adipose (lipid) and proliferating glandular (water) tissue. While the spectral components of the main tissue chromophores are overlapping (see Additional file 1: Figure S1C), the short wavelength range is dominated by the hemoglobins, whereas the long wavelength range is affected by collagen .
The current PCA analysis, while being somewhat difficult to visualize, nevertheless provides strong evidence of the ability to stage breast development in an objective manner. Each of the current PCs carries information on the various tissue chromophores as shown in Additional file 2: Table S3. The final separation of the chromophores requires significant additional computation. As Additional file 2: Table S3 illustrates, the separate PCs are related to a set of chromophores but it is the direction of these relationships and the strengths of these associations that change as the breast develops. In Additional file 2: Table S3, we show the correlation and the P values for PC1–8 and each chromophore. PC1, which accounts for the greatest variation, is dominated by the overall attenuation rather than the contributions of specific chromophores. The other components, however, reveal how there is additional adipose and dense tissue as the breast develops, that the ratio between the two changes, and that there is less signal from the pectoral muscle.
For example, PC2 scores are related to the amount of dense tissue which increases as the breast matures from TS2 to TS4. For transition from TS1 to TS2, which is the onset of breast development, PC3 scores become positive and remain positive through TS4, signaling an increase in lipids or adipose tissue as the breast develops. Thus, the onset of breast development is marked by an increase in adipose tissue. In addition, the PC3 scores have a large negative component at shorter wavelengths, indicating a reduction in hemoglobin and/or myoglobin within the optical measured tissue volume, indicating breast tissue with lower relative blood volume and less contribution from the pectoral muscle compared to TS1 (see Additional file 2: Table S3). The increased relative absorption by lipids at the expense of water and hence glandular tissue is also present, as shown by the declining contribution of the PC2 scores. Transition to T3 was also marked by an increase in PC6 scores, reflecting additional lipid content and an increase in PC7 scores reflecting lower collagen.
Interestingly, although PC4 and PC5 scores did not map clearly to TS they were different by BCRS. As Additional file 2: Table S3 reveals, high PC4 scores indicate increased collagen in the optically measured tissue volume and decreased hemoglobin content and oxygenation and high PC5 scores indicate less lipid.
We identified OS-derived principal components (PC2, PC3, PC6, and PC 7) that mapped to breast developmental stage. In particular, the complementarity of spectral features in PC2 and PC6 and the unique short wavelength absorption in PC3 are sufficient to capture the changing ratio of muscle to adipose to glandular tissue in girls’ chests during puberty, as noted by the multivariate regression results (Tables 3, 4, and 5) and the variable importance random forest plots (Additional file 1: Figures S4A-B). Thus, this preliminary study suggests that OS-derived measures have the potential to predict breast developmental stage in preteen and teen girls.
Furthermore, three OS-derived principal components (PC4, PC5, and PC8 scores) together best predicted BCRS. The PC4 and PC8 scores correlated negatively and significantly with BCRS indicating that those with higher scores in these variables tend to come from BCFH- families. The PC5 scores positively correlated with BCRS implying that those with higher scores in these variables tend to come from BCFH+ families. It is of interest that the lipid-water ratio, previously identified as a breast cancer risk factor in adult women is not prominent in these spectra, but there is strong absorption at the short wavelengths and long wavelengths beyond 970 nm; this suggests that the relative hemoglobin and collagen contributions may play a role in BCFH status.
We have found that a non-invasive imaging method can be used to accurately classify girls by breast developmental stage. As the onset of breast development and the duration in each stage may map to increased breast cancer susceptibility, studies of pubertal development can use objective OS imaging methods, either alone or in combination with more subjective measures of breast development based on maternal or self-report of breast development stages, to more accurately predict breast development changes over time.
Akaike’s information criteria
Analysis of variance
breast cancer family history
Breast cancer risk score
Body mass index
Linear discriminant analysis
Mammographic breast density
Principal component analysis
The authors thank the LEGACY girls and family members for continuing contributions to the study, and our colleagues at the participating clinics. We also acknowledge the diligent work of Brenda Ornelas, Jennifer Xanthopoulos, Victoria Kuta, Jennifer Batchelor, Rohini Gosai, Pauline Susanto, and Nayana Weerasooriya, who assisted in data collection. We thank the contributing clinical centers (Clinical Genetics at Trillium Health Partners - Credit Valley Hospital, Cancer Risk Assessment Centre at the Juravinski Cancer Centre, Princess Margaret Hospital Familial Breast and Ovarian Cancer Clinic, Mount Sinai Familial Breast Cancer Clinic, and Granovsky Gluskin Family Medicine Centre of Mount Sinai Hospital.
This work was supported by the National Cancer Institute at the National Institutes of Health (Grants CA138638 to E.M. John, CA138819 to M.B.D Daly, CA138822 to M.B. Terry, and CA138844 to I.L. Andrulis) and the Canadian Breast Cancer Foundation (I.L. Andrulis). L. Lilge acknowledges support from the Ontario Ministry of Health and Long Term Care. I.L. Andrulis holds the Anne and Max Tanenbaum Chair in Molecular Medicine at Mount Sinai Hospital and the University of Toronto.
Availability of data and materials
Please contact the corresponding author for additional information on how to obtain the study data.
LL designed the optical spectroscopy device and maintained instrument calibration and converted transmission measurements in effective attenuation data for further analysis, participated in the design and acquisition of the data, conceptualized the analyses, directed the data analysis and interpretation, and participated in writing the manuscript. MBT conceptualized the design of the overall parent study and participated in the assembly of the data, conceptualized the analyses, directed the data analysis and interpretation, and participated in writing the manuscript. JW designed the optical spectroscopy device and maintained instrument calibration and converted transmission measurements in effective attenuation data for further analysis, and participated in acquisition of the data and writing the manuscript. DP conceptualized the design of the analyses, analyzed the data and participated in interpretation and in writing the manuscript. GG participated in the design of the overall parent study, acquisition of the data, and writing the manuscript. DH and MT participated in design, acquisition of data, and writing the manuscript. AB, SBB, MD, and EMJ conceptualized the design of the overall parent study and participated in interpretation of the data and writing the manuscript. JAK conceptualized the design of the overall parent study and participated in the acquisition of the data, analysis and interpretation, and writing the manuscript. ILA conceptualized the design of the study and the analyses presented, participated in the acquisition of the data, data analysis and interpretation, and writing the manuscript. All authors approved the final manuscript as submitted and agreed to be accountable for all aspects of the work.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
All participating institutions obtained Institutional Review Board approval to conduct the study; Mount Sinai Hospital (#08-0281-A), University Health Network (#09-0379-CE), University of Utah (IRB 00047298), Fox Chase Cancer Center (#11-803), Columbia University (AAAC5578), Cancer Prevention Institute of California (2009-005) and the Committee for the Protection of Human Subjects of the California Health and Human Subjects Agency (12-12-0950). Mothers/guardians provided written informed consent, and the girls provided assent based on institutional standards. For more details see www.legacygirlstudy.org.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Johnson RH, Chien FL, Bleyer A. Incidence of breast cancer with distant involvement among women in the United States, 1976 to 2009. JAMA. 2013;309:800–5.View ArticlePubMedGoogle Scholar
- Torre LA, et al. Global cancer incidence and mortality rates and trends–an update. Cancer Epidemiol Biomarkers Prev. 2016;25(1):16–27.View ArticlePubMedGoogle Scholar
- Colditz GA, Rosner BA, Speizer FE. Risk factors for breast cancer according to family history of breast cancer. For the Nurses’ Health Study Research Group. J Natl Cancer Inst. 1996;88(6):365–71.View ArticlePubMedGoogle Scholar
- Biro FM, Greenspan LC, Galvez MP. Puberty in girls of the 21st century. J Pediatr Adolesc Gynecol. 2012;25(5):289–94.View ArticlePubMedPubMed CentralGoogle Scholar
- Euling SY, et al. Examination of US puberty-timing data from 1940 to 1994 for secular trends: panel finding. Pediatrics. 2008;121 Suppl 3:S172–91.View ArticlePubMedGoogle Scholar
- de Muinich Keizer SM, Mul D. Trends in pubertal development in Europe. Hum Reprod Update. 2001;7(3):287–91.View ArticlePubMedGoogle Scholar
- Bodicoat DH, et al. Timing of pubertal stages and breast cancer risk: the Breakthrough Generations Study. Breast Cancer Res. 2014;16(1):R18.View ArticlePubMedPubMed CentralGoogle Scholar
- Morris NM, Udry JR. Validation of a self-administered instrument to assess stage of adolescent development. J Youth Adolesc. 1980;9(3):271–80.View ArticlePubMedGoogle Scholar
- Terry MB, et al. Comparison of clinical, maternal, and self pubertal assessments: implications for health studies. Pediatrics. 2016;138(1):e20154571. doi:10.1542/peds.2015-4571.
- Boyd NF, et al. Mammographic density and the risk and detection of breast cancer. N Engl J Med. 2007;356(3):227–36.View ArticlePubMedGoogle Scholar
- Byrne C, et al. Mammographic features and breast cancer risk: effects with time, age, and menopause status. J Natl Cancer Inst. 1995;87(21):1622–9.View ArticlePubMedGoogle Scholar
- Nelson HD, et al. Risk factors for breast cancer for women aged 40 to 49 years: a systematic review and meta-analysis. Ann Intern Med. 2012;156(9):635–48.View ArticlePubMedPubMed CentralGoogle Scholar
- Blackmore KM, Knight JA, Lilge L. Association between transillumination breast spectroscopy and quantitative mammographic features of the breast. Cancer Epidemiol Biomarkers Prev. 2008;17(5):1043–50.View ArticlePubMedGoogle Scholar
- Blyschak KSM, Jong R, Lilge L. Classification of breast tissue density by optical transillumination spectroscopy: optical and physiological effects governing predictive value. Med Phys. 2004;31(6):1398–414.View ArticlePubMedGoogle Scholar
- Blackmore KM, Knight JA, Walter J, Lilge L. The association between breast tissue optical content and mammographic density in pre- and post-menopausal women. PLoS One. 2015;10(1):e0115851.View ArticlePubMedPubMed CentralGoogle Scholar
- Knight JA, et al. Optical spectroscopy of the breast in premenopausal women reveals tissue variation with changes in age and parity. Med Phys. 2010;37(2):419–26.View ArticlePubMedGoogle Scholar
- John EM, et al. The LEGACY girls study: growth and development in the context of breast cancer family history. Epidemiology. 2016;27(3):438–48.View ArticlePubMedGoogle Scholar
- Cerussi A, et al. In vivo absorption, scattering, and physiologic properties of 58 malignant breast tumors determined by broadband diffuse optical spectroscopy. J Biomed Opt. 2006;11(4):044005.View ArticlePubMedGoogle Scholar
- Shah N, et al. Noninvasive functional optical spectroscopy of human breast tissue. Proc Natl Acad Sci U S A. 2001;98(8):4420–5.View ArticlePubMedPubMed CentralGoogle Scholar
- Dick SL, Lilge L. Optical reflectance spectroscopy for prospective studies on breast cancer risk in adolescent girls. Am J Epidemiol. 2006;163(11):S97–7.Google Scholar
- Simick MK, et al. Non-ionizing near-infrared radiation transillumination spectroscopy for breast tissue density and assessment of breast cancer risk. J Biomed Opt. 2004;9(4):794–803.View ArticlePubMedGoogle Scholar
- Antoniou AC, et al. The BOADICEA model of genetic susceptibility to breast and ovarian cancer. Br J Cancer. 2004;91(8):1580–90.PubMedPubMed CentralGoogle Scholar
- Antoniou AC, et al. The BOADICEA model of genetic susceptibility to breast and ovarian cancers: updates and extensions. Br J Cancer. 2008;98(8):1457–66.View ArticlePubMedPubMed CentralGoogle Scholar
- Lee AJ, et al. BOADICEA breast cancer risk prediction model: updates to cancer incidences, tumour pathology and web interface. Br J Cancer. 2014;110(2):535–45.View ArticlePubMedGoogle Scholar
- Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control. 1974;AC-19(6):716–23.View ArticleGoogle Scholar
- Taroni P, et al. Seven-wavelength time-resolved optical mammography extending beyond 1000 nm for breast collagen quantification. Opt Express. 2009;17(18):15932–46.View ArticlePubMedGoogle Scholar