Dataset | Age range | Median age | Mean age | SD age | Sample size | Outlier | Type | Method |
---|
GSE88883 | 18–82 | 37 | 37.2 | 13.6 | 100 | 0 | N | 450K |
GSE74214 | 13–80 | 54.5 | 49 | 20.8 | 18 | 3 | N | 450K |
GSE101961 | 17–76 | 38 | 38.2 | 12.2 | 121 | 2 | N | 450K |
GSE69914 | 18–80 | 51 | 49.5 | 14.4 | 49 | 10 | N | 450K |
GSE69914 | 30–86 | 51 | 51.1 | 12.2 | 42 | 7 | N-adj | 450K |
GSE160233 | 33–82 | 50 | 52 | 14.3 | 29 | NA | N-adj | DREAM |
TCGA | 28–90 | 56.5 | 57.5 | 15.3 | 97 | 17 | N-adj | 450K |
- DNA methylation datasets in normal and normal-adjacent breast tissues. Summary of the characteristics of publicly available and in-house generated DNA methylation datasets used in this study. Raw (idat) files for all 450K datasets were downloaded and were re-normalized within themselves to match the normalization of the GSE69914 dataset for which raw files were not available for normalization. The data were normalized using beta-mixture quantile normalization (BMIQ) through the ChAMP R package. DREAM dataset was generated in-house as described in the “Methods” section. The “Outlier” column indicates the number of outliers identified in each dataset
- N normal, N-adj normal-adjacent, NA not applicable