| 1 |
|
Subtype specific breast cancer event prediction
We investigate the potential to enhance breast cancer event predictors by exploiting subtype information. We do this with a two-stage approach that first determines a sample's subtype using a recent module-driven approach, and secondly constructs a subtype-specific predictor to predict a metastasis event within five years. Our methodology is validated on a large compendium of microarray breast cancer datasets,including 43 replicate array pairs for assessing subtyping stability. Note that stratifying by subtype strongly reduces the training set sizes available to construct the individual predictors, which may decrease performance. Besides sample size, other factors likeunequal class distributions and differences in the number of samplesper subtype, easily obscure a fair comparison between subtype-specific predictors constructed on different subtypes, but also between subtype specific and subtype a-specific predictors. Therefore, we constructed a completely balanced experimental design, in which none ofthe above factors play a role and show that subtype-specific eventpredictors clearly outperform predictors that do not take subtype information into account.
|
[PDF]
[Abstract]
|
| 2 |
|
An evaluation protocol for subtype-specific breast cancer event prediction
Motivation: In recent years increasing evidence appeared that breastcancer may not constitute a single disease at the molecular level,but comprises a heterogeneous set of subtypes. This suggests that instead of building a single predictor, better predictors might be constructed that solely target samples of a designated subtype. An unavoidable drawback of developing subtype-specific predictors, however,is that a stratification by subtype drastically reduces the numberof samples available for their construction. It is therefore questionable whether the potential benefit of subtyping can outweigh the drawback of a severe loss in sample size. Factors like unequal class distributions and differences in the number of samples per subtype, further complicate comparisons. Results: We present several evaluation strategies that facilitate a comprehensive comparison between subtype-specific predictors and predictors that do not take subtype information into account. Emphasis lies on careful control of sample size as well as class and subtype distributions. The methodology is applied to a large breast cancer compendium involving over 1500 arrays,using a state-of-the-art subtyping scheme. We show that the resulting subtype-specific predictors outperform those that do not take subtype information into account, especially when taking sample size considerations into account.
|
[PDF]
[Abstract]
|
| 3 |
|
Neonatal mortality prediction using real-time medical measurements
Current neonatal illness scoring systems are not designed to predictoutcomes for individual patients, but rather can provide an overview of a population of patients for objective comparison when reporting outcomes. Having more patient-specific predictions may help physicians make better treatment decisions in a Neonatal Intensive Care Unit (NICU) environment. We developed neonatal mortality prediction models using C5.0 decision tree software that met criteria for clinically useful results (>50-60% sensitivity, >90% specificity) for individual patients using data from real-time medical measurement devices. The models were evaluated to identify: (1) the model with the bestperformance based on minimizing false positives, and (2) the attributes used most often in the best clinically useful models. Performance results showed that the mortality model using summary data duringthe first 48 hours after NICU admission provided, on average, the highest sensitivity and specificity with the least number of false positives (sensitivity=63%, specificity=94%, positive predictive value=38%), exceeding the performance criteria requested by our clinicalpartners. The attributes used most often in the best models for predicting mortality with our data were: mean blood pressure, serum pH,immature/total neutrophil ratio, serum sodium, serum glucose, respiratory rate, heart rate, and pO2 blood oxygen level.
|
[PDF]
[Abstract]
|
| 4 |
|
Prediction of extubation failure for neonates with respiratory distress syndrome using the MIMIC-II Clinical Database
Extubation failure (EF) is an ongoing problem in the neonatal intensive care unit (NICU). Nearly 25% of neonates fail their first extubation attempt, requiring re-intubations that are associated with riskfactors and financial costs. We identified 179 mechanically ventilated neonatal patients that were intubated within 24 hours of birth in the MIMIC-II intensive care database. We analyzed data from the patients 2 hours prior to their first extubation attempt, and developed a prediction algorithm to distinguish patients whose extubation attempt was successful from those that had EF. From an initial list of57 candidate features, our machine learning approach narrowed downto six features useful for building an EF prediction model: monocytecell count, rapid shallow breathing index, fraction of inspired oxygen (FiO2), heart rate, PaO2/FiO2 ratio where PaO2 is the partial pressure of oxygen in arterial blood, and work of breathing index. Algorithm performance had an area under the receiver operating characteristic curve (AUC) of 0.871 and sensitivity of 70.1% at 90% specificity.
|
[PDF]
[Abstract]
|