Skip to main content
Fig. 2 | BMC Systems Biology

Fig. 2

From: A computational framework for complex disease stratification from multiple large-scale datasets

Fig. 2

Process proposed for handling high levels of non-random missing data. If there are less than 10% missing values, data imputation is used, then tested for association (artificial associations might arise from the imputation process, which would then skew the analysis downstream) and submitted to a sensitivity analysis. If there are more than 10% missing values, we either collapse the feature/patient to a binary (presence/absence) scheme and run a χ2 test for difference in detection rates, or explore several imputation methods with highly cautious interpretation

Back to article page