A platform to automatically build balance-based disease prediction models and discover microbial biomarkers from microbiome data


Identifying key taxonomic biomarkers and restoring commensal gut microbiota is vital for the precision diagnosis and treatment of many human diseases. DisBalance was developed to address the key issues related to microbiome-based binary classifications.

  • The distal DBA (distal discriminative balance analysis) method was embedded to select a set of distal discriminative balances as input features for regularized LR (logistic regression) modelling.
  • 2111 gut-distributed disease-taxon associations were abstracted from MicroPhenoDB as built-in evidence.
  • Multiple strategies for the mining of microbial biomarkers were supplied to facilitate the model-driven and knowledge-driven discoveries.

The implementation of DisBalance is showcased by a complete analysis of a UC demo dataset from GMrepo as demonstrated on the "demo" tab.

metagenomic balance network
Workflow of DisBalance

Model Building

Only two files are required to build a balance-based disease prediction model:

  1. Feature data: rows indicating feature ids and columns indicating sample ids, the abundances could be either raw counts or relative abundances.
  2. Sample metadata: two columns required, with the first column indicating the sample id and the second column indicating the class name of the respective sample. It should be noted that if the MicroPhenoDB evidence is adopted for subsequent taxonomic biomarkers discovery, the name of the disease should be the MeSH (Medical Subject Headings) id.

Check the Demo part for the detailed tutorial.

Risk Prediction

Once the optimized disease prediction model is established, it can be applied to predict the disease risk for new samples. When the balance-based model, SBP matrix and the new Feature data is submitted, a disease risk probability for each sample will be immediately calculated.
Check the Demo part for the detailed tutorial.

Biomarker Discovery

Upload the LR coefficients of balances and SBP matrix generated from the above steps and make the taxon-disease associations inference for the taxons in the top n balances selected. DisBalance supports automated inferences using the evidenced taxon-disease associations in MicroPhenoDB or user-defined taxon-disease associations. The inference results can be interactive explored through the output panel.
Check the Demo part for the detailed tutorial.

Contact Us