Software

Hypothesis testing (multivariate, high-dimensional, non-Euclidean data)

Two-sample tests

  • gTests: Graph-based two-sample tests.

  • gCat: Two-sample tests for categorical data utilizing similarity information among the categories. Useful when the number of categories is large and the contingency table sparsely populated.

Checking covariates matching in observational studies

Multi-sample tests

Change-point analysis (multivariate, high-dimensional, non-Euclidean data)

Offline change-point detection (segmentation)

  • gSeg: Graph-based change-point detection.

  • kerSeg: Kernel-based change-point detection.

Online/Sequential detection (for streaming data)

  • gStream: Graph-based change-point detection for streaming data.

DNA copy number profiling

Allele-specific copy number profiling

  • falconx: for exome sequencing data.

  • falcon: for genome sequencing data.

Pipeline (on GitHub)

  • Marathon: Integrative pipeline for profiling DNA copy number and inferring tumor phylogeny.