Visit project's GitHub page:
The presented machine-learning pipeline uses Mapper, a clustering algorithm that helps uncover hidden patterns and groups within complex data. Mapper is especially useful for finding subgroups that share similar characteristics.
Developed by Dr. Ewan Carr and Dr. Raquel Iniesta, the pipeline applies the latest methods in topological data analysis (TDA) to identify groups of patients with common traits related to a specific feature of interest.
Pipeline link: https://github.com/kcl-bhi/mapper-pipeline
Corresponding paper: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-021-04360-9. Carr, E., Carrière, M., Michel, B., Chazal, F., & Iniesta, R. (2021). Identifying homogeneous subgroups of patients and important features: a topological machine learning approach. BMC bioinformatics, 22, 1-7.
Video presentation: https://youtu.be/8dFKgj5CN5U
Abstract: This project exploits recent developments in topological data analysis to present a pipeline for clustering based on Mapper, an algorithm that reduces complex data into a one-dimensional graph. We present a pipeline to identify and summarise clusters based on statistically significant topological features from a point cloud using Mapper. Key strengths of this pipeline include the integration of prior knowledge to inform the clustering process and the selection of optimal clusters; the use of the bootstrap to restrict the search to robust topological features; the use of machine learning to inspect clusters; and the ability to incorporate mixed data types.