Topological deep learning: a review of an emerging paradigm

Zia, Ali; Khamis, Abdelwahed; Nichols, James; Tayab, Usman Bashir; Hayder, Zeeshan; Rolland, Vivien; Stone, Eric; Petersson, Lars

doi:10.1007/s10462-024-10710-9

Topological deep learning: a review of an emerging paradigm

Open access
Published: 29 February 2024

Volume 57, article number 77, (2024)
Cite this article

Download PDF

You have full access to this open access article

Artificial Intelligence Review Aims and scope Submit manuscript

Topological deep learning: a review of an emerging paradigm

Download PDF

Ali Zia^1,2,
Abdelwahed Khamis²,
James Nichols¹,
Usman Bashir Tayab³,
Zeeshan Hayder²,
Vivien Rolland²,
Eric Stone¹ &
…
Lars Petersson²

2563 Accesses
1 Citation
3 Altmetric
Explore all metrics

Abstract

Topological deep learning (TDL) is an emerging area that combines the principles of Topological data analysis (TDA) with deep learning techniques. TDA provides insight into data shape; it obtains global descriptions of multi-dimensional data whilst exhibiting robustness to deformation and noise. Such properties are desirable in deep learning pipelines, but they are typically obtained using non-TDA strategies. This is partly caused by the difficulty of combining TDA constructs (e.g. barcode and persistence diagrams) with current deep learning algorithms. Fortunately, we are now witnessing a growth of deep learning applications embracing topologically-guided components. In this survey, we review the nascent field of topological deep learning by first revisiting the core concepts of TDA. We then explore how the use of TDA techniques has evolved over time to support deep learning frameworks, and how they can be integrated into different aspects of deep learning. Furthermore, we touch on TDA usage for analyzing existing deep models; deep topological analytics. Finally, we discuss the challenges and future prospects of topological deep learning.

Learning Topology: Bridging Computational Topology and Machine Learning

Article 01 July 2021

Topological Approaches to Deep Learning

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this article, we explore the growing interface between deep learning and topology. We examine deep learning methods that make use of topological information to understand the shape of data, as well as the use of deep learning in calculating topological signatures. We broadly refer to this intersection of fields as topological deep learning. The advancements in topological deep learning have been enabled by the development of topological data analysis (TDA) over the last two decades.

TDA is a relatively recent amalgam of theory and algorithms that aims to obtain a geometric and topological understanding of data from real-world applications. The approach to data employed in TDA fundamentally differs from that in statistical learning. Rather than finding summary statistics, estimators, fitting approximate distributions, clustering, or training neural nets, TDA instead seeks to understand the properties of the geometric object, often a manifold, on which the data resides. This reflects the common intuition that data tends to lie on, or close to, a lower-dimensional manifold that is embedded in high-dimensional feature space. In this article, we sometimes refer to this as the data manifold.

The main goal of TDA is to infer information about the global structure of the data manifold, such as its connectivity and the presence of multi-dimensional holes. In the pure mathematical setting, this information is characterized by the persistence homology and the related concept of Betti numbers, that counts the number of n-dimensional holes in a manifold. With a finite set of data points, the Betti numbers are unavailable, but TDA employs various substitutes such as persistence diagrams and the barcode. An important property of the topological information obtained is its invariance to continuous deformation and scaling. This property also lends itself to robustness against perturbation and noise. Another benefit is the versatility of the TDA methods, owed mostly to the abstract origins of algebraic topology. The methods are applicable to a wide variety of data types and objects. This includes point cloud data in Euclidean spaces, categorical data, or the analysis of images and functions. TDA is backed by explainable theory but lacks the learning ability and other practical aspects of deep neural networks. Conversely, neural networks suffer from the need for large training datasets and billions of tunable parameters. Due to these aspects, integration of TDA with deep neural networks poses a number of challenges.

Despite much recent activity in co-opting topological approaches in deep learning, what the leading approach should be remains unclear, mostly because of computational and theoretical concerns. The TDA methods discussed in this paper form but a small part of the ever-expanding interface between topological data analysis and machine learning. However, it is important to state that this survey does not provide exhaustive background on TDA background and literature. For that, we refer our reader to the following excellent studies: (Pun et al. 2016), Krizhevsky et al. (2012)) has shown that learning representations (i.e. feature learning) is a preferable approach.

3 Topological deep learning (TDL)

Topological representations that incorporate structural information hold great promise for topological deep learning models (Hofer et al. 2017). Combining these cues with deep learning approaches has inherent benefits in various applications. On the flip side, deep learning approaches can be useful in overcoming some common hurdles faced by TDA approaches in estimating robust topological features. The incorporation of topological concepts into deep learning has only recently been investigated and the following benefits have been observed:

Global features from input data can be efficiently and robustly extracted that would otherwise be inaccessible via traditional feature maps.
TDA is versatile and adaptable, meaning that we are not limited to specific problems and types of data (such as images, sensor measurements, time series, graphs, etc.).
TDA is noise-resistant across a number of problems, which include the classification of 3D surface meshes (Som et al. 2018; Reininghau et al. 2015; Li et al. 2014), the recognition of 2D object shapes (Turner et al. 2014), the manifold of natural image patches (Carlsson et al. 2007), the analysis of activity patterns in the visual cortex (Singh et al. 2008), and clustering (Chazal et al. 2013).
TDA can be applied to arbitrary data structures without any prepossessing provided the right filtrations are used.
A new trend is emerging that allows efficient backpropagation through persistent homology components. This has been a long-standing challenge in TDA (further discussed in Sect. 3.3), but topological layers are now becoming compatible with deep learning and end-to-end training schemes.

We reiterate that though the benefits of using TDA (more specifically, persistent homology) and deep learning together have demonstrated success, there are still some theoretical and computational challenges in the application of TDA to data. We discuss these issues at length in Sect. 4.2.

In the rest of this section, we investigate TDA for deep learning from lenses of different magnifications and perspectives, as shown in Fig. 4. In particular, we explore the use of persistent homology in various different ways. The discussion in Sects. 3.1–3.3 is focused on the on-training integration of TDA. That is, building topological neural architectures. However, a holistic view should also consider TDA’s contribution to post-training (deep topological analytics). These analytics use TDA to study the ‘shape’ of a trained model. Thus, we review works that studied deep model complexity and interpretability using TDA in Sect. 3.4.

3.1 Learning topological features embedding

In this section, we extend the discussion of fixed vectorization methods (Sect. 2.3) by introducing deep learnable vectorization (i.e. embedding). A key advantage here is the possibility of leveraging the deep model to simultaneously learn the vectorization of data and the representation of the target task. For example, we may parameterize the vectorization of persistence diagrams $\textrm{PD}$ to embedding vector $V \in \mathbb {R}^d$ by neural layers $f_w$ where w denotes the trainable parameters. Guided by the task loss, we can efficiently learn map** $f_w: \mathrm {PD_x} \rightarrow V_{x}$ and automatically answer the question of “which family of vectorizations should best work for the given task”.

Handling PDs by neural networks is the focus of many deep topological embedding studies. Generally, PDs deep vectorization layers should be continuous and permutation invariant with respect to the input. The latter requirement is motivated by the set nature of the persistence diagram. Hofer et al. (2017, 2019) introduced the first learnable deep vectorization of PDs. It adopts a permutation invariant transformation by evaluating the PD’s points against Gaussian(s) whose mean and variance are learned during the training. Since permutation invariance was explored in other deep learning problems (e.g. Deep Set (Zaheer et al. 2017) for points cloud), some vectorization techniques for PD were borrowed from them. For example, PersLay (Carrière et al. 2020) builds on DeepSets for embedding extended PDs encoding graphs and uses it for graph classification. Recently, transformers were used for PDs embedding. Persformer (Reinauer et al. 2021) architecture showed superiority in synthetic and graph tasks while having some interpretability features. Note that transformers without positional encoding can be made as expressive as Deep Sets. Thus, the permutation invariance requirement can be maintained.

Zhou et al. (2022) proposed TopologyNet, a novel approach, to directly fit the output of topological representations derived from input point cloud data. This innovative method substantially reduces computation time for generating topological representations, in contrast to traditional pipelines, while maintaining a minimal approximating error in practical scenarios. The resultant output of TopologyNet holds potential for various downstream tasks that require efficient topological representations. Experimental evaluations involved incorporating TopologyNet as a topological branch within an autoencoder framework. The results demonstrated that the inclusion of the topological branch led to superior topology quality in the generated point clouds compared to an autoencoder lacking such a branch. Furthermore, the latent vectors generated by a topological autoencoder were employed to train a latent generative adversarial network (GAN), enabling the generation of new point clouds from Gaussian noise. Evaluation indices indicated that the inclusion of the topological autoencoder within the generative adversarial network resulted in improved quality of the newly generated point clouds, surpassing the performance of a GAN lacking the topological autoencoder.

Beyond PDs, deep embedding was explored for other topological signatures. For example, PLLay (Kim et al. 2020) provides a layer for embedding persistence landscapes. PLLay claim to robustness to extreme topological distortion is backed by a tight stability bound that’s independent of the input complexity.

Topological embedding transforms the topological input with a complex structure into a vector representation compatible with deep models. As discussed in this section, the process uses a custom topological input layer for embedding. In the next section, we explore topological components that enhance deep learning representation and usually have the flexibility to be plugged anywhere in the network.

Algorithm 1 represents the process of embedding persistence diagrams (PDs) into a vector space using deep neural network layers. The procedure DeepTopologicalEmbedding takes a persistence diagram as input, initializes an embedding vector and neural layers, and then maps each point in the PD to the embedding vector. The process is guided by a loss function to determine the best vectorization for the given task.

3.2 Integration of topological representations

Representation learning is the process of learning features from data that can be used to improve the accuracy of the model. Deep learning excels in this regard thanks to its powerful feature learning, but having a good representation goes further than achieving good performance on a target task (Bengio et al. 2013). For example, TDA’s stability can make deep representation resilient to input perturbation (de Surrel et al. 2022). Below, we review two categories of deep topological representations.

Constrained representations One approach is to train deep neural networks to learn representations that preserve the persistent homology of the input data. Again, TDA’s versatility ensures the feasibility of this as the topological signature can be computed for both the input and the internal representation. For example, Topological Autoencoders (Moor et al. 2020) perform the alignment through a loss, minimizing the divergence between input and latent representation topologies (both captured by PDs).

Augmented representations Another approach for topological representation is augmenting the deep features with topological signatures. Persistence Enhanced Graph Network (PEGN) (Zhao et al. 2020) developed graph spatial convolution that builds on persistence homology. Normally, convolution filters can adapt to local graph structures through the use of node degree information. In contrast, PEGN weights the message passing between nodes through neighborhood information captured by persistence images. Moreover, Graph Filtration Learning (GFL) (Hofer et al. 2020) adapts the readout operation (a graph pooling-like operation) in Graph Neural Network (GNN) to be topologically aware. BDs are computed for the graph nodes feature and vectorized. Interestingly, the filtration function is learned end-to-end. Topological Graph Layer (TOGL) (Horn et al. 2022) extends GFL’s idea and learns multiple filtrations of a graph (rather than one) in an end-to-end manner.

Unlike the embedding layers (e.g. PersLay Carrière et al. (2020)) that expect a pre-specified input type (e.g. PDs), the topological representation layers discussed in this section enjoy more flexibility regarding the input and placement in the network. This comes with the attached cost of requiring careful design choices and guarantees on the layer characteristics (e.g. consistency of gradients in Hofer et al. (2020)).

The process of integrating topological representations into deep learning models is outlined in Algorithm 2. The exact method used (e.g. Topological Autoencoders, PEGN, GFL, TOGL) depends on the specific approach chosen.

3.3 Topological loss

The most common approach for leveraging topology in deep learning is incorporating a topological penalty in the loss. The popularity of the approach stems from the fact that loss-based integration is straightforward and does not require changing the architecture or adding additional layers. The only caveat is that the loss should be differentiable and easy to compute. As iterated previously, the capability of topological features to capture the complex structure of the data means that deep learning can learn robust representations guided by topological loss. Thus, the representations are likely invariant with respect to typical transformations present in real-world datasets, such as noise and outliers. An example of this is a common persistence loss (Hu et al. 2019), which minimizes the difference between a predicted persistence diagram $\textrm{PD}_X$ and the true diagram $\textrm{PD}_Y$:

$$\begin{aligned} \mathcal {L}_{\text {topological}} = d(\textrm{PD}_X,\textrm{PD}_Y) \end{aligned}$$

(1)

This has been used either as a standalone loss or as a regularizer (i.e. augmenting another loss) (Hu et al. 2019) in applications such as semantic segmentation (Hu et al. 2019), or generative modeling (Wang et al. 2020).

As discussed in 3.1, PDs do not lend themselves to vector representations in Euclidean space. Moreover, the PD is not differentiable (a key requirement for using backpropagation). One strategy to resolve this is to leverage a divergence or metric that can handle PDs. The p-Wasserstein^{Footnote 1} distance and the bottleneck distance are popular choices:

$$\begin{aligned} d_{p,q}(\textrm{PD}_X,\textrm{PD}_Y)&= \Big [ \inf _{\pi \in \Pi (\textrm{PD}_X, \textrm{PD}_Y) } \sum _{t \in \textrm{PD}_X} \Vert t - \pi (t)\Vert _{q}^{p} \Big ]^{\frac{1}{p}} \end{aligned}$$

(2)

$$\begin{aligned} d_{\infty }(\textrm{PD}_X,\textrm{PD}_Y)&= \inf _{\pi \in \Pi (\textrm{PD}_X, \textrm{PD}_Y) } \sup _{t \in \textrm{PD}_X} \Vert t - \pi (t)\Vert _{\infty } \end{aligned}$$

(3)

where t is a point corresponding to a $(b_i, d_i)\in \mathbb {R}^2$ that is in $\mathrm {PD_X}$, and where $\Pi (\textrm{PD}_X, \textrm{PD}_Y)$ denotes the set of bijection between $\textrm{PD}_X$ and $\textrm{PD}_Y$, and $\Vert .\Vert _q$ is the $\ell _q$ Euclidean norm. It can be seen that the bottleneck distance is the largest distance between any pair of corresponding points across all bijections that preserve the partial ordering of the points (i.e. we cannot match a point with a birth time greater than another point’s death time). This ensures that the topological features to be matched are comparable.

The initial popularity of the bottleneck distance is perhaps fueled by a stability theorem (Cohen-Steiner et al. 2005) for PDs of continuous functions. According to this theorem, the bottleneck distance is controlled by $L_\infty $ distance, that is

$$\begin{aligned} d_{\infty }(\textrm{PD}_{f_1},\textrm{PD}_{f_2}) \le C \Vert f_1-f_2\Vert _{\infty } \end{aligned}$$

(4)

form some constant C. In effect, this means that the diagrams are stable with respect to small perturbations of the underlying data. A similar stability result exists for the p-Wasserstein distance. These are the foundation of the stability guarantees by recent deep learning works such as the stability of Heat Kernel Signature in graphs (Carrière et al. 2020) and stability of mini-batch-based diagram distances in Topological Autoencoders (Moor et al. 2020).

Among the limitations of (2) and (3) is the high computational budget needed by these distances when the number of points is large. As the distance requires point-wise matching, the computational complexity is $\mathcal {O}(n^3)$ for n points (Anirudh et al. 2016). Also, in many applications (Wang et al. 2020; Chen et al. 2019), we aim to learn a model $f_w$ that aligns a predicted diagram $\textrm{PD}_P$ with a target (i.e. ground truth) diagram $\textrm{PD}_T$ by gradually moving $\textrm{PD}_P$ points towards $\textrm{PD}_T$. This is typically achieved by pushing w in the negative direction of $\nabla _w \mathcal {L}_{\text {topological}}$ and, obviously, assumes that the loss is differentiable with respect to the diagram. While the Wasserstein distance satisfies this requirement in general, it can have some instability issues (Solomon et al. 2021). Below, we select a few representative papers using topological losses in various applications and show how they handle these issues.

In generative modeling, TopoGAN (Wang et al. 2020) uses a slightly modified 1-Wassertsein distance to align the diagrams of generated and real images in medical image applications. The loss ignores the death time and focuses only on the birth time of the diagram features. Framed in this way, the loss becomes similar to the Sliced Wasserstein (Peyré et al. 2019), which can be computed efficiently and is still differentiable. A similar loss was used by Hu et al. (2019) for segmentation to encourage the deep model to produce output with a topology that was close to the ground truth. The cross-entropy loss is augmented with the 2-Wasserstein loss between persistence diagrams. To alleviate the computational burden, the method performs the calculation on a single small image patch (part of the image) at a time. In (Clough et al. 2022), the authors rely on Betti numbers for semi-supervised image segmentation. A notable advantage here is the output of a network trained on a small set of labeled images can still capture the actual Betti numbers correctly. This gives us the opportunity to initially train the model on a small labeled dataset guided by the Betti numbers loss $\mathcal {L}_{\beta }$. The model is then fine-tuned using a large unlabeled dataset and guided by a loss (that incorporates $\mathcal {L}_{\beta }$). Since the estimation of Betti numbers is robust for unlabeled data, $\mathcal {L}_{\beta }$ will regularize the second stage of training (fine-tuning). In classification, (Chen et al. 2019) uses a topological regularizer. To speed up the computation, it focuses on the zero homological dimension, where the persistence computations are particularly faster.

Algorithm 3 outlines the computation of topological loss using either the p-Wasserstein distance or the bottleneck distance. The procedure TopologicalLoss takes two persistence diagrams $\textrm{PD}_X$ and $\textrm{PD}_Y$, and the parameters p and q, then computes the p-Wasserstein or bottleneck distance as the topological loss. This loss can be used in deep learning models to minimize the difference between predicted and true topological features.

3.4 Deep topological analytics

The complementary value of TDA goes beyond on-training integration and constructing topological neural architectures. In fact, leveraging TDA methods post-training can be even more insightful and powerful. Currently, researchers use TDA to address deep learning transparency (Liu et al. 2020), studying model complexity (Rieck et al. 2019; Carlsson and Gabrielsson 2020) and even tracking down answers for seemingly mysterious aspects of deep learning, e.g. why deep networks outperform shallow ones (Naitzat et al. 2020). These efforts are centered around analyzing deep models using TDA approaches. Hence, we call it deep topological analytics. We explore two aspects of it below.

Quantifying structural complexity Watanabe and Yamana (2021) treats the neural networks as a weighted graph G(V, E) where V and E denote the network neurons and the relevance scores (computed from weights); respectively. By computing persistence features (e.g. Betti numbers) across filtration, we can gain insight into the network complexity. For example, the increase in the Betti number (the occurrence of a cycle between a set of neurons) can reflect the complexity of knowledge in deep neural networks. In Rieck et al. (2019), the authors follow the same line and further develop training optimization strategies (e.g. early stop**) informed by homological features.

Visual exploration of models Another use of TDA here is to provide a post-hoc explanation and/or visual exploration of the internal functioning of deep models. For example, topological information provides insight into the overall structure of high-dimensional functions. The authors in Liu et al. (2020) use this to offer a scalable visual exploration tool for data-driven black box models. This is an important research problem, where doing so in an intuitive way is a challenge. They also use topological splines to visualize the high-dimensional error landscape of the models. Similarly, TopoAct (Rathore et al. 2021) offers insightful information on neural network learned representations and provides a visual exploration tool to study topological summaries of activation vectors. Works such as Polianskii (2018) shed light on how neural networks maintain the topological properties of the data when they are projected into low-dimensional space.

DNN focused topology optimization The concept of “Inverting Representation of Image” and “Physically Informed Neural Network” served as inspiration for the creation of the topology optimization via neural reparameterization framework (TONR) (Zhang et al. 2021), which aims to address a variety of topology optimization issues. In this approach, the density field is optimized through the updating of DNN parameters and carefully choosing the initial parameters. This leads to quicker training and suggests a good measure for topology optimization.

4 Discussion

TDA is a steadily develo** and promising area, with successes in a wide variety of applications. However, there are open questions in applying TDA with deep neural networks. In this section, we discuss various successes and applications of deep TDA, we highlight several open challenges for future research on deep TDA in both practical and theoretical aspects, and paint a speculative picture by outlining what persistent homology holds for the future. We also note some open-source implementations available for researchers to get started.

4.1 Successes and applications

Deep TDA has demonstrated potential in a variety of challenging settings. The invariance of PH information to continuous deformation means TDA applies well to settings where objects should have consistent shapes but may be transformed in some way. TDA also performs well to bridge the gap between structural information and prior knowledge. If we have prior knowledge of the topology of a class of objects, then PDs are an effective tool for the classification and comparison of data against this class, even in the presence of noise or limited data. This robustness is well adapted to deep learning.

A potential area of application for topological data analysis (TDA) combined with deep learning lies in multi-class segmentation tasks. In such tasks, it becomes feasible to delineate the topology of individual classes as well as the boundaries between each class. This extension can be viewed as an implementation of persistent homology (PH) to address the issue examined in a study by Clough et al. (2022) and Haft-Javaherian et al. (2020), where prior information was utilized to define the adjacencies amongst different brain regions.

TDA can produce good results in small datasets (Byrne et al. 2021; BenTaieb and Hamarne 2016), and is especially useful for medical imaging applications where cost and privacy concerns often limit data acquisition. Byrne et al. (2021), BenTaieb and Hamarne (2016) have investigated the limitations of conventional deep learning training procedures when applied to small datasets. It reveals that these procedures heavily rely on pixel-wise loss functions, which restrict the optimization process in terms of extended or global features. They used persistent homology and constructed topological loss functions to evaluate image segments against a known prior, resulting in a richer description of segmentation topology with better accuracy.

As persistence homology describes the global structure, develo** topological loss functions could suppress small false positives or false negatives related to the topology of an object. For example, in the segmentation task, techniques such as morphological operations or CRF-based techniques are used to remove local errors; they do not have the concept of global topology. The benefit of PH-based loss is that the correct global topology can be propagated with local label smoothness. TDA has been used in settings with limited or noisy data, such as power forecasting (Senekane et al. 2021), segmenting aerial photography (Mosinska et al. 2018) and astronomy (Murugan and Robertson 2022).

As deep learning models continue to grow in complexity and dataset to grow in size, scalability and efficiency become even more crucial. Future directions in TDA for deep learning involve the development of scalable algorithms and efficient computational frameworks capable of handling large-scale datasets. This would enable the application of topological data analysis to diverse domains and real-world problems.

Interpreting deep learning models’ decisions remains a challenging endeavor. TDA offers a unique perspective by providing interpretable representations of complex data. Future directions in this area will focus on develo** methodologies to extract meaningful topological features and interpret their significance in the context of deep learning tasks. This will facilitate a better understanding of the decision-making process for deep neural networks and increase their trustworthiness.

Regularization plays a crucial role in preventing overfitting and improving the generalization ability of deep learning models. Future research will explore how TDA-based regularization techniques can be integrated into deep learning frameworks. This could involve incorporating topological penalties or constraints to encourage models to capture meaningful topological features, leading to improved model generalization and robustness.

Many real-world applications involve multimodal data, such as images, text, and sensor data. Combining TDA with deep learning techniques provides a promising avenue for analyzing and integrating information from multiple modalities. Future directions include the development of TDA methods that can handle multimodal data and exploit the interactions between different modalities to uncover complex relationships and structures.

Transfer learning has proven to be an effective strategy for leveraging knowledge gained from one task to improve performance on a related task. Integrating TDA into transfer learning frameworks can enable the transfer of topological knowledge between domains or datasets. This could facilitate the adaptation of deep learning models to new domains by preserving the underlying topological structure and transferring relevant information.

Moreover, deep learning may yet yield new kinds of topological representation other than PDs, with robustness to different data deformations. PH could have further applications in multi-class open-set problems (where data may have unknown classes). If the topology among classes is relatively consistent, then the object labels of unknown classes could be better predicted.

4.4 Implementations

There are a number of open-source implementations of TDA available to practitioners. Here, we present three libraries that have interfaces with deep learning architectures.

GUDHI^{Footnote 2} is an open-source library that implements relevant geometric data structures and TDA algorithms, and it can be integrated into the TensorFlow framework. PersLay (Carrière et al. 2020) and RipsLayer are implementations using GUDHI that learn persistence representations from complexes and PDs. They can handle automatic differentiation and are readily integrated in deep learning architectures.

Giotto-deep^{Footnote 3} is an open-source extension of the Giotto-TDA library. It aims to provide seamless integration between TDA and deep learning on top of PyTorch. To use topology for both pre-processing data (using a variety of available methods) and using it within neural networks, the developers aim to provide several off-the-shelf architectures. One such example is that of Persformer (Reinauer et al. 2021).

TopoModelX^{Footnote 4} is a recent Python package that extends Graph Neural Networks (GNNs) for application in topological domains, demonstrating a substantial development in the field of topological deep learning. The implementation of topological neural networks in TopoModelX started as the ICML 2023 Topological Deep Learning Challenge (Papillon et al. 2023a), hosted by the second annual Topology and Geometry (TAG) in Machine Learning Workshop at ICML. Participants contributed by implementing existing topological neural network methods from the literature and applying them to train on a benchmark dataset. TopoModelX offers a robust framework and essential functionalities, enabling researchers to either implement new GNN-based TDL algorithms or apply existing methodologies from scholarly literature to their specific problems.

5 Conclusion

The recent growth in TDA and the established efficacy of deep learning have meant that the integration of these techniques has been inevitable. There is no universal paradigm for combining TDA and deep learning. This article surveyed numerous ways in which these frameworks have benefited each other. We began with an overview of the key TDA concepts. Following this, we reviewed TDA in deep learning from a variety of perspectives. We described numerous challenges and opportunities that remain in this field, as well as some observed successes.

Notes

The “Wasserstein” distance in TDA literature is slightly different from the common Wasserstein (i.e. Kantrovich optimal mass transport (Peyré et al. 2019)) metric. The first seeks a deterministic bijection that best aligns the diagrams (hard assignment) and the mass can be freely added to or removed from the diagonal. The latter is based on probabilistic coupling (soft assignment). This also has implications for the kind of algorithms that can be used to estimate the distance.
https://gudhi.inria.fr/.
https://github.com/giotto-ai/giotto-deep.
https://github.com/pyt-team/TopoModelX.

References

Adams H, Emerson T, Kirby M, Neville R, Peterson C, Shipman P et al (2017) Persistence images: a stable vector representation of persistent homology. J Mach Learn Res 18(1):218–252
MathSciNet Google Scholar
Adcock A, Carlsson E, Carlsson G (2016) The ring of algebraic functions on persistence bar codes. Homol Homotopy Appl 18(1):381–402
Article MathSciNet Google Scholar
Ali D, Asaad A, Jimenez MJ, Nanda V, Paluzo-Hidalgo E, Soriano-Trigueros M (2023) A survey of vectorization methods in topological data analysis. IEEE Trans Pattern Anal Mach Intell 45(12):14069–14080
Article Google Scholar
Amézquita EJ, Nasrin F, Storey KM, Yoshizawa M (2023) Genomics data analysis via spectral shape and topology. PLoS ONE 18(4):e0284820
Article Google Scholar
Anirudh R, Venkataraman V, Natesan Ramamurthy K, Turaga P (2016) A riemannian framework for statistical analysis of topological persistence diagrams. In: Conference on computer vision and pattern recognition workshops. pp 68–76
Bengio Y, Courville A, Vincent P (2013) Representation learning: a review and new perspectives. IEEE Trans Pattern Anal Mach Intell 35(8):1798–1828
Article Google Scholar
BenTaieb A, Hamarneh G (2016) Topology aware fully convolutional networks for histology gland segmentation. In: Medical image computing and computer-assisted intervention. Springer, New York, pp 460–468
Bubenik P (2020) The persistence landscape and some of its properties. In: Topological data analysis. Springer, New York, pp 97–117
Bubenik P, Dłotko P (2017) A persistence landscapes toolbox for topological statistics. J Symb Comput 78:91–114
Article MathSciNet Google Scholar
Bubenik P, de Silva V, Scott J (2018) Interleaving and Gromov-Hausdorff distance. Ar**v preprint. ar**v:1707.06288
Byrne N, Forte MV, Tandon A, Valverde I, Hussain T (2016) A systematic review of image segmentation methodology, used in the additive manufacture of patient-specific 3D printed models of the cardiovascular system. JRSM Cardiovasc Dis 5:204800401664546
Article Google Scholar
Byrne N, Clough JR, Montana G, King AP (2021) A persistent homology-based topological loss function for multi-class CNN segmentation of cardiac MRI. In: Statistical atlases and computational models of the heart. Springer, New York, pp 3–13
Cai T, Liu W (2011) A direct estimation approach to sparse linear discriminant analysis. J Am Stat Assoc 106(496):1566–1577
Article MathSciNet Google Scholar
Cang Z, Wei GW (2017) TopologyNet: topology based deep convolutional and multi-task neural networks for biomolecular property predictions. PLOS Comput Biol 13(7):e1005690
Article Google Scholar
Cang Z, Mu L, Wu K, Opron K, **a K, Wei GW (2015) A topological approach for protein classification. Comput Math Biophys 3(1)
Cang Z, Mu L, Wei GW (2018) Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening. PLoS Comput Biol 14(1):e1005929
Article Google Scholar
Carlsson G (2009) Topology and data. Bull Am Math Soc 46(2):255–308
Article MathSciNet Google Scholar
Carlsson G, Gabrielsson RB (2020) Topological approaches to deep learning. In: Topological data analysis. Springer, New York, pp 119–146
Carlsson G, Zomorodian A (2009) The theory of multidimensional persistence. Discret Comput Geom 42(1):71–93
Article MathSciNet Google Scholar
Carlsson G, Ishkhanov T, de Silva V, Zomorodian A (2007) On the local behavior of spaces of natural images. Int J Comput Vis 76(1):1–12
Article MathSciNet Google Scholar
Carrière M, Chazal F, Ike Y, Lacombe T, Royer M, Umeda Y (2020) Perslay: a neural network layer for persistence diagrams and new graph topological signatures. In: International conference on artificial intelligence and statistics. PMLR, pp 2786–2796
Chang C, Lin H (2023) A topological based feature extraction method for the stock market. Data Sci Financ Econ 3(3):208–229
Article Google Scholar
Chazal F, Michel B (2021) An introduction to topological data analysis: fundamental and practical aspects for data scientists. Front Artif Intell 4:108
Article Google Scholar
Chazal F, Guibas LJ, Oudot SY, Skraba P (2013) Persistence-based clustering in Riemannian manifolds. J ACM 60(6):1–38
Article MathSciNet Google Scholar
Chen C, Ni X, Bai Q, Wang Y (2019) A topological regularizer for classifiers via persistent homology. In: International conference on artificial intelligence and statistics. PMLR, pp 2573–2582
Chen J, Qiu Y, Wang R, Wei GW (2022) Persistent Laplacian projected Omicron BA.4 and BA.5 to become new dominating variants. Comput Biol Med 151:106262
Article Google Scholar
Chevyrev I, Nanda V, Oberhauser H (2020) Persistence paths and signature features in topological data analysis. IEEE Trans Pattern Anal Mach Intell 42(1):192–202
Article Google Scholar
Chiu MC, Pun CS, Wong HY (2017) Big data challenges of high-dimensional continuous-time mean-variance portfolio selection and a remedy. Risk Anal 37(8):1532–1549
Article Google Scholar
Chulián S, Stolz BJ, Martínez-Rubio Á, Blázquez Goñi C, Rodríguez Gutiérrez JF, Caballero Velázquez T et al (2023) The shape of cancer relapse: Topological data analysis predicts recurrence in paediatric acute lymphoblastic leukaemia. PLoS Comput Biol 19(8):e1011329
Article Google Scholar
Clough JR, Byrne N, Oksuz I, Zimmer VA, Schnabel JA, King AP (2022) A topological loss function for deep-learning based image segmentation using persistent homology. IEEE Trans Pattern Anal Mach Intell 44(12):8766–8778
Article Google Scholar
Cohen-Steiner D, Edelsbrunner H, Harer J (2005) Stability of persistence diagrams. In: Symposium on computational geometry, pp 263–271
de Surrel T, Hensel F, Carrière M, Lacombe T, Ike Y, Kurihara H et al (2022) RipsNet: a general architecture for fast and robust estimation of the persistent homology of point clouds. In: Topological, algebraic and geometric learning workshops. PMLR. pp 96–106
Edelsbrunner H, Harer J (2008) Persistent homology—a survey. In: Surveys on discrete and computational geometry. vol 453. Amer Mathematical Society, p 257
Edelsbrunner, Letscher, Zomorodian (2002) Topological persistence and simplification. Discret Comput Geom 28(4):511–533
Fan Q, Sun C, Hu B, Wang Q (2023) Recent advances of lanthanide nanomaterials in Tumor NIR fluorescence detection and treatment. Mater Today Bio 100646
Glatt R, Liu S (2023) Topological data analysis guided segment anything model prompt optimization for zero-shot segmentation in biological imaging. Ar**v preprint. ar**v:2306.17400
Goel A, Pasricha P, Mehra A (2020) Topological data analysis in investment decisions. Expert Syst Appl 147:113222
Article Google Scholar
Guo W, Qiu H, Liu Z, Zhu J, Wang Q (2022) GLD-Net: deep learning to detect DDoS attack via topological and traffic feature fusion. Comput Intell Neurosci
Hafez SM, Nainay ME, Abougabal M, Kosba A (2022) Ethereum price prediction using topological data analysis. In: Global conference on artificial intelligence and Internet of Things, pp 146–153
Haft-Javaherian M, Villiger M, Schaffer CB, Nishimura N, Golland P, Bouma BE (2020) A topological encoding convolutional neural network for segmentation of 3D multiphoton images of brain vasculature using persistent homology. In: Conference on computer vision and pattern recognition workshops, pp 4262–4271
Hajij M, Zamzmi G, Batayneh F (2021) TDA-Net: fusion of persistent homology and deep learning features for COVID-19 detection from chest X-ray images. In: International conference of the IEEE engineering in medicine & biology society. IEEE, pp 4115–4119
Hatcher A (2002) Algebraic topology. Cambridge University Press, Cambridge
Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Conference on computer vision and pattern recognition. IEEE, pp 770–778
Hofer C, Kwitt R, Niethammer M, Uhl A (2017) Deep learning with topological signatures. Adv Neural Inf Process Syst 30
Hofer CD, Kwitt R, Niethammer M (2019) Learning representations of persistence barcodes. J Mach Learn Res 20(126):1–45
MathSciNet Google Scholar
Hofer C, Graf F, Rieck B, Niethammer M, Kwitt R (2020) Graph filtration learning. In: III HD, Singh A (eds) International conference on machine learning. vol. 119 of proceedings of machine learning research. PMLR, pp 4314–4323
Horn M, Brouwer ED, Moor M, Moreau Y, Rieck B, Borgwardt K (2022) Topological graph neural networks. In: International conference on learning representations. p x
Hu X, Li F, Samaras D, Chen C (2019) Topology-preserving deep image segmentation. Adv Neural Inf Process Syst 32
Huynh V, Phung DQ, Zhao H (2021) Optimal transport for deep generative models: state of the art and research challenges. In: International joint conference on artificial intelligence, pp 4450–4457
Kališnik S (2018) Tropical coordinates on the space of persistence Barcodes. Found Comput Math 19(1):101–129
Article MathSciNet Google Scholar
Kim K, Kim J, Zaheer M, Kim J, Chazal F, Wasserman L (2020) Pllay: efficient topological layer based on persistent landscapes. Adv Neural Inf Process Syst 33:15965–15977
Google Scholar
Ko S, Koo D (2023) A novel approach for wafer defect pattern classification based on topological data analysis. Expert Syst Appl 120765
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25
Kusano G, Hiraoka Y, Fukumizu K (2016) Persistence weighted Gaussian kernel for topological data analysis. In: International conference on machine learning. PMLR. pp 2004–2013
Kwitt R, Huber S, Niethammer M, Lin W, Bauer U (2015) Statistical topological data analysis-a kernel perspective. Adv Neural Inf Process Syst 28
Li C, Ovsjanikov M, Chazal F (2014) Persistence-based structural recognition. In: Conference on computer vision and pattern recognition, pp 2003–2010
Liu S, Gaffney J, Peterson L, Robinson PB, Bhatia H, Pascucci V et al (2020) Scalable topological data analysis and visualization for evaluating data-driven models in scientific applications. IEEE Trans Visual Comput Graphics 26(1):291–300
Article Google Scholar
Mileyko Y, Mukherjee S, Harer J (2011) Probability measures on the space of persistence diagrams. Inverse Prob 27(12):124007
Article MathSciNet Google Scholar
Moor M, Horn M, Rieck B, Borgwardt K (2020) Topological autoencoders. In: International conference on machine learning. PMLR, pp 7045–7054
Morilla I, Chan P, Caffin F, Svilar L, Selbonne S, Ladaigue S et al (2022) Deep models of integrated multiscale molecular data decipher the endothelial cell response to ionizing radiation. Icience 25(1):103685
Article Google Scholar
Mosinska A, Marquez-Neila P, Kozinski M, Fua P (2018) Beyond the pixel-wise loss for topology-aware delineation. In: Conference on computer vision and pattern recognition. IEEE, pp 3136–3145
Munkres J (1993) 1. In: Homology groups of a simplicial complex. CRC Press, New York, pp 1–78
Murugan J, Robertson D (2019) An introduction to topological data analysis for physicists: from LGM to FRBs. Ar**v preprint. ar**v:1904.11044
Naitzat G, Zhitnikov A, Lim LH (2020) Topology of deep neural networks. J Mach Learn Res 21(1):7503–7542
MathSciNet Google Scholar
Narayana J, Mac Aogáin M, Ivan F, Jaggi T, Keir H, Dicker A et al (2023) Topological data analysis reveals antimicrobial resistotypes associated to the microbiome in bronchiectasis: an international multi-centre study. In: Microbiome research. American Thoracic Society, pp A2652–A2652
Papillon M, Hajij M, Myers A, Frantzen F, Zamzmi G, Jenne H et al (2023a) Topological deep learning challenge: design and results. In: Workshop on topology, algebra, and geometry in machine learning. vol 221 of Proceedings of machine learning research. PMLR, p 3–8
Papillon M, Sanborn S, Hajij M, Miolane N (2023b) Architectures of topological deep learning: a survey on topological neural networks. Ar**v preprint. ar**v:2304.10031
Peyré G, Cuturi M et al (2019) Computational optimal transport: with applications to data science. Found Trends Mach Learn 11(5–6):355–607
Article Google Scholar
Polianskii V (2018) An investigation of neural network structure with topological data analysis [Master’s Thesis]. KTH, School of Electrical Engineering and Computer Science (EECS)
Pun CS (2021) A sparse learning approach to relative-volatility-managed portfolio selection. SIAM J Financ Math 12(1):410–445
Article MathSciNet Google Scholar
Pun CS, Lee SX, **a K (2022) Persistent-homology-based machine learning: a survey and a comparative study. Artif Intell Rev 55(7):5169–5213
Article Google Scholar
Qiu Y, Wei GW (2023a) Persistent spectral theory-guided protein engineering. Nat Comput Sci 3(2):149–163
Article Google Scholar
Qiu Y, Wei GW (2023b) Artificial intelligence-aided protein engineering: from topological data analysis to deep protein language models. Briefings Bioinform 24(5):bbad289
Article Google Scholar
Rathore A, Chalapathi N, Palande S, Wang B (2021) TopoAct: visually exploring the shape of activations in deep learning. Comput Graphics Forum 40(1):382–397
Article Google Scholar
Reinauer R, Caorsi M, Berkouk N (2021) Persformer: a transformer architecture for topological machine learning. Ar**v preprint. ar**v:2112.15210
Reininghaus J, Huber S, Bauer U, Kwitt R (2015) A stable multi-scale kernel for topological machine learning. In: Conference on computer vision and pattern recognition. IEEE. pp 4741–4748
Rieck B, Togninalli M, Bock C, Moor M, Horn M, Gumbsch T et al (2019) Neural persistence: a complexity measure for deep neural networks using algebraic topology. In: International conference on learning representations, p x
Robins V (1999) Towards computing homology from approximations. Topol Proc 24:503–532
MathSciNet Google Scholar
Sarpietro RE, Pino C, Coffa S, Messina A, Palazzo S, Battiato S et al (2022) Explainable deep learning system for advanced silicon and silicon carbide electrical wafer defect map assessment. IEEE Access 10:99102–99128
Article Google Scholar
Senekane M, Matjelo NJ, Taele BM (2021) Improving short-term output power forecasting using topological data analysis and machine learning. In: International conference on electrical, computer and energy technologies. IEEE, pp 1–6
Shapanis A, Jones MG, Schofield J, Skipp P (2023) Topological data analysis identifies molecular phenotypes of idiopathic pulmonary fibrosis. Thorax 78(7):682–689
Article Google Scholar
Singh G, Memoli F, Ishkhanov T, Sapiro G, Carlsson G, Ringach DL (2008) Topological analysis of population activity in visual cortex. J Vis 8(8):11
Article Google Scholar
Singh Y, Farrelly CM, Hathaway QA, Leiner T, Jagtap J, Carlsson GE et al (2023) Topological data analysis in medical imaging: current state of the art. Insights Imaging 14(1):1–10
Article Google Scholar
Solomon Y, Wagner A, Bendich P (2021) A fast and robust method for global topological functional optimization. In: International conference on artificial intelligence and statistics. PMLR, pp 109–117
Som A, Thopalli K, Natesan Ramamurthy K, Venkataraman V, Shukla A, Turaga P (2018) Perturbation robust representations of topological persistence diagrams. In: European conference on computer vision (ECCV), pp 617–635
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15(56):1929–1958
MathSciNet Google Scholar
Turner K, Mukherjee S, Boyer DM (2014) Persistent homology transform for modeling shapes and surfaces. Inf Inference 3(4):310–344
Article MathSciNet Google Scholar
Uray M, Giunti B, Kerber M, Huber S (2023) Topological data analysis in smart manufacturing processes—a survey on the state of the art. Ar**v preprint. ar**v:2310.09319
Vukicevic M, Mosadegh B, Min JK, Little SH (2017) Cardiac 3D printing and its future directions. Cardiovasc Imaging 10(2):171–184
Google Scholar
Wamil M, Hassaine A, Rao S, Li Y, Mamouei M, Canoy D et al (2023) Stratification of diabetes in the context of comorbidities, using representation learning and topological data analysis. Sci Rep 13(1):11478
Article Google Scholar
Wang F, Liu H, Samaras D, Chen C (2020) Topogan: a topology-aware generative adversarial network. In: European conference on computer vision. Springer, NewYork, pp 118–136
Watanabe S, Yamana H (2021) Topological measurement of deep neural networks using persistent homology. Ann Math Artif Intell 90(1):75–92
Article MathSciNet Google Scholar
**a K, Li Z, Mu L (2017) Multiscale persistent functions for biomolecular structure characterization. Bull Math Biol 80(1):1–31
Article MathSciNet Google Scholar
Yu Z, Su Y, Lu Y, Yang Y, Wang F, Zhang S et al (2023) Topological identification and interpretation for single-cell gene regulation elucidation across multiple platforms using scMGCA. Nat Commun 14(1):400
Article Google Scholar
Zaheer M, Kottur S, Ravanbakhsh S, Poczos B, Salakhutdinov RR, Smola AJ (2017) Deep sets. In: Advances in neural information processing systems. vol 30, pp 3391–3401
Zhang Z, Li Y, Zhou W, Chen X, Yao W, Zhao Y (2021) TONR: an exploration for a novel way combining neural network with topology optimization. Comput Methods Appl Mech Eng 386:114083
Article MathSciNet Google Scholar
Zhao Q, Ye Z, Chen C, Wang Y (2020) Persistence enhanced graph neural network. In: International conference on artificial intelligence and statistics. vol 108 of proceedings of machine learning research. PMLR, pp 2896–2906
Zhen Z, Chen Y, Segovia-Dominguez I, Gel YR (2022) Tlife-GDN: detecting and forecasting spatio-temporal anomalies via persistent homology and geometric deep learning. In: Pacific-Asia conference on knowledge discovery and data mining. Springer, New York, pp 511–525
Zhou C, Dong Z, Lin H (2022) Learning persistent homology of 3D point clouds. Comput Graph 102:269–279
Article Google Scholar
Zhu X, Vartanian A, Bansal M, Nguyen D, Brandl L (2016) Stochastic multiresolution persistent homology kernel. In: International joint conferences on artificial intelligence. pp 2449–2457
Zieliński B, Lipiński M, Juda M, Zeppelzauer M, Dłotko P (2020) Persistence codebooks for topological data analysis. Artif Intell Rev 54(3):1969–2009
Article Google Scholar
Zomorodian A (2010) Fast construction of the Vietoris-Rips complex. Comput Graph 34(3):263–271
Article Google Scholar

Download references

Funding

Open Access funding enabled and organized by CAUL and its Member Institutions.

Author information

Authors and Affiliations

Biological Data Science Institute, Australian National University, 46 Sullivans Creek Rd, Acton, ACT, 2601, Australia
Ali Zia, James Nichols & Eric Stone
CSIRO, Building 801, Cnr of Dickson Way and, N Science Rd, Acton, ACT, 2601, Australia
Ali Zia, Abdelwahed Khamis, Zeeshan Hayder, Vivien Rolland & Lars Petersson
School of Engineering, RMIT, Bundoora Campus, 264 Plenty Rd, Mill Park, VIC, 3082, Australia
Usman Bashir Tayab

Authors

Ali Zia
View author publications
You can also search for this author in PubMed Google Scholar
Abdelwahed Khamis
View author publications
You can also search for this author in PubMed Google Scholar
James Nichols
View author publications
You can also search for this author in PubMed Google Scholar
Usman Bashir Tayab
View author publications
You can also search for this author in PubMed Google Scholar
Zeeshan Hayder
View author publications
You can also search for this author in PubMed Google Scholar
Vivien Rolland
View author publications
You can also search for this author in PubMed Google Scholar
Eric Stone
View author publications
You can also search for this author in PubMed Google Scholar
Lars Petersson
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All authors whose names appear on the submission made substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data, drafted the work or revised it critically for important intellectual content.

Corresponding author

Correspondence to Ali Zia.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Zia, A., Khamis, A., Nichols, J. et al. Topological deep learning: a review of an emerging paradigm. Artif Intell Rev 57, 77 (2024). https://doi.org/10.1007/s10462-024-10710-9

Download citation

Accepted: 18 January 2024
Published: 29 February 2024
DOI: https://doi.org/10.1007/s10462-024-10710-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Topological deep learning: a review of an emerging paradigm

Abstract

Similar content being viewed by others

Learning Topology: Bridging Computational Topology and Machine Learning