How to Stop Looking at Data

Higher structured data types are essential for Artificial Intelligence (AI) to achieve human-like capabilities. In their 2018 paper, Peter Battaglia et al. proposed these types be higher-level graph network-blocks.^[1] However, such compositional theories suffer from the limitations of their function approximator hosts.

This mechanism for collating insignificant local data into useful information has an analog in geometry called Homology. For years, applied mathematicians have avoided this abstract algebraic and topological concept in modeling complex systems. The time has come for it to shine.

Recently, Topological Data Analysis, supported by Persistent Homology, has emerged. The basic idea is to stitch together overlapping data point patches in specific ways to form an insightful quilt.

The quilt's topology depends on how you sew the data, and insights are quantified based on its invariants and persistence properties. One such invariant is the Euler characteristic of a topological space. It's the alternating sum of ranks of Homology groups of the n-dimensional quilt. Insights come from data, specifically from its shape. Applied Sheaf Theory is another extension of topological data analysis.

Topological tools like Homology and Sheaf Theory help us answer questions about coverage, movement, curvature of data for better decision making. In a dense telecommunication network, knowing which towers to switch off is valuable. This is obtained by computing the representative bases for homology classes, which helps build a fault-tolerant network. Similar approaches apply to Finance, Biology, Chemistry, and Engineering.

Though data seems advantageous, it shouldn't be the first choice with topological tools. Algebraic Topology offers a framework to study the mechanistic underpinnings of data generative processes. It is a waste to apply on data itself when we can understand more about the machinery. Tools like dimensionality reduction, visualization, neural networks, and machine learning algorithms can effectively operate on data to detect correlations.

Topological tools can help us formulate hypotheses, which we can verify with machine learning algorithms. Abstract algebra can automate this formulation. A chain of verified formulations forms an algorithm — one of many generative models of the data.

Consider a large number of strings, all of the same length, from a finite alphabet. They could be countably infinite. Each data point (string) is a permutation of another, and complex correlations can be obtained from neural networks and other machine learning tools.

An LSTM neural network can be trained using aligned strings until it learns a transposition operator (or another permutation approximator) to understand how these data points (strings) are generated. We are left with an algorithm that generates the strings, given the alphabet as input. Such neural networks (with graph network block operatives) can be combined to generate new algorithms yielding data points with interesting topological shapes.

These higher-order algorithms can be modeled as graph-graph blocks, which act as data points for higher dimensional neural networks. To understand the information flow inside these networks, topological tools like persistent homology can be applied for goal-oriented algorithm generation.

To automatically generate algorithms for a given set of data points, we need to correct our method. A major flaw in our approach was that we never left the realm of correlations. We have a chain of representations for n-dimensional data points with (n – 1)-dimensional composite neural networks. A simple chain of [Vietoris-Rips] complexes isn't enough to generate a representative data generative process (algorithm).

In our method, the approximator host (neural networks) forces us into a cycle of identifying and generating correlations in higher dimensions. To break free, we step away from neural networks and other machine learning tools. Let's look at the data set. Since it contains both inputs and outputs required by any candidate algorithm, constructing duals of the homology groups satisfying commutative, associative or distributive properties (bar and cobar construction) yields a finite co-chain. Any representative of its canonical basis will belong to the universal set of generator algorithms for the given data points. This method simplifies the approach to obtain algorithms explaining a given set of data.

Battaglia, Peter W., Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski, Andrea Tacchetti et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 (2018). ↩