Network Science as temptation

Aug 22

The core fable of half of physics papers is simple: "Our system can be in a whole lot of different states, but because of a funny interaction it only ends up in a few special states." This fable underlies the conceptual separation of states and interactions. I only realized this separation to be a core principle of physics toolset once I saw it misused and misinterpreted.

Let's go through it slower. When modeling a system in physics language, we define the space of possible states in which it can be, for example the positions of all the particles. At this stage the definition is binary: the states are either possible or impossible, with no degree of preference. This preference nuance is added right after, with addition of interactions. Interactions can come in different flavors, sometimes driving the dynamics of the system, sometimes the statistics. Sometimes the interactions affect only one degree of freedom at a time, sometimes two at a time in pairs, sometimes even more than pairwise. The result of interactions is usually in the system selecting particular states from the whole gamut. How exactly that selection works is usually the hardest yet most interesting question.

If the interaction hits the particles one at a time, their selected states can also be found one at a time, but that is a common and less interesting case. If the interaction hits a few particles at a time, usually in pairs, state selection gets more complicated and interesting because it becomes collective. But what are the pairs that interact? For most of the history of physics, it was concerned with "topologically simple" spaces - and I am being loose with words here. There is a lot of cool stuff that happens in 1D, 2D, 3D continuous space. There are good intuitions - and sometimes good reasons to care - about spaces of more than three dimensions. There are a lot of studies of what happens on regular lattices: patterns of discrete points infinitely repeated in all directions, such as a honeycomb. There were of course many models where each particle interacts with all others (well-mixed). From the 1960s onward, there was a growing interest in systems with completely random interactions, from Erdős–Rényi random graphs to spin glasses.

The examples above fall in three topological archetypes: regular, fully-connected, and random. But in the real world, as recognized for decades by disciplines from computer science to sociology, the connection patterns often don't fall into those neat categories. These patterns are not fully connected, not periodic, not random - they are, for lack of a better word, complex networks. The name "network science" or "network theory" really became big after the late 1990s, right as I was entering elementary school. By the time I learned about network science circa 2014, it was an active, but already quite mature field, with dedicated textbooks, conferences, degree programs. One professor called network science "glorified accounting". Little did they know that I love glorified accounting. Little did they know about networks appearing in their own papers just a few years later.

Network science has developed lots of ways to measure structure: centrality metrics, community detection, spectra and eigenmodes, network control theory. All of these calculations are implemented in well-tested packages for all common programming languages. Network lens allows directly working with empirical structure data, unlike several older complexity paradigms. For legacy and continuity reasons, we can still talk about the old, simple networks in the complex network language, even if sometimes it feels like we misjudged the caliber of our tools. Complexity science itself has been hijacked by network science, as written by the hijacker-in-chief. Between all these amazing advances, what could go wrong?

Like other trigger-happy computational techniques, network science and its formulas and packages dramatically reduce the time between questions and answers, and thus provide a great scientific temptation. The main risk of this temptation is the confusion of the answer with the question. At the baseline, qualitatively, physics is still about interaction driving systems to special corners of the state space. Is the network an interaction (question, input) or a state (answer, output)? That's a model-building dilemma that cannot be resolved by network science but needs to be addressed by you, the modeler.

Sometimes, the networks are known a priori and describe the complex pattern of interactions between parts of the system. But interactions are only one side of system description. To capture the other side, there needs to also be a state of the system. Usually, we put a local variable on each node, and these variables get to interact along the network edges. Pertinent to our times, the node variable can be the infection status of individuals, and the edges describe how the infection can spread. The node variables can be activities of brain regions, which activate or inhibit each other along the edges. The node variables can be blood pressures, and the edges are vessels along which the blood might flow. The node variables might be the counts of different species in an ecosystem, and the directed edges show when one species uses another one as food. In all these cases, we already know the network but want to ask questions about the system state: is it an isolated disease outbreak or a pandemic? Is it a healthy brain or a schizophrenic one? Does the blood reach all parts of the organisms or some parts are systematically deprived? Is it a regular trophic cascade or an ecosystem collapse?

Other times, networks are answers. In the whole space of all possible networks, the system interactions select only a special few - but which ones? This question drove the network revolution in the first place because new empirical measurements of network structures contradicted everyone's favorite Erdős–Rényi random graphs, so new models were required. Network formation and selection questions are common. Proteins in a cell need to bind their functional partners but not others, so how does the network of protein interactions arise under selective pressure in evolution? Inside, each protein is a chain of amino acid residues folded onto itself, so what is the network of residue contacts there? If we compress a granular material like dry sand, how does the network of particle contacts and forces evolve? Jumping to a very large scale, what is the distribution and connection of galaxies in the Universe and how does that depend on how we define networks?

The distinction between networks as questions and networks as answers is very easy to get wrong, and I've seen many talented researchers stumped by this. Most commonly, the network describes an interaction, and the network metrics promise to describe the evolution of... what? There is nothing to evolve or select if there are no local node variables. Other times, the node states and the edge patterns evolve together, and the state/interaction duality doesn't map onto the question/answer so neatly. Network science is a powerful tool, it is here to stay, but it doesn't free you from the basic modeling decision flows.

network science

Andrei Klishin

Network Science as temptation

The Nontriviality Argument

Writing Science