Skip to main content

The Shape of Randomness

We often rely on shapes and patterns when navigating the world. Poison ivy or an innocent plant? A nasty rash or the imprint of the textured wall you were leaning against? Similarly, scientists often use shapes and patterns to interpret datasets. Do the points follow a straight line? Appear in clusters? On the street and in the lab, shapes help us organize information, interpret data, and even make predictions.
While some sets of data are relatively straightforward to interpret, others get messy quickly. It can be difficult to extract useful information from maps of complicated situations like the relationship between diseases and their associated genes. This is because the structures that emerge often depend on parameters chosen by researchers through a somewhat arbitrary process, making it difficult to tell when a structure is really significant. In new research recently published in the American Physical Society’s journal Physical Review E, a team of scientists from Universit√© Laval in Canada, the Politecnico di Torino and the ISI Foundation in Italy introduce a valuable tool for determining whether the shape of a complex dataset is actually significant.

Diseases and their associated genes are just one example of what scientists classify as a complex system. Many other systems fall into this category too—the Earth’s climate, living cells, the human brain, social structures—really any system that is difficult to describe because it contains so many moving, interacting pieces. Of course, understanding these same systems can have a profound effect on our quality of life, enabling early warning systems, targeted treatments, and effective interventions.

A common way of studying these systems is with complex networks, a way of visually representing components and their interactions and looking at the structures that emerge. For example, the traditional network approach considers each component to be a node and each interaction between two components as a line linking them together. Research shows that the network approach is effective in helping us understand many systems. However, you can lose important information by applying it to a complex system that can’t be broken down into a set of clean interactions between two components.

An alternative that works better for systems like the brain and social structures is the simplicial complexes approach. This approach uses mathematical objects to capture complicated interactions. Once you encode the data from a complex system into a simplicial complex, you can extract information by looking at its shape. For example, you can ask questions like: What is the shape of this dataset from a brain? Does the shape of this dataset tell me anything about the health of the brain?

Researchers have demonstrated that this approach is effective, says Jean-Gabriel Young, a researcher at Universit√© Laval who co-led the project with Alice Patania (Politecnico di Torino and ISI Foundation), but has lacked a statistical foundation. In other words, when you organize information into simplicial complexes it can be hard to determine whether the shape is surprising or random, whether a pattern can be explained by chance or has some significant meaning. Making this determination requires a precise model that lets you compare your observed data to its random results—and such a model hasn’t existed until now.

Young and Patania met at the Santa Fe Institute in the summer of 2015, as attendees at a summer school for PhD students focused on complexity. Young was a physics student studying network science and Patania a math student studying topological data analysis, an area focused on finding and quantifying the shape of datasets. The two worked together over the summer and kept in touch. Over the course of two later visits, one by Young to Italy and one by Patania to Canada (the latter trip funded by a grant from the Young Researcher Network on Complex Systems), the researchers combined their expertise in different areas to address the question of how to quantify randomness in the shapes of complex systems.

Working with Patania’s advisors Giovanni Petri and Francesco Vaccarino, the two developed what they call the Simplicial Configuration Model (SCM). The SCM describes all of ways in which the components of a simplicial complex can be arranged. The team also developed an algorithm for generating many random versions of a real dataset. Together, these tools enable a researcher to compare the simplicial complexes most likely to be produced by a random system to the real data. In this way, you can determine whether an observed pattern is statistically different than one produced by chance.

The team investigated three real datasets using their model:
• the relationship between flower-visiting insects and plants,
• the relationship between human disease and genes linked by known disorder-gene associations, and
• the relationship between the individuals involved in crimes in St. Louis (suspects, victims, and witnesses).

The test results show that the observed shape of the pollinator dataset is random and has no higher-level organizational system. The structure is simply the result of random interactions among the insects and plants. In the other two cases, however, the observed structures have high levels of statistical significance. The patterns are very different than what would be produced by chance alone. In the case of the crime data, this significance comes at least in part from the way the data was collected—by looking up ties between suspects, victims, and witness already in a database. In the disease case, the significance doesn’t come from data collection, implying that the system self-organizes in some way.

The SCM can already be applied to real situations, as shown in the case studies, although there are some technical questions that still need to be addressed. The model sets the simplicial complexes approach on a more solid, objective footing and the researchers say it could lead to new insights on the emergence of patterns and higher-order structural properties in complex systems. Given the complex systems and massive datasets that make our world go 'round, that’s a good thing for all of us.

Kendra Redmond


Popular Posts

How 4,000 Physicists Gave a Vegas Casino its Worst Week Ever

What happens when several thousand distinguished physicists, researchers, and students descend on the nation’s gambling capital for a conference? The answer is "a bad week for the casino"—but you'd never guess why.

Ask a Physicist: Phone Flash Sharpie Shock!

Lexie and Xavier, from Orlando, FL want to know: "What's going on in this video ? Our science teacher claims that the pain comes from a small electrical shock, but we believe that this is due to the absorption of light. Please help us resolve this dispute!"

The Science of Ice Cream: Part One

Even though it's been a warm couple of months already, it's officially summer. A delicious, science-filled way to beat the heat? Making homemade ice cream. (We've since updated this article to include the science behind vegan ice cream. To learn more about ice cream science, check out The Science of Ice Cream, Redux ) Image Credit: St0rmz via Flickr Over at Physics@Home there's an easy recipe for homemade ice cream. But what kind of milk should you use to make ice cream? And do you really need to chill the ice cream base before making it? Why do ice cream recipes always call for salt on ice?