Theoretically, it can be estimated that more unique biological interactions exist than stars in our known universe.
The biological foundations of life are built on an unimaginably vast network of interactions, with molecules, cells, systems and organisms constantly colliding with each other.
For centuries, scientists and physicians have relied on targeted techniques and isolated observations. Through slow, iterative, shared discoveries over generations, we have developed our understanding of biology, applying fractional knowledge to enable life-changing approaches in just a subset of disease states and dysfunction.
Humanity is now entering a new era of scientific discovery, using artificial intelligence to learn and reason about complex biological challenges.
Artificial intelligence
Thoughtful implementations reveal new information to solve important problems at the intersection of biology and medicine.
The use of AI allows us to organize and perceive the complexity of biological interactions on a scale greater than what the human brain is naturally capable of. These frameworks are supported by growing experimental data enabled by rapidly improving analytical technologies.
A much-discussed example of AI in biology is the 2024 Nobel Prize in Chemistry for AlphaFold, an AI model that predicts protein structures and interactions based on statistical regularities in structural and evolutionary data.
Proteins, responsible for a huge portion of biological interactions, can now be examined systematically and virtually in hours or days. This bypasses conventional methods that require weeks, months or even years of effort.
(AP Photo/Jeff Chiu)
AlphaGenome, another AI-driven model from Google DeepMind, now allows researchers to quickly and efficiently predict how gene variants contribute to genetic landscapes that cause disease and dysfunction.
These disruptive AI approaches (and others) are already being widely applied to cancer, Alzheimer’s disease, pandemic response and beyond.
Correlation versus cause and effect
Importantly, the AI field is currently dominated by modeling approaches that are statistical in nature; that is, these models learn correlations, rather than cause and effect.
This distinction is important. Statistical models are limited by the context in which they can be applied.
This brings us to the most important overarching question in the field today: how can we capture the cause and effect of every interaction that exists within this fuzzy network we call biology?
Contemporary solutions to this question are explored through hybrid computational frameworks. These are models that combine the limited structured knowledge we have about biological systems and how they function with multimodal datasets.
But what do I mean by knowledge? From a natural science perspective: established causal mechanisms or fundamental laws in physics, chemistry and biology.
From a medical perspective: established mechanisms of disease progression or aging.
And multimodal datasets? Data obtained to observe biology and medicine from different perspectives. These could be:
-
Images of biology that determine the spatial characteristics of healthy or diseased states.
-
Quantitative data that determines the expression of metabolites, genes, proteins, epigenetics, or other aspects of biological identity and function.
-
Medical data that provides information about the broad variables that may (or may not) play a role in the onset and progression of the disease.
These are just a few examples. As you might imagine, this is no easy task.
(Unsplash+/Aakash Dhage)
Train AI models
The Arc Institute is one of many groups tackling this by learning biological representations at the cellular level.
Arc Institute researchers are training AI to understand how gene networks work together to shape the cellular identity of more than 150 million cells from different organs in the body.
Researchers then perform perturbations: making informed perturbations of biology to understand the cause and effects that drive biological changes. These changes impact cellular function or identity.
The data obtained from these experiments inform causal mechanisms in biology.
This means informing direct cause and effect, in addition to compensatory mechanisms (how biology tries to adapt in response to changes) and biological variance (how one cell may differ in its response from another).
These results are integrated into the model architecture to optimize how well it learns to predict a statistical-causal representation of the cell state. That is, a representation that is causally informed, but that also captures statistical representations of how large numbers of features (input variables) interact.
This approach and others like it are pushing the fields of biology and medicine forward at an accelerated pace.
However, biology is very complex. The question remains how we connect one aspect of the biological state of being (such as genes expressed for a particular cell identity or function) to the many other aspects that drive identity and function in biological contexts.
Extraordinary complexity
There is no denying that causally aware AI systems have the potential to accelerate drug discovery, optimize personalized treatment recommendations, and even provide novel mechanistic solutions across the breadth of biomedical science and medicine.
However, there are significant challenges in achieving these results. Biological systems are extremely complex.
These systems are highly dimensional, meaning they operate at the intersection of a very large number of variables. They are also confusing because biological variance makes it difficult to separate important information from noise.
Furthermore, biology is rich in compensatory mechanisms that are ingrained in our evolution, as biology tries to correct or compensate itself when a variable output goes wrong.
Even limited causal evidence is difficult to distinguish from correlation in biological systems, experimentally in the laboratory or medically in the clinic.
There are also other challenges:
-
Insufficient data, or a lack of critical information within existing data sets.
-
Inconsistencies and biases in data collection, including but not limited to underrepresentation, and perspective biases in many contexts.
-
Ethics in AI, a topic you could write books about on health, medicine and everything beyond.
However, the question remains: how can we reliably implement, interpret and translate these systems into solutions, in the face of all these obstacles?
Regenerative competence
(AP Photo/Malin Haarala)
Our own team, the Biernaskie lab at the University of Calgary, applies these approaches.
We study how reindeer regenerate their antlers, both seasonally and after injury. Our work is first to model, predict and then facilitate this regenerative competence in people.
Our first goal is to regenerate healthy skin in burn survivors, or significantly improve healing outcomes.
Severe burns result in fibrotic scars, an evolutionary mechanism that protects life by minimizing the risk of bleeding and infection. The result is dysfunctional scar tissue without sweat glands, hair follicles, or most of the cell types that coordinate healthy skin.
Burns are most common in children, and the physical, social, and psychological consequences of severe burns place a significant burden on the longevity of survivors.
Other labs around the world are committed to using AI to solve complex problems in health and medicine, focusing on a wide range of approaches. These range from deeper integration of omics and imaging data to improved theoretical and experimental frameworks for validating causal mechanisms, robust cyclic validation to advance predictions using preclinical experiments, and transparent, fair, and ethical frameworks.
Professionals from across the breadth of this transdisciplinary field may be together on the cusp of a new era of solutions to some of the toughest challenges in health and medicine.
