In genetics, one harmful variant can be enough to cause disease—but two can make it far more severe. One notable example is KJ, an infant diagnosed with a rare urea cycle disorder with a grim prognosis: He had a 50 percent chance of dying. His story ultimately became one of scientific collaboration and a successful personalized gene therapy. As researchers and clinicians use genome sequencing to identify genetic variants in patients, these advancements also underscore a new question: Can this information do more than diagnose disease?
The human genome contains many variants with unknown functional consequences, and some combinations of variants behave in unexpected ways. This uncertainty motivated Aimée Dudley, a geneticist at Pacific Northwest Research Institute who studies rare diseases—including urea cycle disorders—to investigate whether one could predict the functional impact of genetic variation.
In a recent study published in the Proceedings of the National Academy of Sciences, Dudley and her colleagues demonstrated that two harmful variants within a gene could restore its function.1 This was contrary to long-held beliefs that two combined mutations would worsen disease. The team then developed an AI-based model to show that this interaction was predictable through protein structure. Their findings suggest that approximately four percent of human genes possess structural features that enable this interaction and could, in the future, help clinical geneticists make genotype-phenotype predictions.
The team focused on the argininosuccinate lyase (ASL) enzyme: It is a key enzyme responsible for detoxifying ammonia in the body. Variants of ASL reduce its activity and cause a urea cycle disorder. So, the researchers used yeast screening assays to measure the functional impact of thousands of individual and combinations of ASL variants.

Tang, Dudley, and their colleagues developed an AI-based model, trained on the enzyme ASL, that can predict whether two variants can restore protein function.
Pacific Northwest Research Institute
To Dudley’s surprise, the screens revealed that more than 60 percent of variant pairs that had no activity individually—rendering the protein, which Dudley described as “dead as a doornail”—recovered 80 to 100 percent of normal activity when combined. This effect, known as intragenic complementation, was first proposed in the 1960s but had never been tested on such a scale before. It is a form of positive epistasis, where two mutations within the same gene can dampen the severity of a phenotype or rescue function.
“[Often], when we look at pathogenic mutations, we immediately assume that they cause a loss of function of the protein or the gene. Here, you have two pathogenic mutations, and you have the opposite effect,” said Avner Schlessinger, a computational structural biologist at Icahn School of Medicine at Mount Sinai, who was not involved in the study. Although this observation felt counterintuitive to him, he was pleased to see that the researchers could explain this observation with a protein structure. He added that mutation effects are often difficult to rationalize structurally, but the effects are quite clear from the study’s findings. As it turned out, Dudley and the team found that this intragenic complementation was not driven by the amino acid substitutions themselves, but by their position within the 3D protein structure.
Because this phenomenon had a clear protein structural basis, the researchers developed a machine-learning algorithm to learn the rules for when two deleterious variants could restore protein function and when they couldn’t. “It was a really big surprise to me at how good it was with that,” said Dudley. The model achieved nearly 100 percent accuracy in classifying variant pairs based on their ability to cause intragenic complementation.
Next, they sought to determine if this could be generalized to another protein in a pathway outside of the urea cycle, so they examined the human fumarase protein. This enzyme is involved in energy production and DNA repair. “It causes a completely different disease. It’s only 21 percent identical at the amino acid level, but it has a very similar shape,” Dudley explained. The model produced similar results, achieving greater than 91 percent accuracy.
Emboldened by these findings, the researchers turned to the Protein Data Bank, scouring human protein structures to assess how widespread this phenomenon might be. Their analysis suggested that intragenic complementation may occur in approximately four percent of genes in the human genome.
“You can imagine doing protein families and really being able to…generate dependent epistatic models for these,” said Dudley. Schlessinger, who applies AI and machine learning to study protein structure-function relationships, echoed these sentiments. He remarked that the study is an exciting proof-of-concept.
Additionally, Schlessinger is curious as to whether this phenomenon is also present across additional complex protein systems and in contexts where variant interactions occur outside the active site.
Dudley and her team aim to address these questions and more: how frequently these interactions occur beyond the active site and whether such interactions are structurally predictable. She also emphasized that current databases and tools were not built to capture the types of context-dependent effects the team observed in the study. With many other genes likely to exhibit this phenomenon, their work underscores the need for new approaches to incorporate this added layer of information and improve genotype-phenotype predictions.
Reference
Tang M, et al. Predicting epistasis across proteins by structural logic. Proc Natl Acad Sci USA. 2026;123(3):e2516291123.