Generalizing results from animal models to human patients is a critical biomedical challenge. This problem is a key cause of the large proportion of failures encountered in moving therapeutics from preclinical studies to clinical trials (1). Direct translation of observations in rodents or nonhuman primates (NHPs) to humans frequently disappoints, for reasons including discrepancies in complexity and regulation between species. Because the experiments required to understand disease biology to the degree required for ascertaining effective treatments cannot be performed in human subjects, translation from animals to humans is necessary—and needs to be improved. Systems biology and machine learning (ML) can be used to translate relationships across species. Instead of attempting to “humanize” animal experimental models, which is possible to only a limited extent, greater success may be obtained by humanizing computational models derived from animal experiments.
High-throughput DNA and RNA sequencing has made it possible to compare large animal and human datasets to search for translatable features and assess the representativeness of animal models. This comparative approach is vulnerable to how phenotypic and molecular similarity are defined, factors that influence apparent translatability. For example, two independent analyses of the same mouse and human transcriptomic datasets came to opposite conclusions about the utility of mice in inflammatory disease research (2, 3). The discrepancy in the conclusions of these studies derived from differences in the statistical methods and selection of mouse data and phenotypes to compare with that of humans. Such comparative studies that use animal-to-human dataset pairs, called cross-species pairs (CSPs), are subject to these pitfalls, demonstrating a need to move from descriptive approaches to predictive models that incorporate cross-species differences in data types and phenotypes into translation.
Although CSP comparisons are potentially problematic, they can highlight biology that is challenging to translate. In a recent study, transcriptomic profiles from humans and animal models were used to identify cross-species expression of genes according to sex in 12 tissues and 4 species (4). The authors showed that sex-specific differences may have evolved after speciation and therefore may not be translatable to humans. An example that uses CSPs to identify representative animal disease models is PhenoDigm, a computational method that ranks animal models by assigning similarity scores to animal and human disease phenotypes (5). These studies expand the knowledge base of both genephenotype associations and animal-human phenotype associations, aiding experimental design and interpretation.
By contrast, computational humanization shifts perspective from comparisons to translating predictive models of biological associations across species, incorporating diverse molecular and phenotypic data from animals and humans. These approaches span from translation of disease-gene or disease-pathway associations in comparable data types and phenotypes to more complex signaling network, mechanistic, or data-driven computational models that integrate multiple data types and phenotypes. The features delineating these models are the extents to which they incorporate different molecular and phenotypic measurements to model and compensate for species-specific differences to characterize translatable biology.
The most basic predictive translation from animal to human is of individual molecular-to-phenotypic associations, such as those based on orthology. Theoretically, orthologs should have equivalent functions across organisms, but considerable deviation in ortholog expression between mice and humans shows that many gene-phenotype relationships are not evolutionarily conserved (6, 7). Because orthology-function relationships do not broadly apply, computational models have been developed to identify functional orthologs across species. One example uses Bayesian probability scoring to integrate transcriptomic data across tissues, cell types, and species to infer functional homology through coexpression analysis (8). Expanded orthology knowledge bases provide a resource to identify gene-phenotype associations that are translatable beyond specific CSPs.
ML has also been explored for cross-species molecular-to-phenotypic translation. A challenge these approaches navigate is that cross-species translation involves predicting human biology from nonhuman systems, predicting on a test set from a different domain (species) than that of the training set. Direct generalization of a model holds problematic concerns akin to simple CSP comparisons. To address this, most ML methods use a training set of CSPs with well-matched cross-species data and phenotypes, providing curated examples of cross-species molecular-to-phenotypic relationships for model training (supervised learning). This approach enables explicit modeling of cross-species differences and mitigates comparative issues in CSPs. Typically, ML models are validated by comparing the predicted biology to that obtained by analyzing human data alone. This cross-validated performance allows an expected accuracy of model performance to be obtained.
One systematic ML effort is the SBV-IMPROVER Species Translation Challenge (9). Transcriptomic and phosphoproteomic data were generated for human and rat bronchial epithelial cells under 52 stimulation conditions that modulated transcriptional regulation and pathway activity. Several translation challenges were posed, such as rat-to-human prediction of phosphoproteomic responses to stimuli as well as prediction of signaling pathway and regulatory functions. Many ML approaches—such as support vector machines, decision trees, and neural networks—performed well, but no approach was broadly effective across challenges, indicating that translating different molecular data types may require different ML models.
Others have used transcriptomic data to train ML models. Found In Translation (FIT) uses 170 mouse-human CSPs across 28 diseases to train a lasso regression model to predict gene-disease associations in humans (10). FIT trained a model for each gene individually and improved human disease gene prediction from mice by 20 to 50%. An alternative approach is to build models that reflect multigene effects to move toward systems-centric translation and reflect biological complexity. An effort that benchmarked eight ML models across 36 CSPs in inflammatory pathologies found that semi-supervised approaches, using unsupervised integration of human data with supervised models of mouse data, were effective for context-specific gene- and pathway-disease association prediction (11). These models improved the coverage of predicted pathways by up to 50%.
The SBV-IMPROVER, FIT, and semi-supervised methods highlight some key considerations. SBV-IMPROVER showed that ML improves on direct extrapolation of animal biology to humans, but generating new training data for every animal model, disease indication, and perturbation would prohibitively limit the use of ML approaches. FIT aimed for broad utility by training on data from many disease contexts, but this potentially obscures complex, context-specific biology. Semi-supervised models leverage context-specific animal and human systems effects but sacrifice some statistical power. Comparing methods is challenging because of differences in reported metrics. Implementing standard performance metrics for ML cross-species translation could catalyze the development of more effective methods.
Because data coverage and resolution can vary across species and confound ML methods, alternative approaches have been developed for translating mixed data types and phenotypes. These methods include signaling network and mechanistic models for predicting biology across species. The flexibility of these methods enables deeper interrogation of context-specific biology, but with a trade-off in generalizability to other diseases and species. Therefore, the utility of these approaches is in repurposing the methods to other biological contexts.
Signaling network models enable integration of heterogeneous data with existing knowledge bases. For example, diseaseQUEST (12) combines disease-gene associations from genome-wide association studies (GWASs) in humans with in silico model organism functional networks. The authors applied diseaseQUEST to identify candidate genes with conserved cross-species functions in 25 diseases and traits. Behavioral screens on the top predicted genes with Parkinson’s disease (PD)–associated phenotypes in the worm Caenorhabditis elegans revealed that several genes associated with age-dependent motility defects that mirrored PD symptoms. Computational network modeling enabled integration of genes identified in human GWASs with disease and tissue context. Network models have also been used to translate metabolic perturbations through orthology-based interaction mapping (13). Human metabolic interactions likely conserved in rats were used to humanize a genome-scale rat metabolism network. Gene responses to 76 compounds were analyzed on this network to identify species-specific metabolic biomarkers. These studies show how network integration of prior-knowledge and predictive models can enable cross-species predictions.
Signaling networks also facilitate meta-analysis-based methods, in which hypotheses are assessed from multiple sources of evidence when pooling raw data is infeasible. This motivated a study in which mouse and human tumor data were integrated to study mutant KRAS oncogenic signaling (14). A meta-analysis method was developed to statistically humanize tissue-specific mouse proteomic networks with human mutations and proteomics data. Overlaying genetic screening data from human cancer cell lines on these networks identified mutant KRAS allele-specific synthetic lethality (in which loss of a gene in the context of another genetic alteration confers lethality) that was validated in human cancer cell lines. Variants of network model approaches could enable the prediction of cross-species responses to perturbations by integrating multiple data types and phenotypes. However, such responses are typically inferred using data from other contexts rather than direct measurements, and many signaling network databases are incomplete, which may lead to false-negatives.
Sometimes, understanding cross-species mechanistic differences beyond what network or ML methods provide is required. For example, a mechanistic model integrating human and NHP antigen-specific T cell responses in tuberculosis was needed to characterize species-specific vaccination responses (15). Despite species-specific differences, a single computational model for T cell priming, proliferation, and differentiation described vaccine responses in both species through the appropriate alteration of parameter values. The ability of a single model to describe cross-species biology raises the issue of how to define cross-species parameters for ML and network models. One approach may be to use mechanistic or network information to train models as a hybrid approach to incorporate biological mechanisms into ML.
A hallmark of these approaches is a combination of cross-species or multi-omic data integration. Principles being established for within-species multi-omics integration may be generalizable to model cross-species data by using ML, network, or mechanistic models and can likely be adapted to other frameworks. The diversity of cross-species translation challenges will mandate a spectrum of different computational frameworks rather than imagining a “one size fits all” approach. New experimental technologies will produce new types of data and likely motivate the development of new computational models. Any model will need to balance generalizability, which limits biological resolution, with the need to make disease, tissue, and cell type–specific inferences in species translation. A promising way forward is to use ML approaches for discovery purposes and network, mechanistic, or emerging computational approaches to study context-specific biology. Because context-specific predictive models will necessarily use less data than will generalized approaches, new methods are needed to integrate these models with data from biological knowledge bases of orthology, network topology, and cross-species phenotypic similarity. These considerations motivate the participation of researchers who bring approaches from various disciplines—including clinical, engineering, and biological sciences—into what must become an expanding area of biomedicine.
Acknowledgments: The authors are supported by the Research Beyond Borders SHINE Program at Boehringer Ingelheim Pharmaceuticals, the NCI Cancer Systems Biology program, and the Army Institute for Collaborative Biotechnologies. We thank M. Carroll and M. Lee for their thoughtful input.