To combat viruses, bacteria and other pathogens, synthetic biology offers recent technological approaches whose performance is being validated in experiments. Researchers from the Würzburg Helmholtz Institute for RNA-based Infection Research and the Helmholtz AI Cooperative applied data integration and artificial intelligence (AI) to develop a machine learning approach that may predict the efficacy of CRISPR technologies more accurately than before. The findings were published today within the journal Genome Biology.
The genome or DNA of an organism incorporates the blueprint for proteins and orchestrates the production of recent cells. Aiming to combat pathogens, cure genetic diseases or achieve other positive effects, molecular biological CRISPR technologies are getting used to specifically alter or silence genes and inhibit protein production.
One in all these molecular biological tools is CRISPRi (from “CRISPR interference”). CRISPRi blocks genes and gene expression without modifying the DNA sequence. As with the CRISPR-Cas system also often called “gene scissors,” this tool involves a ribonucleic acid (RNA), which serves as a guide RNA to direct a nuclease (Cas). In contrast to gene scissors, nevertheless, the CRISPRi nuclease only binds to the DNA without cutting it. This binding ends in the corresponding gene not being transcribed and thus remaining silent.
Until now, it has been difficult to predict the performance of this method for a particular gene. Researchers from the Würzburg Helmholtz Institute for RNA-based Infection Research (HIRI) in cooperation with the University of Würzburg and the Helmholtz Artificial Intelligence Cooperation Unit (Helmholtz AI) have now developed a machine learning approach using data integration and artificial intelligence (AI) to enhance such predictions in the long run.
The approach
CRISPRi screens are a highly sensitive tool that will be used to analyze the consequences of reduced gene expression. Of their study, published today within the journal Genome Biology, the scientists used data from multiple genome-wide CRISPRi essentiality screens to coach a machine learning approach. Their goal: to raised predict the efficacy of the engineered guide RNAs deployed within the CRISPRi system.
“Unfortunately, genome-wide screens only provide indirect details about guide efficiency. Hence, now we have applied a brand new machine learning method that disentangles the efficacy of the guide RNA from the impact of the silenced gene,” explains Lars Barquist. The computational biologist initiated the study and heads a bioinformatics research group on the Würzburg Helmholtz Institute, a site of the Braunschweig Helmholtz Centre for Infection Research in cooperation with the Julius-Maximilians-Universität Würzburg.
Supported by additional AI tools (“Explainable AI”), the team established comprehensible design rules for future CRISPRi experiments. The study authors validated their approach by conducting an independent screen targeting essential bacterial genes, showing that their predictions were more accurate than previous methods.
“The outcomes have shown that our model outperforms existing methods and provides more reliable predictions of CRISPRi performance when targeting specific genes,” says Yanying Yu, PhD student in Lars Barquist’s research group and first writer of the study.
The scientists were particularly surprised to search out that the guide RNA itself shouldn’t be the first consider determining CRISPRi depletion in essentiality screens. “Certain gene-specific characteristics related to gene expression appear to have a greater impact than previously assumed,” explains Yu.
The study also reveals that integrating data from multiple data sets significantly improves the predictive accuracy and enables a more reliable assessment of the efficiency of guide RNAs. “Expanding our training data by pulling together multiple experiments is important to create higher prediction models. Prior to our study, lack of knowledge was a significant limiting factor for prediction accuracy,” summarizes junior professor Barquist. The approach now published can be very helpful in planning simpler CRISPRi experiments in the long run and serve each biotechnology and basic research. “Our study provides a blueprint for developing more precise tools to control bacterial gene expression and ultimately help to raised understand and combat pathogens,” says Barquist.
The outcomes at a look
• Gene features matter: The characteristics of targeted genes have a big impact on guide RNA depletion in genome-wide screens.
• Data integration improves predictions: Combining data from multiple CRISPRi screens significantly improves the accuracy of prediction models and enables more reliable estimates of guide RNA efficiency.
• Designing higher CRISPRi experiments: The study provides priceless insights for designing simpler CRISPRi experiments by predicting guide RNA efficiency, enabling precise gene-silencing strategies.
Funding
The study was supported by funds from the Bavarian State Ministry of Science and Art through the bayresq.net research network.