Optimal protospacer sequences recommended by ensemble deep learning for high-efficiency base editing

Hui Peng, Xiaocai Zhang, Yuansheng Liu, Yi Pan, Wilson Wen Bin Goh*, Jinyan Li*

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Identification of an optimal protospacer at a disease-associated gene is critically important for the design of a high-efficiency base editor that does not have much bystander effect to edit this gene. Current machine learning methods are prone to overestimating the editing efficiencies of those low-efficiency editors, leading to wrong recommendations of high-efficiency protospacers; meanwhile they made separate predictions for editing efficiency and outcome proportion using two independent models, giving rise to performance inconsistency and confusing the identification of optimal protospacers. We propose an ensemble of paired convolutional neural networks for accurate prediction of outcome proportions and then we derive editing efficiencies directly from the proportions. Our method is able to significantly reduce the performance inconsistency between editing efficiency and editing proportion predictions caused by the two-model approach. Our method generalizes well to work on a range of different editing platforms. Furthermore, our method recommends optimal protospacers by ranking the candidates’ picking score which we newly defined as a harmonic indexing score integrating both of on-target editing efficiency and bystander editing effect.

Original languageEnglish
Title of host publicationACM-BCB 2024 - 15th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics
PublisherAssociation for Computing Machinery, Inc
ISBN (Electronic)9798400713026
DOIs
Publication statusPublished - Dec 16 2024
Externally publishedYes
Event15th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2024 - Shenzhen, China
Duration: Nov 22 2024Nov 25 2024

Publication series

NameACM-BCB 2024 - 15th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Conference

Conference15th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, ACM-BCB 2024
Country/TerritoryChina
CityShenzhen
Period11/22/2411/25/24

Bibliographical note

Publisher Copyright:
© 2024 Copyright held by the owner/author(s).

ASJC Scopus Subject Areas

  • Computer Science Applications
  • Software
  • Biomedical Engineering
  • Health Informatics

Keywords

  • Base editing
  • Ensemble Paired Convolutional Neural Network
  • Picking score

Cite this