Published in

BioMed Central, Genome Biology, 1(24), 2023

DOI: 10.1186/s13059-023-02948-3

Links

Tools

Export citation

Search in Google Scholar

High-throughput deep learning variant effect prediction with Sequence UNET

Journal article published in 2023 by Pedro Beltrao, Alistair S. Dunham ORCID, Mohammed AlQuraishi
This paper is made freely available by the publisher.
This paper is made freely available by the publisher.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

AbstractUnderstanding coding mutations is important for many applications in biology and medicine but the vast mutation space makes comprehensive experimental characterisation impossible. Current predictors are often computationally intensive and difficult to scale, including recent deep learning models. We introduce Sequence UNET, a highly scalable deep learning architecture that classifies and predicts variant frequency from sequence alone using multi-scale representations from a fully convolutional compression/expansion architecture. It achieves comparable pathogenicity prediction to recent methods. We demonstrate scalability by analysing 8.3B variants in 904,134 proteins detected through large-scale proteomics. Sequence UNET runs on modest hardware with a simple Python package.