Generating Tertiary Protein Structures via Interpretable Graph Variational Autoencoders

Guo, Xiaojie; Du, Yuanqi; Tadepalli, Sivani; Zhao, Liang; Shehu, Amarda

Published in

Oxford University Press, Bioinformatics Advances, 1(1), 2021

DOI: 10.1093/bioadv/vbab036

Tools

Export citation

Search in Google Scholar

Generating Tertiary Protein Structures via Interpretable Graph Variational Autoencoders

Journal article published in 2021 by Xiaojie Guo, Yuanqi Du, Sivani Tadepalli, Liang Zhao, Amarda Shehu

Distributing this paper is prohibited by the publisher

Full text: Unavailable

Preprint: archiving allowed

Upload

Postprint: archiving allowed

Upload

Published version: archiving allowed

Upload

Policy details

Data provided by

Abstract

Abstract Motivation Modeling the structural plasticity of protein molecules remains challenging. Most research has focused on obtaining one biologically active structure. This includes the recent AlphaFold2 that has been hailed as a breakthrough for protein modeling. Computing one structure does not suffice to understand how proteins modulate their interactions and even evade our immune system. Revealing the structure space available to a protein remains challenging. Data-driven approaches that learn to generate tertiary structures are increasingly garnering attention. These approaches exploit the ability to represent tertiary structures as contact or distance maps and make direct analogies with images to harness convolution-based generative adversarial frameworks from computer vision. Since such opportunistic analogies do not allow capturing highly structured data, current deep models struggle to generate physically realistic tertiary structures. Results We present novel deep generative models that build upon the graph variational autoencoder framework. In contrast to existing literature, we represent tertiary structures as ‘contact’ graphs, which allow us to leverage graph-generative deep learning. Our models are able to capture rich, local and distal constraints and additionally compute disentangled latent representations that reveal the impact of individual latent factors. This elucidates what the factors control and makes our models more interpretable. Rigorous comparative evaluation along various metrics shows that the models, we propose advance the state-of-the-art. While there is still much ground to cover, the work presented here is an important first step, and graph-generative frameworks promise to get us to our goal of unraveling the exquisite structural complexity of protein molecules. Availability and implementation Code is available at https://github.com/anonymous1025/CO-VAE. Supplementary information Supplementary data are available at Bioinformatics Advances online.

Published in

Links

Tools

Generating Tertiary Protein Structures via Interpretable Graph Variational Autoencoders

Abstract