Global NEST International Conference on Environmental Science & Technology, 2023
Freshwater cyanobacteria are prominent sources of structurally diverse natural compounds. Bioactive cyanometabolites are particularly relevant to water quality and public health protection. Non-targeted analysis (NTA) by liquid chromatography - high resolution mass spectrometry (LC-HRMS) is applied to expand the range of detected and identified metabolites. However, data analysis is challenging and subjected to limitations arising from the availability of experimental or library-based mass spectra. We present an HRMS data analysis workflow using state-of-the-art computational tools that we have applied to analyze samples from cyanobacteria blooms in Greek lakes. Pre-processing of data was carried out in MZmine3 (feature detection, deconvolution, alignment, deisotoping, gap filling). Processed data were exported in GNPS for feature-based molecular networking - FBMN and annotations based on public GNPS libraries. In parallel, feature lists were processed in SIRIUS and its associated tools, for de novo molecular formula annotation, database search, prediction of compound classes using molecular fingerprints, and ranking of candidates using fragmentation trees. Results were visualized and further explored in Cytoscape, to enable annotation propagation. Such workflows substantially expand the chemical space of annotated cyanometabolites at structural and compound-class levels, and the discovery of new compounds which are not included in libraries.