Published in

Springer Verlag, Lecture Notes in Computer Science, p. 139-155

DOI: 10.1007/978-3-540-73255-6_13

Links

Tools

Export citation

Search in Google Scholar

Fast Approximate Duplicate Detection for 2D-NMR Spectra.

Proceedings article published in 2007 by Björn Egert, Steffen Neumann, Alexander Hinneburg
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

D-Nuclear magnetic resonance (NMR) spectroscopy is a powerful analytical method to elucidate the chemical structure of mole- cules. In contrast to 1D-NMR spectra, 2D-NMR spectra correlate the chemical shifts of 1H and 13C simultaneously. To curate or merge large spectra libraries a robust (and fast) duplicate detection is needed. We propose a deflnition of duplicates with the desired robustness properties mandatory for 2D-NMR experiments. A major gain in runtime perfor- mance wrt. previously proposed heuristics is achieved by mapping the spectra to simple discrete objects. We propose several appropriate data transformations for this task. In order to compensate for slight variations of the mapped spectra, we use appropriate hashing functions according to the locality sensitive hashing scheme, and identify duplicates by hash- collisions.