Published in

Association for Computing Machinery (ACM), SIGMOD record, 2(33), p. 33-38, 2004

DOI: 10.1145/1024694.1024700

Links

Tools

Export citation

Search in Google Scholar

Managing and Analyzing Carbohydrate Data.

This paper was not found in any repository, but could be made available legally by the author.
This paper was not found in any repository, but could be made available legally by the author.

Full text: Unavailable

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Red circle
Published version: archiving forbidden
Data provided by SHERPA/RoMEO

Abstract

One of the most vital molecules in multicellular organisms is the carbohydrate, as it is structurally important in the construction of such organisms. In fact, all cells in nature carry carbohydrate sugar chains, or glycans, that help modulate various cell-cell events for the development of the organism. Unfortunately, informatics research on glycans has been slow in comparison to DNA and proteins, largely due to difficulties in the biological analysis of glycan structures. Our work consists of data engineering approaches in order to glean some understanding of the current glycan data that is publicly available. In particular, by modeling glycans as labeled unordered trees, we have implemented a tree-matching algorithm for measuring tree similarity. Our algorithm utilizes proven efficient methodologies in computer science that has been extended and developed for glycan data. Moreover, since glycans are recognized by various agents in multicellular organisms, in order to capture the patterns that might be recognized, we needed to somehow capture the dependencies that seem to range beyond the directly connected nodes in a tree. Therefore, by defining glycans as labeled ordered trees, we were able to develop a new probabilistic tree model such that sibling patterns across a tree could be mined. We provide promising results from our methodologies that could prove useful for the future of glycome informatics.