Links

Tools

Export citation

Search in Google Scholar

Improving Lookup Time Complexity of Compressed Suffix Arrays using Multi-ary Wavelet Tree.

Journal article published in 2009 by Zheng Wu, Joong-Chae Na, Minhwan Kim, Dong-Kyue Kim
This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Question mark in circle
Preprint: policy unknown
Question mark in circle
Postprint: policy unknown
Question mark in circle
Published version: policy unknown

Abstract

In a given text T of size n, we need to search for the information that we are interested. In order to support fast searching, an index must be constructed by preprocessing the text. Suffix array is a kind of index data structure. The compressed suffix array (CSA) is one of the compressed indices based on the regularity of the suffix array, and can be compressed to the order empirical entropy. In this paper we improve the lookup time complexity of the compressed suffix array by using the multi-ary wavelet tree at the cost of more space. In our implementation, the lookup time complexity of the compressed suffix array is O(), and the space of the compressed suffix array is bits, where a is the size of alphabet, is the kth order empirical entropy r is the branching factor of the multi-ary wavelet tree such that and and 0 < < 1/2 is a constant.