2010 Proceedings of the Seventh Workshop on Analytic Algorithmics and Combinatorics (ANALCO), p. 34-46
DOI: 10.1137/1.9781611973006.5
Full text: Download
We consider the problem of finding optimal description for general unlabeled graphs. Given a probability distribution on labeled graphs, we introduced in [4] a structural entropy as a lower bound for the lossless compression of such graphs. Specifically, we proved that the structural entropy for the Erd˝ os–Rényi random graph, in which edges are added with probability p, i n h(p) − n log n + O(n), where n is the number of vertices and h(p) = −p log p − (1 − p) log(1−p) is the entropy rate of a conventional memoryless binary source. In this paper, we prove the asymptotic equipartition property for such graphs. Then, we propose a faster compression algorithm that asymptotically achieves the structural entropy up to the first two leading terms with high probability. Our algorithm runs in O(n + e) time on average where e is the number of edges. To prove its asymptotic optimality, we introduce binary trees that one can classify as in-between tries and digital search trees. We use analytic techniques such as generating functions, Mellin transform, and poissonization to establish our findings. Our experimental results confirm theoretical results and show the usefulness of our algorithm for real-world graphs such as the Internet, biological networks, and social networks.