Dissemin is shutting down on January 1st, 2025

Published in

Society for Industrial and Applied Mathematics, SIAM Review, 4(46), p. 647-666

DOI: 10.1137/s0036144502415960

Links

Tools

Export citation

Search in Google Scholar

A Measure of Similarity between Graph Vertices: Applications to Synonym Extraction and Web Searching

This paper is available in a repository.
This paper is available in a repository.

Full text: Download

Green circle
Preprint: archiving allowed
Green circle
Postprint: archiving allowed
Green circle
Published version: archiving allowed
Data provided by SHERPA/RoMEO

Abstract

We introduce a concept of similarity between vertices of directed graphs. Let $G_{A}$ and $G_{B}$ be two directed graphs with, respectively, $n_{A}$ and $n_{B}$ vertices. We define an $n_{B}\times n_{A}$ similarity matrix S whose real entry $s_{ij}$ expresses how similar vertex j (in $G_{A}$ ) is to vertex i (in $G_{B}$ ): we say that $s_{ij}$ is their similarity score. The similarity matrix can be obtained as the limit of the normalized even iterates of $S_{k+1}=BS_{k}A^{T}+B^{T}S_{k}A$ , where A and B are adjacency matrices of the graphs and S₀ is a matrix whose entries are all equal to 1. In the special case where $G_{A}=G_{B}=G$ , the matrix S is square and the score $s_{ij}$ is the similarity score between the vertices i and j of G. We point out that Kleinberg's "hub and authority" method to identify web-pages relevant to a given query can be viewed as a special case of our definition in the case where one of the graphs has two vertices and a unique directed edge between them. In analogy to Kleinberg, we show that our similarity scores are given by the components of a dominant eigenvector of a nonnegative matrix. Potential applications of our similarity concept are numerous. We illustrate an application for the automatic extraction of synonyms in a monolingual dictionary.