Transparent Gif

Department of Computer Science

University of California, Santa Barbara

Abstract

Scalable Nearest Neighbors with Guarantees in Large and Composite Networks

by: Petko Bogdanov and Ambuj K. Singh

Abstract:

We address the problem of k Nearest Neighbor (kNN) search in networks, according to a random walk proximity measure called Effective Importance. Our approach retrieves the exact top neighbors at query time without relying on off-line indexing or summaries of the entire network. This makes it suitable for very large dynamic networks, as well as for composite network overlays mixed at query time. We provide scalability and flexibility without compromising the quality of results due to theoretical bound guarantees that we develop and incorporate in our search procedure. We incrementally construct a subgraph of the underlying network, sufficient to obtain the exact top k neighbors. We guide the construction of the relevant subgraph in order to achieve fast refinement of the lower and upper proximity bounds, which in turn enables effective pruning of infeasible candidates. We apply our kNN search algorithm on social, information and biological networks and demonstrate the effectiveness and scalability of our approach. For networks in the order of a million nodes, our method retrieves the exact top 20 using less than 0.2% of the network edges in a fraction of a second on a conventional desktop machine without prior indexing. When employed for nearest neighbors search in composite network overlays, it scales linearly with the number of networks mixed in the overlay.

Keywords:

composite networks, graph nearest neighbor, search, kNN

Date:

October 2010

Document: 2010-17

XHTML Validation | CSS Validation
Updated 14-Nov-2005
Questions should be directed to: webmaster@cs.ucsb.edu