Bonding vertex sets over distributed graph: A betweenness aware approach


Given two sets of vertices in a graph, it is often of a great interest to find out how these vertices are connected, especially to identify the vertices of high prominence defined on the topological structure. In this work, we formally define a Vertex Set Bonding query (shorted as VSB), which returns a minimum set of vertices with the maximum importance w.r.t total betweenness and shortest path reachability in connecting two sets of input vertices. We find that such a kind of query is representative and could be widely applied in many real world scenarios, e.g., logistic planning, social community bonding and etc. Challenges are that many of such applications are constructed on graphs that are too large to fit in single server, and the VSB query evaluation turns to be NP-hard. To cope with the scalability issue and return the near optimal result in almost real time, we propose a generic solution framework on a shared nothing distributed environment. With the development of two novel techniques, guided graph exploration and betweenness ranking on exploration, we are able to efficiently evaluate queries for error bounded results with bounded space cost. We demonstrate the effectiveness of our solution with extensive experiments over both real and synthetic large graphs on the Google's Cloud platform. Comparing to the exploration only baseline method, our method achieves several times of speedup. © 2015 VLDB Endowment 2150-8097/15/08.

Publication Title

Proceedings of the VLDB Endowment