Distance-Aware Selective Online Query Processing Over Large Distributed Graphs


Performing online selective queries against graphs is a challenging problem due to the unbounded nature of graph queries which leads to poor computation locality. It becomes even difficult when a graph is too large to be fit in the memory. Although there have been emerging efforts on managing large graphs in a distributed and parallel setting, e.g., Pregel, HaLoop and etc, these computing frameworks are designed from the perspective of scalability instead of the query efficiency. In this work, we present our solution methodology for online selective graph queries based on the shortest path distance semantic, which finds various applications in practice. The essential intuition is to build a distance-aware index for online distance-based query processing and to eliminate redundant graph traversal as much as possible. We discuss how the solution can be applied to two types of research problems, distance join and vertex set bonding, which are distance-based graph pattern discovery and finding the structure-wise bonding of vertices, respectively.

Publication Title

Data Science and Engineering