Author:
(1) Subhajit Sahu, IIIT Hyderabad, Hyderabad, Telangana, India ([email protected]).
Table of Links
Abstract and 1 Introduction
2 Related Work
3 Preliminaries
4 Approach
5.1 Experimental Setup
5.2 Performance of Dynamic Frontier PageRank
5.3 Strong Scaling of Dynamic Frontier PageRan
6 Conclusion, Acknowledgments, and References
A number of approaches have been proposed for performing incremental computation (updating PageRank values in a dynamic / evolving graph) of approximate PageRank. Chien et al. [6] identify a tiny region of the graph near the updated vertices and model the remainder of the graph as a single vertex in a new, much smaller graph. PageRanks are computed for the small graph and then transferred to the original graph. Chen et al. [5] propose a number of methods to estimate the PageRank score of a particular web page using only a small subgraph of the entire web, by expanding backwards from the target node following reverse hyperlinks. Bahmani et al. [2] analyze the efficiency of Monte Carlo methods for incremental computation of PageRank. Zhan et al. [24] propose a Monte Carlo based algorithm for PageRank tracking on dynamic networks, by maintaining 𝑅 random walks starting from each node. Pashikanti et al. [20] also follow a similar approach for updating PageRank scores on vertex and edge insertion/deletion.
A few approaches have been proposed for updating exact PageRank scores on dynamic graphs. Zhang [25] presents a simple incremental Pagerank computation system for dynamic graphs on hybrid CPU and GPU platforms that incorporates the Update-GatherApply-Scatter (UGAS) computation model. A common approach used for Dynamic PageRank algorithm, given a small change to the input graph, is to find the affected region in the preprocessing step with Breadth-First Search (BFS) or Depth-First Search (DFS) traversal from the vertices connecting the edges that were inserted or deleted, and computing PageRanks only for that region [7, 12, 13, 22]. This approach was originally proposed by Desikan et al. [7]. Kim and Choi [13] use this approach with an asynchronous implementation of PageRank. Giri et al. [12] use this approach with collaborative executions on muti-core CPUs and massively parallel GPUs. Sahu et al. [22] use this approach on a Strongly Connected Component (SCC) based decomposition of the graph to limit the computation to SCCs that are reachable from updated vertices, on multi-core CPUs and GPUs (separately). Ohsaka et al. [18] propose an approach for locally updating PageRank using the Gauss-Southwell method, where the vertex with the greatest residual is updated first — however, their algorithm is inherently sequential.
Further, Bahmani et al. [3] propose an algorithm to selectively crawl a small portion of the web to provide an estimate of true PageRank of the graph at that moment, while Berberich et al. [4] present a method to compute normalized PageRank scores that are robust to non-local changes in the graph. Their approaches are orthogonal to our Dynamic Frontier approach which focuses on the computation of the PageRank vector itself, not on the process of crawling the web or maintaining normalized scores.