Author:
(1) Subhajit Sahu, IIIT Hyderabad, Hyderabad, Telangana, India ([email protected]).
Table of Links
Abstract and 1 Introduction
2 Related Work
3 Preliminaries
4 Approach
5.1 Experimental Setup
5.2 Performance of Dynamic Frontier PageRank
5.3 Strong Scaling of Dynamic Frontier PageRan
6 Conclusion, Acknowledgments, and References
3 PRELIMINARIES
3.1 PageRank Algorithm
The PageRank, 𝑅[𝑣], of a vertex 𝑣 ∈ 𝑉 in the graph 𝐺(𝑉 , 𝐸), represents its importance and is based on the number of incoming links and their significance. Equation 1 shows how to calculate the PageRank of a vertex 𝑣 in the graph 𝐺, with 𝑉 as the set of vertices (𝑛 = |𝑉 |), 𝐸 as the set of edges (𝑚 = |𝐸|), 𝐺.𝑖𝑛(𝑣) as the incoming neighbors of vertex 𝑣, 𝐺.𝑜𝑢𝑡(𝑣) as the outgoing neighbors of vertex 𝑣, and 𝛼 as the damping factor. Each vertex starts with an initial PageRank of 1/𝑛. The power-iteration method updates these values iteratively until the change is rank values is within a specified tolerance 𝜏 value (indicating that convergence has been achieved).
Presence of dead ends is an issue that arises when computing the PageRank of a graph. A dead end is a vertex with no out-link, which forces the random surfer to jump to a random page on the web. Or equivalently, a dead end contributes its rank among all the vertices in the graph (including itself). This introduces a global teleport rank contribution that must be computed every iteration, and can be considered an overhead. We resolve this issue by adding self-loops to all the vertices in the graph [1, 15].
3.2 Dynamic Graphs
Interleaving of graph update and computation: Changes to the graph arrive in a batched manner, with updating of the graph and execution of the desired algorithm being interleaved (i.e., there is only one writer upon the graph at a given point of time). In case it is desirable to update the graph while an algorithm is still running, a snapshot of the graph needs to be obtained, upon which the desired algorithm may be executed. See for example Aspen graph processing framework which significantly minimizes the cost of obtaining a read-only snapshot of the graph [8].
3.3 Existing approaches for updating PageRank on Dynamic Graphs
3.3.1 Naive-dynamic approach. This is a straightforward approach of updating ranks of vertices in dynamic networks. Here, one initializes the ranks of vertices with ranks obtained from previous snapshot of the graph and runs the PageRank algorithm on all vertices. Rankings obtained through this method will be at least as accurate as those obtained through the static algorithm.