Author:
(1) Subhajit Sahu, IIIT Hyderabad, Hyderabad, Telangana, India ([email protected]).
Table of Links
Abstract and 1 Introduction
2 Related Work
3 Preliminaries
4 Approach
5.1 Experimental Setup
5.2 Performance of Dynamic Frontier PageRank
5.3 Strong Scaling of Dynamic Frontier PageRan
6 Conclusion, Acknowledgments, and References
ABSTRACT
KEYWORDS
Parallel PageRank algorithm, Dynamic Frontier approach
1. INTRODUCTION
PageRank [19] is an algorithm that measures the importance of nodes in a network by assigning numerical scores based on the structure of links. It finds applications in web page ranking, identifying misinformation, predicting traffic flow, and protein target identification. The increasing availability of vast amounts of data represented as graphs has led to a significant interest in parallel algorithms for computing PageRank [10–12, 23].
However, most real-world graph evolve with time. Here, frequent edge insertions and deletions make recomputing PageRank from scratch impractical, particularly for small, rapid changes. Existing strategies optimize by iterating from the prior snapshot’s ranks, reducing the number of iterations needed for convergence. For further improvements, it is essential to recompute only the ranks of vertices likely to change. A prevalent approach involves identifying reachable vertices from the updated regions of the graph, and limiting processing to such vertices. However, if updates are randomly distributed, they often fall within dense graph regions, necessitating processing of a substantial portion of the graph.
To reduce computational effort, one can incrementally expand the set of affected vertices starting from the updated graph region, rather than processing all reachable vertices from the first iteration. Additionally, it is possible to skip processing a vertex’s neighbors if the change in its rank is small and is expected to have minimal impact on the ranks of its neighboring vertices. This technical report introduces such an approach.