Algorithm for Friendship, Affair, and Love

IIT Patna: A Journey Began

Google page ranking algorithm

Crawling a website

Performance measure of a website

Google page ranking algorithm

Ravikant Mundotiya, and Shovan Bhaumik


Page rank is a numeric value which represents the importance of page present on web. When one page links to another page, it is effectively casting a vote for the other page. More votes implies more importance. Importance of the page that is casting the vote determines the importance of the vote. So for a web page to rank high, not only inbound links are important, but also the quality of links is also important. So PageRank algorithm works by counting number of links and quality of websites from where links are coming and assigning a number through some mathematical manipulation.

The original PageRank algorithm was described by Lawrence Page and Sergey Brin in a publication in 1997. Initially Google used that to return search results. Later Google heavily modified the PageRanking algorithm. Being trade secret they never disclose them. Here we are going to discuss the initial PageRanking algorithm proposed by Page and Brin. PageRank results are basically from a mathematical algorithm based on Web-graph, created by World Wide Web pages as nodes and hyper-links as edges.


Let us assume, there are total $N$ number of pages, among them many pages are interlinked to each other. Rank of any page, we can assign by the following mathematical equation. \begin{equation} PR(A) =\frac{1-d}{N}+d \frac {PR(T_{1})}{C(T_{1})}+...+ d\frac {PR(T_{n})}{C(T_{n})}, \end{equation} Where $PR(A)$ is the PageRank of page $A$, $PR$($T_i$) is the PageRank of pages $T_i$ which link to page $A$, $C$($T_i$) is the number of outbound links on page $T_i$, and $d$ is a damping factor which is typical taken around 0.85. The above equation could be written as \begin{equation} PR(P_i) = \frac {1-d}{N}+d \sum_{p_{j}\in_M (p_{i})} \frac{PR(p_{j})}{L(p_{j})} \end{equation}

Where $p_1$, $p_2$ and $p_N$ are the pages under consideration, $M$($p_i$) is the set of pages that link to $p_i$, $L$($p_j$) is the number of outbound links on page $p_j$.

< Prev.Page   1   2   3   Next page>