the exercise asks one to rank websites using a modification of PageRank. It seems that this ranking is also equal to the eigenvector of the transition matrix P that is associated with eigenvalue 1. However, when I try to apply this idea I get a different ranking. I would appreciate any clues what goes wrong in my reasoning/code.
(heavily) Building on the provided solution, here is my code.
“”"
Return list of pages, ordered by rank
“”"
import re
from operator import itemgetter
import numpy as np
Q = np.zeros((n, n), dtype=int)
f = open(infile, ‘r’)
edges = f.readlines()
f.close()
for edge in edges:
from_node, to_node = re.findall(’\w’, edge)
i, j = alphabet.index(from_node), alphabet.index(to_node)
Q[i, j] = 1
Create the corresponding Markov matrix P
P = np.empty((n, n))
for i in range(n):
P[i, :] = Q[i, :] / Q[i, :].sum()
v, w = np.linalg.eig( P )
ranked_pages = {alphabet[i] : w[0, i] for i in range(n)}
Print solution, sorted from highest to lowest rank
print(‘Rankings\n ***’)
for name, rank in sorted(ranked_pages.items(), key=itemgetter(1), reverse=1):
print(f’{name}: {rank:.4}’)
Thanks @Felix, interesting question. Your code is nearly correct. We’ll be in touch soon with a working version. (The topic might be added as a new exercise to the lecture. )
Let me briefly explain. The rank problem arises from two issues in your code:
Use the numpy.linalg.eig(), which computes the eigenvalues and right eigenvectors of a matrix. In order to get the stationary distribution for Markov chains, by definition, \phi^*=\phi^* P, we need to compute the left eigenvector of the transition matrix P with respect to eigenvalue 1.
Even if you obtain the left eigenvector of the transition matrix P with respect to eigenvalue 1, denoted by w, w might not be equal to the stationary distribution \phi^* we want. Think about the definition of distribution, values in the stationary distribution \phi must be nonnegative and sum up to 1.
For issue 2, take the absolute value of all elements in the left eigenvector obtained above and then normalize the new vector by dividing by sum of all its nonnegative elements, please see my answer. (Hints: use functions abs() and sum().)
Now your ranking and the associated numbers will be exactly the same as the answers in that QuantEcon lecture (please find attached in my next reply below).