Finite Markov Chains: Exercise 2

Felix · April 12, 2020, 12:38pm

Hi,

the exercise asks one to rank websites using a modification of PageRank. It seems that this ranking is also equal to the eigenvector of the transition matrix P that is associated with eigenvalue 1. However, when I try to apply this idea I get a different ranking. I would appreciate any clues what goes wrong in my reasoning/code.

(heavily) Building on the provided solution, here is my code.

“”"
Return list of pages, ordered by rank
“”"
import re
from operator import itemgetter
import numpy as np

infile = ‘web_graph_data.txt’
alphabet = ‘abcdefghijklmnopqrstuvwxyz’

n = 14 # Total number of web pages (nodes)

Create a matrix Q indicating existence of links

* Q[i, j] = 1 if there is a link from i to j

* Q[i, j] = 0 otherwise

Q = np.zeros((n, n), dtype=int)
f = open(infile, ‘r’)
edges = f.readlines()
f.close()
for edge in edges:
from_node, to_node = re.findall(’\w’, edge)
i, j = alphabet.index(from_node), alphabet.index(to_node)
Q[i, j] = 1

Create the corresponding Markov matrix P

P = np.empty((n, n))
for i in range(n):
P[i, :] = Q[i, :] / Q[i, :].sum()
v, w = np.linalg.eig( P )
ranked_pages = {alphabet[i] : w[0, i] for i in range(n)}

Print solution, sorted from highest to lowest rank

print(‘Rankings\n ***’)
for name, rank in sorted(ranked_pages.items(), key=itemgetter(1), reverse=1):
print(f’{name}: {rank:.4}’)

john.stachurski · April 13, 2020, 3:43am

Thanks @Felix, interesting question. Your code is nearly correct. We’ll be in touch soon with a working version. (The topic might be added as a new exercise to the lecture. )

shu · April 13, 2020, 4:28am

Thanks, @Felix and @john.stachurski. Hi @Felix, please find my answer attached here (https://colab.research.google.com/drive/1E5lxOe8xqrY0s1HiJ5LSi-i7ohvvuM1C).

Let me briefly explain. The rank problem arises from two issues in your code:

Use the numpy.linalg.eig(), which computes the eigenvalues and right eigenvectors of a matrix. In order to get the stationary distribution for Markov chains, by definition, \phi^*=\phi^* P, we need to compute the left eigenvector of the transition matrix P with respect to eigenvalue 1.
Even if you obtain the left eigenvector of the transition matrix P with respect to eigenvalue 1, denoted by w, w might not be equal to the stationary distribution \phi^* we want. Think about the definition of distribution, values in the stationary distribution \phi must be nonnegative and sum up to 1.

To solve the two issues above, you can:

For issue 1, use scipy.linalg.eig() (https://docs.scipy.org/doc/scipy/reference/generated/scipy.linalg.eig.html) instead of numpy.linalg.eig(), to compute the left eigenvector of the transition matrix P, please see my answer.
For issue 2, take the absolute value of all elements in the left eigenvector obtained above and then normalize the new vector by dividing by sum of all its nonnegative elements, please see my answer. (Hints: use functions abs() and sum().)

Now your ranking and the associated numbers will be exactly the same as the answers in that QuantEcon lecture (please find attached in my next reply below).

Thanks again for your interesting question.

shu · April 13, 2020, 4:45am

Felix · April 13, 2020, 7:49am

Thanks for quick and super helpful reply!

Topic		Replies	Views
Estimating the Transition Probability Matrix and a new lecture on Hidden Markov Model	3	2383	April 19, 2017
Outdated code in Julia Finite Markov Chains example	1	1151	September 23, 2016
Question regarding an earlier email	7	1347	August 12, 2016
Finite Markov Chain Lectures	2	1172	December 18, 2018
Estimating Markov chain probabilities in Julia using proportions Lectures	3	607	December 21, 2022