1
00:00:04,223 --> 00:00:10,652
So let's look at another example for
Markov Chains here. This is sometimes
2
00:00:10,652 --> 00:00:14,844
known as the "Mouse in a Maze" model and
we will see why in just a second. I just
3
00:00:14,844 --> 00:00:18,861
want to recall from the previous slide
that I was showing you how to find a
4
00:00:18,861 --> 00:00:23,719
steady state vector for a stochastic
matrix by solving the corresponding system
5
00:00:23,719 --> 00:00:28,734
of linear equations. So what is this
Mouse in a Maze model all about then?
6
00:00:28,734 --> 00:00:35,397
Well, I have depicted on the left, crudely
sorry about that, but a mouse placed in a
7
00:00:35,397 --> 00:00:39,198
maze. We have made the maze very simple
and in the maze there are three rooms
8
00:00:39,198 --> 00:00:43,403
labeled one two three, accordingly, and
the mouse is kind of free to roam about
9
00:00:43,403 --> 00:00:47,347
the maze, coming to and fro between the
various rooms. So you can think of
10
00:00:47,347 --> 00:00:54,121
the mouse as choice to transition from one
room to the next, as a probabilistic event
11
00:00:54,741 --> 00:00:59,937
and in this fashion, then we are going to
model, in a way a really primitive almost,
12
00:00:59,937 --> 00:01:05,865
Artificial Intelligence here, AI system
based on those probabilities.
13
00:01:06,943 --> 00:01:10,993
So, let's go ahead and draw a directed
graph that explicitly shows these
14
00:01:10,993 --> 00:01:15,452
transitional probabilities and then from
there we can construct our stochastic
15
00:01:15,470 --> 00:01:21,582
matrix. Right, so I have drawn the
directed graph and the corresponding
16
00:01:21,582 --> 00:01:27,354
stochastic matrix for this example. You'll
notice a few things, for one I haven't
17
00:01:27,354 --> 00:01:34,810
allowed the mouse to remain stationary
from step to step, it always moves and you
18
00:01:34,810 --> 00:01:38,886
can see that in our directed graph, for
instance if it is in room one currently,
19
00:01:38,886 --> 00:01:43,112
if its probability is one half it moves to
room three and similarly a probability of
20
00:01:43,112 --> 00:01:48,570
one half it moves to room two. So, zero
probability that it remains still and it's
21
00:01:48,570 --> 00:01:52,663
true for all the rooms and you can see
those remaining transitional probabilities
22
00:01:52,663 --> 00:01:57,655
And then just kind of eyeballing the
stochastic matrix here, just to make sure
23
00:01:57,655 --> 00:01:59,752
we can sort of make sense of it.
24
00:01:59,752 --> 00:02:04,010
The way you wanna read this is, I have the
columns as representing my starting place
25
00:02:04,010 --> 00:02:06,821
and the rows representing my end place.
Okay?
26
00:02:06,821 --> 00:02:10,763
So, for instance, if I am in room one I
probably have been moving or
27
00:02:10,763 --> 00:02:15,425
staying, I should say, to room one is zero
as we just mentioned a moment ago. Moving
28
00:02:15,425 --> 00:02:19,392
to maybe a different component, if I am
currently in room three, the probability
29
00:02:19,392 --> 00:02:23,236
of, say, of moving then to room two or
transitioning is two-thirds. Lets see that
30
00:02:23,236 --> 00:02:27,474
from the directed graph, so from three to
two, yes true enough, is probability two
31
00:02:27,474 --> 00:02:29,558
thirds. So, there is our sochastic matrix.
32
00:02:29,558 --> 00:02:33,748
So now the question we wanna answer is
this, basically what happens as time
33
00:02:33,748 --> 00:02:40,998
marches on? Can we in a sense get a
cognitive blue print of the long term
34
00:02:40,998 --> 00:02:45,618
thinking or the long term behavior of the
Mouse in the Maze? And in order to answer
35
00:02:45,618 --> 00:02:50,166
that question we are gonna go ahead and
solve for the steady state vector with
36
00:02:50,169 --> 00:02:51,775
respect to these stochastic matrix.
37
00:02:53,055 --> 00:02:57,355
Okay, so before we go ahead and solve for
the steady state vector for this
38
00:02:57,355 --> 00:03:01,110
particular problem lets go ahead and just
to get the mechanisms down, plug in a
39
00:03:01,110 --> 00:03:05,342
vector or two and induce some calculation
for this Markov chain. So, you will recall
40
00:03:05,342 --> 00:03:10,865
that our stochastic matrix P is given as
follows. Let's suppose now that our
41
00:03:10,865 --> 00:03:15,056
initial state vector is 1, 0, 0. So, in
other words the mouse starts in room one.
42
00:03:15,056 --> 00:03:19,134
Now, if I want to figure out my
probability vector, my transitional state
43
00:03:19,134 --> 00:03:26,033
vector, after one time iteration, I just
multiply on the left by my matrix P and
44
00:03:26,033 --> 00:03:29,486
then I get, not surprisingly, the
following probability vector.
45
00:03:29,486 --> 00:03:34,799
So, after one step we have probability of
one half that the mouse is in room two and
46
00:03:34,799 --> 00:03:39,149
probability one half that the mouse is in
room three. So, if I wanna turn the crank
47
00:03:39,149 --> 00:03:43,552
with the Markov process again and see what
my probability look like after two
48
00:03:43,552 --> 00:03:49,092
iterations or two time steps and multiply
again by P, now I get this new state
49
00:03:49,092 --> 00:03:52,573
vector one third, one third, one third, so
uniform probability vector.
50
00:03:52,573 --> 00:03:56,667
So, the way I would interpret this result
is to say after two movements of the
51
00:03:56,667 --> 00:03:59,943
Mouse in the Maze, there is an equal
chance. So, in other words, a one third
52
00:03:59,943 --> 00:04:03,958
probability that the mouse is either in
room one, room two or room three.
53
00:04:03,958 --> 00:04:09,588
So the question is, what is the end
behavior of the movement of the mouse in
54
00:04:09,588 --> 00:04:13,113
the maze, in other words, what is the
steady state vector for the system?
55
00:04:13,113 --> 00:04:18,188
Let's codify the idea using the language
of mathematics. So, a steady state vector,
56
00:04:18,188 --> 00:04:22,955
we'll denote it as x, is by definition a
probability vector, in other words, the
57
00:04:22,955 --> 00:04:27,293
components of that vector sum to one. And
basically, what's happening is when I
58
00:04:27,293 --> 00:04:31,545
multiplying in the left by my stochastic
matrix - when I multiply a stochastic
59
00:04:31,545 --> 00:04:36,848
matrix times a steady state vector, it
remains unchanged, so I get x back.
60
00:04:36,848 --> 00:04:41,883
One way to put this also, we can say that
our system- our dynamic system is in
61
00:04:41,886 --> 00:04:45,196
equilibrium when it reaches the steady
state vector.
62
00:04:45,846 --> 00:04:49,473
Okay, so when I do that calculation, where
I solve for x or in other words, steady
63
00:04:49,473 --> 00:04:53,368
state vector, we get the following result:
x equals to one quarter, three eights,
64
00:04:53,368 --> 00:04:58,185
three eights. You can notice that if I add
the components of that vector together I
65
00:04:58,185 --> 00:05:02,425
get sum 1. So it is, in fact, a
probability vector, which is a nice thing
66
00:05:02,425 --> 00:05:08,526
to check. And because it is a steady state
vector, to remind you, it remains fixed.
67
00:05:08,526 --> 00:05:13,164
It's a fixed point, in other words, when I
multiply by the matrix P on the left.
68
00:05:13,164 --> 00:05:17,985
So in a sense, at some point eventually,
our Markov chain turns into an infinite
69
00:05:17,985 --> 00:05:22,784
sequence of just these steady state
vectors. So to remind of a few conceptual
70
00:05:22,784 --> 00:05:28,502
details as well here, how do we know, for
instance, that the steady state vector x
71
00:05:28,502 --> 00:05:32,954
is the only such steady state vector
associated with that matrix P? We know
72
00:05:32,954 --> 00:05:37,141
that from the theorem I mentioned
previously in this section. Namely, as
73
00:05:37,141 --> 00:05:42,410
long as the stochastic matrix P is regular
meaning if I take power- enough powers of
74
00:05:42,410 --> 00:05:48,005
it, I get strictly positive components.
Well, I haven't shown it directly here but
75
00:05:48,005 --> 00:05:52,388
if you square a P, by the way, you can see
that you get strictly positive components.
76
00:05:52,388 --> 00:05:55,982
So, P is regular. So, the theorem
guaranteed us that if our stochastic
77
00:05:55,982 --> 00:06:01,763
matrix is regular then, there are two
consequences, one is that the steady
78
00:06:01,763 --> 00:06:06,564
state vector x is unique, so there is only
steady state vector or fixed point for
79
00:06:06,564 --> 00:06:10,843
this matrix. The other guarantee which is
important in understanding and
80
00:06:10,843 --> 00:06:16,687
interpreting this problem and its answer,
is that no matter where we start in our
81
00:06:16,687 --> 00:06:22,144
Markov chain, the Markov chain will
converge to that particular vector.
82
00:06:22,144 --> 00:06:25,042
So, what does all this means in the
context of this particular
83
00:06:25,042 --> 00:06:27,482
application of the Mouse in the Maze
problem?
84
00:06:27,482 --> 00:06:30,334
Well, it means the
following, if I were to place the mouse,
85
00:06:30,334 --> 00:06:34,510
initially, in any of the rooms in the maze
in other words with respect to any initial
86
00:06:34,510 --> 00:06:39,453
state vector I like and I let the mouse
run around the maze as long as it wants,
87
00:06:39,453 --> 00:06:44,425
I can then answer the question, what is
the end behavior of the mouse? And that
88
00:06:44,425 --> 00:06:49,826
answer is as follows, in the long run, the
mouse spends a quarter of its time in room
89
00:06:49,850 --> 00:06:54,596
one, three eights of its time on room two
and similarly, three eights of its time in
90
00:06:54,645 --> 00:07:00,986
room three. So, there we have it.
Let's then in retrospect, now that we have
91
00:07:00,986 --> 00:07:05,286
solved the problem, take a step back and
appreciate what a Markov chain and sort
92
00:07:05,286 --> 00:07:09,716
of in conjuncture with this procedure of
solving for a steady state vector have
93
00:07:09,716 --> 00:07:14,552
done for us. Right in the beginning we
posed a rather daunting and complicated
94
00:07:14,552 --> 00:07:19,189
question or series of premises. We said
knowing the probabilistic tendencies of
95
00:07:19,189 --> 00:07:25,571
this mouse, moving in the maze, what is
the long term behavior of this dynamical
96
00:07:25,571 --> 00:07:30,634
process? And the answer, with just really
a scant few computations, in the end was
97
00:07:30,634 --> 00:07:37,136
achieved pretty readily. So, there is a
classic example of an application of the
98
00:07:37,136 --> 00:07:40,016
Markov chain. And that ends for our
section 3E.