1
00:00:10,122 --> 00:00:12,500
Okay, just to summarize.
2
00:00:12,500 --> 00:00:23,089
If we have n bits, this corresponds to 2^n
possibilities,
3
00:00:23,089 --> 00:00:30,441
and we'll always define 2^n to be N, if we
have a little n.
4
00:00:30,441 --> 00:00:46,030
If we have N possibilities, where N could
be three, this corresponds to log base 2 of n bits,
5
00:00:46,740 --> 00:00:53,960
which doesn't have to be a number, for instance,
log to the base 2 of 3 is between one and two bits.
6
00:00:53,960 --> 00:00:59,331
I guess, it's still a number but it's not a
natural number.
7
00:00:59,331 --> 00:01:02,621
Then we also looked at probabilities.
8
00:01:02,631 --> 00:01:12,108
So if the frequency of heads is m_h/m,
9
00:01:12,658 --> 00:01:19,297
I have a sequence of heads and tails,
10
00:01:19,297 --> 00:01:24,209
I count the total number of heads, I count
the total number of tails,
11
00:01:24,209 --> 00:01:31,317
I define this to be q(h) and I define q(t),
12
00:01:31,317 --> 00:01:36,799
this is equals 1-q(t), because it's either
heads or tails,
13
00:01:36,799 --> 00:01:42,138
our coins are not landing on their sides and
just staying there like that,
14
00:01:44,042 --> 00:02:11,185
then the log base 2 of the number of sequences
with frequency q(h) for heads and q(t) of tails,
15
00:02:11,904 --> 00:02:30,184
this is equal to m(-q(h) log base 2 q(h) - q(t) log base 2 q(t)),
16
00:02:30,184 --> 00:02:36,465
and we define this to be the amount of
information or entropy, and so sequences.
17
00:02:36,465 --> 00:02:39,724
And if the logarithm of this number is equal
to this,
18
00:02:39,724 --> 00:02:49,854
that implies that the total number of such
sequences, that is, sequences with this frequency,
19
00:02:49,854 --> 00:03:09,437
is approximately equal to 2^(mI) or 2^(mS).
20
00:03:10,129 --> 00:03:18,194
So this special formula, almost magical but
not magical because it's just mathematical,
21
00:03:18,194 --> 00:03:22,261
this formula which shows up all the time in
information theory,
22
00:03:22,261 --> 00:03:27,700
minus the sum of q log q or minus the sum of
p log p,
23
00:03:27,700 --> 00:03:32,831
is just a way of counting numbers of sequences
24
00:03:32,831 --> 00:03:35,671
if you're given a particular kind of frequency.
25
00:03:35,671 --> 00:03:42,170
So there's an intrinsic relationship between
frequency, probability and information,
26
00:03:42,170 --> 00:03:46,221
and that relationship is given by this formula.
27
00:03:49,411 --> 00:03:54,670
In a bit, maybe a lecture or two down the line
in these ten minute lectures,
28
00:03:54,670 --> 00:04:00,691
we'll talk about how to apply this fundamental
formula about information theory to the
29
00:04:00,691 --> 00:04:03,191
problem of communication.
30
00:04:03,591 --> 00:04:08,370
How many bits of information can you send
down a communication channel?
31
00:04:08,370 --> 00:04:12,282
For example, I'm talking right now.
32
00:04:12,282 --> 00:04:17,122
My larynx is wiggling, they're sending vibrations
through the air,
33
00:04:17,122 --> 00:04:20,334
this is a physical system, information is being
conveyed,
34
00:04:20,334 --> 00:04:24,933
but because it's a physical communication
channel, there are physical limits to how much
35
00:04:24,933 --> 00:04:28,862
information I could probably communicate to you,
36
00:04:28,862 --> 00:04:33,642
assuming, of course, you think I am communicating
any information at all.
37
00:04:34,524 --> 00:04:42,672
These fundamental formulas of information theory
allows us to put limits on the capacity of any
38
00:04:42,672 --> 00:04:51,863
communication channel, any communication
channel that's realized by a physical medium,
39
00:04:51,863 --> 00:04:57,274
and those are the only communication channels
I know about.
40
00:04:57,993 --> 00:05:06,894
But before we do this, let's talk about the
relationship between information and computation.
41
00:05:06,894 --> 00:05:13,075
Let's talk about information processing.
42
00:05:17,044 --> 00:05:22,526
This also goes under the name of computation,
43
00:05:22,526 --> 00:05:27,753
and it also goes under the name of Boolean
logic.
44
00:05:30,268 --> 00:05:32,616
Boolean logic, what is Boolean logic?
45
00:05:33,436 --> 00:05:39,039
Well George Boole was a 19th century logician
who came up with Boolean logic.
46
00:05:39,681 --> 00:05:44,550
He actually was married to Mary Everest,
who was the daughter of the surveyor of the
47
00:05:44,738 --> 00:05:49,044
British Raj of the same name after whom
Mount Everest is named.
48
00:05:49,993 --> 00:05:58,712
Boole wrote a modestly-titled book called
the Laws of Thought.
49
00:06:02,251 --> 00:06:08,044
He'd fit in right well with the kind of modesty
that goes on in the Santa Fe Institute nowadays,
50
00:06:08,044 --> 00:06:11,083
I'm sure we'd welcome him with open arms.
51
00:06:11,083 --> 00:06:17,232
So what is the content of Boole's laws of thought?
52
00:06:19,502 --> 00:06:23,212
Boole was interested in logical propositions,
53
00:06:24,433 --> 00:06:28,925
and logical propositions can either be true or false.
54
00:06:31,604 --> 00:06:35,944
By the way, if we're going to map this the way
computers do this,
55
00:06:35,944 --> 00:06:40,453
often we call true - 1 and false - 0.
56
00:06:40,453 --> 00:06:42,443
This is just a convention.
57
00:06:42,443 --> 00:06:47,483
Those of you who are more fond of zero than one
can have the opposite convention if you like,
58
00:06:47,483 --> 00:06:50,803
but I'm going to adopt this convention right here.
59
00:06:51,475 --> 00:06:59,074
So Boole was interested in questions like,
suppose I have a proposition X,
60
00:06:59,074 --> 00:07:04,535
so X is 'it's sunny'.
61
00:07:05,435 --> 00:07:12,314
And I have another proposition Y, and Y is,
for example, 'it's raining'.
62
00:07:14,057 --> 00:07:17,989
So either it's sunny or it's not sunny.
63
00:07:17,989 --> 00:07:22,209
So it could be true if it's sunny or it's
false if it's sunny.
64
00:07:22,209 --> 00:07:25,759
It could be true if it's raining or it's
false if it's raining.
65
00:07:26,253 --> 00:07:33,548
And Boole was interested in propositions
of the form X AND Y.
66
00:07:33,548 --> 00:07:38,618
X AND Y means that it's sunny and it's
raining.
67
00:07:38,618 --> 00:07:44,408
Now actually where Boole lived in Ireland,
it was probably, usually, false.
68
00:07:46,208 --> 00:07:48,899
Usually false in this case.
69
00:07:52,149 --> 00:07:58,578
And he was also interested in propositions
like X OR Y.
70
00:07:59,449 --> 00:08:02,968
Now, either it's sunny or it's raining.
71
00:08:03,449 --> 00:08:06,699
It could also be cloudy and not raining.
72
00:08:07,190 --> 00:08:10,087
So sometimes true.
73
00:08:16,208 --> 00:08:30,480
And he was also interested in propositions
like, if X then (NOT Y).
74
00:08:30,480 --> 00:08:34,087
So if it's sunny then it's not raining.
75
00:08:34,087 --> 00:08:36,530
Probably largely true.
76
00:08:37,008 --> 00:08:44,768
So he wanted to break down logical statements
into propositions of this form,
77
00:08:44,768 --> 00:08:53,052
using ideas like X AND Y, that a proposition
is true if and only if its constituents are true.
78
00:08:53,879 --> 00:09:01,069
X OR Y, this proposition is true if and only if
one or more of its constituents were true.
79
00:09:01,674 --> 00:09:06,453
Or more complicated propositions like if X
then NOT Y,
80
00:09:07,593 --> 00:09:14,265
which is some Boolean function of its constituent
propositions.
81
00:09:14,854 --> 00:09:20,075
Here is a glorious fact.
82
00:09:22,946 --> 00:09:34,844
If I say X AND Y, I write this as X ⋂ Y.
83
00:09:37,634 --> 00:09:44,925
X OR Y, I can write this as X ⋃ Y,
84
00:09:44,925 --> 00:09:48,943
this is the notation that Boole developed.
85
00:09:49,694 --> 00:09:59,004
NOT X, I write this as X with a little twiddle
in front of it (~X) to say that it is not true.
86
00:10:03,604 --> 00:10:14,765
If X then Y, I write this as X → Y, if X is
true then Y is true.
87
00:10:15,953 --> 00:10:34,554
Then Boole had a beautiful theorem that says,
"Any possible logical expression can be written
88
00:10:34,554 --> 00:10:43,877
down in terms of composing these kinds of
propositions.
89
00:10:43,877 --> 00:11:14,824
So, for example, X ⋂ (Y ⋃ ~X) → Y ⋂ (X ⋃ ~Y).
90
00:11:15,343 --> 00:11:18,253
Now I've no idea what this means.
91
00:11:21,093 --> 00:11:28,750
And what Boole gave is a set of rules to
evaluate when such expressions could be true
92
00:11:29,171 --> 00:11:31,230
and when they're not true.
93
00:11:31,230 --> 00:11:36,560
So if I just speak this out in terms of sunny
and raining,
94
00:11:36,560 --> 00:11:47,992
it's sunny and it's raining or it's not sunny
then it's raining and it is sunny or not raining.
95
00:11:47,992 --> 00:11:50,092
Sounds false to me.
96
00:11:51,582 --> 00:11:57,902
But what Boole showed is that any possible
logical proposition could be written in this way.
97
00:11:58,251 --> 00:12:01,804
And this is why he modestly entitled his book,
"The Laws of Thought".
98
00:12:01,983 --> 00:12:05,931
Because he was interested in logic, he noted
that any possible logic proposition could be
99
00:12:06,121 --> 00:12:09,492
written out in terms of these simple rules,
100
00:12:09,492 --> 00:12:14,842
and here is a even more remarkable fact,
101
00:12:15,541 --> 00:12:20,673
and the fact is, and this is not a mid-19th-century
thing,
102
00:12:20,673 --> 00:12:25,162
this is a beginning-of-the-21st-century thing,
103
00:12:25,162 --> 00:12:45,740
all digital computers are doing is evaluating
such Boolean expressions.
104
00:12:50,248 --> 00:12:55,853
A computer just breaks up things into operations.
105
00:12:55,853 --> 00:13:00,889
And we can have an operation, this is how I
write the same kind of thing in computer language,
106
00:13:00,889 --> 00:13:12,481
this says X AND Y, and this is what's called
an AND gate.
107
00:13:12,481 --> 00:13:20,822
An AND gate is simply a device in the computer
that takes in the inputs X AND Y, which are zero and one,
108
00:13:20,822 --> 00:13:32,818
and outputs X AND Y, and this is equal to one
if and only if X = 1, Y = 1, otherwise it's equal to zero.
109
00:13:34,248 --> 00:13:47,374
Similarly, we have an OR gate, which is written
like this, and the output is X OR Y,
110
00:13:47,606 --> 00:13:53,083
this is equal to one if X OR Y is equal to one.
111
00:13:53,264 --> 00:13:58,715
It's equal to zero if X = 0 and Y = 0.
112
00:13:59,061 --> 00:14:04,706
It's a physical device that implements this
logical operation OR.
113
00:14:04,706 --> 00:14:08,726
The AND gate is something that implements
the physical operation AND.
114
00:14:08,726 --> 00:14:13,365
And we have something that implements NOT X.
115
00:14:13,365 --> 00:14:21,317
So NOT X = 0 if X = 1 and it's NOT X = 1 if X = 0.
116
00:14:23,346 --> 00:14:31,138
So this is AND, OR, NOT and the final operation
is a COPY operation.
117
00:14:31,138 --> 00:14:34,216
So I take X and I make two copies of it.
118
00:14:35,066 --> 00:14:39,620
So these are the computer digital versions
of Boolean logic,
119
00:14:40,895 --> 00:14:48,742
they are actually physical devices that implement
Boole's universal laws of thought.
120
00:14:48,963 --> 00:14:53,372
Mind you, there are plenty of ways of thinking
that I do and that other people do and the
121
00:14:53,372 --> 00:15:00,880
dogs and cats do that cannot obviously be
broken down into this Boolean logic,
122
00:15:00,880 --> 00:15:03,561
so I'm not sure how universal it is.
123
00:15:04,209 --> 00:15:09,580
We have these four different kinds of gates:
AND, OR, NOT and COPY.
124
00:15:10,172 --> 00:15:12,691
And now here's a remarkable thing.
125
00:15:12,691 --> 00:15:27,488
So Boole's theorem, his universal theorem,
says that any set of digital logic operations,
126
00:15:30,541 --> 00:15:36,643
that is to say what a digital computer can do,
127
00:15:37,429 --> 00:15:46,233
including your iPhone, or if you're an Android
person, your Android phone,
128
00:15:48,002 --> 00:16:03,473
can be broken down into AND operations,
which looks like this in computer science lingua,
129
00:16:03,473 --> 00:16:07,131
OR operations, which looks like this,
130
00:16:07,131 --> 00:16:12,443
NOT operations, which are written like this
for some historical reasons,
131
00:16:12,443 --> 00:16:17,253
and COPY operations, which are written like this.
132
00:16:17,253 --> 00:16:21,724
So all your smartphone is doing, all your
computer is doing
133
00:16:21,724 --> 00:16:29,373
is breaking down logical processes and information
processing into fundamental units of combining
134
00:16:29,373 --> 00:16:32,854
these operations of AND, OR, NOT and COPY.
135
00:16:32,854 --> 00:16:39,854
And as George Boole showed in the 1840s and fifties,
it's possible to break down any kind of logical expression
136
00:16:39,854 --> 00:16:42,722
into the set of operations.
137
00:16:42,722 --> 00:16:50,973
So we have information, we have bits, we have
ways of processing and transforming those information,
138
00:16:50,973 --> 00:16:54,144
or that information, sorry,
139
00:16:54,144 --> 00:16:57,724
using elementary physical systems which
are logic gates,
140
00:16:57,724 --> 00:17:03,366
and all the digital computation is doing is
combining these sets of operations in arbitrarily
141
00:17:03,366 --> 00:17:04,964
complicated ways.