Harvard CS50 - Full Computer Science University Course - YouTube - English
Harvard CS50 - Full Computer Science University Course - YouTube - English
is where to start.
CS50 is considered by many to be one of the best computer science courses in the
world.
Throughout a series of lectures, Dr. Malan will teach you how to think
algorithmically and solve problems efficiently.
And make sure to check the description for a lot of extra resources that go along
with the course.
[MUSIC PLAYING]
and the art of programming, back here on campus in beautiful Sanders Theatre
My name is David--
OK.
And I took this class myself some time ago, but almost didn't.
that was, like, actually fun for perhaps the first time in, what, 19 years.
and sort of bring to bear something that I'd been using every day
but didn't really know how to harness, that's been gratifying ever since,
In fact, just this past week, I looked in my old CS50 binder, which I still
of what was apparently the very first program that I wrote and submitted,
But this is a program that we'll soon see in the coming days that
does something quite simply like print "Hello, CS50," in this case,
to the screen.
grammatical rules, programming, once you start to wrap your mind around what
it is and how it works and what these various languages are, it's so easy,
And the only experience that matters ultimately in this class is your own.
And take comfort in knowing just some months from now all of that
And if you're thinking that, OK, surely the person in front of me, to the left,
to the right, behind me, knows more than me, that's statistically not the case.
2/3 of CS50 students have never taken a CS course before, which is to say,
It helps you learn how to think more methodically, more carefully, more
to do what you want unless you are correct and precise and methodical.
Problems are all about taking input, like the problem you want to solve.
output.
We all just need to decide, whether it's Macs or PCs or phones or something
else, that we're all going to speak some common language, irrespective
And you may very well know that computers tend to speak only
And indeed, we humans have many more than that, certainly not just zeros
And so, how do you get from something as simple as a few zeros, a few ones,
on your fingers where one finger represents one person in the room,
And we'd go past just those five digits and count much higher,
Enough said.
but we might not bother writing out the two zeros at the beginning.
we're going to have to tweak these zeros and ones further to get 3.
bits, for binary digits that represent, via these different patterns,
Why?
And it's a really simple thing to just either store some electricity
1 or 0, so to speak.
You've got thousands, millions of them in your Mac or PC or phone these days.
And these are just tiny little switches that can get turned on and off.
And so these switches, really, you can think of being as like switches
like this.
And if I put it in that same kind of pattern, I don't want to just do this.
So if this was one a moment ago, what I think I did earlier was I turned it off
010.
So here is 000.
Here is 001.
If this other bulb now goes on, and that switch is turned
and all three stay on-- this, again, was what number?
AUDIENCE: Seven.
anymore because we've probably been doing math and numbers since grade
school or whatnot.
But why?
If you use three digits in decimal, and you have the ones place,
the tens place, and the hundreds place, well, why was that 1, 10, and 100?
Why 10?
just change the bases if you're using only zeros and ones.
And if you keep going, it's going to be 8s column, 16s column, 32, 64,
and so forth.
0, plus 2 times 1, plus 1 times 0, and now 3, and now 4, and now 5, and now 6,
and now 7.
And, in fact, most computers would typically use at least eight at a time.
you would still use eight and have a whole bunch of zeros.
All right, so, with that said, if we can now count as high as seven
like the letter A of the English alphabet, if, at the end of the day,
Any thoughts?
Yeah.
We just have to agree and sort of write it down in some global standard.
ultimately, can only be stored, at the end of the day, as zeros and ones.
And so, some humans in a room before, decided that capital A shall be 65,
or, really, this pattern of zeros and ones inside of every computer
at a time.
have this mapping from numbers to letters, but still support numbers?
Yeah?
I like that.
Other thoughts?
Yeah.
The reason we have all of these different file formats in the world,
like JPEG and GIF and PNGs and Word documents, .docx,
and Excel files and so forth, is because a bunch of humans got in a room
we shall interpret any patterns of zeros and ones as being maybe numbers
for Excel, maybe letters in, like, a text messaging program or Google Docs,
or maybe even colors of the rainbow in something like Photoshop and more.
some hints to the computer that tells the computer, interpret it as follows.
So, similar in spirit to that, but not quite a standardized with these
prefixes.
So this system here actually has a name ASCII, the American Standard
looked inside the computer, what you technically received in this text
What might your friend have sent you as a message, if it's 72, 73, 33?
AUDIENCE: Hey.
Close.
AUDIENCE: Hi.
Why?
And this is, frankly, not the kind of thing most people know.
When I said that we just need to write down this mapping earlier,
But it's important to note that when you get these patterns of zeros and ones
And even those, frankly, aren't that useful if we do out the math.
Anyone know?
Say it again?
Long story short, if we actually got into the weeds of all of these zeros
So this is useful because now we can speak, not just in terms of bytes
The problem is that, if you're using ASCII and, therefore, eight bits or one
that need many more symbols and, therefore, many more bits.
of the accented characters that you might want for some languages.
You probably send and/or receive many of these things any given day.
of zeros and ones that you're receiving, that the world has also standardized.
And when you receive them, your phone, your laptop, your desktop,
And Unicode is just a mapping of many more numbers to many more letters
with the old way of doing things with ASCII, but they might also use 16 bits.
And heck, Unicode might even use 32 bits to represent letters and numbers
and punctuation symbols and emojis.
And, I daresay, one of the reasons we see so many emojis these days is we
has anyone ever received this decimal number, or if you prefer binary now,
has anyone ever received this pattern of zeros and ones on your phone,
Well, if you actually look this up, this esoteric sequence of zeros and ones
as well.
I started being in the habit of using the emoji that kind of looks
like this because I thought it was like woo, happy face, or whatever.
because whatever device I was using sort of looks like this, not like this.
This has happened too when what was a gun became a water
pistol in some manufacturers' eyes.
Yeah?
DAVID MALAN: Yeah, so we'll come back to this in a few weeks, in fact.
Binary is one.
Decimal is another.
Unary is another.
But hexadecimal, long story short, uses four bits per digit.
And so, four bits, if you have two digits in hex, that gives you eight.
And it's also human convention in the world of files and other things.
Other questions?
Good question.
No?
Oh, yes?
There we go.
Sorry.
[INAUDIBLE]
DAVID MALAN: Ah, sure, and we'll come back to this, in some form,
we have, with eight bits, two possible values for the first
and then two for the next, two for the next, and so forth.
means you can have 256 total possible patterns of zeros and ones.
software often starts counting at 0 by convention and if you use one of those
you only have 255 other patterns left to count as high as therefore 255.
That's all.
Good question.
All right, so what then might we have besides these emojis and letters
and numbers?
How might a computer, do you think, knowing what you know now, represents
Like what are our options if all we've got are zeros and ones and switches?
Yeah?
AUDIENCE: RGB
This yellow dot on the screen that might be part of any of those emojis
these days, well that's some amount of red, some amount of green,
And it was only once they decided, well, let's add an eighth bit that they
a solution to the same problem of not having enough room, if you will,
But even that wasn't enough and that's why we've now gone up to 16 and 32
we have to decide how to represent the amount of red and green and blue.
Well, it turns out if all we have are zeros and ones, ergo numbers,
For instance, suppose a computer we're using, these three numbers 72, 73, 33,
but now in the context of something like Photoshop, a program for editing
respectively.
You can think of the first digit as red, second as green, third as blue.
And so ultimately when you combine that amount of red, that amount of green,
that amount of blue, it turns out it's going to resemble the shade of yellow.
And indeed, you can come up with a numbers between 0 and 255
for each of those colors to mix any other color that you might want.
on our phones and laptops such that you barely see the dots, they are there.
So if you look sort of awkwardly, but up close to your phone or your laptop
How might you represent a video using only zeros and ones?
Yeah?
[INAUDIBLE]
It's not just one image, it's not just one letter or a number,
interpret them using our eyes and brain that there is now
If we just came up with some convention for representing those same notes
[MUSIC PLAYING]
[CLICKING]
are just zeros and ones because they're just this grid of these pixels or dots.
Now something like musical notes like these, those of you who are musicians
each note that you saw a moment ago essentially as a sequence of numbers.
But more generally, you might think about music as having notes,
you might have the duration like how long is the note being heard or played
just the volume like how hard does a human in the real world
press down on that key and therefore how loud is that sound?
It would seem that just remembering little details like that quantitatively
we can then represent really all of these otherwise analog human realities.
but at the end of the day and as fancy as those devices in years
are, it's just zeros and ones, tiny little switches or light bulbs,
Yeah, in back.
AUDIENCE: Yeah, so we talked about how different file formats kind of allow
How does a file format like .mp4 discriminate between audio and video
You allude to MP4 for video and more generally the use
It's not quite as simple when using larger files, for instance,
for instance.
Why?
you might know how many megabytes or larger even individual photographs
might be.
that uses much more math to represent the same information more minimally
In the world of multimedia, which we'll touch on a little bit in a few weeks,
look perfect to the human, but heck it's a lot cheaper and a lot easier
to distribute.
Yeah?
I mean, back in the day you might have heard of the expression a vacuum
that has allowed us to store as many and many more zeros and ones much more
closely together.
For instance, you might know by using your phone or your laptop
In fact, if you'd like to see one of the earliest computers from decades ago,
across the river here in now Allston in the new engineering building
is the Harvard Mark 1 computer that will give you a much better mental model
of just that.
But what's inside of this proverbial black box into which these inputs are
Well that's where we get this term you might have heard, too.
for solving some problem, and maybe you're using this language or that,
to this problem.
AUDIENCE: Yes.
obviously I'm eventually going to get to him, so that's what we mean by correct.
Is it efficient?
No.
I mean this is going to take forever even just to get to the Js or the Hs,
AUDIENCE: No.
Yeah?
you're only going odd number of pages, so if it's on an even number page,
of the book there was no John Harvard, I might have just errored.
Yeah, in back.
AUDIENCE: [INAUDIBLE]
And now I'm looking down, I'm looking for J, assuming first name, J Harvard,
AUDIENCE: [INAUDIBLE]
But I've now just decreased the size of this problem really in half.
So let me tear the problem in half again, throw another 250 pages away,
but really just how to harness intuition and ideas that you might already
Why is an algorithm like that if I found John Harvard better than, ultimately,
So the farther you go out here, the more pages are in the phone book.
Why?
One to one.
If, though, we use the second algorithm, flawed though it was,
between the size of the problem and the amount of time required to solve it
The implication there is that if, for instance, Cambridge and Allston,
Yeah.
and if you or your parents have any more still somewhere we could definitely
Other questions?
But thanks.
No.
Hopefully.
we're about to start coding in, it's just a way of expressing yourself
But that probably wasn't the case at least for John Harvard,
to speak, and then not just open to the middle of the left half of the book,
Why?
Because I can just repeat what I just did, but with a smaller problem
then I should open to the middle of the right half of the book,
left, if John Harvard is not on the page and he's not to the left
and he's not to the right, what should our conclusion be?
Now as an aside, it's kind of deliberate that I buried that last question
at the end because this is what happens all too often in programming,
just not considering all possible cases, corner cases if you will,
that might not happen that often, but if you don't anticipate them
Why?
Like what does this program do if John Harvard is not in the phone book
I don't know.
mistakes you're going to make early on, and me, too, 25 years later.
after a mathematician last name Bool, that simply have yes no answers.
Or, if you prefer, true or false answers or, heck, if you prefer 1 or 0 answers.
a 1,000-page phone book, I can get away with a 13-line program but sort
of repeat myself inherently in order to solve some problem until I get to that
last step.
of programs that we'll touch on before long, things like arguments and return
and that everyone in the real world these days still uses,
In fact, this version here just tries to print quote unquote, "Hello, world."
Which is, dare say, the most canonical first thing that most
words like int, curly braces, quotes, parentheses, semicolons, and back
slashes.
Now that's not to say that you won't be able to understand this before long,
because honestly there's not that many patterns, indeed programming languages
have typically a much smaller vocabulary than any actual human language,
But you can perhaps infer I have no idea what these other lines do yet,
but "Hello, world." is presumably quote unquote what
when you're quite younger but without the same vocabulary applied
to those ideas.
The upside of what we'll soon do using Scratch, this graphical programming
language from our friends down the road at MIT, it'll let us today
start to drag and drop things that look like puzzle pieces that interlock
But for now, let's go ahead and take a 10 minute break here
that we'll also take a look at in a few weeks before long along
Today though, and for this first week, week zero, so to speak,
that will be in C and in Python and in JavaScript and other languages, too,
but in a way where we don't have to worry about the distractions of syntax.
up here that you can full screen to make bigger and this here by default
is Scratch, who can move up, down, left, right and do many more things, too.
which is helpful only when it comes time to position things on the screen.
Right now Scratch is at the default, 0,0, where x equals 0 and y equals 0.
If you were to move the cat way up to the top, x would stay zero,
If you move the cat all the way to the bottom, x would stay zero,
And if you went left, x would become negative 240 but y would stay 0,
So those numbers generally don't so much matter because you can just
be helpful just to have that mental model off up, down, left, and right.
Well let's go ahead and make perhaps the simplest of programs here.
which has a whole bunch of puzzle pieces or blocks that relate to motion.
and Boolean expressions and more each in a different color and shape.
these that might say "hello" or whatever for some number of seconds.
Or you could switch costumes, change the cat to look like a dog or a bird
Sounds, too.
You can play sounds like "meow" or anything you might import or record,
yourself.
Then there's these things Scratch calls events and the most important of these
this rectangular region has this green flag and red stop
sign up above, one of which is for Play one of which is for Stop
and so that's going to allow us to start and stop our actual programs
But you can listen for other types of events when the spacebar is pressed
you and I take for granted like every day now on our phones.
Any time you tap an icon or drag your finger or hit a button on the side.
These are what a programmer would call events,
Indeed, that's why when you tap the phone icon on your phone,
wrote software that's listening for a finger press on that particular icon.
which implies some kind of loop where we can do something again and again.
You can ask questions aka Boolean expressions like is the sprite touching
Under Operator some lower level stuff like math, but also the ability
Something and something must be true in order to make that kind of decision
before.
Says apple and banana by default, but you can type in or drag and drop
larger sentences.
But in programming you'll see that it's much more conventional not to just
give variables full singular or plural words to describe what they are.
and with the first problem set whereby once you start to assemble these puzzle
pieces and you realize, oh, would have been nice if those several pieces could
have just been replaced by one had MIT thought to give me that
one puzzle piece, you yourself can make your own blocks
to click and drag and drop this thing here when green flag clicked.
Then I'm going to grab one more, for instance under Looks,
something like initially not just Hello but the more canonical
I can go over here now and click the green flag and voila,
more user friendly than typing out the much more cryptic text that we
saw on the screen that you, too, will type out next week.
But for now, we'll just focus on these ideas, in this case, a function.
and it seems to take some form of input in the white oval, specifically Hello
comma world.
like the cat and the speech bubble are saying Hello, world.
So already even that simple drag and drop
Let's go ahead now and make the program a little more interactive so
And you might have to poke around to find these things the first time
around, but I've done this a few times so I kind of know where things are
we want, and it's going to wait for the human to type in their answer.
from the Say block, which just had this side effect of printing a speech
The ask function is even more powerful in that after it asks the human to type
something in.
I'm going to let go and ask the default question, what's your name?
don't know when I'm writing the program who's going to use it.
So let me now grab another looks block up here, say something again, and now
let me go back to Sensing and now grab the return value, represented
by this other puzzle piece, and let me just drag and drop it here.
Notice it's the same shape, even if it's not quite the same size.
Let me go and stop the old version because I don't want to say Hello,
world anymore.
Enter.
Huh.
Yeah?
AUDIENCE: Do you need to somehow add them together in the same text box?
DAVID MALAN: Yeah, we kind of want to combine them in the same text box.
And it's technically a bug because this just looks kind of stupid.
but it's just blowing past the Hello and printing David.
But let's put our finger on why this is happening.
You're right for the solution, but what's the actual fundamental problem?
In back.
to even see it because it's then saying David on the screen so fast as well.
And what we'll try to teach you over the coming weeks
the more you'll get your footing with exactly this kind of problem solving.
Hello, one second, two seconds, David, one second, two seconds, but at least
and you can just drag and let go and it'll delete itself.
Let me go down to Operators because this Join block here is the right shape.
So even if you're not sure what goes where, just focus on the shapes first.
so that the output of one becomes the input to another, but that's OK here.
Let me go ahead and zoom out, hit Stop, and hit Play.
[APPLAUSE]
Thank you.
it still fits the same mental model, but in a little more interesting way.
And notice that in this case too there's an input, otherwise known
we'll see that the input now is going to be quote unquote "What's your name?"
the function called Ask, and the output of that thing this time
Thanks to this function it's sort of handing me back like a scrap of paper
can take not one but two arguments, or inputs, and that's fine.
The input now are two things, Hello comma and the return value
The function now is going to be Join, the output is going to be Hello, David.
to become the input to another function, namely that first block called Say,
and that's then going to have the side effect of printing out Hello, David
on the screen.
are going to get, they really do fit this very simple mental model
of inputs and outputs and you just have to learn to recognize the vocabulary
But you can ultimately really kind of spice these things up.
and this one is going to give me a few new green puzzle pieces, namely
Now notice I don't have to interlock them if I'm just kind of playing around
put it there, let me throw away the Say block by just moving it
Type in David.
And voila:
PROGRAM: Hello, banana.
That's embarrassing.
David.
And now:
Hello, David.
[APPLAUSE]
OK, so we have these functions then in place, but what more can we do?
Well what about those conditionals and loops and other constructs?
and let me just spice things up with some more audio under Sound.
[MEOW]
[QUIETER MEOW]
OK.
It's kind of an underwhelming program eventually
since you'd like to think that the cat would just meow on its own, but.
[MEOW]
Well this seems like an opportunity for doing something again and again.
let me just grab a few of these, or you can even right click or Control click
[THREE MEOWS]
Is this well-designed?
Yeah?
But why?
Like this works, I'm kind of done with the assignments, what's bad about it?
AUDIENCE: There's too much repetition.
but then I'd have to change it here and then I'd have to change it here,
and God, if this were even longer that just gets tedious quickly
So I do like the repeat or the forever idea so that I don't repeat myself.
So let me go ahead and throw away most of these to get rid of the duplication,
Let me grab the Repeat block for now, let me move this inside of the Repeat
block, it's going to grow to fit it, let me reconnect all this
So, better.
40 times by changing one thing, or it could just use the Forever block
pieces into just one that literally just says what it does, meow?
where I can choose a name for it and I'm going to call it Meow.
That's it.
I'm just going to drag this way down here, way down
out of mind, because now notice at top left there is a new pink puzzle
So at this point, I'd argue it doesn't really matter how Meow is implemented.
Now I have a brand new puzzle piece that just says what it is.
Why?
a year from now for the first time because you're sort of finally looking
The function itself has semantics, which conveys what's going on.
you could scroll down and start to tinker with the underlying
lets me call the Meow function, so to speak, use the Meow function
three times.
aka use the Meow function, and pass it in input that tells the puzzle
piece how many times I want it to meow?
Let me right click or Control click on the pink piece here and choose Edit,
or I could just start from scratch, no pun intended, with a new one.
Now here, rather than just give this thing a name Meow, let me go ahead
let me move the Play, Sound, and Wait, into the repeat block.
provides.
Let me connect this and now voila, I have an even fancier version of Meow
that is parameterized.
Now I'm going to scroll back up, because out of sight, out of mind,
reconnecting this new puzzle piece here that takes an input like 3 and voila,
now we're really programming, right?
you can start to take a big assignment, a big problem set, something
for homework even, that feels kind of overwhelming at first, like, oh my God
But if you start to identify what are the subproblems of a bigger problem?
it's so easy to drag my feet and ugh, it's going to take forever to start,
and I start to modularize the program and say, all right, well
Meowing.
now you have a piece of the problem solved not unlike we did with the phone
book there, but in this case, we'll have presumably other problems to solve.
Let me go ahead and now when the green flag is clicked, let me go ahead
So if the cursor is touching the cat like here, something like this,
So I'm going to ask the question, when the green flag is clicked,
Grew to fill.
or like a program, if you will, let me go ahead then and choose a sound
if the cat is touching the mouse pointer then place sound meow.
Here we go.
Play.
[SILENCE]
Play.
[SILENCE]
Huh.
I'm worried it's not Scratch's fault. Feels like mine.
AUDIENCE: [INAUDIBLE]
DAVID MALAN: Yeah, the problem is the moment I click that green flag,
Scratch asks the question, is the cat touching the mouse pointer?
And obviously it's not because the cursor was like up there a moment ago
It's fine if I move the cursor down there, but too late.
The answer was no or false or zero, however you want to think about it,
go back up here, click the green flag, and now nothing's happened yet,
Oh.
So now.
[MEOW]
So now we have this idea of taking these different ideas, these different puzzle
going to get a little inception thing here where the camera is picking up
Now that we have a non video backdrop, I'm going to say this.
going to say when the video motion is greater than some arbitrary
measurement of motion, I'm going to go ahead and play sound meow until done.
[MEOW]
OK.
[MEOW]
AUDIENCE: Aw.
[MEOW]
[LAUGHING]
(LAUGHING) Right?
It's completely creepy, but I'm not like exceeding the threshold--
[MEOW]
[MEOW TWICE]
[MEOW]
[LAUGHING]
I thought we'd play one of your former classmate's projects here up on stage.
Who's that?
Yeah.
Come on down.
AUDIENCE: Sahar.
[MEOW]
[APPLAUSE]
[MEOW]
It's going to use the camera focusing on your head, which will have
AUDIENCE: Yeah.
So let's line up your head with this red rectangle, if you could,
we'll do beginner.
[MUSIC PLAYING]
Sahar.
Give it a moment.
[DINGING]
[DING]
There we go, one point.
[DING]
One point.
[DINGING]
Nice.
15 seconds to go.
There we go.
Oh yeah.
One point.
[LAUGHING]
[DINGING]
Six seconds.
AUDIENCE: Oh no.
Quick!
[DINGING]
Thank you.
[APPLAUSE]
some new costumes or artwork, you can really bring programs to life.
But at the end of the day, the only puzzle pieces really involved
were ones like the ones I just dragged and dropped and a few more,
So the student probably created a few different sprites, not a single cap,
So you can imagine taking what looks like a pretty impressive project
to solve yourself, but just think about what are the basic building blocks?
I can click on it and drag and as I get close and close to the trash can notice
OSCAR THE GROUCH: (SINGING) If you really want to see something trashy--
OSCAR THE GROUCH: (SINGING) I have here a sneaker that's tattered and worn--
and just let them fall into the trash themself if I want to.
Presumably there's some kind of variable that's keeping track of this score.
DAVID MALAN: OK, let's see what the last chorus here is.
Right?
I kind of had a vision of what I wanted.
of the trashcan, and some other stuff, but wow that's a lot
and then drag and drop it into the stage as a costume and boom, that's
version one.
frankly might be the easiest to pluck off next and the trash can.
for instance, the trash can version here that looks a little something now
like this.
That is to say, I just wanted to implement this idea of the can opening
So here, when I run this program by clicking Play, notice what happens.
If I went in now and added the lamp post and compose the program together,
Right?
let me implement one of the pieces of trash, not the shoe in the newspaper
all at once.
And again, all of these examples will be available on the course's website
and I also created, with Carter's help, a second sprite, this one a floor.
Now without seeing the code yet, just hearing that description,
why might I have wanted the second sprite and this black line for a floor
Yeah?
DAVID MALAN: Yeah, you don't want the first sprite to start at the top,
Or it would seem to maybe eat up more and more of the computer's memory
and you can't pull it back out and you can't fix the program.
up, down, left, right, I picked a random x location, either here or over here,
negative 240 to positive 240, and then a y value of 180, which is the top.
It's kind of lame pretty quickly if the trash always falls from the same spot.
Here's this a little bit of randomness, like most any game would have,
So now if I click the green flag, you'll see that it just falls,
does stop when it touches the black line because notice what we did here.
I'm forever asking the question if the distance of the sprite, the trash,
So move it down 3 pixels, down 3 pixels, until the distance to the floor
but this felt like a nice, clean way that logically, just
OK, now I got some trash falling, I got a trash can that opens and closes,
I have a lamp post, now I'm a good three steps into the program.
like the dragging of the trash, let me go ahead and open up this version 2.
want the human to move it up, down, left, right and the human's
the green flag is clicked, we're forever asking this question or really
that you and I take for granted every day on Macs and PCs of clicking
For every icon, is the mouse down and is the icon touching the mouse?
but I bet I could just start to use just one single program.
into the can, Oscar popped out and told us the current score.
So this is the kind of thing that if you looked at first glance, like,
I have no idea how I would have implemented this
Oscar, the other sprite that we've now added to the program,
we'll see that orange is a variable, like an x or y but with a better name,
Well, let me show you what it's doing and then we can infer backwards.
All right, it's falling, I'm clicking and dragging it, I'm moving it over,
Even though the human perceives this as like a lot of trash falling
want the game to have that many levels, probably doing something wrong.
Reuse the code that you wrote, reuse the sprites that you wrote,
and that would give you not just correctness, but also a better design.
It's not a very fun game yet, but here's a little Harvard
And left, left, left, left, left, left, left, left, left, left, left, left,
AUDIENCE: [INAUDIBLE]
DAVID MALAN: Perfect, yeah.
There's probably some question being asked, if touching the black line,
is just literally a vertical black line we're probably asking a question like,
And if so, we just ignore the left or the right arrow at that point.
So that works.
OK, sure.
Let's go on.
AUDIENCE: [INAUDIBLE]
AUDIENCE: Presumably it's continually looking for you to hit the arrow keys
It's continually, forever listening for the arrow keys up, down, left, right,
If I zoom out here and take a look at the code that implements this,
especially if you, for instance, were poking around online as for problem set
until you start to wrap your mind around what's going on.
This is that program with the two black lines and the Harvard shield going up,
Go up.
Go down.
if I'm touching the left wall, change my x value by 1, sort of move away from it
a little bit.
Just in case it slightly went over, we keep the crest within those two walls.
to this game?
Well, let me go ahead to maybe this one here where the adversary in this game
This is like a maze and you're trying to get the Harvard shield from the bottom
Hypothesize.
Yeah, in back.
touching the left wall or the right wall, we somehow have it bounce.
And indeed we'll see there's a puzzle piece that can do exactly
and then it's forever doing this: if touching the left wall
want another adversary, a more advanced adversary down the road for instance,
we want the other sprite to not just bounce back and forth,
DAVID MALAN: Yeah, forever point at the location of the Harvard shield
Notice it's sort of twitching back and forth because it goes one
We haven't finished the game yet, but if we see inside, we'll see exactly that.
which is what we called the Harvard crest sprite, move one step.
Green flag.
It caught up to me.
Jesus, OK, so that's how you might then make your levels progressively
AUDIENCE: Celeste.
AUDIENCE: Celeste.
[APPLAUSE]
AUDIENCE: Celeste.
AUDIENCE: Celeste.
AUDIENCE: Yes.
The goal at hand is to initially move the Harvard crest to the sprite all
the way on the right so that you catch up to him in this case,
So if you're up for it, you can use just these arrow keys up, down, left, right.
[MUSIC PLAYING]
Feeling ready?
AUDIENCE: Yep.
AUDIENCE: Oh!
Keep going.
You got it like that and you know you want to dance.
So move out of your seat and get a fly girl and catch this beat.
[LAUGHING]
Hold on.
Pump a little bit and let them know what's going on like that, like that.
[APPLAUSE]
Give me a song or rhythm, making them sweat that's what give them.
[CHEERING]
They know.
Oh!
Yes!
Oh.
One more.
Yes!
[CHEERING]
This is it.
For a winner.
[CHEERING]
[APPLAUSE]
Congratulations.
[MUSIC PLAYING]
So this is CS50.
And this is week 1, the one in which you learn a new language, which
played with this graphical language known as Scratch before, which itself
But it's a language that underlies so many of today's more modern languages,
with a more traditional language like this, there's just so much distraction.
Last week I described all of the syntax, all of the weird punctuation
that you see in this, like the hash symbol, these angled brackets,
Well, today we're not going to reveal what all of those little particulars
mean.
But by next week, will this no longer look like the proverbial Greek
But to do that, we'll explore some of the very same topics as last week.
So recall that, via Scratch-- and presumably via problem set 1--
questions, loops, which let you do things again and again, variables,
So if you were comfortable on the heels of problem set 0 and last week,
realize that all of these topics are going to remain with us.
So really, today is just about acquiring all the more of a mental model for how
you translate those ideas into, presumably, a very cryptic new syntax--
But you need to be with these computer languages all the more precise
and ultimately will see to your code is successful along a few other lines
as well.
So if you think about the last time you kind of wandered around not really
might not have been that long ago, entering Harvard Yard for the very
where everything was, how Harvard or Yale, or anything else for that matter,
worked.
You sort of got by day to day by just focusing on those things that matter.
and try to wave our hands, so to speak, at details that, yeah, eventually
And along the way this term, we'll provide you with tools and techniques
so you don't have to just sit there sort of endlessly trying an input,
will help facilitate you answering that question for yourself, is my code
and you're probably not going to feel 100% comfortable with the first week,
the first weeks, is just how well designed your code is.
And we spend all these years in middle school, high school, presumably,
writing papers and other documents, getting grades and feedback on them
to wrap their mind around what your code is doing and, indeed, to be confident
if it is correct.
the next time you look at that code-- might have no idea
getting your algorithms efficient, getting your code nice and clean,
capitalization and the like-- the sort of way you write an essay
a few of these characteristics that are pretty easily taught and remembered.
But you just have to start to get in the habit of writing code in a certain way.
Yeah?
AUDIENCE: [INAUDIBLE].
need to hit your keyboard keys this many times with this cryptic syntax just
you can distill this same logic into literally one line of code.
there's nothing after this week and, really, next week that you shouldn't
Well, we're in the habit of typically writing things with, like, Microsoft
And yeah, I could open up Word or Google Docs or Pages or the like
But the problem, per last week, is that computers only understand or speak
AUDIENCE: [INAUDIBLE].
or the like.
Why?
Well, they come with features like bold facing and italics
and sort of fluffy, aesthetic stuff that has no functional impact on what
And it's a simple program-- here, for instance, a very popular one
created in advance before class a very simple empty file called "hello.c."
Why?
It's not .docx, which would mean in this file is a Microsoft Word document,
This is .c, which means in this file is going to be text in the language called
C. This number 1 here is just an automatic line number that's going help
Well, let me go ahead and type out exactly the same code.
one of these curly braces and then a sibling there that closes the same.
And then I'm going to type not print, but printf, then "Hello, world," /n,
And I dare say this was essentially the very first program I wrote some
25 years ago.
In this environment that we're using and that many programmers-- dare
say most programmers-- use, you don't have immediately a nice, pretty icon
to double-click on.
you're going to want to type commands because it's just faster than pointing
Well, how do I go about doing this, and how am I going to get from this
so-called code--
to this, these zeros and ones that we'll now start calling machine code.
The zeros and ones from last week can be used not only
A compiler.
is source code.
can download onto their own Mac or PC and be on their way with whatever
By the end of the semester, we're going to get you out of the cloud,
so to speak, as best we can and get you onto your own Mac or PC,
so that after this class, especially if it's the only CS class you ever take,
you feel like you can continue programming in any number of languages,
to interact with a computer, wherever it may be-- on your lap, in your pocket,
This has its own operating system for me, its own hard drive,
that you might not have ever used or seen, but it's very popular,
called Linux.
Odds are almost all of us in this room are using Mac OS or Windows right now,
but we're all going to start using an operating system called Linux, which
line interface, but are used not just for programming, but for serving
And it's, indeed, a familiar and very powerful interface, as we'll see.
I can type, make hello, at this dollar sign prompt, go ahead and hit Enter,
if you don't see anything go wrong, that means everything went right.
before long.
/hello means run the program called "hello" in this current folder.
So ./hello, and then Enter, and voila, now I'm actually not just programming,
And this might look a little different based on your own configuration.
Even the color scheme I'm using might ultimately look different from yours,
And what's now worth noting is that now things are getting a little more
Like on the left-hand side, you have a GUI, a Graphical User Interface.
But on the bottom here, again, you have a CLI, Command Line Interface.
so it's the command line one with which we'll spend some time.
I could just hover up here, like in any software, and I could right-click,
Voila, it disappears.
you can forever just type ls for list and hit Enter.
But the inventors of this, this operating system and its predecessors,
Why?
Yeah?
AUDIENCE: [INAUDIBLE].
run it, and then maybe make other changes and try to rerun it,
I'm going to see some kind of error because I just deleted the file.
Now if I type ls, I'll see not one but two files again, and one of them
But now suppose I change it to "Hello, CS50," like I did years ago.
Let me go ahead and save the file with Command-S or Control-S. Down here now,
Huh.
Yeah?
AUDIENCE: [INAUDIBLE].
So sort of newbie mistake, you're going to make this mistake and many others
before long.
And we'll come back to all the crazy syntax I typed before long.
Yeah?
AUDIENCE: [INAUDIBLE].
So it keeps changing the hello program and the hello file, and that's it.
AUDIENCE: [INAUDIBLE].
If I open up that directory, you'll see that there's just the one.
We have to then convert my words to the computer's, the zeros and ones.
Yeah, in front.
AUDIENCE: [INAUDIBLE].
Permission denied.
to the people who designed the operating system, but it's a little cryptic.
but you do have permission to read or write it-- that is, change it.
AUDIENCE: [INAUDIBLE].
It's a program that knows how to find and use the compiler on the system
we left off last time in the context of Scratch and inputs and outputs.
Functions, again, are those actions and verbs like say, or ask, or the like.
We'll see, in all of the languages we're going to see this term,
This is what Hello, World looked like last week in the form of one function.
that it's kind of evoking that same idea with the parentheses.
The F stands for formatted, but we'll see what that means in a moment.
Whenever you have multiple words like this, this is known as a string
as we'll see.
Semicolon.
Now, what does this really fit into in terms of the mental model?
One type of output from a function can be something called a side effect.
like something appearing on the screen or a sound playing from your computer.
And indeed, last week we saw this in the context of passing in something
But sometimes, recall last week, we had functions like the ask block that
It was stored, instead, in that special variable that was called answer.
Because some functions have not side effects but return values.
They hand you back an output that you can use and reuse,
unlike the side effect, which, again displays and that's it.
or a sentence in programming.
and I'm even going to deliberately put a space there just to kind of move
the cursor a little bit over so that the human isn't typing literally
the analog here forget string is that it, too, returns a value.
But it does so by copying the value on the right into the thing on the left.
You also have to tell the computer in advance what type of value
it is storing.
And there's even more than that we'll see today and beyond.
And this is partly an answer to the question that came up one or more times
last week, which was how does a computer distinguish this pattern of zeros
And I just claimed last week that it totally depends on the program.
on what the human programmer said the type of the value is.
which is kind of figures out what you mean, with C in a lot of languages
Yeah.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: And we still need the stupid semicolon.
a few weeks from now, until you start to notice this and recognize it
Yeah, question.
Good question.
it's not going to be nice and blissfully quiet and just give me another prompt.
looking error message until we get the muscle memory for reading it.
Other questions.
Other questions.
And it's going to take time to recognize and develop this muscle memory.
Everything I've typed here except, for the W at the moment, is lowercase.
and is hard to see at first, especially when a little S doesn't look that
let's go ahead and consider how we can go about implementing this now in code.
Rerun Make on Hello with the original version with the backslash n.
So let's try.
Still compile.
Yeah, the dollar sign, my so-called prompt, stayed on the same line.
Why?
Notice that the cursor will move to the next line in my terminal window.
But it'd be kind of stupid if when you run a program in this world,
weirdly spaced in the middle of the terminal with the dollar sign,
And notice that it's not acceptable or correct to do this, to hit enter there.
Let me go ahead and save that, though, and see what happens.
And this is where, again, you'll start to develop the instincts for just
on it's really just meant to be correct and precise with its errors.
your double quotes just have to be on the same line just because.
But the best way around it is to use this so-called escape sequence.
I could.
I deliberately want the human to type their name on the same line
just because.
And let me just go with this initially with the backslash n, semicolon.
So here, if I look at the very first line of output after the dollar sign--
So, I didn't.
But if I now retroactively say, all right, what does standard I/O
do for us up here.
were a few examples with the camera and with the speech to-- the text to voice.
was an extension for doing text to voice or for using your camera.
And so if you want to use any functions related to standard input and output,
like text from a keyboard, you have to include standard I/O dot
Get string, it turns out, is a function that CS50 wrote some time ago.
but the only way you can use that is to load the extension,
And we'll come back in time, like, why is it .h, why is it a hash symbol.
Now it worked.
So all those crazy error messages were resolved by just one fix,
of errors.
Then we changed it to hello and then the answer that the human typed in.
You tell the computer inside of your double quotes that you want to have
Then outside of your quotes, you just add a comma and then you type
for you.
./hello.
David.
It formats its input for you by using these placeholders for things like
and even zoom in here, how many inputs is printf taking as a function?
A moment ago, I'll admit that it was taking one input, "hello, world,"
quote unquote.
2.
And it's implied by this comma here, which is separating the first one,
AUDIENCE: [INAUDIBLE]
all of the somewhat pretty colors that have been popping up on the screen
here--
that's because a text editor like VS Code syntax highlights for you.
If your text editor understands the language that you're programming in--
C, in this case--
here right now, but that's just how it displays on the screen.
The %s is in blue.
And so it's just using different colors to make different things on the screen
But if I go back there and put it back, now everything's back in place.
And that's true for these curly braces over here on the left and the right.
If I put my cursor there, you can see that these things correspond
to one another.
And you can even see it, though it's a little subtle--
see these four dots here and these four dots here?
That's my indentation.
Any time I hit the Tab key, this too can help you make sure--
Phew.
Yeah.
%s is one.
You can have multiple i's, multiple s's, and even other symbols too.
printf can take many more arguments than just these two.
No.
and it's got to be done outside the context of printf in this case.
Yeah, in back.
AUDIENCE: [INAUDIBLE]
It's automatically done for you in our version of VS Code in the cloud.
If, ultimately, you program on your own Mac or PC, either initially or later
Yeah.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: String is the type of the variable or, more properly,
AUDIENCE: [INAUDIBLE]
No.
let me go ahead and zoom out, save this, do make hello again.
And so, actually, let's go down this rabbit hole for just a moment.
Yeah?
agree that this seems harder to read because I start reading here,
So, yeah, it just feels like it was nicer to read top to bottom,
I would say.
Your thoughts?
AUDIENCE: [INAUDIBLE]
And so this is useful if I only want to print out the person's name once.
But in this case, eh, if you can make a reasonable argument one
So let's frame this one last example in the context of the same process
synonymous.
in this case.
If we look then at what we did last time in the world of Scratch last week,
the input was what's your name, the function was ask,
And now let's take a look at this block, which is honestly a more user-friendly
Last week we said save, then join, then hello and answer.
But the interesting takeaway there was not how to say hello anything.
like the green join, could become the input to another function,
which was with that whole program, which had the include
and it had int main(void) and all of this other cryptic syntax.
This Scratch piece last week was kind of like the go-to
You could listen for clicks or other things, not just the green flag.
But this was probably the most popular place to start a program in Scratch.
So just like last week, if you were in the habit of dragging and dropping
and then you can put all of your code inside of those curly braces.
tends to use these curly braces, one of them opened, the other one closed.
main.
Just have the one green flag clicked and then say hello, world.
In C, though, you technically can't just put int main(void) printf hello, world.
Because, again, you need to tell the compiler to load the library--
code that someone else wrote-- so that the compiler knows what printf even is.
But long story short, it's like a menu of all of the available functions.
Question.
AUDIENCE: [INAUDIBLE]
Yeah, the library would be standard I/O. The library would CS50.
Indeed.
Other questions.
Yeah.
AUDIENCE: [INAUDIBLE]
take away, and we'll see why we've been using get_string and string.
Yeah.
AUDIENCE: [INAUDIBLE]
Early on, you will have to use whatever is prescribed by the specification.
Long story short, you referred, I think, a moment ago to another function
Long story short, in C, it's pretty easy and possible to get input from a user.
that gives you pretty much ultimate control over your computer's hardware.
Let me just give you a quick tour of some of the other placeholders and data
types that students will start seeing as we assemble more interesting programs.
of commands with which you'll get familiar over the next few weeks
We've only seen two of these so far, ls for list, rm for others.
feel too foreign when you see them on screen or online in a problem set.
and re-open the little GUI on the left-hand side, the so-called Explorer,
I can confirm as much with my command line interface by typing what command?
But now let me go ahead and start doing the same thing from the command line.
it will delete everything you previously typed just to clean things up.
AUDIENCE: [INAUDIBLE]
It's not doing anything because there's nothing in there, but that's fine.
But suppose again I want to get more comfortable with my command line.
just to keep me sane so that I don't forget what folder I'm actually in.
I could go up here, I could click the little plus icon, and use the GUI.
Or I can just type code mario.c.
Voila.
I'm not going to write any code in here yet, but I am going to save the file.
Because, again, it's not providing you with any new information.
nothing you can't do at the command line that you could do with the GUI.
the Back button or something like that or just close it and start all over.
So if I hit Enter now, notice I'm going to close the Mario folder,
And one last little trick of the trade, if I'm in PSet1/Mario like I
was a moment ago, and you're just tired of all the navigation,
Recall a bit ago, though, that I was running hello as this, ./hello.
That allows you to wean yourself off of a GUI, Graphical User Interface,
Well, what about those other types, now back in the world of C?
Those commands were not C. Those are just command-specific to a command line
Back in the world of C now, we've seen strings, which are words.
a fancy way of saying a real number, something with a decimal point in it--
And if you want even more numbers after the decimal point that
that it will rely on to know what is this pattern of zeros and ones.
Is it a number, a letter?
These are the types of data types that provide exactly those hints.
What are the functions that come in the menu that is the CS50 library?
We talked about standard I/O, and that's just one function so far, printf.
The C50 library exists largely for the first few weeks
of the class to make our lives easier when you just want to get user input.
So if you want to get a string, like a word or words from the human,
If you want to get an integer from the user, you're going to use get_int.
When you want to get any of those other data types, for the most part,
was a lot of math and calculations, so there's a lot of operators like these.
just for getting the remainder, when you divide one number by another.
There are other features in the world of C, like variables, as we've seen.
you literally write the word counter, or whatever you want it to be called.
You then use the assignment operator, a.k.a. the equals sign,
and you assign it whatever its initial value should be here on the right.
So, again, the 0 is going to get copied from right to left into the variable
Yeah, in front.
AUDIENCE: Semicolon.
Again.
AUDIENCE: A data type.
with a semicolon.
And this is where, again, it's important to note, the equal sign,
counter plus 1, whatever that is, then you update the value of counter
of a fancy way of saying the same thing with fewer words or fewer characters
on the screen.
This also adds 1, or whatever number you type over here,
And there's one other form of syntactic sugar we're going to start seeing too,
can do it by default with just plus plus or minus minus adding or subtracting 1.
Yeah.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Ah, so when you are changing a variable that already
has been created, as we did with the code that looked like this,
you no longer need to remind the computer what the data type is.
Many different ways I can create new files, but I want to create something
called a calculator.
I'm going to go ahead from memory and do the int void main--
more on that next week, why it's int, why it's void, and so forth.
And then let me just ask the user for whatever their x value is.
And then lastly, let me go ahead and just print out the sum of x and y.
AUDIENCE: [INAUDIBLE]
And now if I want to add x and y, for instance, super-- simple calculator,
But, again, if you focus on the basics, printf takes one input first--
No error messages.
Now I do ./calculator.
Voila.
Now I have the makings of a calculator.
still equals 2, and let me claim that it will work for other values as well--
OK, so this one is arguably better because I've now got a reusable
Counterthoughts?
AUDIENCE: [INAUDIBLE]
I did the same thing with get_string, that, yeah, maybe kind of crossed
s line because get_string and the what's your name inside of it,
it was just so much longer.
But x + y, eh, it's not that hard to wrap our mind around what's
So, again, these are the kinds of thoughts that hopefully you'll
Why?
to help my comprehension.
But that would be one other direction we could have taken things.
expressed with common variable names like x and y, totally fine here.
What if I want to annotate this program and remind myself what it is it does?
With a slash slash, two forward slashes, you can write a note to yourself,
Now, in this case, I'm not sure these commands are really
Because in the time it took me to write and eventually read these comments,
that, honestly, you might forget the next day, the next week,
the next month-- might be useful to have these notes to self that
Well, let me go ahead and rerun this again in this current version,
make calculator.
And here, too, you might think I'm typing crazy fast--
not really.
supports autocomplete.
you don't have to finish writing calculator, you can just hit Tab,
The other thing you can do is if you hit Up and keep going up,
you'll scroll through your entire history of commands.
by hitting Up quickly rather than retyping the same darn thing again
and again.
to make programming and interacting with the command line interface even faster.
All right, let me go ahead and just make sure it's compiled in the current form.
And, clearly, running 1 plus 1 was not the most robust testing of my code
here.
Yeah.
AUDIENCE: [INAUDIBLE]
So it turns out with these data types-- we've been talking about string and int
and also float and char and those other things-- they all use a specific,
and, most importantly, finite number of bits to represent them.
Newer computers use more bits, older computers tended to use fewer bits.
That's a lot.
This is 64 light bulbs on the stage and could count even higher.
An int is only using half of these, or we have two integers here on the stage.
Now, if you think back to last week, we talked about 8 bits at one point.
And if you have 8 bits, 8 zeros and ones, you can count as high as 256--
And if you want to support both positive and negative numbers, that technically
That's still 4 billion, give or take, but it's only half as many
I have to make one more change, which is to the data type itself.
long integer.
And then let me change the format code per the little cheat sheet we had up
with a long--
64 bits, which is as long as this stage now, that's still a finite value.
of fundamental limitations.
AUDIENCE: [INAUDIBLE]
Yes.
Good question.
All right, so how about we spice things up with maybe not just addition here,
and then these Boolean expressions here in green, maybe saying something
It's much cleaner from left to right than it was with printf and join.
that's effectively the role that these curly braces are playing.
that needs to be asked and answered to decide whether or not to do this thing.
And if we add in the printf's, it now looks quite like the same,
but it adds, of course, the word else and then a couple of more curly braces.
For best practice, though, do so anyway, because it makes super clear to you
that you intend for just that one or more line of code to execute.
and C. Scratch uses an equals sign for equality, to compare two values.
And if we add in the printf's, it looks a little something now like this.
But could someone make a claim that this is not, again, well-designed?
Exactly.
We need the else, at least, but we don't need the last if.
of your program running-- a blink of the eye-- by only asking two questions
now you're only going to ask two questions, so 2/3 as many questions,
it to be well-designed as well.
And let's do something with points, like points on my own very first CS50
problem set.
include stdio.h.
Then I'll ask a question in English like, how many points did you lose,
And then once I have this answer, let's now ask some questions of it.
I'm going to go ahead and print out you lost more points than me, backslash n.
else if-- wait a minute, else seems to be sufficient logically here.
Run points.
Even better.
And so forth.
So, again, we have the ability to express in C now pretty basic idea
from last week in reality, which is this notion of conditionals and asking
questions.
There's something subtle here, though, that's maybe not super well-designed
There's a bit of redundancy unrelated to the if and the else and the else.
But is there something I typed twice just to ask, perhaps, for the obvious?
Exactly, I've hard-coded, so to speak, manually typed out the number 2--
which for better or for worse, is all the program can do.
But this is an example too of a magic number in the sense
like, wait, where did that 2 come from, and why is it in two places?
It feels like we are setting the stage for just a higher probability
Because the longer this code gets, suppose I'm comparing against 2 points
elsewhere--
2, 3, 4, 5 places--
It's correct.
and you're going to miss one of the 2's, you're going to change it to a 3,
This tells the compiler to make sure that even you later in your code cannot
And another convention in C and other languages, when you have a constant,
Yeah.
AUDIENCE: Why do you not use semicolons after line 9 and line 13?
DAVID J. MALAN: Yeah, why do you not use a semicolon in lines 9, 13?
Just because.
This is the way the language was designed.
For now, assume that semicolons usually finish your thought after a function.
That's not 100% reliable of a heuristic, but it'll get you most of the way
there.
Left hand was not talking to the right hand when some of these languages
were designed.
could I write a very simple program that does something basic like,
And let me go ahead and include cs50.h, include stdio.h, int main void--
But, for now, I'm going to go ahead and get a number n from the user
So here, I've sort of taken a bite out of the problem, if you will.
So if, question marks now, let me go ahead and fill in the blanks here.
Nice.
a remainder of 0 or 1.
And that's nice because the alternative would seem to be doing something stupid
your code would be infinitely long if you had to ask all possible questions.
It does numerator divided by denominator and returns not the result of that
What is another new piece of syntax, apparently, besides the percent sign?
Yeah.
Yeah.
AUDIENCE: [INAUDIBLE]
And then some number of minutes or days later, people are like, damn,
And if you think this is a little weird, in some languages, like JavaScript,
So make parity-- and, again, parity is just the name of my file, parity.c.
Let me go ahead now and let me start copying and pasting some of this code
I'm going to go ahead and say, how about get_string do you agree--
whatever the question might be-- and I want the human to type y or n
So how about if c == "y", then let me go ahead and, inside of my curly braces,
else if c=="n" n for no, then let me go ahead and print out not agreed,
And I'm just going to ignore the user if they don't cooperate
All right, let me go ahead now and compile this code, make agree, ./agree.
Yes.
How about my caps lock key is on or I'm just really yelling, capital Y?
It ignores me.
AUDIENCE: [INAUDIBLE]
Or you know what, even more simplistic based only on what we've seen before--
if you will, let me just copy and paste some of this code.
Because you're implying that I could just say something like OR c == "Y"
The catch is, you can't use the word OR in C. It's actually two vertical bars.
Certainly, what the human typed can't both be lowercase and uppercase.
ourselves knowing a little bit about ASCII or Unicode from last week.
But, yes, that would be an alternative, but more on that a different time.
Other questions?
AUDIENCE: [INAUDIBLE]
even though that's kind of how you might think about it.
You have to ask a complete question using the equality sign twice
in this case.
Previously, we used double quotes for anything that looked like text.
Yeah.
or N-O. That's not supported at the moment-- more on that another time.
If it's a string--
But if you don't mind, let me kick the can a couple of weeks
The most pleasant way to do this would indeed be to do something like this.
if the user does something weird, like they capitalize just the Y?
it will vary.
but just stdio.h because I only want printf for this demo.
And then if I want the cat to meow three times, like it did last week,
Save it.
make meow, ./meow.
Voila.
It ran.
It compiled OK.
doing something wrong or, at best, just lazy of you, in this case.
there's not a forever keyword in C, so this one's a little weird to look at.
But that should rub you the wrong way, because like, why 2 versus 1?
I could also put the number 1 for true and the number 0 for false,
you kind of have to whip out that toolkit of all of the basic building
build a little machine in software that does something some number of times?
And just intuitively, even if you've never seen C code or any code
Yeah.
So I need code like I showed earlier, like counter equals counter plus 1.
You can't just say what you mean, like you couldn't Scratch.
This is correct, but here are some conventions that are popular.
Just call it i.
Do you recall?
Yeah.
Yeah, that syntactic sugar that does nothing new,
to make sure that i hasn't gotten so big that it's greater than 3.
But if not, it then does this again and it does this again.
We started last week with all the light bulbs off, which was 0.
On most keyboards, there's no symbol for less than or equal to or greater than
or equal to 4.
initialize it to 0 for something like this, and just generally count up,
The first thing before the semicolons initializes your variable, int i = 0.
And the last thing is going to be what you do after each loop, which
Then it might go ahead and increment i and check the condition again.
So, again, this does not read quite the same simple fashion top to bottom.
does the exact same thing as what we saw a moment ago in this while loop format.
a for loop once comfortable, but just because is really the answer there.
All right, any questions, then, on loops as we've translated them to C?
Yeah.
AUDIENCE: [INAUDIBLE]
technically means it's only going to exist in these four lines of code.
of the loop.
Good question.
Let me go ahead and now re-implement meowing with a for loop, for instance.
Then inside my curly braces, let me go ahead and print out with printf, meow,
So I did it pretty quickly just because I've long acquired the muscle memory.
Run ./meow.
but we'll explain over time what each of these keywords is doing.
because the authors of C did not create a function called meow decades ago--
And I'm going to explicitly say no by writing the special word void.
But for now, I'm just going to say that meow is the name of the function,
it takes no inputs--
meow's purpose in life is just to have side effects, visual side effects
And here's where too, if you really don't like the curly braces,
Darn.
Something stupid.
AUDIENCE: [INAUDIBLE]
OK, fixed.
My mistake.
But recall what I did in Scratch, kind of out of sight, out of mind.
And just to make a point, let me just highlight this and move it
And now, just by moving that function, I've created all these lines of errors.
but it says meow.c in bold-- which is the name of the file where the bug is--
Let's see.
Sorry.
This is the error we're now looking at, line 7.
I was looking at the old error message from earlier before I fixed the 0.
meow.c line 7.
All right, apparently, C does not know what the meow function is.
defined it yet.
which we generally use here, it's one of the more recent versions.
Can you infer from the mere fact that I just moved meow to the bottom
why is that?
AUDIENCE: [INAUDIBLE]
And if it does not know what meow is when you first try to use it,
So the solution is, quite simply, don't do that, just leave it where it was.
But you can imagine this getting a little annoying over time, if only
You can put functions in different orders with main at the top so long
as you-- and this is perhaps the only time copy paste is appropriate--
so long as you leave a little breadcrumb for the compiler
But the semicolon means I'm not going to deal with this yet.
So let me go ahead and make meow one more time, ./meow, still working OK.
And let me make one final enhancement to this meow program here.
So now if I tighten up main here, now I can actually do something really cool
Voila.
And, heck, what you could do now-- and this is just a tease about a feature
We, the staff, wrote a function called get_string, get_int, and so forth,
in cs50.h.
you are sort of secretly telling the compiler at the very top of your code
Why?
And stdio.h is the same lines of code for things like printf.
It's just a way of telling the computer in advance what functions to expect.
Correct.
Indeed, int main void is a little weird, because what would the input domain be?
but also returns some value, maybe an int, maybe a float, maybe something
else altogether?
Well, it turns out, in C, we can do that as well.
Let me go ahead and include our usual cs50.h followed by stdio.h at the top.
Let's go ahead and get a float from the user asking them
Then, next, let's go ahead and declare a second variable-- also a float--
by 0.85.
Of course, if we're taking off 15%, we multiply the regular price by 0.85.
places--
followed by a newline.
just does this arithmetic for us, simple though it may be.
It returns a value, namely, a float that represents what the sale price is.
And it's going to take one input, like the price that we want to discount.
I'm going to say float sale equals whatever that price is times 0.85.
returning with this keyword return, I actually don't even need that variable.
And I can actually just go ahead and get rid of that variable
altogether and immediately return whatever the arithmetic
result is of taking the price input, the argument that's being passed in,
times 0.85.
So very simple function that simply does the discounting for me.
in our toolkit called discount that lets me discount the regular price
I'm still going to print out sale the variable in which I'm
return so that I can hand back that value, just like get_string hands
Let me go ahead now and recompile this code with make discount.
Now, it turns out that functions don't have to take just 0 or 1 argument
as input.
and take in as input to the discount function, not just the price
thereby allowing us to support not just 15% off but any number of percentage
points off.
Well, let me go up here and declare an int, say, and call it percent_off.
But I need to tell the computer that it is taking now two arguments,
And I'm now going to use that percentage in a slightly familiar way.
85.
And then I need to divide that by 100 in order now
to give myself 0.85 times the price that was passed in.
But if I go ahead now and save this, run, make discount one last time,
I haven't just done the math on the original variable that's being passed.
And we'll encounter this again before long, but this notion of scope
or is accessible.
the value explicitly so that ultimately I'm handing back the discounted price.
All right.
look like this where there's some coins in the sky hidden behind these question
marks.
Like, not actual colors or fanciness, that feels like too much too soon--
and a newline.
just means graphics but really just implemented with your keyboard.
And if I make mario and do ./mario, it's not nearly as engaging visually
as this, but it's the beginning of this kind of map for a game.
I could do something like four int i gets 0, i less than 4, i plus plus.
And then inside here, I could just print out one of them at a time.
AUDIENCE: [INAUDIBLE]
So, logically, just like in Scratch, put it at the end of the loop, so something
out here.
And just print out, for instance, only, quote unquote, new line.
not repeating myself multiple times, I'm doing this again and again.
Let me go ahead and ask the user how many question marks or coins to print.
The catch here is that there's another type of loop that's helpful for this,
A do while loop just inverts the logic so that you can actually
And then I'm going to do, literally, the following with the keyword do.
n equals get_int-- and I'm going to ask the user for the width,
whoops.
a do while loop is helpful when you want to do something no matter what first
and then check some condition or some Boolean expression to see if maybe,
I'm going to break out of this loop, and I've got myself
And I can now use this, for instance, here, change the 4 to an n
And the difference here with the do while is if something like this
So you have to do something first, then check, and break out of the loop
more like this in the same game, where you're underground as Mario,
And it's like, made of bricks, so I'll use maybe hash symbols this time.
So I'm just going to ask for the size of this square of bricks.
Run mario of 3.
and then maybe put the newline here, after the loop.
All right, so let's do make mario, ./mario, and type in 3 and huh.
How do I do this?
Yeah.
If you use one loop conceptually to kind of count the rows from top
Let me get rid of this line and get rid of this line for now.
Let me go ahead and print out just one of these things at a time.
And let me save and let me run this.
Make mario 3.
OK, three, that's clearly wrong, but I see nine things there on the screen.
So we're close.
What's the one fix I need now to move the old school typewriter head
And let me add some comments now to help everyone visualize what I've done.
For each row, for each column, how about print a brick--
are a little taller than they are wide, but that's just a font detail here.
Now I've done something that's quite more akin to something like this.
All right, so let me pause here and see if there are any questions.
Yeah.
No.
But ask that same question again in a few weeks when we get to Python,
and the answer will be yes.
Other questions.
Yeah.
And if none of them are applicable, you write the word void.
But for now, just take on faith that you need to do that with main.
Yeah
AUDIENCE: [INAUDIBLE]
let me go ahead and get an inch from the user for the size of this thing.
which is a fancy way of saying a real number with a decimal point in it.
And down here, I'm going to go ahead and use %f for float.
so divide x by y.
2 divided by 3 is 0.66667.
It turns out that in C, you can modify the behavior of these format
You can more succinctly say 0.2 before the f and after the percent.
This is the kind of thing that's hard to remember, but you Google it,
0.67.
Pretty sure it's supposed to be a 0.6 with a line over it, right?
Integers can overflow if you're trying to use more bits than you actually
You sort of change them all to ones, and then you're out of bits, so to speak.
In the world, which is to say a computer with finite memory cannot possibly
of permutations of 32 or 64 bits.
floating-point imprecision.
there are solutions to this problem that just give you more digits.
And instead of using floats for x and y, again, you say integer, so int x and y.
make calculator, ./calculator, and let's do, say, 2 for the numerator,
And it's not 0.666, and it's not even rounding oddly.
So why is that?
Only the integral part to the left of the decimal point does.
Everything at and beyond the decimal point itself get thrown away,
in just the 0 from what should have been 0.666666 and so forth.
Well, certainly, we could just use floats from the get-go, as I did.
But if, by nature of your program, you only have access to integers--
or maybe even longs, for which the same problem would occur--
Suppose that we think back to last week when we had three bits,
and we counted from 0 to 7, 0, 1, 2, 3, 4, 5, 6, 7.
If you don't, though, the next number after this is technically 1000.
But if you don't have space for or hardware for that fourth bit,
Why?
presumably, were going to update their clocks from 1999 to the year 2000.
Why?
The problem, of course, on January 1st of 2000 is that 99 rolls over to 100.
But if you don't have room for another digit it's just 00.
And if your code assumes a prefix of 19, well, we just went from the year
The next time the world might end though, is on January 19, 2038.
AUDIENCE: [INAUDIBLE]
So it turns out that the way computers generally keep track of time
is they count the total number of seconds since the epoch, which
Why?
or negative 2 billion--
carrying the 1, carrying the 1, 1 second after 2 billion seconds, give or take,
whereby if that first bit is negative, that represents that the rest of it
been negative 2 billion seconds since January 1, 1970, which is going to make
More bits.
just kick the can further and just give yourself more bits.
In fact, let me go ahead and write a real quick program here called pennies.
And I'm going to include stdio.h, int main void as my starting point.
going to ask the user for some amount of dollars, so a dollar amount,
Then I'm going to simply convert that amount to pennies by doing, say, how
And then I'm going to go ahead and print out that the number of pennies is %i--
All right, so if I didn't make any mistakes here, let me make pennies,
./pennies.
That's 99 pennies.
Huh.
There's that imprecision issue.
Now, not a big deal if the cashier gives you one penny less than you're owed,
if I print it out using the %f with a 0.50 or whatever to see more decimal
points--
And it turns out that it's in a library called the math library.
And you would know this by looking at online documentation and the like,
And if I now make pennies again and do ./pennies, I can now do $4.20.
And, voila.
And in a class like this, the goal is not just to teach you programming
but to really teach you what's going on underneath the hood, so to speak,
so that you are not on the failing end of some program having some bug.
too.
Why?
Because this Boeing airplane software was using a 32-bit integer counting up
And, unfortunately, after 248 days of the airplane being continuously on--
uncommon to make every dollar count, keeping the planes up and running
sort of operating system style, well, have you rebooted your plane?
And that was indeed the fix until they rolled out an actual software patch.
And the more hardware we carry around and the more we as a society
[MUSIC PLAYING]
Now that you have some programming experience under your belts,
in this more arcane language called c.
Among our goals today is to help you understand exactly what you have
Wrestling with your first programs in C, so that you have more of a bottom
But at the end of the day, computers, your Mac, your PC,
What's the format into which we need to get any program that we write, just
to recap?
AUDIENCE: [INAUDIBLE]
Right?
And up until now, we've been using this command called make,
Make hello looks in your current directory or folder for a file called
hello.c, implicitly, and then it compiles that into a file called hello,
But make is this utility that comes on a lot of systems that makes it easier
And, so here, for instance, in VS Code, is that very first program again,
And we'll see in a moment why we've been automating the process with make.
And if I do ls for list, you'll see there is not a file called hello.
And this is, just, the default file name for a program
after the name of a program that just influences its behavior in some way.
you can actually pass a -o, for output, command line argument, that
And then you go ahead and type the name of the file that you actually
Now we still have the old a.out, because I didn't delete it yet.
I could, of course, resort to using the Explorer, on the left hand side.
anyone recall?
AUDIENCE: rm.
to do the second version we ever did, which was to also include cs50.h,
so that we have access to functions like, get string, and the like.
question mark.
And now, let me go ahead and say hello to that name with our %s placeholder,
comma, name.
that very easily compiled with make hello, but notice the difference now.
clang-o, hello, just so I get a better name for the program, hello.c, Enter.
And a new error pops up that some of you might have encountered on your own.
So it's a bit arcane here, and there's this mention of a cryptic-looking path
Even though we might not have seen this language last time in class,
Now, your first instinct might be, well maybe I forgot cs50.h, but of course,
I didn't.
But it turns out, make is doing something else for us, all this time.
Just putting cs50.h, or any header file at the top of your code,
for that matter, just teaches the compiler that a function will exist.
It, sort of, asks the compiler to-- it asks the compiler
But this error here, some kind of linker command, relates to the fact
finding the 0s and 1s that cs50 compiled long ago for you.
That authors of this operating system compiled for you, long ago,
the actual machine code that someone else wrote and then compiled.
a file called hello, and you want to compile a file called hello.c,
And now if I run ./hello, it works as it did last week, just like that.
But honestly, this is just going to get really tedious, really quickly.
If you wanted to use the math library, like, to use that round function,
That is the program that converts from source code to machine code.
But we'll continue to use make because it just automates that process.
cryptic the more sophisticated and more feature full year programs get.
And make, again, is just a tool that makes all that happen.
Yeah, in front.
AUDIENCE: Can you explain again what the -lcs50-- just why you put that?
AUDIENCE: [INAUDIBLE].
And other libraries, even though they might come with the language C,
Other questions?
Yeah?
AUDIENCE: [INAUDIBLE]
recall was clang -o hello, then the name of the file, then -lcs50.
And this is where these commands do get and stay fairly arcane.
that you'll start to remember, oh what are the other commands that you--
what are the command line arguments you can provide to programs?
Technically, when you run make hello, the program is called make,
typed at the prompt, that tells make what you want to make.
So to come back to the first question about what actually is happening there,
how much automation is going on, so that when you run these commands,
it's not magic.
And this is not four distinct things that you need to memorize and remember
as to how we're getting from source code, like C, into 0s and 1s.
It turns out, that when you compile, quote-unquote, "your code," technically
Just humans decided, let's just call the whole process compiling.
If we look at our source code, version 2 that uses the cs50 library
and therefore get string, notice that we have these include lines at top.
Fancy way of saying they're handled special versus the rest of your code.
What was the one salient thing that I said was in cs50.h and therefore,
but recall that any time we've made our own functions,
just to teach the compiler that this function doesn't exist, yet,
that you all are accessing via VS Code, there's a line that looks like this.
A prototype for the get string function that says the name of the functions
Get string, not surprisingly, has a return value and it returns a string.
you can just trust that cs50 figured out what it is.
Yeah?
AUDIENCE: Printf.
because, as you might have noticed, printf can take one argument, just
It's not quite as simple a prototype as get strain, but more on that
another time.
notices, oh, here is hash include, oh, here's another hash include.
And it, essentially, finds those files on the hard drive, cs50.h, stdio.h,
OK.
But this is about as low level as you can get to what a computer really
I couldn't tell you what this is doing unless I thought it through carefully
feeds into the brains of the computer, the CPU, the central processing unit.
that understands this instruction, and this one, and this one, and this one.
Which finds and replaces the hash includes symbols, among others,
Indeed.
this is why you can't take a program that was sold for a Windows computer
Because the commands, the instructions that those two products understand,
Now Microsoft, or any company, could generally write code in one language,
It's twice as much work and sometimes you get into some incompatibilities,
You can now use the same code and support even different platforms,
or systems, if you'd want.
All right.
Assembly, assembling.
happening for you every time you run make or, in turn, clang,
for you from your source code, is turned into 0s and 1s.
when you compile your code, you convert it to source code-- from source code
to machine code.
All right.
So that's assembling.
and then plugging it into printf, I'm using three different people's code,
if you will.
hello.c, cs50.c, and by that logic, what might the other be?
Yeah?
AUDIENCE: stdio?
And that's a bit of a white lie, because that's such a big, fancy library
that there's actually multiple files that compose it, but the same idea,
I get those 0s and 1s that end up taking hello.c and turning it, effectively,
into 0s and 1s that are combined with cs50.c, followed by stdio.c as well.
Here might be the 0s and 1s for my code, the two lines of code that I wrote.
Here might be the 0s and 1s for what cs50 wrote some years ago in cs50.c.
Here might be the 0s and 1s that someone wrote for standard I/O decades ago.
essentially stitches them together into one single file called hello,
That last step is what combines all of these different programmers' 0s and 1s.
that are happening that humans have developed over the years,
over the decades, that breakdown this big problem of source code going
Questions?
Or confusion?
Yeah?
AUDIENCE: Can you explain again what a.out signifies?
a.out is just the conventional, default file name for any program
It stands for assembler output, and assembler might now sound familiar
Yeah?
AUDIENCE: [INAUDIBLE]
but they are there, just in case you might want them.
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID MALAN: Does it matter what order we're telling the computer to run?
It's going to-- make is going to take care of automating that process for us.
All right.
more this week because, invariably, this past week you ran against--
Though your programs are going to get more featureful, more sophisticated,
You don't have to stare at your code, or shake your fist at your code.
So what are some of the techniques and tools that folks use?
If you've ever heard this tale, some 50 plus years ago, in 1947.
at the time, a very sophisticated system known as the Harvard Mark II computer,
for instance, when we were trying to print out various aspects of Mario,
Let me switch back over to VS Code here, and I'm going to run--
write a program.
All right, saving the file, doing make, buggy, Enter, it compiles.
But some of you have probably seen the logical error already,
this picture, which was 3 bricks high, I seem to have 4 bricks instead.
don't have to find an actual bug, we can use a tool to find one that we already
And the simplest way to do this now, and years from now,
Let me do this.
Right?
This is not the program I want to write, it's the program I'm temporarily
And now, once you've figured it out, oh, so this should probably be less than 3,
Now, I can delete my temporary print statement, rerun make buggy, ./buggy.
just to poke around and see what's inside the computer's memory.
known as a debugger.
And the way you run this debugger is you say debug50, space, and then
Because every program we've written thus far, runs from start to finish,
will let me walk through my code one step at a time, one second at a time,
And in just a moment, you'll see that a new panel opens on the left hand side.
Let me zoom out a little bit here so we can see more on the screen at once.
And sometimes, you'll see in VS Code that debug console opens up,
which looks very cryptic, just go back to terminal window if that happens.
Because at the terminal window is where you can still interact with your code.
notice that we have the same program as before, but highlighted in yellow
is line 5.
The little red dot means break here, pause execution here.
says Step Over, there's another that's going to say Step Into,
And I'm going to do this, and you'll see that the yellow highlight
But the most powerful thing here, notice, is that top left here.
going on that will make more sense over time, but at the top
But look in the terminal window, one of the hashes has printed.
It's still 0 because the yellow highlighted line hasn't yet executed.
But the moment I click Step Over, it's going to execute line 5.
Now, notice at top left, i has become 1, and nothing has printed, yet,
I can see my variables changing, I can see output changing on the screen,
and I can just think about should that have just happened.
I can pause and give thought to what's actually
going on without trying to race the computer and figure it all out at once.
know what this particular problem is, and that brings me back
to step through your code step-by-step, at your own pace to figure out
Printf is great, but it gets annoying if you have to constantly add print this,
print this, print this, print this, recompile, rerun it, oh wait a minute,
Questions on this debugger, which you'll see all the more hands-on over time?
Questions on debugger?
Yeah?
We'll see this before long, but those other buttons that I glossed over,
step into and step out of, actually let you step into specific functions
stepping over the entire execution of that function, I could step into it
working on that has multiple functions, you can set a breakpoint in main,
if you want, or you can set it inside of one of your additional functions
to focus your attention only on that.
or a roommate who actually wants to hear you talk about code, of all things,
is that simply by looking at your code and talking it through, OK, on line 3,
finds you having the proverbial light bulb go off over your head,
And this is really just a proxy for any other human, teaching fellow, teacher
or friend, colleague.
One of these little, rubber ducks and consider using it, for real, any time
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID MALAN: What's the difference between Step Over and Step Into?
At the moment, the only one that's applicable to the code I just wrote
If, though, I had other functions that I had written in this program,
maybe lower down in the file, I could step into those function calls
And we're going to write one other thing that's buggy, as well.
But I'm going to assume, for the sake of discussion, that it does.
Then, I'm just going to print out, with %i and a new line,
whatever the human typed in.
So suppose, in this case, that I declare a function called get negative int.
It's return type, so to speak, should be int, because, as its name suggests,
Assign n the value of get int, asking the user for a negative integer using
whatever the user has typed in, so long as they cooperated and gave me
but let me compile this program after copying the prototype up to the top,
Uh-huh.
No.
How about 0?
All right.
I could use my printf technique and say something explicit like n is %i,
On ./buggy.
It's going to highlight the line that I set the breakpoint on.
yet.
But let me, now, Step Over this line that's highlighted in yellow,
and you'll see that I'm being prompted.
All right.
My problem doesn't seem to be in main, per se, maybe it's down here.
So that's fine.
But this time, instead of just stepping over that line, let's step into it.
Now, I can step through these lines of code, again and again.
Hopefully, once I've solved the issue, I can exit the debugger, fix my code,
So Step Over just goes over the line, but executes it,
And when we come back, we'll take a look at that computer's memory
we've been talking about.
All right.
Up until now, both, by way of week 1 and problems set 1, for the most part,
we've just translated from Scratch into C all of these basic building blocks,
But there are features in C that we've already stumbled across already,
like data types, the types of variables that doesn't exist in Scratch,
So to summarize the types we saw last week, recall this little list here.
We had ints, and floats, and longs, and doubles, and chars,
there's also Booles and also string, which we've seen a few times.
But today, let's actually start to formalize what these things are,
and actually what your Mac and PC are doing when you manipulate bits
And see if we can't put more tools into your toolkit, so to speak,
programs in C.
And this is why ASCII, which uses 1 byte, or technically, only 7 bits early
But the things that we call long, it's, literally, twice as long,
8 bytes or 64 bits.
So is a double.
Let's consider the fact that, even though we don't have to care, exactly ,
how this kind of thing is made, if this is, like, 1 gigabyte of memory,
Then, maybe, way down here in the bottom right corner is byte number 1 billion.
and just think of memory as taking up-- or, rather, just think of data
So, for instance, if you were to store a char in a computer's memory, which
If you were to store something like an integer that uses 4 bytes, well,
If you were to store a long or a double, you might, actually, need 8 bytes.
how much memory and given variable of some data type would take up.
Well, from here, let's abstract away from all of the hardware
Or, really, like a canvas that we can paint any types of data
At the end of the day, all of this data is just going to be 0s and 1s.
Suppose that your three scores were these, 72, 73, not too bad, and 33,
Let's write a program that does this kind of averaging for us.
Now, I'm going to use printf to print out the average of those things,
But I'm going to print out %f, and I'm going to do score 1, plus score 2,
if I'm curious what my average grade is in the class with these three
assessments.
Format specifies type double, but the argument has type int, well,
Yeah?
AUDIENCE: So the computer is doing the math, but they basically [INAUDIBLE]
but, indeed, what's happening here is I'm adding three ints together,
is, recall that C when it performs math, treats all of these things as integers.
without throwing away the remainder, everything after the decimal point,
to a float.
Because it turns out, long story short, in C, so long as one of the values
So let me, now, recompile this code with make scores, Enter.
All right.
What is, kind of, bad about it, or if we maintain this program longer term,
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID MALAN: Yeah, so in this case, I have hard coded my three scores.
Yeah?
haven't stored the average in some variable, which in this program, not
that's a problem.
maybe I have 4, or 5, or 6.
Even though the variables, yes, have different names; score 1, score 2,
score 3.
AUDIENCE: [INAUDIBLE]
32 bits.
I'm just keeping the pictures clean for today, from the top-left on down.
really being stored in the computer's memory, are patterns of 0s and 1s.
But there might be a better way to store, not just three of these things,
An array is another type of data that allows you to store multiple values
So an array can let you create memory for one int, or two, or three,
all using the same variable name, the same one name.
So for instance, if, for one program, I only need three integers,
Why?
The square brackets tell the computer how many ints you want.
In this case, 3.
I can say, scores bracket 0 equals 72, scores bracket 1 equals 73,
is going to be 33.
is going to be an int.
Now, down here, this code needs to change because I no longer have
three variables, score 1, 2, and 3.
I'm going to, here, then, do scores bracket 0, plus scores bracket 1,
But notice, I'm using the same variable name, every time.
And again, I'm using this new square bracket notation to, quote-unquote,
index into the array to get at the first int, the second int, and the third,
Now, this program, still not really solving all the problems we describe,
I still can only store three scores, but we'll come back
But for now, we're just introducing a new syntax and a new feature,
Let's, then, use get int to ask the user for another score.
Let's use get int to ask the user for a third score,
And, now, if I go ahead and save this program, recompile scores, huh.
AUDIENCE: cs50.h
That was not intentional, so still making mistakes all these years later.
OK.
So maybe, this year was better and I got a 100, and a 99, and a 98, and there,
my average is 99.0000.
But now, I've introduced another, sort of, symptom of bad programming.
There's this expression in programming, too, called code smell, where like--
And there's something off here in that I could do better with this code.
Does anyone see an opportunity to improve the design of this code, here,
if my goal, still, is to get three scores from the user but [SNIFF SNIFF]
Yeah?
That way you don't have to copy and paste all of those scores.
And so this program, ultimately, is going to work, pretty much, the same.
Make scores, ./scores, and 100, 99, 98, and we're back to the same answer.
The fact that I have indeed, this magic number three, that really
Let me say something like, int n gets get int, how many scores question mark,
Ask the human how many tests have there been this semester?
Then, you can type in each of those scores
And then you'll get the average of one test, two test, three--
If you have score 1, score 2, score 3, dot, dot, dot, score 99,
that you could collapse into one variable that has 99 locations.
AUDIENCE: [INAUDIBLE]
And that list we had on the screen, earlier, is not all of them.
you could, technically, use char, in some form or other data types as well.
is going to get a test score that's 2 billion, or more, because int is just,
kind of, the go-to.
Yeah?
Could it-- when you're doing a hash problem on the problem set--
The problem with the scenario I created a moment ago was printf was involved.
And I was telling printf to use a %f, but I was giving printf the result
I'm guessing in the scenario you're describing, for something like cash,
All right.
What, then, might we do that's more interesting than just storing numbers
in memory?
As opposed to just storing 72, 73, 33 or 100, 99, 98, at these given locations,
because again, an array gives you one variable name, but multiple locations,
I could, of course, store those in three variables, like c1, c2, c3.
And let's, for the sake of discussion, let's whip this up real quickly.
int main(void).
And then, inside of main, I'm going to, simply, create three variables.
And we've, generally, dealt with strings, which was easier last week.
But %c, %c, %c, will let me print out three chars, and like c1, c2, and c3.
But let's poke around at what's going on underneath the hood, here.
And let me add some spaces so there are gaps between each of them.
Any guesses?
Yeah?
hi, because it should be, hopefully, the old friends, 72, 73, and 33.
There we go.
take whatever variable comes after this, c1, c2, or c3 and convert it to an int.
The effect is going to be no different, make hi, and then rerunning whoops--
then running ./hi still works the same, but now I'm explicitly converting chars
to ints.
And we can do this all day long, chars to ints, floats to ints,
ints to floats.
just to demonstrate that we can, indeed, see the values underneath the hood.
All right.
and then let's do a simple printf with %s, printing out s's there.
Oh, and let me introduce the C50 library here, more on that next before long.
Like, I've just done the same thing multiple, different ways.
have three different variables, c1, c2, c3, representing H-I exclamation point,
or you can just treat them all together like this h, i, exclamation point.
But it turns out that strings are actually
Yeah?
DAVID MALAN: Yeah, a string might be, and indeed is, just
an array of characters.
It's the same 72, 73, 33, but now, I'm sort of, hopefully,
can index into them using this new square bracket notation.
think of a string as a type, I'm just going to use one big box of size 3.
But let me ask this question now, if this, at the end of the day,
and the ability, like a canvas to draw 0s and 1s, or numbers, or characters,
is what your Mac, and PC, and phone ultimately reduced to.
B-Y-E. And then the next thing I type might go here, here, here and so forth.
But then how does the computer know if, potentially, B-Y-E exclamation point
is right after H-I exclamation point where one string ends and the next one
begins?
Right?
and for those who don't know, what does that mean?
and where the next one begins, we just need some special symbol.
0, 0, 0, 0, 0, 0, 0, 0.
like you could do by doing out the math or doing the conversion,
like we've done in code, you would see for storing hi, 72, 73, 33,
but then 1 extra byte that's sort of invisibly there, but that is all 0s.
And now I've just written it as the decimal number 0.
Whatever the length of the string is, plus 1 for this special sentinel value
If humans, at the end of the day, just have this canvas of memory,
a lot easier with ints, it's even easier With floats, why?
This is a bit dangerous, but I'm going to start looking at memory locations
There it is.
let's get rid of just this one word and let's have two.
for instance, just this common convention with bye exclamation point.
And let me also print out with %s, whoops, printf, print out t, as well.
As I was missing.
Make hi.
Make hi.
OK, voila.
And it's wrapping around, but that's just an artist's rendition, here.
how many characters there are so you know how many characters are
going to be there.
we would indeed have to know in advance how many chars you want for a given
string that you're storing, how, then, does something like get string work,
It turns out, two weeks from now we'll see that get string
And it's going to grow or shrink the array automatically for you.
Other questions?
Yes.
But I claim there's really no other way to distinguish the end of one string
from the start of another, unless we make some sort of notation in memory.
All we have, at the end of the day, inside of a computer, are bits.
Yeah?
AUDIENCE: How does our memory device know to enter a line when you type
how does the computer know to move to a next line when you have a /n?
And you can see that, for instance, on the ASCII chart from the other day.
If I had put a /n in my code here, right after the exclamation point here
and here, that would actually shift everything in memory because we would
need to make room for a /n here and another one over here.
Other questions?
too as 72, 73, 33, if we are to write those numbers in the string,
and convert them into binary how would the computer know what's 72
and what's 8?
So if, at the end of the day, all we're storing is these numbers,
And I simplified this story in week 0 saying that Photoshop interprets them
and float.
a data type via which you can represent a color as a triple of numbers,
Yeah?
AUDIENCE: It seems easy enough to just add a nul thing at the end of the word,
Why could we not just make all data types variable in size?
that if you know every int is 4 bytes, you can very quickly,
and we'll see this next week, jump from integer to another,
Whereas, if you had variable length numbers, you would have to,
kind of, follow, follow, follow, looking for the end of it.
we'll see in just a moment how you can actually write code
because it's the number 0 in the context of Excel using some memory,
means that we can reliably distinguish one variable from another in memory.
our own code that manipulates these things that are lower level.
So let me do this.
And let's use this basic idea to figure out what the length of a string
Let me include both the CS50 header and the standard I/O header,
give myself int main(void) again here, and inside of main, do this.
Let me prompt the user for a string s and I'll ask them
this, while name bracket i does not equal that special nul character.
So I typed it on the slide is N-U-L, but you don't write N-U-L in code,
While name bracket i does not equal the nul character, I'm going to go ahead
And then down here I'm going to print out the value of i
Fortunately no errors.
./length and let me type in something like H-I, exclamation point, Enter.
And I get 3.
And I get 4.
as we go.
So let's increment i.
Then, let's come back around to line 9 and let's ask the question again.
Now i equals 1.
Fast-forward to the end of the story, once I get to the end of the string,
So what we seem to have here with some low level C code, just this while loop,
is a program that figures out the length of a given string that's been typed in.
for the sake of discussion for a moment, that I can call a function now called
string length.
and then I'll go ahead and print out, just as before with %i,
the length of that string.
and what should I have this function return as its return type?
Yeah?
AUDIENCE: Int.
And I don't want to print it at the end, this would be a side effect.
Yeah?
AUDIENCE: Return i.
AUDIENCE: Return i.
Because now, my main function can use the return value stored in length
That works.
It turns out that we can get rid of my own custom string length function here.
This is a function that comes with C, albeit in the string.h header file,
So, for instance, let me go ahead and search for a string up at the top here.
You'll see that there's documentation for our own get string function,
that you know what its inputs are, if any, and its outputs are, if any.
Then down below you might see a description, which in this case,
is pretty straightforward.
and you might even see an example, like this one that we've whipped up here.
here, and we'll link to these in the problem sets moving forward,
are pretty much the place to start when you want to figure out
Is there a function that might help me solve some problems set problems
so that I don't have to really get into the weeds of doing all
But again the point of our having just done this together
At the end of the day, this is all that's inside of your computer
is 0s and 1s.
Yeah.
We, the staff, have configured make to do all of that for you automatically.
But the onus is on you for the prototypes and the header files.
Other questions on these representations or techniques?
Yeah?
If you were to have a string with actual spaces in it that is multiple words,
Which is just a random website that's my go-to for the first 127 characters
of ASCII.
And if you look here, it's a little non-obvious, but S-P is space.
If a computer were to store a space, it would actually store the decimal number
32, or technically, the pattern of 0s and 1s that represent the number 32.
Yeah?
be accompanied by nuls?
because every other data type we've talked about thus far
If we think back to last week, we did end the week with a couple of problems.
Integer overflow, because 4 bytes, heck, even 8 bytes is sometimes not enough.
But they will then start to manage that memory for you
and what they're really probably doing is just grabbing a whole bunch of bytes
And when we come back, we'll flesh out a few more details.
All right.
Let's start to take more of these library functions out for a spin.
So we're not relying only on the built ins that we saw last week.
And then, let me print out say, output , and all I want to do is print back out
Now, the simplest way to do this, of course, is going to be like last week,
Well, in pseudo code, or in English what's the gist of how we could solve
this problem, printing out the string s on the screen without using %s?
Yeah?
Well, for int i, get 0 is kind of the go-to starting point for most loops,
i is less than--
that new line, just to move the cursor down to the next line
make string, Enter, so far so good, string and let me type in something
Let me do it once more with bye, Enter, and that works, too.
two spaces here and one space here just because I, literally,
wanted these things to line up properly, and input is shorter than output.
Yeah, in back?
like 3 if it's a H-I exclamation point and 0 is less than 3, so that's fine,
I call strlen of s.
Compare 3 against 1.
It's still 3.
3.
3.
Don't ask multiple times questions that you can remember the answer to.
So how could I remember the answer to this question and ask it just once?
Let me see.
Yeah, back there?
That's been our answer most any time we want to keep something around.
Well, I could do something like this, int, maybe, length equals strlen of s.
Let me fix this to be comparing against length, and this is now OK.
Turns out that 4 loops let you declare multiple variables at once,
But heck, while I'm being succinct I'm just going to use n for number.
But now, hereafter, all of my condition checks are just, i less than n,
Let me write a quick program here that capitalizes the beginning of--
Up here I'll use my new friends, cs50.h, and standard I/O, and string.h.
is let's ask the user for a string s using get string asking them
So that it-- just so I can see what the uppercase version thereof is.
We've looked at this last week notice that a-- capital A is 65,
Right?
subtract 32 and boom, you have capital A. So there's some arithmetic involved.
But now that we know that strings are just arrays,
ask the question, if the current character in the array during this loop
But we know from week 0 lowercase a is 97, lowercase z is, what is it, 1?
AUDIENCE: 132.
AUDIENCE: 132?
And so that would allow us to answer the question is the current letter
lowercase?
And at the very end of this, I'm going to print out a new line just
But this loop here, which I borrowed from our code previously,
All right, and let me zoom out here for just a second.
And sorry, I misspoke 122, which is what you might have said.
So 122 is little z.
Well.
lowercase.
Well let me check, is lower, now I see the actual man page for this function.
So it returns non-zero.
And down here, let me get rid of this cryptic expression, which
was kind of painful to come up with, and just ask this, is-lower s bracket i?
Just, because.
if you put a value like a function call like this, that returns 0,
of the function and its arguments, and not compare it against anything.
Because we could do something like this, well if it's not equal to 0, then
it must be lowercase.
it's lowercase.
But a more succinct way to do that is just a bit more like English.
AUDIENCE: [INAUDIBLE]
OK.
AUDIENCE: [INAUDIBLE]
So this is great, but some of you might have spotted a better solution
to this problem.
Yeah?
AUDIENCE: To-upper.
so I don't have to get into the weeds of negative 32, plus 32.
There we go.
let's use a function that someone else wrote, and just say to-upper, s bracket
i.
So if I rerun make uppercase, and then do, slowly, .uppercase, type in hi,
by going back to its man page, or manual page, what you'll see
thereof.
If it's not lowercase, it's already uppercase, it's punctuation,
Yeah?
Unfortunately, no.
we'll see later this term, you can say, give me everything.
But that, actually, tends to be best practice because it can slow down
Yeah?
Yes.
No, we do not have access to a function that at least comes with C or comes
with CS50's library that will just force the whole thing to uppercase.
So stay tuned for another language that will let us do exactly that.
to where we began today where we were talking about those command line
arguments.
How is it that maybe you and I can start to write programs that
and just asking that you take on faith that it's just the way you do things.
(Void) What that (void) means, for all of the programs I have written thus far
and you have written thus far, is that none of our programs
It turns out that main is the way you can specify that your program does,
you want the human to be able to say something, like hello, David
You can use command line arguments, words after the program name
to be an integer that stores how many words the human typed at the prompt.
going to be an array of all of the words that the human typed at the prompt.
and it's going to tell us how many words there are in an int called argc.
The int, as the return type here, we'll come back to in just a moment.
oops.
That is not the right name of a program, let's start that over.
let's actually say int, argc, string, argv, open bracket, close bracket.
Let's write a very simple program that just says, hello, David, hello, Carter,
But not using get string, let's instead have the human just
type their name at the prompt, just like rm, just like clang, just like make,
No additional prompts.
I need a placeholder, so let me put %s here and then put that here.
So no get string.
I'm literally typing an extra word, my own name at the prompt, Enter.
So logically, how do I print out hello, David, or hello so-and-so and not
Yeah?
Huh.
Hello, nul.
Yeah?
But it turns out, in a couple of weeks, we'll start really poking around memory
But let's now make sure the human has typed in the right number of words.
So let's say this, if argc equals 2, that is the name of the program
and one more word after that, go ahead and trust that in argv 1,
Else, let's go ahead and default here to something simple and basic,
like, well, if we don't get a name from the user, just say hello, world,
like always.
This time the human, even if they screw up, they don't give us a name
or they give us too many names, we're just going to say hello, world,
OK.
I would need to alter my logic to support more than just two words
to actually give myself a way of taking user input when they run the program.
If you had to use get string every time you compile your code,
You type make, then you might get a prompt, what would you like to make?
Then you type in hello, or cash, or something else, then you hit Enter,
if you support command line arguments, then you can use these little tricks.
Like, scrolling up and down in your history with your arrow keys.
You can just type commands more quickly because you can do it all at once.
Yeah?
if you were to use argv, and you were to put integers inside of it,
Why?
If we therefore go to the chart here, that might make you wonder, well,
then how do you distinguish numbers from letters in the context of something
And it's a little silly that we have numbers representing other numbers.
All right, one final example to tease apart what this int is
So an exit.c.
We're going to introduce something that are generally known as exit status.
It turns out that main has one more feature we haven't leveraged.
So I might give them an error message like, missing command line argument /n.
1 is fine as a go-to.
We don't need to get into the weeds of having many different exit statuses,
so to speak.
But if you return 1, that is a clue to the system, the Mac, the PC, the cloud
Why?
Because 1 is not 0.
If everything works fine, like, let's go ahead and print out hello comma %s like
And so I return 0.
The only new thing here logically, is that for the first time ever,
because main has always been defined by us as taking an int as a return value.
If you've never once use the return keyword, which you probably
the programmer, something went wrong, you can abort programs early.
You can exit out of them by returning some other value, besides 0, from main.
It might say error code 123, or negative 49, or something like that.
What you're generally seeing, are these exit statuses, these return
they are unnecessarily showing you, the user what the error code is.
Yeah?
AUDIENCE: [INAUDIBLE] You know how if you have get string or get int,
at the command line like you could with get string and get int.
until they give you an int, or a float, or the like with command line
arguments, no.
Good question.
Yeah?
to also start returning a value for main when something goes right
this for some actual problems that we'll be solving in the coming days
to when you were a kid the readability of some text or some book,
What might the grade level be for a book that has words like this?
Maybe, when you were a kid or if you have a siblings still reading
these things, what might the grade level of this thing be?
Any guesses?
Yeah?
Why is that?
were proud to say that they were perfectly normal, thank you very much.
And, onward.
AUDIENCE: Third.
AUDIENCE: What?
And whether or not we can debate exactly what age you were when you read this,
What makes this text assume an older audience, a more mature audience,
Yeah?
AUDIENCE: [INAUDIBLE]
exactly, the grade level of some actual text-- there's the third--
and you're in class, and you're passing a note from one person to another,
Your input, though, when you have a message you want to send securely,
The ciphers typically take, though, not one input, but two.
If, for instance, your cipher is as simple as A becomes B,
You and the recipient both have to agree, presumably, before class,
and the key is 1, the ciphertext using this simple rotational algorithm,
what algorithm are they using today, or what key are they using today,
Such that now B should become A, and C should become B, and A should become A.
Well if we spread all the letters out, and we start from left to right,
[APPLAUSE]
[MUSIC PLAYING]
And even as we've gotten much more into the minutia of programming
is all the more cryptic looking, recall that at the end of the day,
like, everything we've been doing ultimately fits into to this model.
And of course, last week we really went into the weeds of like how
This is what?
AUDIENCE: RAM.
and then, maybe way down here again, something like 2 billion
And so as we did that, we started to explore how we could use this canvas
to create kind of our own information, our own inputs and outputs, not
[AUDIO OUT]
How might someone else define an array in more familiar now terms?
Yeah.
more interesting ways to use this same primitive Canvas to stitch together
things that are sort of two directional even that have some kind of shape
to them.
But for now, all we've talked about is arrays and just using these things
algorithms.
But we have to keep in mind, even though every time we've looked at an array
thus far, certainly on the board like this, you as a human certainly
the whole thing with a bird's eye view and seeing where all of those numbers
are.
like zero, odds are your eyes would go right to where it is,
But the catch is, with a computer that has this memory, even though you,
have this bird's eye view of all of the data in all these locations.
maybe look here, maybe look here, and so forth in order to find something.
that very methodically goes from left to right or right to left or something
And just remember that the conventions we've had since last week
To be zero indexed just means that the data type starts counting from zero.
So this is location 0, 1, 2, 3, 4, 5, 6.
And notice even though there are seven total doors here,
0 would always be at the left, and n minus 1 would always be at the right.
All right, so let's revisit the problem that we started the whole term off
Anytime you take out your phone, you're searching for a friend's contact.
Any time you pull up a browser, you're googling for this or that.
So let's consider how the Googles, the Apples, the Microsofts of the world
Maybe it's a bunch of closed doors like this out of which we want
You can imagine taking this one step further and trying to find
But for now, let's just take one bite out of the problem.
But before we go there and start talking about ways to do that-- that is,
algorithms.
Let's consider how we might lay the foundation of, like,
goes without saying that any code you write, any algorithm you implement,
Otherwise, what's the point if it doesn't give you the right answers?
And in your own words, what do we mean when we say a program is better
I like that.
Other thoughts?
Yeah.
AUDIENCE: Efficiency.
AUDIENCE: [INAUDIBLE]
And as our programs get bigger and more sophisticated and just longer,
not just by yourself but with someone else, getting the design
And the way we might talk about the efficiency of algorithms, just how fast
That is to say, when they're running, how much time do they take?
because presumably fewer steps, to your point, is better than more steps.
or a piece of code, for that matter, in terms of what's called big O notation.
And this generally means that the running time of some algorithm
is on the order of such and such, where such and such, we'll see,
to convey the idea of just how fast or how slow some algorithm or code is
So you might recall then from week zero, I even introduced this picture
At the time, we just use this to compare those phone book algorithms.
Recall that this red straight line was the first algorithm,
The yellow line that's still straight differed how if you recall?
DAVID J. MALAN: Two pages at a time, which was almost correct so long as we
potentially double back a page if maybe we go a little too far in the phone
book.
This last algorithm, though, was the so-called divide and conquer
the first time, another 250, another 125 versus just 1 or 2 bytes at a time.
there, though I didn't use that expression at the time, running times.
n divided by 2.
when you start to zoom out and if I increase my y-axis and x-axis,
that that third algorithm was on the order of-- that is, big O of-- log n.
because it's a smaller mathematical detail that is also just in some sense
a for loop at this point in any of your code and that for loop iterated n times
was something else that you wanted to do n times, you wrote code
if you will.
But the common ones we'll discuss and you will see in your own code
But for now, these are sort of the most familiar ones that we'll soon see.
And then lastly, last one here, big theta, is used by a computer scientist
when you have a case where both the upper bound on an algorithm's
You can then describe it in one breath as being in theta of such and such
And we'll now introduce over time examples of how we might actually
Any questions?
the slower your algorithm is because the more time or more steps that it takes.
would be this last one here either in big O notation or even theta,
That means it literally takes constant time, one step, or maybe 10 steps,
That's the best because even as the phone book gets bigger,
even as the data set you're searching gets larger and larger,
then it doesn't matter how big the data set actually gets.
Questions as well on these notations-- yep, thank you for the pointing.
AUDIENCE: [INAUDIBLE]
seven integers, seven integers that we might actually want to search for.
Yes, OK.
Come on down.
So come on down, and I'll get things ready for you in advance here.
AUDIENCE: [? Nomira. ?]
AUDIENCE: [? Nomira. ?]
Come on over.
And the goal, quite simply, is, given this array of memory
as input, to return, true or false, is the number I care about actually there?
All right, and maybe just step aside so the audience can see.
AUDIENCE: [INAUDIBLE]
Let's just move from left to right, sort of searching our way.
Oh, 6, not 0.
All right, also not working out so well yet, but that's OK.
Next door.
2, 7-- no.
Oh.
AUDIENCE: No.
AUDIENCE: Yes.
But if it was like, the zero is in the middle, it wouldn't have been.
DAVID J. MALAN: Yeah, and so here is where the right way to do things
But the catch is if I asked her to find another number, like the number 8,
And so in the general case, going left to right or, heck, right to left
about the order of these numbers-- and indeed, they seem to be fairly random.
Linear search is about as good as you can do when you don't know anything
But it is correct.
If zero were among those doors, she absolutely would have found it
So let's now try to translate what we did into what we might call again
For each door, from left to right, if the number is behind the door,
return true.
Else, at the very end of the program, you would return false by default.
an accurate translation.
First of all, normally, when we've seen ifs, we might see an if else.
And yet down here, return false is aligned with the for.
Why did I not indent the return false, or put another way,
why did I not do if number is behind door, return true, else return false?
Why would that version of this code have been problematic?
Way in back.
AUDIENCE: [INAUDIBLE]
Yeah, in front?
And it would have been as though if all of these doors were still closed,
she opens this up and says, nope, this is not zero, return false.
the number I'm looking for is, in fact, not actually there.
We've been writing code using n and loops and the like.
So let's take this higher level pseudo code and now just kind of
Let me propose that we think about this version of the same algorithm
This is a way of just saying in pseudo code, give myself a variable called i.
And recall n minus 1 is not one shy of the end of the array.
started counting at 0.
So now let's consider what the running time of this algorithm is.
This outermost loop here for i from 0 to n minus 1, that line of code
AUDIENCE: [INAUDIBLE]
So I might just make a note to myself this loop is going to operate n times.
All right, maybe it's two steps, but it's a constant number of steps.
is what are you doing again and again and again because that's obviously
But looping, that's going to add up over time because the more doors there are,
the bigger n is going to be and the more steps that's going to take,
How many steps is this algorithm on the order of given n doors or n integers?
Yeah?
AUDIENCE: [INAUDIBLE]
AUDIENCE: O n.
Why?
So given this menu of possible running times for lower bounds on an algorithm,
AUDIENCE: Omega 1.
AUDIENCE: [INAUDIBLE]
Maybe it's two steps if you have to unlock the door and open it,
because in the best case, she might just get the number right from the get go.
But in the worst case, we need to talk about the upper bound, which
about best cases and worst cases or lower bounds and upper bounds.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: OK, no, because you only take out the theta notation
when those two bounds, upper and lower, happen
And don't look at what I'm doing because I'm going to--
And actually, if you could stay right there before coming up,
AUDIENCE: [INAUDIBLE]
I'm Rave.
Yeah!
AUDIENCE: Of course.
Nice to meet.
Come on over.
AUDIENCE: OK.
DAVID J. MALAN: Unbeknownst to you, I now took numbers behind the doors,
AUDIENCE: OK.
DAVID J. MALAN: Given that, and given perhaps what we talked about in week zero
with the phone book, where might you propose we begin the story this time?
AUDIENCE: OK.
So--
AUDIENCE: OK.
AUDIENCE: OK.
just like I did, I opened to the right half of the phone book.
AUDIENCE: Yes.
Yeah.
was given the assumption that these numbers are sorted from small
on the left to large on the right, she was able to apply that same divide
and conquer algorithm from week zero which we're now going to give a name--
binary search.
went a little too far, then by going to the left half, which,
the number six in this case that we were actually searching for.
If I had used different numbers but still sorted them from left to right,
AUDIENCE: [INAUDIBLE]
or, heck, they could even be in reverse order, so long as it's consistent,
the decisions that Rave was making-- if greater than, else, if less than--
If the number is behind the middle door, which is where Rave began,
Else, if there are no doors-- and we'll see in a moment why I put
this up top just to keep things clean.
we might have whittled down the problem from seven doors to three doors
So it's not to say that maybe I don't give Rave any doors to begin with.
if she runs out of lockers to ask those questions of-- or a few weeks ago,
And so if I want to express the middle door, I could just, in pseudo code,
to figure out what the middle door is, but that's easy enough to do.
So again, this is a more pedantic way of taking what's a pretty intuitive idea--
then Rave would have wanted to search the middle door plus 1--
through n minus 1.
Well, in the worst case, how many steps total might Rave's binary search
how many times could she go left or go right before finding herself with one
or no doors left?
AUDIENCE: Log n.
And even if you're not feeling wholly comfortable with your logarithm
any time we talk about some algorithm that's dividing and conquering
of times you can divide n by 2 until you bottom out at just a single door
So log n.
So how might we describe a lower bound on the running time of binary search.
Yeah.
AUDIENCE: 1.
AUDIENCE: 1.
So here too, we see that in some cases binary search and linear search, eh,
Well, odds are, you're going to generally care about big O notation.
depending on the nature of the data that you're going to actually be given here.
All right let me pause and see if there is any questions.
And if I can generalize it, how do you guarantee that you can do this
But Rave was given an advantage when she came up here in that the doors were
already sorted.
either a small data set or, heck, something Google sized with millions,
got millions, billions of web pages, should they just go with linear search
searching for one thing in life, then that's probably a waste of time
to sort it and then search it because you're just adding to the process.
worry about that whereby it might make sense to sort it and then search?
Yeah.
AUDIENCE: [INAUDIBLE] you can go and use the other values as a way
you have more than just one user who's searching for more than just one
and sort the whole thing because every subsequent request thereafter
I could just go to sleep for eight hours and let it analyze this really big data
set.
Why?
Because I was the only user, and I only needed to run these queries once.
reasonable until I woke up eight hours later and my code was incorrect.
And now I had to spend another eight hours rerunning it after fixing it.
These are all resources we'll start to talk about because it really
Yeah.
DAVID J. MALAN: When analyzing running time, does the sorting step count?
it took to find that number six, I should have added the time
So we've seen a couple of searches-- linear search and binary search, which,
But let's actually translate at least one of those now to some code
using this building block from last week where we can actually
So css50.h.
I'll include standardio.h that we can get input and print input if we want.
And now I'm going to go ahead and give myself int main void.
And this is the same list that we saw with [? Nomira ?] a bit
want an array of certain values and you know therefore how many of those values
you want, you can actually do this little trick using curly braces.
You can say, don't worry about how big this is.
But this has the effect of giving me an array called numbers inside
How many?
The compiler can infer it from what's ever inside these curly braces.
respectively.
what would have otherwise been like eight separate lines of code.
And you can do this in a bunch of ways, but I'm going to do it like this.
a value for main when all is well, I'm going to return 0 by convention
Right, I don't think I want an else here per our pseudo code earlier.
[INTERPOSING VOICES]
But if you go through the whole thing, through the whole loop,
at the very end, you probably just want to conclude not found backslash n
And if something goes wrong, like you didn't find what you're looking for,
you might return something other than 0, like positive 1, maybe positive 2,
And just as a little check, let's search for something that's definitely
And maybe this time, I'll store a bunch of names and not just integers
And I'm going to go ahead and for now include a new friend
Int main void because I'm not going to bother with any command line
Let me just let the compiler figure out how many names there are.
And using curly braces, I'll do Bill and then Charlie and then Fred and then
George and then Ginny and then Percy and then Ron if there's the pattern there.
And inside of the, loop lets this time check for the string in question,
Let me go ahead and say if names bracket i equals quote unquote Ron, then inside
And I'm going to take your advice from the get go this time
and, at the end of the loop, print out not found because if I get this far,
So I'm just going to go ahead and return 1 after printing not found.
I mean, that's kind of a mouthful, and the first time you see it,
with the equality checking here, with equal equals and Ron.
and a float and a bool are that are sort of built into the language.
You can't just use equation equals to compare two strings.
And if we go to string.h--
In string.h you can perhaps infer what function will probably take
Yeah.
AUDIENCE: Strcmp?
that, OK, I need to use the CS50 header file and string.h, as I already have.
sensitively.
if two strings come in this order or in this order or they're the same,
but an int gives you like 4 billion even though we just need the 3.
numbers that a computer uses to figure out whether something comes before it
I use stir compare or strcmp, names bracket i comma, quote unquote, Ron.
But for strings, it turns out we need to use a more powerful function.
Why?
And so whereas you can use equals equals for single characters,
There's a loop needed, and that's why it comes with the string library.
But it doesn't just work out of the box with equals equals alone.
That would literally be comparing two things, not two arrays of things.
So let me go ahead and fix one bug that I just realized I made.
Now it compiles.
And just as a sanity check, let's check someone outside the family.
If I had not fixed what I claimed was a mistake earlier and I did this--
So if I were to say this, if str compare of names bracket i and Hermione, that's
implicitly like saying this does not equal 0, or it means sort of is true,
but you don't want to check for true because, again, we're
that it explicitly checks for the return value that means they're the same.
And yeah.
Follow up?
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Yes, you might not have seen this yet,
and I think you can make a reasonable case for it, sort of hides that detail.
We'll come back to little syntactic tricks like that before long.
Two hands?
No?
a little more like a phone book that has both names and numbers and not
We could now have two arrays-- one called names, one called numbers.
that really looks more like a string even though we call it a phone number.
Probably don't want to use an int lest we throw away those kinds of details.
So let me switch back to VS Code here, and let's do one more program, this one
And then inside of my program, I'm going to give myself two arrays--
Then for numbers, I'm again going to use a string array specifying with
Why mine?
If you want to have a little fun with programming, feel free to text
Let's go ahead and actually search for my own name and number here.
So let me do.
There's two of us this time-- so i less than 2 and then i plus plus as before.
and I'm going to use str compare to find my name in this case.
And I'm going to say if strcmp of names bracket i equals quote unquote David
then just as before, I'm going to go ahead and print something out.
But this time, I'm going to make the program more useful
Now I'm implementing a phone book, like the contacts app on iOS or Android.
And then down here if we get all the way through the loop
and David's not there for some reason, I'm going to print as before not found
So let me go ahead and compile this with make phone dot slash phonebook,
and it seems to have found the number.
It's kind of stupid because I've just made a phone book or a contacts
That, of course, could come later using get string or something else.
But is it well-designed?
We started to use them last week, but are we using them well this week?
For instance, if I screw up the actual number of names in the names array
such that it's now more or less than is in the numbers array or vise versa,
that any time I use names bracket i that it lines up with numbers bracket i.
is getting much, much longer, the odds that you or your colleagues
remember that you're sort of just trusting that names and numbers line up
Someone's not going to realize that, and just, the code is going to break.
And you're going to start out putting the wrong numbers for names, which
that you're not just trusting that these two independent variables, names
A new feature today that we'll introduce is generally known as a data structure.
that go beyond the built in ints and floats and even strings?
You can make, for instance, a person data type or a candidate data
And maybe that array is our only array with two things
in it, two persons in it.
But typedef is a new keyword today that defines a new data type.
This is the C key word that lets you create your own data
Struct is another related key word that tells the compiler that this isn't just
a simple data type, like an int or a float renamed or something like that.
It actually is a structure.
It's got some dimensions to it, like two things in it or three things in it
The last word down here is the name that you want to give your data type,
the compiler clang will know that a person is composed of a name that's
And you don't have to worry about having multiple arrays now.
You can just have an array of people moving forward.
And why don't we enhance the phone book code a little bit
structure that has a name inside of it and that has a number inside of it.
And the name of this new structure again is going to be called person.
Inside of my code now, let me go ahead and delete this old stuff temporarily.
I'm going to more pedantic spell out what I want in this array of size 2
And the dot is admittedly one new piece of syntax today too.
and access the variable called name and give it this value Carter.
I can go into people bracket 0 dot number and give that the same thing
And then lastly, people bracket 1 dot number equals quote unquote plus
1-949-468-2750.
I'm still, for the sake of discussion, going to iterate 2 times from i
Let me see.
Why?
Bracket i is the i-th person that we're iterating over in the current loop--
Then dot is our new syntax for going inside of a data structure
So it's a little more verbose, but now arguably this is a better program
because now these people are full fledged data types unto themselves.
I'm going to fix this one last remnant of the previous version.
Now why?
that this is a short program and given that this kind of made things
now laying the foundation for just a better design because you really
with numbers because every person's name and number is, so to speak,
And thus, we have a person that encapsulates two other data types, name
and number.
of the cool stuff we've talked about and you use every day.
What is an image?
maybe you have three values, three variables-- one called red,
And then you could name the thing not person but pixel.
And now you could store in C three different colors-- some amount of red,
some green, some blue-- and collectively treat it as the color of a pixel.
And you could imagine doing something similar perhaps for video or music.
Music, you might have three variables-- one for the musical note,
And you can imagine coming up with your own data type for music as well.
But we now have the way in code to express most any type of data
the purposes for which are to use arrays but use them more responsibly
for implementing cooler and cooler stuff per our week zero discussion?
Yeah.
DAVID J. MALAN: What's the difference between this and an object in an object
oriented language?
Languages like Java and C++ and others which you might have heard
of, programmed yourself, had friends program in, are object oriented
languages in those languages they have things called classes or objects which
are interrelated.
Objects can also store functions, and you can kind of sort of do this in C.
In languages like Java and C+, you have objects that store data and functions
together.
So we'll see this issue in a few weeks, but let me wave my hands at it for now.
Yeah.
And most recently, you might have seen me mention the bug in iOS and Mac OS
called big int, which allows you to express even bigger numbers.
How?
And you somehow allow yourself to store more and more bits
So in short, yes.
We now have the ability now to do most anything we want in the language
Other questions.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Could you define a name and a number in the same line?
Sort of.
Over here.
have to define anything you're going to use or declare anything you're going
Otherwise, the compiler would not know what I mean by person when I first
So it has to come first, or it has to be put into something like a header file
Yeah.
AUDIENCE: [INAUDIBLE]
come back to this later in the term when we talk about SQL, a database language,
and so forth.
You could not store any of that syntax or that punctuation inside of an int.
and so forth.
And Outlook at the time lets you export all of your contacts as a CSV file--
file with all of my friends and family and their numbers inside of it.
Unfortunately, I open that same CSV file with Excel, I think, at the time
And I must have instinctively hit, like, Command or Control-S to save it.
And Excel at least has this habit of sort of reformatting your data.
But long story short, I then imported my mildly saved CSV file into Gmail.
And now 10 plus years later, I'm still occasionally finding friends and family
And two, that too is also kind of relying on the honor system.
It would be all too easy to omit some of the square brackets in the two
dimensional array.
Two dimensional arrays just means arrays of arrays, as you might infer.
solve one of the original problems by actually sorting the information we're
given in advance and considering, per our discussion earlier, just how
going to be one or more algorithms that actually gets this job done.
Well, just to vary things a bit more, I think we have a chance here
OK, so this is actually quite convenient that you're all quite close.
Come on down.
Come on down.
So they're in wardrobe right now just getting their very own Harvard T-shirts
with a Jersey number on it, which will then represent an element of our array.
As we have these numbers up here on the screen, these numbers too are unsorted.
How would you go about sorting these eight numbers on the screen?
You would look to the number to the right or to the left of it,
And if it's out of order, you would just start to swap things.
you can achieve the end result of getting the whole thing sorted.
AUDIENCE: [INAUDIBLE]
So to recap there, find the smallest one first and put it at the beginning,
And then presumably, you could do that again and again and again.
Come on over.
Feel free to distance as much as you'd like and scooch a little with this way
if you could.
5, 2, 7, 4--
AUDIENCE: [INAUDIBLE]
3.
Go [INAUDIBLE].
I'm [INAUDIBLE].
Go [INAUDIBLE].
Go [INAUDIBLE]
Go Winthrop.
OK, yes.
52741630?
Sort yourselves.
OK, yes.
All right, so admittedly, there's kind of a lot going on because each of you,
except number four, are doing something in parallel all at the same time.
at a time, so can a computer only move one number at a time-- sort of opening
So let's try this more methodically based on the two audience suggestions.
And even though I as the human can obviously see all the numbers
and I just kind of have the intuition for how to fix this,
So let me see.
2.
I'm going to forget about 5 and only now remember 2 as the now smallest
elements.
7, nope-- I'm going to ignore that because it's not smaller than the 2
I have in mind.
AUDIENCE: Celeste.
And could you all shift and make room for Celeste?
And even though it happened pretty quickly, that's like seven steps
What's that?
AUDIENCE: Swapping.
So if you want to go back to where you were, one step Over, number 5,
it doesn't really matter where 5 goes until we get him into the right place.
I can ignore Celeste and make this a seven step problem and not eight
Why?
Yeah.
Other thoughts?
Yeah.
So maybe there's a [INAUDIBLE] You just don't know what kind of data
And honestly, I only stipulated earlier that I'm using one variable in my mind.
I could use two and remember the two smallest elements I've seen.
But then I'm going to start to use a lot of space in addition to time.
So if I've stipulated that I only have one variable to solve this problem,
6?
Nope.
3?
Nope.
5?
Nope.
AUDIENCE: Hannah.
2?
But on average, I could get lucky too and just pop number 2
into the right place.
I can now ignore Hannah and Celeste, making the problem size 6 instead of 8.
7 is the smallest.
2 is the smallest.
So we now have 4, 7, 6 3, 5.
Now we have 6, 7, 5.
And my God, that felt so much slower than the first approach,
But two, we were doing one thing at a time whereas the first time, you guys
was recommended by just fixing small problems and see where this gets us.
to 7 to 6 just yet.
AUDIENCE: Yes.
Yeah.
And conversely, if you prefer, Celeste is one step closer to the beginning.
Now worst case, Celeste is going to have to move one step on each iteration.
Let me see.
are sort of bubbling up, if you will, to the end of the list.
4 and 5, good.
5 and 3, swap.
5 and 0, swap.
2 and 1, 2 and 4.
We need the puppets back, but you can keep the shirts.
here, we can't now formalize a little bit what we did on both passes here.
I claim that the first algorithm our volunteers kindly acted out
And as the name implied, we selected the smallest elements again and again
putting Celeste into the right place, and then continuing with everyone else.
4i from 0 to n minus 1.
The left end is 0, the right end is n minus 1 where in this case,
n happened to be eight people.
So that's 0 through 7.
I found the smallest number between numbers bracket i and numbers bracket
n minus 1.
So that's how we got Celeste from over here all the way over there.
ignoring Celeste this time because she was already in the correct location.
where the left most again is always 0, the right most is always n minus 1,
or equivalently, the second to last is n minus 2, the third to last
Big O of what?
It's probably more than n, right, because I went through the list
Let me propose we think about it this way with just a bit of formula, say.
But once the list was swapped into the right place, then
to look through.
Then after that, it's n minus 2 plus n minus 3 plus n minus 4 plus dot dot
And it's obvious that I only have one human left to consider.
and just say dot dot dot plus 1 for the final step.
has a little cheat sheet at the end that shows these kinds of recurrences.
That's just what that recurrence, that series, actually adds up to.
And here's where we're starting to get annoyingly into the weeds.
or a billion web pages in Google search engine, honestly, which of these terms
AUDIENCE: n squared.
Let's just wave our hands at this because at the end of the day,
as n gets really large, the dominant factor is indeed that first one.
Even the divided 2, as I claimed earlier with our two phone book examples, where
looked the same when n is large enough, let's just call this on the order of n
squared.
That's an oversimplification.
If we really added it up, it's actually this many steps-- n
order term to get a sense of what the algorithm feels like, if you will,
In the best case, how many steps does selection sort take?
Like, what does it mean to be the best case or the worst case
Like, what could you imagine meaning the best possible scenario when you're
Yeah.
I can't really imagine a better scenario than I have to sort some numbers,
But I only know she needs to be here once I've looked at all eight people.
And then I would have realized, well, that was a waste of time.
I can leave Celeste be.
I would have ignored her position because we've solved one problem.
I would have done the same thing now for seven people, then six people.
So every time I walk through, I'm not doing much useful work.
don't know until I do the work that the people were in the right order.
So this would seem to imply that the omega notation, the best case
scenario, even, a lower bound on the running time would be what, then?
AUDIENCE: [INAUDIBLE]
AUDIENCE: N squared.
in fact, because the code I'm giving myself doesn't leverage or benefit
So in this case, yes, I would claim that the omega notation for selection sort
That's the first algorithm we've had the chance to describe that in,
Can we do better?
Well, there's a reason that I guided us to doing the second algorithm second.
saying the big values are bubbling their way up to the right
differently.
Why?
Well, if I'm comparing two things, left hand and right hand,
would be bad.
I'm looking at the last two elements, not the last element
and then pass the boundary.
So this pseudo code, then, allows me to say compare every one again and again
AUDIENCE: [INAUDIBLE]
Each time through bubble sort, she only moved one step.
she needs to move n minus 1 steps to get 0 all the way to where it needs to be.
And so this inner loop, if you will, where we're iterating using i,
But it doesn't fix all of the problems until we do that same logic again
Well, one way to see it is to just literally look at the pseudo code.
The inner loop, the for loop, also iterates n minus 1 times.
Why?
And if that's hard to think about, that's the same thing is 1 to n minus 1
if you just add 1 to both ends of the formula.
is running by how many times the inner loop is running, which gives me
The--
AUDIENCE: N squared.
plus 1 is not that big a thing, there's such drops in the bucket when
And if we consider now the lower bound on bubble sort's running time,
What might you claim is the running time of bubble sort in the best case?
And the best case, I claim, is when the numbers are already sorted.
AUDIENCE: [INAUDIBLE]
It's just going to blindly go back and forth n minus 1 times again and again,
where if I compare two people and I don't swap them, compare two people,
don't swap them, and I go all the way through the list comparing
to do that same process again because if the humans have not moved,
Inside of that same pseudo code, what if I say, hey, if no swaps, quit?
One of the loops has gone through per the indentation here.
one other variable that's plus plusing as I go keeping, track of how many
swaps--
I'm not going to make any swaps the next time around either.
when the list is already sorted, the omega notation for bubble sort
Yeah.
AUDIENCE: [INAUDIBLE] to optimize the running time for all cases possible?
If the running time of selection sort and bubble sort are both in big O
Yeah.
And that's an advantage because we're not comparing things ever so precisely.
Just like I plotted with the green and yellow and red chart,
don't have that much data because they're going to be pretty fast.
and as you might in the real world, that the Googles of the world,
a sense of how these things actually work and look at a faster rate
Short bars mean small numbers, tall bars mean big numbers.
we just have bars that are small or tall based on the magnitude of the number.
And that's like me walking left and right, left and right,
So that's why the problem is getting smaller and smaller and smaller
over time.
But you can notice now visually, look at how many times
to be frowned upon if avoidable because I'm touching the same elements again
and again.
Let me re-randomize the thing, and let me now click Bubble Sort at the top.
And as you might infer, there's other sorting algorithms out there,
It's two pink bars going through again and again comparing
And you'll see that the largest numbers are indeed bubbling the way up
to the right, but the smaller numbers, like our number 0 was,
Here's a comparable.
And it's going to take a while to get all the way to the left.
And here too, notice how many times the same bars
are becoming pink, how many times the algorithm is retracing and retracing
its steps.
Why?
And each time we do that, we're stepping through practically the whole array.
And now granted, I could speed this up even further if I really wanted to,
but my God, this is only, what, like 50 or 60 elements, something like that?
This is slow.
and this is why I sort of secretly sorted the numbers for Rave in advance
Well, to save the best algorithm for last, let's take a shorter five minute
break here.
All right.
and better than bubble sort and ideally not just marginally
Just like in week zero, that third and final divide and conquer algorithm
the reality that we were doing a huge number of comparisons again and again.
And you kind of saw that in the vertical bars that were going pink as everything
A little bit ago, I proposed this pseudo code for the binary search algorithm.
even though I didn't call it out at the time, it's kind of cyclically defined.
and yet it seems a little unfair that I'm using the verb search
and I'm using my own algorithm to search the left half or the right half,
is that when I search the left half or search the right half, yes,
is going to whittle the problem down and down and down until hopefully,
And we haven't seen this yet in C, and we haven't seen this really in Scratch.
And this is a very procedural approach, if you will, because lines 8 and 11
But really, what that's doing in the binary search algorithm for the phone
book is it's just telling me to search the left half or search the right half.
But that's equivalent to just telling myself go search the left half,
search the right half, the key thing being the left
have and the right half are smaller than the original problem.
It would be a bug if I just said search the phone book, search the phone book,
version than the one you've dabbled with-- this sort of pyramid,
And let's throw away the parts that aren't that interesting
and just consider how we might, up until now, implement this in C code,
Let me go over here, and let me create a file called-- how about iteration.c?
And the goal at hand is to implement in C a little program that just prints out
Well, let me go ahead, and in main, let me first ask the user for--
So I'm going to define a function called draw that takes as input an int.
I'm going to go ahead and print out a left aligned pyramid like this from top
to bottom.
The salient features here are that this is a pyramid, at least in this example,
of height four.
And now in height four, the first row has one brick.
1.
For int j, for instance, common to use j after if you have a nested loop,
So why i plus 1?
Well, again, when I equals 0, that's the first row, and I want one brick.
I'm going to save the new line for about here instead.
might to mine after all this practice, but this is something reminiscent
Seems to be correct, and let's assume it's going to work for other inputs
as well.
Like, I literally have a function called draw that does this thing.
But I can think about implementing draw in a somewhat different way that's
kind of clever.
with one brick, row two with two bricks, it kind of comes together
Well, if I were to ask you the question, what does a pyramid of height 4
And then hopefully, this process ends, and it does because notice,
So you're not going to have this sort of silly back and forth with me
infinitely many times because when we finally get to the base case,
no pun intended.
So there's a way to draw a line in the sand and say, stop, no more arguments.
Let me go ahead and create one final file here called recursion.c
And then inside of main, I'm going to do the exact same thing-- int
height equals get int, asking the user for height.
And then I'm going to go ahead and call draw passing in height.
I even am going to make my prototype the same-- void draw int n semicolon.
I'm going to be kind of a wise ass here and say, well, just
done.
AUDIENCE: [INAUDIBLE]
And then down here, I'm going to print out a new line at the very end.
boiled down the implementation of draw into printing a row after just
my drawer function, notice, is always going to call the draw function forever
in some sense.
just to ask a simple question and then just bail out of the function
if n equals 0.
Don't do anything.
All right, let me go ahead and compile this code-- make recursion.
and voila.
If only because some of you have run into this issue accidentally already,
let me get rid of the base case here, and let me recompile the code.
Make recursion.
When I run this program, still works for 4, still works for 0.
Like, this means I have somehow touched memory that I shouldn't have.
probably not your own pset context, by just making sure we don't even
In fact, I'll act it out myself with just these numbers on the shelf here
rather than humans because recursion in general takes a little bit of effort
So here's the pseudo code I propose for this algorithm called merge sort.
In the spirit of recursion, this sorting algorithm
literally calls itself by using the verb sort in its pseudo code.
It sort of obnoxiously says, well, if you want to sort all of these things,
Well, if I just asked you to sort something and you just tell me,
that thing, what was the point of asking you in the first place?
We have the numbers 2-- and I'll call it out if you're at a bad angle--
Notice that the left half at the moment, 2457, is already sorted,
going to turn over most of the numbers except for the first numbers in each
of these halves.
the leftmost element of each half-- that is, the one on the left here
Well, if I look at 2 and I look at 0, which one should presumably come first?
the beginning of this list and the new beginning of this list.
Of course, it's 2.
It's of course 3.
Now I'm going to compare the 4 against the beginning and end,
it turns out, of the second list--
4, of course.
5, of course.
It's, of course, 6.
sorted the whole thing by having merged together the two halves of the list.
We haven't done the guts of the algorithm yet-- sort the left half
But I'm just deliberately putting them in this random order, 5274.
1630.
I could use selection sort and just go back and forth and back and forth.
I could use bubble sort and just compare pairs, pairs, pairs.
5274.
So here we go.
I have a list of size 4.
So here we go.
The 5 is sorted.
I just sorted the left half of the left half of the left half.
So I'm done.
But what's the third and final step of this phase of the algorithm?
We started to sort the left half of the left half of the left half, then
Right half.
Done.
Done.
In total, I've now sorted the left half of the original thing.
And if you want, I'll do the same thing when I merged last time
to be clear what I'm comparing.
Done.
Done.
What do I do?
Done.
Done.
What do I do?
Merge them together.
Now where are we in the stor-- oh my God, where are we in the story?
Merge.
Of course, the 0.
The 1.
The 3.
Merge.
and a 0.
The 4.
The 5.
The 6.
OK.
I wasn't going back and forth, back and forth in front of the shelf
I was deliberately only ever merging the smallest elements in each list.
And how many times did we divide, divide, divide in half the list?
Well, we started with all of the elements here,
With the human examples, we just had the humans, and that's it, and only eight
of them.
Merge sort actually requires that you have some spare space, an empty array
But if I really wanted and if I didn't have this shelf or this shelf,
honestly, I could have just gone back and forth between the two shelves.
that the total running time, if you can perhaps infer from that math, is what?
we saw in week zero and again today that log n is smaller than n.
That's a good thing.
So it's sort of lower on this little cheat sheet that I've been drawing,
turns out it's not quite as good as bubble sort with omega of n,
where you can just sort of abort if you realize, wait a minute,
Merge sort, you actually have to do that work to get to the finish line anyway.
And we'll see with this example what merge sort looks like
Let me zoom out so you can actually see the height here.
Let me go ahead and randomize this again and run merge sort.
There we go.
Now you can see the second array and where the values are going temporarily.
And even though this one looks way more cryptic visualization-wise,
and consider that moving forward as we write more and more code,
And let's see what these algorithms might look or sound like here.
[MUSIC PLAYING]
[MUSIC PLAYING]
[MUSIC PLAYING]
DAVID J. MALAN: Well, this is CS50, and already this is week four,
back to back to back that really lay things out left to right, top
You've seen this approach of just using memory in some way to lay things out,
So for instance, here is a photo taken of last week's front row, for instance,
if we start to zoom in and zoom in and zoom in, because it seems like most
any TV show like CSI, or whatever, or any movie that
For instance, let's zoom on this puppet here's eye and let's
We're using pixels-- these dots on the screen as rows and columns--
then at the end of the day, you can only store a finite amount of information.
At least I don't really see in this grid here any glint of a license plate
But consider after all that this doesn't need to be even as high resolution,
you can imagine just doing something silly with Post-It notes, like this.
we'll explore in the coming week-- you could make this fun smiley face
Or yellow and purple, or vice versa, just to make something come to life.
Now in practice, recall we talked about storing not just a zero or one,
but maybe an R, a G, and a B value-- like 24 bits, or three bytes in total--
But for fun, if today you want to tackle something passively in the background,
look a little something like this, which we've organized in rows and columns.
see if you can't make something a little creative and then email it to Carter
and we'll exhibit some of the best or favorites on the website thereafter.
but you're probably generally familiar with Photoshop as a program for editing
Now the R, the G, and the B values went way up from 0 to 255, 255, 255.
Here is red, and it turns out that red is a whole lot of red, 255,
Or, a.k.a.
FF0000.
0000FF.
Now some of you, again, might have seen this notation before,
these zeros and these F's and all of the numbers and letters in between,
But first, recall that with RGB we previously did the following.
So here we have 72, 73, 33, which in the context of an email or text, of course,
said what--
when you combine that much red that much green that much blue.
And it's maybe somehow equated with 255, at least per that Photoshop screenshot.
a lot of blue--
Here was binary-- in the world of binary you had just two digits, zero and one.
In our world's decimal system, of course, you have zero through nine.
in the context of images and also files just because it's a convention
and there's some conveniences to it.
It's not one zero for 10, or 1 1 for eleven-- all 16 of these values,
Base 16, just does the math from week zero and really,
Instead of powers of two or powers of 10, which we saw for binary and decimal
But now let's just consider how we would represent familiar numbers.
If you've got two hexadecimal digits for which these hashes are
Why?
16 times zero plus one times zero is the number you and I know as zero.
This would be two, three, four, five, six, seven, eight, nine--
So, apparently A, so 0A, 0B, which is now 10, or 11, or 12, 13, 14, 15.
you add one more, nine wraps around to zero, or in this case,
Something, something.
FF, I heard.
So how high can you count in hexadecimal if you've got just two of these digits?
Well, it's the same math as always.
16 times F, a.k.a.
that gives us 240 plus 15 in decimal, the result of which, of course, now
is 255.
So this hexadecimal system-- you may have seen in the world of web pages,
and if you haven't we'll get to that in this class in a few weeks,
has this shorthand notation of counting as high as 255 but just calling it FF.
Now it's marginal, but that's like 50% savings of how many digits
and that difference is going to get magnified the bigger our numbers get.
Let me stipulate for now, you're going to get more and more savings
in terms of just how many symbols you need on the screen to represent
All right, let me pause here just to see if there's any questions thus far
on what we've called hexadecimal, which again, just gives us zero through nine
we're not really going to see other notations besides this moving forward.
Yeah.
Does hexadecimal require more storage or less storage than the decimal system?
But inside of the computer, at the end of the day, you're still storing bits.
because all they're representing at the end of the day is zeros and ones.
are useful to memorize, like 255 is as high as you can count with eight bits
if you start at zero, because two to the eighth is 256, but if you start at zero
So in binary, recall if you have eight bits, all of which were ones,
the left half of all these bits, and the second F in this case
It's not quite eight, but units of four, and that's not bad.
Which is why-- if you use two digits like I have thus far,
00 or FF or anything in between--
One hex digit for the first four bits, one hex digit for the second.
you can again think of each byte as having a number associated with it--
Here's byte zero, one, two, three, four, five, six, seven, 15, 16
But it turns out in the world of memory, and thus today, programming, people
But instead, one zero, one one, one two, one three-- this
is not 10, 11, 12, 13, because I claim I'm in the context of hexadecimal now.
Yeah.
that is the hexadecimal number one zero, which recall we said earlier,
it's really just the system for how you think about these things.
in a bunch of contexts.
When you write code, you might even write code using some hexadecimal
All right, so with that said, any questions now on this building block?
Nothing so far?
All right.
and I'm going to go ahead and whip up a program that very simply assigns
but today, keep in mind that that variable n and that value 50
and it turns out today we'll introduce a bit more syntax so you can actually
So nothing very interesting but I'll use %i backslash n and then comma n
Looks like as expected, it simply prints out the number 50, like this.
But let's consider then, what this code is doing underneath the hood
Yeah.
Maybe it ends up over here just because there's other stuff being
don't really care where it ends up, just that it ends up somewhere.
let me again remind that this is 32 zeros and ones representing that 50--
that's a label of sorts for it-- but at the end of the day that 50 is
care what it is, I just want an address for the sake of discussion.
So way over here off screen might be byte zero, way down here is byte 0x123.
This is different.
and you write &n, C is going to figure out for you what is the address of that
And it's going to give you a number, otherwise known as the address of that.
even though yes, it's a number like 0x123, you have to tell C in advance
that you want to store not an int per se, but the address of an int.
And the star just tells the computer, this is not an integer per se,
As always with the equal sign, you copy from right to left.
in a pointer, and the way to declare a pointer is to specify the type of value
whose address you're storing, and then use the star to indicate that this is
and then down here, I'm going to print out not n this time, but p--
the variable p.
And then even though yes, it's just a number and therefore I could use %i
for integers, there's actually a special format code in printf for printing
as 0x123.
print not just n, but the address of n and achieve the same thing.
And notice if I keep running the program, it's actually moving around.
Maybe it's actually randomizing it so it's not always at the same location.
but this happens to be at that moment in time where that value is in memory,
Yeah?
Short answer is yes, and this is both the power in the danger of C,
and we're going to do this today and make a few deliberate mistakes,
because with this power of going to or getting the address of any variable,
around at all of the computer's memory, even at things that I didn't put there.
and in our operating systems that do hedge against this a little bit.
there's just so many things that can go wrong using this language called C,
and odds are you did it most recently by going too far in an array.
AUDIENCE: [INAUDIBLE] pointer star p, but then we used p later in the code.
Is it called star p or p?
is unfortunately one other use for the star operator, and that's as follows.
If I want to print out the integer via %i, that is at that address,
I can actually use the star here, which technically contradicts what I just
But if you say star p now, you're not redeclaring the variable
So I'm hearing 50, and that's true because if you figure out the address
All right, any questions now on this syntax-- and I will concede,
It's just too confusing, honestly, but with practice comes comfort.
Yeah.
AUDIENCE: [INAUDIBLE]
is on you at the moment to know what you are getting the address of.
Is it a string?
Is it a char?
Is it a bool?
Is it an int?
Notice in line eight, I didn't tell the computer, other than the %i,
what kind of address I'm going to, but I did already in line six.
and that way it will print out all four bytes at that address,
not just part of it, and not more than those four bytes.
Good question.
Yes.
but yes, you can use star star, and then things get--
I'm sorry.
Was there?
Other questions?
Yeah.
This is not the common use case, just printing out the address--
So let's actually just now depict what was going on inside of the computer's
now let me plop into the memory n, which is storing in this program
0x123, and technically there's not an x there, there's not a zero there,
But again, that's weak zero-- don't care about binary day-to-day.
it turns out that p is always going to take up eight squares on the board,
Yeah, thoughts?
Maybe it's allocating eight bytes because it doesn't know the type.
Other thoughts?
AUDIENCE: Maybe the first four for the actual number and the last four
heck-- even our phones have a lot more memory than they did years ago.
been 32 bits, or even only eight bits way back in the day.
It's considered 32 bits, because that was the norm for some time.
and I keep saying it's 2 billion if you do negative, but in the world of memory
because for a very long time that was the maximum amount of memory
Why?
And with 32 bits, depending on whether you allow for negatives or not,
but you know what-- your Mac, your PC, your phone
could not have had five gigabytes of memory, or 5 billion bytes of memory.
You certainly couldn't have had what computers nowadays come with,
16 gigabytes of memory.
Why?
can't count that high, which means if I drew a picture of all of the memory we
would run out of numbers to describe them, which means most of my memory
Generally, no.
you just draw an arrow from the pointer to the value in question,
because neither you nor I probably care about the specifics of 0x whatever.
There's your pointer-- it's literally an arrow, and we can see this.
are not that dissimilar to what we've done for hundreds of years
Well, you store in a computer's memory values like the number 50,
see OK, the value inside of this mailbox is not a number like 50,
Oh yes, please.
Sure.
AUDIENCE: Anfoo.
AUDIENCE: Anfoo.
OK, come on up to the edge of the stage there and just to be clear--
Thank you.
All right.
All right.
Thank you.
because it can get very abstract, very cryptic quickly when we're
talking about addresses and memory and drawing it like these little squares.
But if you think about just walking into a post office or an apartment
not just int but int star, that tells the computer how
Yeah, in back.
Once more?
So this is OK, and I can't draw it quite quickly enough on the board here,
but this would be like creating another four bytes somewhere in memory, maybe
because the assignment operator from right to left copies one value
to another.
OK, so that is week one style use of assignment operators before pointers.
And to repeat for the camera, if I create a second variable like this,
that this gives me another rectangle, the value of which is also 50,
A good question.
Well, recall that we talked quite a bit last week about strings, and just
to recap in layperson's terms, what is this string as you now understand it?
What's a string?
An array of characters.
paragraphs-- as strings.
The CS50 library implemented in the form of the header file cs50.h--
Let's consider what this looks like now in the computer's memory.
I don't care about all the other bytes, let's just focus on these,
just means eight zero bits to demarcate the end of that string
you can then very cleverly use that square bracket notation
Now, I don't care about 123 per se, but even though this is hexadecimal,
Even in hex, if you just add one when you start at 0x123,
laying out the word hi in memory like that, well, what exactly is s?
Where is s?
and showed you where H-I exclamation point null actually are,
what happened to s?
I actually don't really care about these addresses, so let's abstract that away.
to a character.
Specifically, the first character in s.
Last week we had this clever way of demarcating the end of a string.
Well, it turns out that strings are represented in the computer's memory
that's literally all you need to figure out where a string begins and ends.
In terms of this picture, if I've started with this line of code here,
it turns out all this time since week 1, that the word string has just
of the first character in a string, then probably a string is just a char star
Now, the string might have three letters like it did, or four, or even a hundred
Let me go back to my code here, and let's get rid of this integer stuff,
Let me add in the CS50 library, so we'll include CS50.H for now.
So this again is week 1 style stuff where I'm just printing a string.
No pointers yet.
Let me first of all, get rid of the CS50 library for a moment
is to say char, a space, then the star, and then immediately thereafter
So %s is a thing that comes with printf because the word string is programmer
even though this is not purposeful other than for the sake of discussion.
But if s is this-- let me go back and give myself the CS50 library.
and let me store the first character in the string there, which is
And then just for kicks, let me go ahead and do char star--
is the same thing as s bracket I, then I'm saying, what's the address of c,
Instead of just printing p, let me go ahead and print out maybe s itself.
Why are we seeing different addresses, namely this address 402004 for s,
Any thoughts?
Now for the first time all we're doing is actually just poking around
All right, so now let me go ahead and print out the value of this pointer,
let me print out not only what p is, but also what s itself originally is.
Because if I claim that everyone from last week should be comfortable with
Thank you.
Now this isn't to say that we would jump through these hoops
all the time with this much syntax, but this is just to do proof by example
but the key thing is it's the address of the first character in the string,
AUDIENCE: [INAUDIBLE]
I'm passing it s.
Previously, when we used %s, printf knew to print not just the first character
of s, but h, i, exclamation point, and then stop when it hits the backslash
zero.
here and let me just print out a few things to make the same point.
I'm going to print out not just s like I did here, but let's go ahead
So let me print out the address of the first character, the second character,
In my diagram a moment ago, my addresses were arbitrarily 0x123, 124, 125, 126.
ampersand does and what the star does, is I'm just playing around.
Yeah.
AUDIENCE: [INAUDIBLE]
There's actually only going to be one copy of the word "hi" and exclamation
All right, so a couple final details then, on what's been going on here.
As of this week, we can now start writing this code because char star
But it turns out you've seen a way of inventing your own data types.
We played around last time with data structures, or the struct keyword in C,
and briefly the typedef keyword, which defines a type for you.
Now even though the syntax is a little different today because of the star
But the star, the char star, is just too much in week 1.
Yeah.
AUDIENCE: [INAUDIBLE]
If that is-- is that why when you compare two strings as I briefly
accidentally would have compared two addresses in memory, not the strings
at those addresses.
All right, well, before we give ourselves maybe a 10 minute break here,
in an interesting way.
if you will, and actually really use our memory in the most versatile way
whittle it down to just a hi initially, and see what's going on again, here
that you've been using for some time are actually a little special.
which means the compiler will do all the work of figuring out
or even stars explicitly, but the star at least has been there because again,
1 style here for a moment, and let's go ahead and print out a few characters.
So I'm going to use %c this time, and I'm going to print out s bracket zero
and then I'm going to print out s bracket one and s bracket two,
I could print out one more location, and let me go ahead and recompile,
make address ./address and there is, it would seem, the backslash zero.
I'm not seeing zero because I didn't type literally the zero char in ASCII,
it's apparently all eight zero bits, but they are there
and let me go ahead and get rid of the CS50 library and get rid of,
therefore, the word string because again, henceforth it's just char star.
And now, let's just focus on the hi rather than even worry about that.
So I'm going to recompile one last time and now I have h-i exclamation point.
Well, it turns out that the array notation we used last week
but we can see more explicitly today what the square brackets for a string
is actually doing.
zero, but I want to print out whatever the first character of s is.
the computer will see that there's a backslash n at the end of it.
s, because recall that star is the dereference operator when you don't
repeat the word char, you don't repeat the word int--
know that the h comes first, then the i right after it,
So this now is equivalent to what we did last week using square bracket
notation, but now I'm re implementing that same idea with this lower level
I think the more common syntax would be what we did last week--
Why?
All right, let me pause here, see if there's any questions on that one.
and let me give myself an array of numbers like I did last week.
I can do like 4, 6, 8, 2, 7, 5, 0.
So it turns out I can print each of these numbers in the familiar way.
and let me just do some quick copy/paste just to print the first three of these.
and instead of printing numbers bracket zero like I might have last week,
so asterisk numbers.
4, 6, 8, 2,7,5, 0--
is an int.
Why?
Notice that I did not do plus 4, plus 8, plus 12, plus 16, plus 20.
The compiler is smart enough to know that if you add 1 to this pointer,
after a number that are back to back to back but not one byte apart,
Which is only to say plus 1, plus 2, plus 3 works no matter the data type.
Why?
Because the compiler knows what type of data you're talking about.
and I claim that the compiler's smart enough to realize that oh,
if I have double quote hi, that means it's an array of h-i exclamation point,
It turns out that you can actually treat arrays as though the name of the array
Notice that strictly speaking on line five, there's no pointers going on.
that just generally allows you to treat one as though it is the other,
Are any questions then on this before we start to solve some bigger problems?
Yeah.
AUDIENCE: [INAUDIBLE]
If you go beyond the end of an array, you might get a segmentation fault.
It often depends on how far off the end of the array you actually go.
if you just poke a little too far, but if you go way too far
But we'll give you a tool today actually for detecting and solving
exactly that kind of situation.
Let me go ahead and create a program called compare.c, and in this program
not so much for string but so that I can actually use GetInt still,
which is way easier than the way we'll see that C normally lets you get input.
not worrying about command line arguments today, and let me go ahead
and get an int i using get int, and ask the human for the value of i,
then let me give myself an int j, ask the user for another int, calling it j,
and then let me go ahead and kind of naively, but to your point earlier,
and print out something like "same," backslash n, else let's go ahead
and print out "different" if they are not, in fact, the same.
So that would seem to be a program that compares the value of two integers.
let's move away from integers and let's actually change these things to char--
to strings.
If you've used s for string already you can use t for the next one, at least
I'm going to compare the two, just like I did for ints, which worked great.
oh, sorry.
Let me run it again with hi, exclamation point and hi, exclamation point.
Yeah.
Yeah, this is where it's now useful to know that string has been
in t, respectively.
Why?
Because s might end up here in memory and t might end up here in memory.
That doesn't happen because we did not design GetString that way.
They might look the same to the human but to the computer
and trusting that we put a backslash zero at the end of whatever the human
typed in, and that's enough now for printf, for strlen, for you
but there are functions that can solve this comparison for us.
So you might have seen this error before and you might have ignored most of it,
Include string.h which, despite its name, does not create a data type
Now let's type in hi, exclamation point and even the same thing again.
Yeah.
Yeah.
AUDIENCE: [INAUDIBLE]
And indeed, not that it's returning same all the time.
Let me go ahead now and just reveal more pictorially what's going on.
Let's get rid of the string comparison and let's just print these things out.
The simple way to print this out would be with %s and again, %s is special--
printf knows--
I'm using char star, but I'm still using printf and %s in the same way.
Let me go ahead and run compare now, and if I type hi and hi,
So they look the same, but here now we have the syntax today
So make compare, ./compare, and now let's type in hi, and once more,
in hexadecimal.
claimed earlier that on modern systems, pointers are generally eight bytes
Yeah.
defined by the address of its first char and that address of its first char
What's been then copied from right to left using that assignment operator
Now technically, we don't really need to care about where these addresses are.
What ended up in t?
0x456, presumably.
for the same chars left and right, and if it doesn't notice any differences,
And that's very similar, recall, to how we implemented string length ourselves
last week.
str compare is probably a little similar in spirit, looping from left to right
and why it is that we use str compare and not equals equals?
Yeah.
Yes.
So we won't do that today, but I could actually use the ampersand operator
on s or on t.
There's star and there's star star, but yes, that is a thing
and it's very often useful in the context of two dimensional arrays,
which we haven't really talked about, but that is a feature of the language,
too.
Good question.
So let's include the CS50 library just so we have a way of getting user input.
And then in here, let's get a string from the user and just
And heck, we can actually just call this char star if we want,
or string, since we're using the RS50 library.
using a single assignment operator and then let's check something like this.
I might not remember this offhand, but it was in another header file
Now at the very last line of the program let's just print out what both s and t
are by simply printing out %s for each of them, and t is %s also, not %t,
Oh.
There we go. ./copy, and let's now type in, for instance,
Because notice that I got s from the user, so that checks out.
at least in my code--
So C is really literal.
If you create another variable called t and you assign it the value of s,
Nowhere did I tell the computer to give me a copy of a h-i exclamation point
and I call it a string, a.k.a. char star, maybe it again ends up here.
But when I copy s into t by doing t equals s semicolon,
that literally just copies s into t, which puts the value 0x123 there.
So if we now abstract away all these numbers and just think about a picture
Two different pointers but storing the same address, which means
And so if you follow the t breadcrumb and capitalize the first letter,
Thoughts?
The catch with stir copy is that you have to tell it in advance not only
what the source string is-- the one you want to copy--
into which you can copy the string, and here's one thing we haven't seen yet,
And for this, we're going to introduce something called dynamic memory
allocation.
And this is the last and most powerful feature perhaps, today,
whereby we're going to introduce two functions, malloc and free, where
It's a function that takes a number as input-- how many bytes of memory
do you want the operating system to find for you somewhere in that big grid?
to you the address of the first byte of contiguous memory back to back to back,
and then you can do anything you want with that chunk of memory.
When you're done using a chunk of memory that malloc has given you,
you can say free it, and that means you hand it back to the operating system
and then the operating system can use it for something else later.
If your Mac your PC has ever been in the habit of starting to get really,
If the program has a bug and never actually frees any of that memory,
and honestly, humans are not very good at handling corner cases like that.
in concert and free memory once you are done with it.
So let me go ahead and do this in code and solve this problem properly.
So to do that, let's make this super clear that we're dealing with pointer,
Good!
Four!
Because I need the h, the i, the exclamation point, and additional space
point or any other three letter word or phrase, but to do this dynamically
it returns the length of the string you see, plus 1 also takes into account
So for int i equals 0, i is less than the string length of s, i plus plus.
I think this for loop will actually handle the process, then,
Or I could get rid of all of that and take your suggestion, which
is to use str copy, which takes as its first argument the destination
So copy from right to left in this case, too, that's going to do all of that
than s, and then I can print them both out to see that one has not changed
What is-- even if you don't know quite how to solve it,
And I could look this up in the manual, or I know it off the top of my head,
All right, let me just clear this away and do make copy one more time.
Now I'm good. ./copy, Enter, All right. s, I'm going to type in hi, lowercase.
Yeah.
[INAUDIBLE]
was-- the computer remembers how many bytes it gave me and it will go free
I should do free t.
then I should just return 1 or something to say that there was a problem.
If I'm doing t bracket zero, this is assuming that there is a letter there.
I can return zero, thereby signifying that indeed, this thing was successful.
But you did not call malloc for s, so you should not call free for s.
plus 1-- the string length is the literal length of the string as a human
but I know now as of last week and this week what a string technically is
that lesson learned so that I actually give str copy enough room for that
And here's just an annoying thing when we called the backslash zero N-U-L last
So long story short, you never really write N-U-L, I've just said it
You will start writing N-U-L-L when you want to check whether or not a pointer
is valid or not.
If malloc fails and there's just not enough memory left inside
AUDIENCE: [INAUDIBLE]
and then str copy fills it with h-i exclamation point backslash zero.
0x456457458459.
I just get a chunk of memory that is now mine to use as I see fit.
I then assign t to that return value, which points t at the first address.
What str copy eventually did for me was it copied the h over,
the i over, the exclamation point over, and the backslash zero.
And if I didn't want to use str copy or I forgot that it existed, my for loop
Any questions?
Yeah.
AUDIENCE: [INAUDIBLE]
So then I would have been left with a picture that looked like this a few
would now be pointing over here and so I wouldn't have fundamentally solved
Yeah.
AUDIENCE: [INAUDIBLE]
Not necessarily.
malloc's giving me enough memory to make a copy, str copy is doing the copy.
and you could use str copy on that, and there's other use cases for str copy.
you use malloc and then str copy, or your own homegrown loop.
Yeah.
AUDIENCE: [INAUDIBLE]
AUDIENCE: [INAUDIBLE]
If I had a--
str copy, per its documentation, will copy the whole string
It's therefore up to you to pass str copy a long enough chunk of memory
whereby str copy would just still blindly copy one, two, three,
four bytes, but technically it should have only touched three of those.
You do not yet have access to the fourth one, or the rights to it,
Yeah.
Not just using printf, not just using the built-in debugger, but another tool
here as well.
Let me include stdio.h at the top and let me include stdlib.h at the top
Why?
So I'm going to go ahead and do malloc of three, but I don't want three bytes.
so the better way to do this would be 3 times whatever the size is of an int.
And this is just an operator you can use any time if you just want to find out
The right hand side here gives me a chunk of memory for three integers.
I need a pointer.
I can go into maybe the first location and assign it the number 72
Second location, the number 73, and the third location, maybe the number 33.
as zero indexed.
and use pointer arithmetic, but this is a little more user friendly.
Line eight should have been x bracket 1, and then line nine
Make memory, and you'll see here that it compiles OK. ./memory,
and I'm going to go ahead and expand my terminal window for a moment
and I'm going to run not just ./memory, but a program called Valgrind./memory.
So again, this takes some practice to get used to, the nomenclature here,
that not only did I screw up, but I screwed up related to memory
It's not going to necessarily tell you exactly how to fix it,
So let me go ahead then and change this to zero, one, and two, perhaps, here.
buggy.
The blocks of memory are lost in the sense that I malloc'd them--
Once I'm done with this memory I just need to free it at the end.
it's still runs fine so all the while I might have thought, incorrectly,
my code is correct.
And even though it's still a little cryptic, there's no other error here
and in fact, it's pretty explicit-- error summary, zero errors from zero
So even though this is one of the most arcane tools we'll use,
it's also one of the most powerful because it can see things
that you, the human, might not, and maybe even that the debugger might not.
And we'll guide you after today with actually using this, too.
Then let me go ahead and do for int i equals zero, i less than 3,
That's it.
This code, pretty sure is going to compile and it's going to run,
I've forgotten a step even though the code that's written is not so wrong.
Yeah?
doesn't mind.
When you, the programmer, do not initialize your codes variables to have
there's a bit of work that happens even before your code runs in the computer,
and all of a sudden users, maybe people on the internet in the context of web
or remnants.
So this is to say again, you have this great power now to manipulate memory,
but also now you have this great hacking ability to poke around
the contents of memory, and this is exactly what hackers sometimes do when
No?
All right, let's go ahead and take a quick five minute break
We are back.
First, just a little programmer humor from XKCD, which hopefully now
And what we'll also do next to take a look at a short two minute video that
it's another if you actually mistake a garbage value for a valid pointer,
because garbage values are just zeros and ones somewhere-- numbers, that is.
Oh, goody!
BINKY: OK, this code allocates two pointers which can point to integers.
SPEAKER 1: OK.
Well, I see the two pointers, but they don't seem to be pointing to anything.
The things they point to are called pointees, and setting them up
is a separate step.
I knew that.
So make it do something.
That great.
There it goes.
Hey, try using it to store the number 13 through the other pointer, y.
BINKY: OK.
I'll just go over here to y and get the number 13 set up,
whoa!
And because the pointers are sharing that one pointee, they both see the 13.
Whatever.
BINKY: But--
These first couple of lines were not bad, and notice that in Stanford's code
That's fine.
and not assign them a value initially so long as you eventually do.
which means go to the memory location in x and store the number 42 there.
Yeah.
which means go to some random address that you have no control over,
and boom--
that might cause what we've seen in the past, perhaps as a segmentation fault.
that if you don't quite have the eye for it yet, Valgrins, that new tool,
could help you find as well.
But it's just another example of again, the sort of upside and downside
All right.
We had all of our volunteers come up, we had to swap a lot of things
we wanted to swap some values like int A and int B, for instance, here.
Void because I'm not going to return a value, but I have a function
called swap.
being asked to help with your-- oh, I'll go with the friend, pointing.
no?
Come on down.
That backfired.
Come on over.
AUDIENCE: Marina.
So here we have for Marina two glasses of liquid, orange and purple,
it's just to swap two values, as though these two glasses represented
AUDIENCE: [INAUDIBLE]
with how you would do this without having an extra cup, so good foresight
here.
Let me go ahead and we do have a temporary variable, if you will.
So if I hand you this, how would you now solve this problem?
Oh.
Well, OK.
OK.
Sure, go ahead.
So how would you swap these now, using this temporary variable?
OK, good.
then you copied the purple into where the orange was,
But a round of applause if we could, and thank you for doing that so well.
have to kind of stay put, even though we're physically lifting them,
and you just have a temporary variable into which you copy A,
me whip up something really quickly here with, how about include stdio.h,
int main(void).
Then let me call a swap function that we'll invent in just a moment.
Swap x and y And then let me print out again x is %i, y is %i backslash n,
just to print out again what they are, because presumably I should see 1,
And I'll just copy/paste that up here, and now let's go ahead and run this.
x is now 1, y is 2, x is 1, y is 2.
So there seems to be a bit of a bug here, but why might this be?
This code does not in fact work, even though it obviously works in reality.
Yeah?
and in fact what happens when you call a function like this on line 11,
by value, so to speak.
Just because you use the same names in one function as you do elsewhere,
But indeed, swap is going to get copies of this x and y, and in this context,
just to make super clear that they're indeed different, albeit copies,
And let me do that same thing at the beginning of this function before it
Initially, x is 1, y is 2, A is 1, B is 2, A is 2, B is 1,
There's something about using one function versus another that's actually
The fact that I'm passing in copies of these values is creating this problem.
Well again, inside of your computer's memory there is these little chips,
It's not just random, where it just puts stuff wherever is available,
And you have control over a lot of it, but the computer uses some of it
for itself.
and consider that within your computer's memory, what a computer will typically
So when you compile a program and then you run it with ./whatever,
or on a Mac or PC you double click on it, the computer first--
the operating system first-- loads all of your program zeros and ones, a.k.a.
Machine code, into just one big chunk of memory at the top, so to speak.
you have created in your program that are outside of main and outside
of any functions.
Then there's this chunk of memory that's generally known as the heap--
and then there's this other chunk of memory called the stack.
And it turns out that up until this week you were using the stack heavily.
Any time you use local variables in a function they end up on the stack.
Any time you use malloc, that memory ends up on the heap.
like a problem waiting to happen because if you use more and more and more
like two things barreling down the tracks at one another-- this does not
end well.
If you've ever heard the phrase stack overflow, or use the website,
Or if you use malloc a lot and keep calling malloc, malloc, malloc,
and never really, or rarely calling free, you just use more and more memory
and eventually these two things might overflow each other, at which point
And so here we have a whole bunch of wooden blocks and each of these squares
We can visualize it like this-- when you run ./swap or any program for that
and so I'm just going to label this bottom row of memory as main.
And what were the two variables I had in main called in this code?
Yeah.
x and y.
So let me just call this x, and I'm just going to write the number 1 in this box
here.
And then I had my other variable y, and I'm going to put the number 2 there.
What happens when main calls swap like it does in this code here?
that are passed in, so I'm going to call this tmp, tmp over here.
tmp initially gets the value of A. All right, the value of a was 1,
OK, A equals B. So that is assigned from the right to the left of the B
swap doesn't actually do anything with the result, and the problem in C
The A or the B?
main is using this chunk of memory at the bottom in the so-called stack,
What happens when the function returns, whether it's void or not?
are still there in the computer's memory but they no longer belong to us
of why there's other stuff in memory even though you didn't put it there,
necessarily.
What if we pass in not x and y, but the address of x and the address of y,
Well, if I kind of rewind in this story and I go back here, I still have tmp,
although I'm going to delete its value to begin with, I still have B
with this new and improved version, and let's see what happens.
What goes in B?
Well, I'm going to put 0x456, and the what am I going to do?
Which is to say what came as naturally as the real world here for Mariana
You can pass in values but you get copies of those values.
If you want one function to affect the value of a variable somewhere else,
How do I now call swap if swap is expecting an int star and an int star?
That is, the address of an int and the address of another int.
Yeah.
AUDIENCE: [INAUDIBLE]
and as soon as we have an address, just like when I held up the fuzzy finger--
the foamy finger-- I can point at that address, I can go to that address
and actually get a value from the mailbox or put a value into the mailbox
if I even want.
Thank you.
Thank you.
And to illustrate this, let me go ahead and open up one other file here.
and without using the CS50 library at all for strings or for any of those
get functions.
Let me just print out what the value of x is, even though it's going to be a--
or rather, ask the user for the value by asking them for x.
And I'm going to use a function called scanf that's going to scan
in an integer using %i, and I'm going to store whatever the human types
in at this location.
And then I'm going to go ahead and, just so we can see what happened,
I'm going to print out with %i whatever the human typed in as follows.
So the curiosity today is this new line. scanf is another function in stdio.h,
want to scan in, that is, read from the human's keyboard--
and I'm telling it where to put whatever the human typed in.
I can't just say x, because we run into the same darn problem as with swap.
but that's the cryptic syntax we would have had to show you in week 1.
Oh my God.
Strike two.
OK.
Make scanf.
There we go.
OK, so scanf--
I'm going to type in a number like 50 and it just prints it back out.
The problem, though, is when you start to get into strings, things
Then let me go ahead and just prompt the user for a string, using just printf.
Then let me go ahead and use scanf, ask them for a string this time with %s,
Then let me go ahead and print out whatever the human typed
So here, line five is the same thing as string s, but we've taken back
scanf will also read from the human's keyboard a string and store it at s.
if I do make scanf--
Let me now run scanf of this version, Enter, and let me type in something
What if I type in not just something like hello, which also doesn't work.
Enter.
It's still not working, but what's the essence of why this isn't working,
Yeah.
So let me give them four characters, and let me go ahead and do make scanf--
whoops.
nope.
Dammit.
That gives me malloc, now I'm going to recompile this with clang,
now I'm going to rerun it, and now I'm going to type in my first thing, hi.
And let me get a little aggressive now and type in hello, which is too long.
Sort of.
I could actually just say char star four and give myself
there we go.
you would have had to use the scanf thing-- not a huge deal
But if we hadn't given you GetString you would have had to do stuff like this,
If the human types in five letters, six letters, 100 letters-- this code,
like with the Hello input, will probably just crash, which is bad.
that we allocate using malloc as many bytes as you physically type in,
The moment you type in h-e-l-l-o, we're laying the tracks as we go and we keep
allocating more and more memory so that we theoretically will never crash with
You're going to see a function called fopen, which stands for file open,
and it takes two arguments-- the name of a file to open like a CSV
that you might manipulate in Excel or Google Spreadsheets or the like-- comma
separated values, and then something like A for append, R for read,
Or fwrite-- file write-- which now that you will begin to understand pointers,
text files, images, other things-- but also write them out.
In fact for instance, just as a teaser here, JPEGs will be one of the things
that every JPEG in the world starts with these three bytes, written
there we go.
So here we have the notion of a byte we're going to create for ourselves.
which reads from a file some number of bytes-- for instance, three bytes.
bracket 1 equals 0xD8 and bytes bracket 2 equals 0xFF, all three of those
bytes I just claimed represent a JPEG, you'll see an output like this.
Let me do make jpeg, and let me run jpeg on a file which is available online
Let me open it up for us, called lecture.jpeg, and here, for instance,
is that same photo with which we began class, namely implemented as a JPEG.
we might take images and actually run them through a program that
called BMP, which essentially lays out all of its pixels from left to right,
looks like this, which is just a whole bunch more values in it,
to give it some old school feel, or a reflection like this to invert it,
And let me show you this file here, helpers.c, in which there is a function
But the ones we give you for the piece that won't already be implemented,
have a loop like this that iterates over all of the pixels in an image from top
But why?
And if you don't happen to have in your dorm one of these secret decoder
getting rid of the green in the world and the blue in the world--
you can actually-- I'm actually probably the only one who
But if using my code written here in helpers.c I get rid of all the blue
in the picture and I get rid of all the green in the picture,
[MUSIC PLAYING]
This is CS 50.
In fact, in just a few days' time, what has looked like this
and so forth.
But a lot of the low-level plumbing that you might have been wrestling with,
And indeed, we're going to start leveraging libraries, all the more code
And on top of all that, will you be able to make even better, grander, more
At the end of the day, it's just zeros and ones, or bytes, really.
And how you interconnect them, how you represent information on them.
Back-to-back, to back.
And then, today focus on what more generally are called data structures.
1, 2 and 3.
And suppose, whatever the context, you need to now add a fourth number
to this array.
Left to right.
SPEAKER 1: Sorry?
Over there?
SPEAKER 1: Yeah.
So, I mean, it feels like if there's some ordering
And if fill that in, ideally, we'd want to just plop the number 4 here.
Like, if your program has not just a few integers in this array,
It could be that your computer has plopped the H-E-L-L-O W-O-R-L-D right
Why?
Or maybe just hard coded a string in your code for "Hello, world."
Now I think you might claim, well, let's just overwrite the H.
Because those garbage values are junk that we don't care about anymore.
So we could certainly reuse those.
OK.
6 or more integers.
And 1, 2, 3 ended up back-to-back with some other data that we care about.
Let's go ahead and assume that we'll abstract away everything else.
The 2 over.
The 3 over.
All right.
So problem solved.
And this is something we're going to start thinking about all the more.
What's the downside of having solved this problem in this way?
Yeah.
Where you have not just a few, but maybe a few hundred, few thousand,
And it would feel like you're wasting all of this time moving stuff around.
In fact, if we put this now into the context of our Big O notation
from a few weeks back, what might the running time now of Search
be for an array?
Yeah.
AUDIENCE: Big O of n.
SPEAKER 1: Big O of n.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: OK.
Yeah.
What would the Big O notation be for Searching an array in this case,
AUDIENCE: Big O of n.
AUDIENCE: Log n.
Because we could use per week zero binary search on an array like this,
Big O of n.
Why?
you have to copy over all the darn existing numbers into the new one.
But we can throw away the plus 1 because of the math we did in the past.
that could start to add up, and add up, and add up.
And this is why computer programs, and websites, and mobile apps
could be slow.
Because you might just get lucky, and boom the number
you're looking for is right there in the middle, if using binary search.
And insert 2.
If there's enough room, and we didn't have to move all of those numbers--
If we do get lucky, it might just take the one, or constant number, of steps.
Then inside of my code here, I'm going to go ahead and give myself
All right.
4 int i gets 0.
I less than 3, i++.
Now let's start to practice some of what we're preaching with this new syntax.
And let me zoom out a little bit to give ourselves some more space.
Let me give myself a list that's of type int* equal the return value of malloc
enough memory for that very first picture we drew on the board.
I'm going to use malloc and get memory from the so-called "heap", as we
Instead of using the stack by just doing the previous version where I said,
int list 3.
That is to say this line of code from the first version is in some sense
And that's important because it was only on the heap and via this new function
That you can actually ask for more memory, and even give it back.
Recall that there's this relationship between chunks of memory and arrays.
And arrays are really just doing pointer arithmetic for you,
So if I've asked myself here, in line 5, for enough memory for 3 integers,
and find the first location, the second, and the third.
All right.
So let's go on.
Now, obviously, I could just rewind and like fix the program.
reason.
But let me ask first, what's really wrong, first, with this code?
The goal at hand is to start with the array of size 3 with the 1, 2, 3.
So at the moment, in line 17, I've asked the computer for a chunk of 4 integers.
Yeah.
SPEAKER 1: Yeah.
I don't necessarily know where this is going to end up in memory.
And so, yes, even though I'm putting the number for there,
If you think of the picture that I drew earlier, the line of code
Point at this chunk of memory, at which point I've forgotten if you will,
So the right way to do something like this, would be a little more involved.
So that I can now ask the computer for a completely different chunk of memory
of size 4.
That would seem to have the effect of copying all of the memory from one
to the other.
Again, I'm hard coding the numbers for the sake of discussion.
4 int i gets 0.
Let me go ahead and print each of these elements out with %i using list [i].
And then, I'm going to return 0 just to signify that all is successful.
Time passes.
Just as a safety check, I make sure that TMP doesn't equal null.
going to copy all the values from the old list into the new list.
And then, I'm going to add my new number at the end of that list.
And then, now that I'm done playing around with this temporary variable,
AUDIENCE: Library.
SPEAKER 1: Yeah.
A library.
There we go.
And we could see this, even if not just with our own eyes or intuition.
If I do something like Valgrind of dot/list,
But, notice that I have definitely lost some number of bytes here.
And indeed, I think what I need to do is, before I clobber the value of list
So if I now do Make List and do dot /list, the output is still the same.
So better.
And after I'm done senselessly just printing this thing out,
So this is perhaps the best output you can see from a tool like Valgrind.
All right.
OK.
And so, when I finally freed the list, that was the same thing as freeing TMP.
In fact, if I wanted to, I could say free TMP here and it would be the same.
Because at this point in the story, I should be freeing the actual list, not
Yeah.
in version 1, when I said int list [3], that was an array of fixed size.
This version now is still dealing with arrays, but I'm flexing my muscles
So we haven't even now solved this, even better in a sense, with linked lists.
Yeah.
AUDIENCE: How are you able to free list and then still make list?
And TMP is what was given the return value of malloc, the second time.
freeing the chunk of memory that begins at the address currently in list.
Totally reasonable to then touch that memory, and eventually free it later.
Because you're not freeing the variable per se,
Good distinction.
All right.
And there's another function in stdlib that's called realloc, for re-allocate.
just so we can keep track of what's been going on this whole time.
Time passes.
And we'll post this code online after 2, which tells a more explicit story.
So it turns out that we can reduce some of the labor involved with this.
Not so much with the printing here, but with this copying.
that can actually handle the resizing of an array for you, as follows.
I'm going to scroll up to where I previously
And I'm instead going to say this, resize old array to be of size 4.
And pass in not just the size of memory we want this time,
All right.
And someone's instinct was to just plop the 4 right at the end of the list.
And boom, it will just grow the array for you in the computer's memory.
If, though, it realizes, sorry, there's already a string like "Hello, world"
And then realloc will return to you, the address of that new chunk of memory.
And it will handle the process of freeing the old chunk for you.
So realloc just condenses, a lot of what we just did, into a single function.
And our apps, and our phones, and our computers are getting really slow,
Which lets you say a person has a name and a number, or something like that.
You can encapsulate multiple pieces of data inside of just one using struct.
What did we use the Dot Notation for now, a couple of times?
SPEAKER 1: Perfect.
you can actually start to now use your computer's memory almost any way
you want.
syntactic sugar.
We could just copy the whole existing array to a new location, add the 4,
know the buzz phrase we're looking for from past experience, hang in there.
Yeah.
Like an arrow, that says after the 3, oh I don't know, go down over here
At the end of the day, I as the programmer, just care about the data--
1, 2, 3, 4, and so forth.
I don't care how it's stored when I'm writing the code,
All right.
but there's plenty of other options for where this thing can go.
The point being, I don't know what is, or really care about,
But the catch is, now that we're not using an array,
we can't just naively assume that you just add 1 to an index and boom,
you're at the next number.
Now you have to leave these little breadcrumbs, or use the arrow notation,
All right.
OK, yeah.
So let me, you put the pointers right next to these numbers.
So let me at least plan ahead, so that when I ask the computer like malloc,
recall from last week, for some memory, I don't just ask it now
malloc for enough space for the number and a pointer to another such number.
Almost any time in CS, when you start using more space, you can save time.
But I'm going to use this second chunk of memory to refer to the next number.
And I'm going to use this chunk of memory to refer to the next,
But this is the equivalent of drawing an arrow from one to the other.
AUDIENCE: 0X789.
So 0X789, indeed.
And you can't do that with the hands because I can't count that fast.
So 0X789 should go here because that's like a little breadcrumb to the next.
Because at the end of the day, it's got to use its 64 bits in some way.
AUDIENCE: 0.
If any of you have accidentally lost control over your code space
because you had an infinite loop, this would seem a very easy way
AUDIENCE: Null.
decided that if you store the address 0, that's not a valid address.
remember where did the list start so that you can detect cycles.
All right.
But these addresses, who really cares at the end of the day
And indeed, this is how most anyone would draw this on a whiteboard
any questions on this idea of creating a linked list in memory by just storing,
Any questions?
No?
All right.
Oh, yeah.
Over here.
So I abstracted it away.
AUDIENCE: [INAUDIBLE].
How does the computer identify useful data from used data?
The other type of memory you use, not just from the heap.
from the heap, which was drawn at the top of the picture, pointing down.
There's also stack memory, which is where all of your local variables go.
And that was drawn in the picture is working its way up.
keep track of which values are valid or not inside of the stack.
All right.
Good question.
How could we implement this idea of, let's call these things nodes.
Whenever you have some data structure that encapsulates information, node,
We use typedef, which defines the name person to be our new data
type representing that whole structure.
do we think?
Yeah?
AUDIENCE: [? Data. ?]
Let me propose that, ideally, we would say something like node* next.
Next just means what comes after me is the notion I'm using it at.
is that you can temporarily name this whole thing up here, struct node.
And then, down here inside of the data structure, you say struct node*.
teaching the compiler, from the first line, that here comes
Down here, you're shortening the name of this whole thing to just node.
Why?
It's just a little more convenient than having to write struct everywhere.
But you do have to write struct node* inside of the data structure.
All right.
Yeah, question.
to another [INAUDIBLE].
SPEAKER 1: Why is the next variable a struct node* pointer and not an int
in turn, needs to know that this chunk of memory is not just an int.
It is a whole node.
Good question.
Yeah.
Here, too, it looks like we're using twice as much memory, also.
And to my comment earlier, it's even more than twice as much memory
because these pointers are 8 bytes, and not just 4 bytes like a typical integer
is.
So you weren't consuming long-term, more memory than you might need.
And even though I might still have to follow all of these arrows, which
not going to have to be asking for more memory, freeing more memory.
And certain operations in the computer, anything involving asking for or giving
But we'll see in a bit just what some of those trade offs actually are.
All right.
let's start to now build up a linked list with some actual code.
All right.
an equal sign.
Why?
for me.
AUDIENCE: Null.
AUDIENCE: To null.
Invalid, yes.
even before you've inserted any numbers into the thing itself.
All right.
So after that, how can we go about adding something to this linked list?
Just because it's nice and clean, and this represents an empty linked list.
Well, if I want to add the number 1 to this linked list, what could I do?
Let's ask malloc for enough space for the size of a node.
And this gets to your question earlier, like, what is it I'm manipulating here?
I don't just need space for an int and I don't just need space for a pointer.
So size of node figures out and does the arithmetic for me.
Because just like last week when I asked malloc for enough space for an int
and I stored it in an int* pointer.
All right.
to null.
It's garbage values because malloc does not magically initialize memory for me.
But malloc alone just says, sure, use this chunk of memory.
Well, suppose I want to insert the number 1 and then, leave it at that.
And this is where you have to think back to some of these basics.
But does someone want to take a stab at translating this inside line of code
How might you explain what that inner line of code is doing? *n.
number equals 1.
Nope?
Yeah.
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Perfect.
haven't needed to do before because we haven't dealt with pointers and data
All right.
It's ugly.
Thankfully, there's that syntactic sugar that simplifies this line of code
to just this.
In this case, to 1.
This just looks more like the artist's renditions we've been talking about.
And how most CS people would think about pointers as really just being arrows
in some form.
All right.
The picture now, after setting number to 1, looks a little something like this.
to say if n does not equal null, then set n's next field to null.
and then update the next field that you find there to equal null.
this to null if we're going to keep adding more and more numbers to it.
If the goal, ultimately, was to insert the number 1 into my linked list,
Yeah.
SPEAKER 1: Yes.
I just needed this to get it back from malloc and clean things up, initially.
List equals n.
N is the address of the beginning, and it turns out, end of our linked list.
initialize the 2 fields inside of it, update the linked list, and boom,
And we'll see before long it all in context with some larger code.
SPEAKER 1: Yes.
All right.
that allows us to build up a linked list now using code similar in spirit
to before.
I'm going to go ahead now and delete the entirety of this old version that
And now, inside of my main function, I'm going to go ahead and first do this.
But I'm also now going to have to take the additional step of defining
And I'm going to call this whole thing, more succinctly, node,
Now as an aside, for those of you wondering what the difference really
Not use typedef and not use the word node alone.
Struct node.
as a node.
again, lets us invent our own word that's even more succinct.
And this just has the effect now of calling this whole thing
node without the need, subsequently, to keep saying struct all over the place.
Just FYI.
All right.
So now that this thing exists in main, let's go ahead and do this.
So I'm going to assume we can keep going with some of the logic here.
If n does not equal null, and that is it's a valid memory address,
N [number] equals 1.
constructed what was that first picture, which looks like this.
This is the corresponding code via which we built up this node in memory.
So this time, I'm just going to say n equals malloc and the size of a node.
So if n equals equals null, then let's just quit out of this altogether.
So I think it suffices to free what is now called list, way at the top.
All right.
Now, if all was well, though, let's go ahead and say n [number] equals 2.
And store the address of what was n, which a moment ago looked like this.
And I'm just throwing away, in the picture, the temporary variable.
All right.
Let me go down here and say, add a number to list, n equals malloc.
Size of node.
All right.
And let me get rid of the highlighting just so it's a little more visible.
And again, it's not that I'm freeing those variables per se.
Can I go to you?
That's OK.
Which node?
Why?
You should never touch memory that you have already freed.
And so, the fact that I did in this order, very bad.
And then, literally one line later, you're saying, wait a minute.
But that's the kind of thing one needs to be careful about when
And that has the effect now of building up in the computer's memory,
Very manually.
Very pedantically.
But, for now, we're doing it just to play around with the syntax.
when it's you, who are stitching together the data structure in memory.
been trusting that all of the bytes in the array are back, to back, to back.
to just figure out, oh, well if you want [0], that's at the beginning.
Because even though you might want to go to the first element in the linked
list, or the second, or the third, you can't just jump to those arithmetically
So with linked lists, you can't use this square bracket notation anymore
because one node might be here, over here, over here, over here.
And this might look scary at first, but it's just an application
I'm going to keep doing this, so long as TMP does not equal null.
But when I print something here with printf, I can still use %i.
But what I want to print out is the number in this temporary variable.
And what update do you want to make on every iteration of the loop?
Then, I'm asking the question, does TMP not equal null?
So what do I do?
That, then, has the result of being checked against this conditional.
Then one last time, I update TMP to be whatever TMP is in the next field.
is that TMP, follow the arrow in number, and I print that out.
So, again, admittedly much more cryptic than our familiar int i equals 0,
and so forth.
Yes.
AUDIENCE: How does it happen that you're always printing out the numbers.
How is it that I'm actually printing numbers and not printing out
addresses instead.
The compiler knows that a node has a number of fields and a next field
Doesn't matter where specifically in the rectangle I'm pointing per se.
And the fact that I, then, use TMP arrow number means, OK,
So you're literally pointing at the number field and not the next field.
Good question.
Yeah.
SPEAKER 1: OK.
You can only reach line 49, if n does not equal null.
So that's safe.
I was only doing those freeing, if I knew on line 45 that I'm out of here
Good question.
And, yeah.
Is TMP [INAUDIBLE].
You should only free addresses that were returned to you by malloc.
While the list itself is not null, so while there's a list to be freed.
2 slightly in the middle of the picture, then it is safe for me on line 61,
Now I'm going to say, all right, once I freed the first node in the list,
If you think about this picture, TMP is initially pointing at not the list,
Totally safe and reasonable to free now the list itself a.k.a.
That has the effect of just throwing away the number 1 node,
telling the computer you can reuse that memory for you.
The last line of code I wrote updated list to point at the number
more on that later, is an opportunity to play around with just this syntax.
At the very end of this thing, I'm going to return 0 as though all is well.
All right.
And again, we'll walk through this again in the coming weeks spec.
Yeah.
AUDIENCE: Can you explain the while loop [INAUDIBLE] starts in other ways?
SPEAKER 1: Sure.
Can we explain this while loop here for freeing the list.
Here is list.
ago where that just meant free the whole darn list,
you now have taken over control over the computer's memory with a linked list,
The computer knew how to free the whole array because you
You are now mallocing the linked list one node at a time.
And the operating system does not keep track of for you
the value of the list variable, which is just this first node here.
Then my last line of code, which I'll flip back to in a second, updates
this new syntax of star notation, and the arrow notation, and the like,
It's just that someone else, the authors of Python for instance,
All right.
Absolutely.
Anything you can do with a while loop you can do with a for-loop,
But for-loops and while loops behave the same in this case.
SPEAKER 1: Sure.
Other questions?
All right, well let's just vary things a little bit here.
Indeed, we'll try to save some of that for problem set 5's exploration.
But instead, let's imagine that we want to create a list here of our own.
Come on up.
AUDIENCE: Pedro.
[AUDIENCE CLAPPING]
But you are a null pointer so just point sort of at the ground,
All right.
might look a little something like this for consistency with our past pictures.
Now suppose that we want to go ahead and malloc, oh, how about the number 2.
OK.
All right.
[AUDIENCE CLAPPING]
OK.
AUDIENCE: Caleb.
AUDIENCE: Caleb.
SPEAKER 1: Halen?
AUDIENCE: Caleb.
SPEAKER 1: Caleb.
Caleb, sorry.
All right.
And come on, let's say that there was room for Caleb like, right there.
That's perfect.
So now if we want to insert Caleb and the number 2 into this linked list,
So far, so good.
All is well.
Let's insert one more, if anyone really wants another foam finger.
Come on down.
AUDIENCE: Hannah.
SPEAKER 1: Hannah.
[AUDIENCE CLAPPING]
All right.
And Hannah, how about Hannah, suppose you ended up over there
All right.
So what should we now do, if the goal is to keep these things sorted?
How about?
AUDIENCE: No.
SPEAKER 1: No.
All right.
OK.
I would, it's just for you for now, so point at the ground representing null.
OK.
So, again demonstrating the fact that, unlike in past weeks where
AUDIENCE: Jonathan.
SPEAKER 1: Jonathan.
[AUDIENCE CLAPPING]
OK.
All right.
OK.
So pretty straightforward.
And here, we'll use a chance to, without the weeds of code,
Yes.
Yeah.
Come on down.
AUDIENCE: Lauren.
SPEAKER 1: Lauren.
OK.
[AUDIENCE CLAPPING]
OK.
And that was perfect that Pedro presumed to point immediately at Lauren.
Why?
You literally just orphaned all of these folks, all of these chunks of memory.
Why?
Because if Pedro was our only variable pointing at that chunk of memory,
I have no idea how to get back to Caleb, or Hannah, or anyone else on stage.
So that's good.
Why?
Good.
And if we had just done this line of code in red here, list equals n.
But if we think through it logically and do this, as Lauren did for us, instead,
we've now updated the list to look a little something more like this.
Yeah.
All right.
[AUDIENCE CLAPPING]
All right.
AUDIENCE: Miriam.
AUDIENCE: Miriam.
SPEAKER 1: Miriam.
All right.
If you want to go maybe in the middle of the stage in a random memory location.
So here, too, the goal is to maintain sorted order.
So let's ask the audience, who or what number should point at whom first here?
And if we do orphan memory, this is what's called, again per last week,
a memory leak.
Your Mac, your PC, your phone can start to slow down
if you keep asking for memory but never give it back or lose track of it.
Or what number?
Say again.
AUDIENCE: 3 to 4.
SPEAKER 1: Perfect.
Why?
AUDIENCE: 2 to 3.
SPEAKER 1: 2 to 3.
So, 2 to 3.
So Caleb, I think it's now safe for you to decouple.
We need the numbers back, but you can keep the foam fingers.
Thank you.
[AUDIENCE CLAPPING]
So this is only to say that when you start looking at the code this week
But the idea is, again, really do bubble up to these higher level descriptions.
And they just assume that, yeah, if we went back and looked
at our textbooks or class notes, we could figure out how to implement this.
Even though, via this week, will we get some practice with the actual code.
What might be now the running time of operations like searching and inserting
into a linked list?
as it's an array.
But as soon as we have a linked list, these arrows, like our volunteers,
You pretty much have to follow all of these breadcrumbs again and again.
Even though I keep drawing all these pictures with all of the numbers
exposed.
spot where the 1 is, where the 2 is, where the 3 is, the computer, again,
just like with our lockers and arrays, can only see one location at a time.
And the key thing with a linked list is that the only address
But without Pedro, we would have lost some of, or all of, the linked list.
And only once you hit null can you conclude, yep, it was there.
can only see the number 1, or the number 2, or the number 3, or the number 4,
Well, in the worst case, the number you might be looking for
And so, obviously, you're going to have to search all of the n elements.
can only figure that out by starting at the beginning and going there.
It's gone.
Linked list, by design, only remember the next node in the list.
All right.
Someone else.
Someone else.
Yeah.
AUDIENCE: N squared.
SPEAKER 1: Say again?
AUDIENCE: N squared.
SPEAKER 1: N squared.
And I think we can stay under that, but not a bad thought.
Yeah.
AUDIENCE: Is it n?
SPEAKER 1: OK.
I could just keep inserting into the beginning, into the beginning,
the number of steps required to insert something between the first element
And now these are the kinds of decisions that will start to leave to you.
All right.
So that's a win.
But they still require Big O of n time to find the end of it,
if we care about order.
We're using at least twice as much memory for the darn pointer.
So can we do better?
Here's where we can now accelerate the story by just stipulating that, hey,
using pointers .
If you've ever seen or draw on a family tree with grandparents, and parents,
Notice this.
And recall that what was nice about an array, if 1, it's sorted.
that's what?
0, 1, 2, 3.
simple arithmetic, I can very quickly, with a single line of code or math,
find for you the middle of the left half, of the left half,
And then, we went left or right again, implied by this color scheme here.
Because log of n was much better than n, certainly for large data sets, right.
And I preserve the color scheme, just so it's obvious what came where.
What are these things look like now?
So still integers.
Whereby, every node has not one pointer now, but as many as 2.
So all the same jargon you might use in the real world,
But this is interesting because I think we could build this now, this data
How?
And give ourselves a pointer called left and another one called right.
So same idea as before, but now we just make sure we think of these things
as pointing this way and this way, not just this way.
It would seem, just like Pedro was the beginning of our linked list,
You can retain and remember this entire tree just by pointing at the root node,
ultimately.
Well, if I look at the root node and the number I'm looking for is less than.
Notice this.
But it also works recursively for every sub tree, or branch of this tree.
And so, now, how many steps does it take to find in the worst case
So it seems 2, literally.
Log base 2 is the number of times you can divide something in half, and half,
Which means that even in the worst case, the number you're looking for maybe
Doesn't matter.
It's going to take log base 2 of n steps, or log of n steps,
It's a tree.
But we've gained back binary search, which is pretty compelling, right.
But what price have we paid to retain binary search in this new world.
Yeah.
Not just 1.
SPEAKER 1: Exactly.
Every node now needs not one number, but 2, 3 pieces of data.
And you start using more space, you can speed up time.
Let me go ahead here and let me just open a program I wrote here in advance.
All right.
And as before, I've played around and I've inserted the numbers manually.
Here is my definition of a node for a binary search tree, copied and pasted
And then, here is me initializing this node to contain the number 2, first.
And then, initializing the tree itself to be equal to that particular node.
So at this point in the story, there's just one rectangle on the screen
All right.
point.
I'm going to not bother with my, let me do this, free memory here.
Just to be safe.
Do I want to do this?
I'm going to initialize the children of this node to null and null.
is my root node, the single rectangle I described a moment ago that currently
All right.
Except for the fact that I'm updating the tree's right child
But it takes a pointer to a root element as its sole argument, node* root.
AUDIENCE: Recursion.
SPEAKER 1: Yeah.
weeks ago.
So here is this leap of faith where I say, print my left tree, or my left sub
And because we have this base case that makes sure that if the root is null,
You're not going to call yourself again, and again, and again,
If you wanted to print the tree in reverse order, you could do that.
Then, yourself.
But you can also do it, even with this 2-dimensional structure.
Order doesn't matter in quite the same way, but it does still matter.
Well, if the root of the tree is null, there's obviously nothing to do.
Just return.
Otherwise, go ahead and free your left child and all of its descendants.
And again, free literally just frees the address in that variable.
Why did I free the left child and the right child
AUDIENCE: [INAUDIBLE].
SPEAKER 1: Exactly.
If you free yourself first, if I had done incorrectly this line higher up,
you're not allowed to touch the left child tree or the right child tree.
For instance.
here might be the prototype for a search function for a binary search tree.
You give me the root of a tree, and you give me a number I'm looking for,
and I can pretty easily now return true if it's in there or false if it's not.
How?
Return false.
Else if, the number you're looking for is less than the tree's own number,
AUDIENCE: Left.
that you're kicking the can and let yourself figure it out
Else if, the number you're looking for is greater than the tree's own number,
Yeah.
AUDIENCE: The number.
So else if, the number I'm looking for equals the tree's own number,
AUDIENCE: Else.
SPEAKER 1: Exactly.
An else suffices.
So here to, more so than the Mario example a few weeks ago,
with these kinds of data structures that have this structure to them
All right.
Yeah.
[INAUDIBLE]
It is the case that true is, it's not well defined what they are.
To 0 and 1, essentially.
When you're using true and false, you should compare them to each other.
upon having code return from functions prematurely, you could invert the logic
that there was explicitly a base case that I could point to on the screen.
All right.
Just to be clear.
where the root is greater than its left child and smaller than its right child.
You agree?
I agree.
[INTERPOSING VOICES]
OK.
Is there any example of a left child that is greater than its parent?
Or is there any example of a right child that's smaller than its parent?
Because if we follow the same logic as before, going left or going right,
add some logic that tell you when you got to pivot the thing,
and rotate it, and snip off the root, and fix things in this way.
you might accidentally create a crazy, long and stringy binary search
Both, in certain search, could actually devolve into instead of big O of log n,
literally, big O of n.
It's a higher level thing you might explore down the road.
It can devolve into something that you might not have intended.
All right.
See you in 5.
All right.
So we are back.
Where if we take for granted that, even though you haven't had an opportunity
to play with these techniques yet, you have the ability now in code
Can we start to get some of the best of both worlds by way of, for instance,
something called a hash table.
But it allows you, a hash table, to jump to any of these locations randomly.
That is instantly.
0 through 25.
So, for instance, suppose that we want to insert a value, one name
So Albus starting with A. Albus might go at the very beginning of this list.
All right.
Starting with Z, so it goes all the way at the end of this data
Z.
And then, maybe a third name like Hermione, and that goes at location H
A, or Z, or H, in this case.
Let's fast forward and assume we put a whole bunch of other names--
But if you're thinking of names you don't yet see it on the screen,
Yeah.
Maybe, if we want to insert Harry next, do we maybe cheat and put him
at location I?
And it just feels like the situation could very quickly devolve.
even though I'm drawing the rectangles a little differently from before?
AUDIENCE: An array.
SPEAKER 1: Yeah.
But, honestly, arrays are such a pain with the allocating, and reallocating,
and so forth.
Where the name is where the number used to be, even though I'm drawing it
But it looks like the array is 26 pointers, some of which are null,
that is empty.
And in theory, this gives you the best of both worlds, right.
You get to jump immediately to the location you want to put someone.
But, if you run into this perverse situation where there's someone
And in fact, if we fast forward and put a whole bunch of familiar names in,
We're trying to balance these trade offs a little bit in the middle here.
And it's big enough to fit the longest word in the alphabet plus 1.
in the story.
Longest word plus 1 should be sufficient to store any name in the story here.
And then, what else does it each of these nodes have?
So it's H-A and H-E. But wait, no, then Harry and Hagrid still collide.
But how do we decide where someone goes in a hash table in this way?
Or 0 through 16,000.
it's going to just tell you where to put that input at a specific location.
So, for instance, Albus, according to the story thus far, gave me back to 0
as output.
It's just looking at the Ascii value, it seems, of the first letter
in their name.
So like doing some math to get back in number between 0 and 25.
And how might we, then, resolve the problem further and use
Well we've got a whole bunch of sorting algorithms from the past.
So here's diamonds.
3, 4.
And if you keep going through the cards, here's seven of hearts, hearts bucket.
8's bucket.
But at the end of it, you have hashed all of the cards
Taking as input some card, some name, and producing as output some location.
some of these chains would get longer, and longer and longer.
Which means that instead of getting someone's name quickly,
If the problem, fundamentally, is that the first letter is just too darn
Not just the first letter but maybe the first 2 letters.
to something more extreme like maybe H-A, H-B, H-C, H-D, H-F, and so forth.
And in this case here, anyone know how many buckets we just
increased to, if we now look at not just a through Z but AA through ZZ, roughly?
AUDIENCE: 26 squared.
SPEAKER 1: Yeah.
OK, good.
in constant time.
But 3 is constant, no matter how many other names are in here, it would seem.
AUDIENCE: Memory.
SPEAKER 1: Memory.
So significantly more.
We're now up to 17,576 buckets, which itself isn't that big a deal, right.
of someone whose name started with H-E-Q, for instance, in the Harry
Potter universe.
Why?
If you have some crazy perverse case where everyone in the universe
if you insert it without any mind for keeping it balance, it just evolves.
Big O of n.
But if what we're really caring about is real humans using our software,
And so, there's this tension too between systems, types of CS,
improving this speed by a factor of 26 in this case, let alone 576 or more,
And that's typically some other resource like giving up more space.
All right.
So for instance, if you wanted to store the names of the Harry Potter universe,
not in a hash table, not in a linked list, not in a tree, but in a trie.
What you would do is hash on every letter in the person's name one
at a time.
then the second letter, then the third, and you do the following.
So that it's clear that the person's name is not H-A, or H-A-G, or H-A-G-R,
just to indicate there's like some other Boolean value that just says, yes.
And if I continue this logic, here's how I might insert someone like Harry.
And what's interesting about the design here is that some of these names
You're using the same nodes for names like H-A-G and H-A-R
And we, therefore, might implement this thing using code like this.
A name stops in this node or it's just a path to the rest of the person's name.
if there were 3 million, it would still take me how many steps to search
for Hermoine?
And if you assume that there's a maximum limit on the length of names
That's constant.
would be big O of 1.
we've looked at, with a trie, the amount of time it takes you to find one person
I don't think there was a Daniel or Danielle in the Harry Potter universe
that I could think of.
But at the end of the day, it only takes a finite number of steps
We've gone through this whole story for weeks now of like, linear time.
And now constant time, what's the price paid for a data structure like this?
AUDIENCE: Memory.
SPEAKER 1: Memory.
In what sense?
SPEAKER 1: Exactly.
Granted there's only 3 names, but most of those boxes, most of those pointers,
It's not really possible to get truly the best of both worlds.
for the device you're writing software for, how much memory it has,
are 2-dimensional.
to use in a program, or even our human world, are things called queues.
So that whoever's first in line gets their food first and gets out first.
And the interesting thing here is that how do you implement a queue?
Well in the human world, you would just have literally physical space
Same in a computer.
All right.
If you use an array to store all of the documents that need to be printed.
oh, you can send like a megabyte worth of documents to this printer at once.
or so forth.
But at that point, maybe they should just use a linked list.
You go to the store and you start having to line up outside and down the road.
And like, for a really busy store, they run out of space so they make do.
If you've ever gone to the dining hall and picked up like a Harvard or Yale
tray, you're typically picking up the last tray that was just cleaned,
Why?
And then you, the human, are probably taking the most recently cleaned one.
but would really be bad in the world of Tasty Burger lining up for food if LIFO
And in fact, to paint this picture, we have a couple of minute video here.
Let's go ahead and dim the lights for just a minute or 2 here.
[VIDEO PLAYING]
When it came to making friends Jack did not have the knack.
I sure do.
And Jack showed Lou the box, where he kept all his shirts, and his pants,
at his socks.
Then he said, now Jack, at the end of the day, put your clothes on the left
your clothes from the right, from the end of the line.
SPEAKER 1: So just to help you realize that these things are everywhere.
[AUDIENCE CLAPPING]
And a dictionary, just like in our human world, has keys and values.
This just has letters of the alphabet and salads as their value.
Because they, too, are using only finite space, finite storage.
Yeah.
SPEAKER 1: Yeah.
And then, maybe, they kind of overflow into the E's or the F's.
is going to come by, and just eyeball it, and figure it out anyway.
AUDIENCE: F15.
AUDIENCE: F.
SPEAKER 1: F15.
AUDIENCE: S.
AUDIENCE: F5.
SPEAKER 1: F5.
What address?
AUDIENCE: F12.
SPEAKER 1: F12.
Big finale.
F12, if you'd like to stand up holding a 0 and null, which means that was CS50.
[AUDIENCE CLAPPING]
All right.
[MUSIC PLAYING]
DAVID J. MALAN: All right, this is CS50, and this is already week 6.
And this is the week in which you learn yet another language.
and in the coming weeks from C, where we've spent the past several weeks, now
to Python.
The goal ultimately is to teach you all how to teach yourselves new languages,
All you had to do was drag and drop these puzzle pieces.
But there were still functions and conditionals and loops and variables
when you try to compile your code, and it just doesn't work.
been focusing on functions and loops and variables, conditionals, and really
we're using, transitioning from C now to Python, this now being the equivalent
And you won't have nearly as much practice with Python as you did with C.
But that's because so many of the ideas are still going to be with us.
I want to do a loop.
if you continue programming and learn some other language after the class,
if in 5-10 years, there's a new, more popular language that you pick up,
All right, so let's do a few quick comparisons, left and right, of what
and a bit of a cryptic mess the first week, you had the printf,
in Python, that same statement is going to look a little something like this.
Yeah.
AUDIENCE: Now print, instead of printf would be, something like that.
is that, with print, you get the new line for free.
We had multiple functions like this, that not only said something
like this, whereby that first line declares a variable called answer,
and then the same double quotes and parentheses and semicolon.
Then we had this format code in C that allowed us, with %S,
But let's see if we can't just infer from this example what
AUDIENCE: Type.
from context.
and this week, which comes from a Python version of the CS50 library.
But we'll also start to take off those training wheels, so that you'll
As before, no semicolon, but the rest of the syntax is pretty much the same
here.
saw the joined block in Scratch, or concatenation was the term of art
there.
And it literally starts with the letter F, which admittedly looks, I think,
a little weird.
forgot the curly braces, but maybe still had the F there?
DAVID J. MALAN: Yeah, it would literally print Hello, comma answer, because it's
So the curly braces just kind of allow you to plug things in.
This just kind of makes your code a little tighter, a little more succinct.
In C it looked like this, where you specify the type, the name,
You don't need to mention the type, just like before with string.
If you want a variable, just write it and set it equal to some value.
There was the slightly less verbose way, where you could say, oops, sorry.
is actually going to be almost the same, you just throw away the semicolon.
And the mathematics are ultimately the same, copying from right to left,
In Python, you can similarly do the same thing, just no need for the semicolon.
were a big fan of counter plus plus, that doesn't exist in Python,
asking a silly question like is x less than y, and if so, just say as much.
with the parentheses, the curly braces, the semicolon, and all of that.
And if someone wants to call out some of the obvious changes here,
what has been simplified now in Python for a conditional, it would seem?
AUDIENCE: Braces.
should be executed below it, until you start to un-indent and start writing
on the left hand side of the window, yeah, it might compile and run.
So Python is going to force you to start inventing properly now, if that's been,
In Python, it's going to now look like this, almost the same,
And there's one other difference that's now again visible here,
Yeah.
if you want to combine thoughts and do this and that, or this or that.
It's just if, and this is the curiosity, elif x greater than y.
So it's not else if, it's literally one keyword, elif, and the colons
Yeah.
though, does it matter if there's this in between thing like that, but
and why.
AUDIENCE: So like the left-hand side and like the right side spaces?
whereby you do have spaces to the left and right of binary operators,
Not only within CS50 have we had a style guide on the course's website,
for instance, that just dictates how you should write your code so that it looks
and there's an actual standard whereby you don't have to adhere to it,
but generally speaking, in the real world, someone would reprimand you,
In C, the closest we could get was doing something while true, because true
never changes.
you repeat something a finite number of times, like meowing three times.
There's this very mechanical way, where you initialize a variable like i
to zero.
You increment i using this syntax, or the longer, more verbose syntax,
You just don't bother saying what type of variable you want.
Python will infer from the fact that there's a 0 right there.
You can't do the i plus plus, but you can do this other technique,
Instead, there is a for loop, but it's meant to read a little more
So lists in Python are more like link lists than they are arrays.
So this just means for i and the following list of three values.
At what point does this approach perhaps get bad, or bad design?
Yeah, in back.
AUDIENCE: If you don't know how many times, last time, you know,
DAVID J. MALAN: Sure, if you don't know how many times you
like that, of 0, 1, 2.
Other thoughts?
like comma 3, comma 4, comma 5, comma dot dot dot, comma 99, comma 100.
called range, that essentially magically gives you back a range of values
Question.
So not 0, which is the implied default, but something larger than that.
Yes, so it turns out the range function takes multiple arguments, not just one
but maybe two or even three, that allows you to customize this behavior.
every two values, for like evens or odds, you could do that as well,
that will become your authoritative source for answers like that.
Well, in the world of C, recall that we had a whole bunch of built-in data
types, like these here, Bool and char and double and float, and so forth,
because the backslash 0, the support for %S and printf, that's all native,
a synonym for a typedef for char star, which is part of the language natively.
Still going to have bulls, we're going to have floats, and Ints,
and we're going to have strings, but we're going to call them STRs.
Ints and floats, meanwhile, don't need the corresponding longs and doubles,
But there are libraries, code that other people have written, as we briefly
So there's other data types, too, in Python, which we'll see actually
which allow you to store keys and values, much like our hash tables
from last time, and then sets in the mathematical sense, where they filter
out duplicates for you, and you can just put a whole bunch of numbers,
a whole bunch of words or whatnot, and the language, via this data type,
Now there's going to be a few functions we give you this week and beyond,
training wheels that we're then going to very quickly take off,
just because, as we'll see today, they just simplify the process of getting
just when you're trying to get Hello, World, or something similar, to work.
And we'll give you functions, not like, not as long as this list in C,
And if you want to import CS50's functions, you just say import CS50.
Or, if you want to be more precise, and not just import the whole thing, which
could be slow, if you've got a really big library with a lot of functionality
in it, you can be more precise and say from CS50, import get float.
From CS50 import get Int, from CSM 50 import get string,
that other people wrote, so that we're no longer reinventing the wheel.
We're not making our own linked lists, our own trees, our own dictionaries.
But let me pause here to see if there's any questions on syntax or primitives
or otherwise, or otherwise.
DAVID J. MALAN: Sorry, someone coughed when you said something operators.
has tended to not give you multiple ways to do the same thing syntactically.
And I'll see if I can dig in and post something online, to follow up on that.
let me go ahead and consider exactly how we're going to write code.
We create a file called like Hello.c, and then, step one, make Hello, step 2,
./Hello.
Or, if you think back to week two, when we sort of peeled back
you could more verbosely type out the name of the actual compiler,
Clang in our case, command line arguments like dash Oh, Hello,
And then you can specify what libraries you want to link in.
odds are you've realized that, any time you want to make a change to your code,
or make a change to your code and try and test your code again,
you're constantly doing those two steps.
The file name is going to change, but that might go without saying.
And so Python, it turns out, is the name, not only of the language
we're going to start using, it's also the name of a program on a Mac, a PC,
assuming it's been pre-installed, that interprets the language for you.
not compiled.
And by that, I mean you get to skip, from the programmer's perspective,
There is no manual step in the world of Python, typically, of writing your code
and then compiling it to zeros and ones, and then running the zeros and ones.
into the illusion of one, whereby you, instead, are able to just run the code,
And the way we do that is via this old process, input and output.
You could run this algorithm, but you're going to have to do some googling,
And that's going to be one of the subtleties with the world of Python.
Yes, it's a feature that you can just run the code without having
that demonstrate how Python is not only easier for many people
you have to write, and also it comes with so many darn libraries,
you can just do so much more without having to write the code yourself.
to this image from problem set 4, which is the Weeks Bridge down by the Charles
and it's even higher res if we looked at the original version of the photo.
And blur was probably among the more challenging of the ones,
you had to take into account what's above, what's below, to the left,
to the right.
And if you ultimately got it, it was probably a great sense of satisfaction.
Let me go ahead and run a program, or write a program, called Blur.py here.
and say, let me open the current version of this image, which
is called Bridge.bmp.
That's it.
In the world of Python, we're going to start making use of the dot operator
there was a to upper function that takes as input an argument that's a char.
And you can pass in any char you want, and it will uppercase it for you
the ability just to uppercase any char by treating the char, or the string,
And then, after, dot save does what you might think.
So instead of using fopen and fwrite, you just say dot save,
I'm going to go ahead and do Python of Blur.py, nope, sorry, wrong place.
There we go.
Let me rewind.
All right, I've gone ahead and moved this file, Blur.py,
left to right.
And, you know what, just to make clear what's really happening,
Let's make a box that's not just one pixel around, but 10.
Let me go ahead and open Out.bmp and show you first the before,
encapsulated it all into a single library, that you can then use instead.
And in Edges.py, I'm again going to import from the Pillow library
then I'm going to go ahead and run a filter on that, called image, whoops,
I'm now going to run Python of Edges.py, after, sorry, user error.
And before we had this, and now, especially if what will look familiar
And then you kept track of just how much time and memory it took.
And we're going to see now the syntax for defining functions in Python.
I'm going to go ahead, then, and say words gets this, give me a dictionary,
check, which was the first function you might have implemented.
You still specify the name of the function and any arguments thereto.
go ahead and return true, else go ahead and return false, done,
That was the heavy lift, where you had to load the big file into memory.
How, now, do I get at the current word, and then strip off the new line,
Well, let me go ahead and get a word from the current line,
but strip off, from the right end of the string, the new line, which
Then let me go ahead and add to my dictionary, or hash table, that word,
done.
And then let me go ahead and return true, because all was well.
This did not take any arguments, it just returns the size of the hash table
or dictionary in Python.
And then lastly, gone from the world of Python is malloc and free.
But the implication now is that, what are you getting for free,
in a language like Python?
It's just someone else in the world wrote that code for you.
So I'm going to make one tweak, set recall was another data type in Python.
to me in uppercase or capitalized.
And actually I'm going to go ahead and split my terminal window into two.
And on the right, I'm going to go into a version that I essentially just wrote.
But it's also available online, if you want to play along afterward.
And then over here, let me get ready to run Python of speller
on texts/homes.txt2.
But what's the trade-off going to be, and what kinds of design decisions
Let's go to the group here, which of these programs is the better one?
more comfortable for the programmer, but C is better for the user.
but C is maybe better for the computer, because it's much faster to run.
Other opinions?
Yeah.
set or something huge, then that time is going to really build up on,
it might be worth it to put in the upfront effort and just load it into C,
so the process continually will run faster over a longer period of time.
They're going to have to deal with memory management and the like.
But if and when it works correctly, it's going to be much faster, it would seem.
and build on top of it, in order to prioritize the human time instead.
Other thoughts?
Python is four times slower than C. Like that's not the right takeaway.
that I've kind of dramatized just how big the difference is.
as it is here, because, indeed, among the features you can turn on in Python
is to save some intermediate results.
But that doesn't mean it has to do that every darn time you run the program.
As you propose, you can save, or cache, C-A-C-H-E, the results of that process.
So that the second time and the third time are actually notably faster.
And, in fact, Python itself, the interpreter, the most popular version
And so the faster the code runs, and the better it's going to be,
and get real work done, and your time is just as, if not more, valuable
you know what, Python is among the most popular languages as well.
all of that low-level stuff, because the whole point of using newer,
modern languages is to use abstractions that other people have created for you.
or the equivalent version that I used, which in this case was a set.
or is there some, I'd imagine that with the audience that can happen,
but it feels like if you can just come up with a Python compiler,
that would read it top to bottom, left to right, converting it to, on the fly,
something the computer understands, but historically that's not been the case.
compiling the code, technically not into 0's and 1's, technically
into something called byte code, which is this intermediate step that
just doesn't take as much time as it would to recompile the whole thing.
Why?
Well, honestly, for you and I, the programmer, it's just much easier to,
one, run the code and not worry about the stupid second step
Why?
And ultimately, too, you might want all of the fancy features that
you can enable these features, as opposed to shying away from them here.
which is not to impugn C. It's just that those other languages tend to be better
fits for the amount of time I have to allocate, and the types of problems
All right, let's go ahead and take a five minute break here.
And when we come back, we'll start writing some programs from Scratch.
All right.
So let's go ahead and start writing some code from the beginning
and then we'll build our way up to more sophisticated examples in Python.
What I've done in advance today is I've downloaded some of the code
that looks like two columns, that splits my code editor into two places,
All right, now I'm going to go ahead and write the corresponding Python
So again, I'm not going to play any further with the C code.
Whoops, I'm going to go ahead and drag that over to the left here.
So from CS50, I'm going to import the function called getString for now.
like the first example on the slide, Hello, comma space plus answer.
David.
But it's worth calling attention to the fact that I've also simplified further.
In C you have to have that to kick-start the entire process of actually running
your code.
And in fact, if you were missing main, as you might have experienced
All right, there are a few other ways we could say Hello, world.
So I could put this whole thing in quotes, I could use this f prefix.
now let me run Python of Hello.py, type in my name, and there we go.
I mean, it depends.
Well, let me go ahead here and let's get rid of the training wheel
altogether, actually.
with what you saw in C. Python of Hello.py, what's your name, David.
All right, any questions, before we now pivot to revisiting other examples
Let's say Calculator0.c, which was one of the first examples we did involving
All right, let me go dive into a translation of this code into Python.
I'm going to go ahead now and get an Int from the user.
and then down here, I'm going to go ahead and say print of x plus y.
So this is already a bit new.
Recall, the C version required that I use this format string, as well
as printf itself.
If all you want to do is print out a value, like x plus y, just print it.
But notice what does not work now, whereas it did work
here.
Recall that when we talked about the stack and the heap,
the stack, like a stack of trays, was all of the functions that
We had main, we had swap, then swap went away, and then main finished, recall.
So here's a trace back of all of the functions or code that got executed.
But even though it's a little cryptic, we can perhaps infer from the output
here, name error, so something related to the name of something, name, getInt
is not defined.
You can isolate variables and function names to their own namespace,
you have to say that the getInt you want is inside the CS50 library.
So just like with the image blurring, and the image edges
before, where I had to specify image dot and image filter dot, similarly here,
If you're using a whole bunch of functions, just import the whole thing.
If you're only using maybe one or two, import them line by line.
too, as quickly as we introduced it, though for the problems set six
Suppose I get rid of this, and I just use the input function,
Huh.
Yeah.
AUDIENCE: Say you have a number of strings that don't have Ints,
so you would part with them and say, printing one, two, better.
earlier.
Notice that I'm not doing parenthesis Int close parenthesis before the value.
So, again, it's written for sort of a programmer, more than sort
Invalid literal, a literal is just something someone typed for Int, which
is the function name, with base 10.
They are things that can go wrong when your Python code is running,
that aren't necessarily going to be detected until you run your code.
So in Python, and in JavaScript, and in Java, and other more modern languages,
here, even though we won't have to use this much just yet.
Then let me go ahead and say, again, that is not an Int exclamation point.
And then I'm going to exit from there to, otherwise I'll
It's actually doing that try and accept for you, because suffice it to say,
But underneath the hood, they're essentially doing this, try except,
In fact, the best way to do this is to say except if there's a value error,
And again, let's not get too into the weeds here with this feature.
which actually did division of numbers, not just the addition herein.
So let me go ahead and close the C version, and let's focus only on Python
So from CS50, import getInt, that will deal with any errors for me.
X gets getInt, ask the user for an Int x, y equals getInt,
Still no need for a format string, I can just print out the variable's value.
AUDIENCE: Zero?
DAVID J. MALAN: It will give you an integer back, and, unfortunately, 0.1,
Most people, especially new programmers, when dividing one value by another,
But what about another problem we had with the world of floats before,
as follows.
And I'm going to go ahead and format, not just z, because this is essentially
But if I use this syntax in Python, which we won't have to use often,
Unfortunately, it is.
And you can see as much here, the f-string, the format string,
All right, before we pivot away from a mere calculator, any questions
Yeah.
This is a comment.
Good question.
All right, let's go ahead and make something else here, how about.
And let me go ahead on the other side and create a file called Points.py.
This was a program, recall, that asked the user how many points they
whether they lost fewer points than me, because I lost two,
if you recall the photo, more points than me, or the same points as me.
Let me go ahead and zoom out so we can see a bit more of this.
And let me now, on the top right here, go about implementing this in Python.
the user, how many points did you lose, question mark.
Then let's go ahead and say, if points less than two, which was my value,
Else let's go ahead and handle the final scenario, which is you
Before I run this, does anyone want to point out a mistake I've already made?
Yeah.
So let me change this to elif, and now cross my fingers, Python of Points.py,
If you only lost one point, you lost fewer points than me.
Here was the code in C that we used to determine the parity of a number.
Well, let's go ahead and from CS50, import getInt, then let's go ahead
and get a number like n from the user, using getInt, and ask them for n.
Else let's go ahead and print out Odd, but before I run this,
anyone want to instinctively, even though we've not talked about this,
Again, so even though some of the stuff is changing, some of the same ideas
and a little representative of tools that actually ask the user questions.
And so let's go over here and let's do from CS50, import get--
as we saw before.
I'm going to get a string from the user that asks them this, getString,
quote unquote, "Do you agree," like a little checkbox or interactive prompt,
where you have to say yes or no, you want to agree to the following terms,
or whatnot.
let's go ahead and print out agreed, just like in C, elif S equals
And you can already see, perhaps, one of the differences here, too.
you just literally use the English word or, instead of the two vertical bars.
but Yes or big Yes or little yes or big Y, lowercase e, capital S, right?
Yeah.
Why don't we take that same idea and ask a similar question.
or heck, let me add to the list now, yes, or maybe all capital YES.
but this is still better than the alternative, with all the or's.
But let's leave this alone, and let me just go into here
and I won't do as, let's just not worry about the weird capitalizations
there, for now.
Notice it did not say agreed, and it did not say not agreed.
Big Yes, that works, big Y, little e, big S, that also works.
So we've now handled, in one fell swoop, a whole bunch more logic.
Yep.
I could, first of all, get rid of this lower, and get rid of this lower,
and then above this, maybe I could do something like this, S equal--
I could do that.
you can chain these functions together, like this, and do dot this,
And eventually you want to stop, because it's going to become crazy long.
Yeah, question.
AUDIENCE: Could you add on like a for uppercase as well, for like upper,
and then cover all the functions where it's lowercase, for all the functions
where it's uppercase as well, or could you not just do this again.
Yeah.
like I did weeks ago, when I was trying to copy S and call it T.
So If that wasn't clear, all of that pain, if you will, all of that power,
Why?
Why say something three times when you can say it just once?
And, like I generally should not, let me copy, paste it three times,
both in Scratch and in C. Let me focus now entirely on the Python version
here.
Let's now go ahead and define, using the Def key word, which we saw briefly
Let me now go ahead and run Python of Meow.py Enter, huh, one
Yeah.
Why?
And even though, yes, it's obvious that it begins on line four, logically,
just put your own code in main, so that, one, you can leave it up top, and two,
So let me define a function called main that has that same loop,
Nothing.
Yeah?
But if you want to call that main function, you have to do it.
because only once the interpreter gets to the bottom of the file,
So it's just obvious to you, and a TF, or any reader in the future,
But it also ensures that main is not called until everything else, main
Just apply some logic, as to, like, all right, what could explain this symptom.
If I now go and run this, Python of Meow.py, now we're back in business.
unquote "better" way of doing this, that solves different problems that we
something that looks like this, where you actually have a weird conditional
That's functionally the same thing, but it solves problems with libraries,
All right, let's make one change to this, just to show how it's done.
In C, the last version of meow also took command line argument, sorry, also
And I figure out how many times by just, like, putting in number 3
let's call it n, and then use that, as by doing this, for i in range of n,
and we don't bother specifying the type of our arguments or our variables.
All right, let me run this one final time, Python of Meow.py,
Any questions?
Yeah.
But that, too, is part of the mindset with this particular language.
[SIREN]
say it louder.
AUDIENCE: There has only been one green line printed at a time.
we had like the question marks for the coins and the vertical bars.
AUDIENCE: If strings are immutable, and every time you like make a copy.
Any time you seem to be modifying it, as with the lower function,
But you don't have to deal with it Python's doing that for you.
AUDIENCE: So you don't free anything.
Yeah.
AUDIENCE: Each up for the variable, you put it before the name, use of the body
Well, if there isn't a main function in Python, how do you define those words?
actually, don't use the word global, that's a special word in Python--
value that a computer scientist would typically use, that is now global.
All right.
So let's go ahead and do this.
To come back to the question about the print command, let me go ahead
But recall that, in Python, in Mario, we wanted to first do something like this.
And we just want to print like three hashes to represent those three blocks.
print, oh, sorry, for i in the range of 3, go ahead and print out quote unquote
"hash."
So let's do that.
Let me go up here and let me go and say from CS50, import getInt,
getInt the height of the column of bricks that you want to do.
And then, let's go ahead and print out n hashes instead of three.
OK, one, two, three, four, five, that seems to work, too.
But also recall that it's not going to work if the user types in something
weird, like, oh, sorry, it is going to work if the user types in something
And then, if they didn't cooperate, prompt them again, prompt them again.
Yeah.
That was useful, because it's almost the same as a while loop.
which makes sense with user input, because what are you
Deliberately induce an infinite loop, while True, with capital T for true.
And then do what you got to do, like get an Int from a user,
asking them for the height of this thing.
And then, if that is what you want, like a number greater than zero, go ahead
So this is how, in Python, you could recreate the idea of a do while loop.
Then, if you get the answer you want, you break out of it,
What if, though, I wanted to get rid of, how about ultimately
And let me give myself a function called get height that takes no arguments.
Break, and then down here, you could return, down here,
I claim, in Python.
Yeah.
[INAUDIBLE]
DAVID J. MALAN: So similar, it's not quite that we're using it first.
We've addressed that before, but on line 9, we're assigning n a value, it seems.
means as soon as you get out of that loop, like further down in the program,
makes clear that n is still inside of this loop, between lines 8 through 11.
The moment you create a variable in Python, for better or for worse,
All right, any questions then on this, before we now run this and then get
OK, so let me go ahead and get the height from the user.
And then let's use that height value, instead of something hardcoded there.
Python of Mario.py.
Yeah.
And if there's a problem, you can handle it in any way you see fit.
Previously, I handled it by just yelling at the user that that's not an Int.
function.
That would work too, but it's sort of an unnecessary extra line.
This is not sufficient, because that does not change the value.
You can see mention of module, that just means your file, main, which
So let's try to do, let's try to do this literally, except if there's an error.
I'm going to go in here, and I'm going to say, try to do the following.
Whoops, try to do the following, except if there's a value error, value error,
But the difference this time is because I'm in a loop, the user
and because I'm still in that loop, and because the program hasn't crashed,
because I've caught, so to speak, the value error, using this line of code
If I type in, though, 2, I get my two hashes, because that's, indeed, an Int.
to show you what's involved with getting rid of those training wheels.
Yeah.
Let's just do a very simple program, to create this idea, for i in range of 4
this time, because there are four of these things in the sky.
Odds are you know this not going to end well, because these are unfortunately,
takes in multiple arguments, not just the thing you want to print,
Some arguments are positional, which is the fancy way of saying it's
And that's what we did all the time in C. Something comma, something
comma, something, we did it in printf all the time,
where you just separate them by commas, to give one or two or three or more
arguments.
the default new line character, and now run Mario again, now I get all four
But, really, it's not nothing, because you get the new line for free.
have what I intended in the first place, which was a little something that
because the print function might take 5, 10, even 20 different arguments.
All right, any questions, then, on this, and the overriding of new line.
but logically expected, like this, by just changing the line ending, too.
because that was kind of three times longer than it needs to be.
would have otherwise taken multiple lines in C, fewer, but still multiple
Let's do one last Mario example, which looked a little something like this.
So two dimensions now, just not just vertical, not horizontal, but now both.
I could now print out a hash symbol, well, let's see what this does.
What do I need to fix and where here, to make this look like this?
Any instincts?
AUDIENCE: Why don't we create a line and then we'll skip it.
I like that instinct, right, print 3, new line, print 3, new line.
OK, it's more visible, what I'm doing, but still wrong.
Yeah.
So let me do n equals quote unquote, and now, together, your solutions might
All it's doing is, whoops, all it's doing is automating that process.
But we're now implementing this same two dimensional structure here.
Yeah.
AUDIENCE: Is there any practical reason why when we write n, n is, I mean,
Why?
around the less than or equal signs, I did say add it.
Here it's actually clearer and recommended
Good observation.
All right, let's do, how about, another five minute break.
Let's do that.
And then we're going to dive into some more sophisticated problems,
and then ultimately build with some audio and visual examples, as well.
And recall that week 1 was like our most syntax-heavy week.
It was when we were first learning how to program in C. But after week 1,
And we'll do that again here, condensing some of those first early weeks
and way more time-consuming to do in C, even more so than the speller example.
but you want to see it officially, you can go to the Python documentation,
docs.python.org here.
terribly user-friendly.
you're interested in, to find your way to the appropriate page on Python.org,
But googling, of course, things like how to implement problem set 6 in CS50,
But moving forward, and really with programming in general, like Google
more popular or more commonly used than these other ones, is third on the list.
But it's purple, because I clicked it a moment ago, when looking for it.
to look like.
And if we keep looking, you'll see mention of Lstrip, which is left strip.
I used its analog, Rstrip before, right strip, which allows you to remove,
that is strip, from the end of a string, something like white space,
And this is just testament to just how rich the language itself is.
And what we'll do today and this week in problem set 6 is really
But you won't know all of Python, just like you won't know all of C.
And, honestly, you won't know all of any of these languages on your own,
and even then, there's more libraries than one might even retain themselves.
And let me whip up, say, a recreation of our scores example from week two,
called scores, sorry, let me give myself a variable in Python called scores.
are the same ones we've used before, 72, 73, 33, in this context
And it turns out I can do, well, how did I sum these before?
I probably had a for loop to add one, then I knew how long they were.
and it just does the sum for you, with a for loop or whatever
Len gives you the length of the list, how many things are in it.
Now let me go ahead and print out, using print, the word average, and then,
and make it a little more interesting, and actually get input from the user
because I don't want to deal with all the exceptions and the loops.
well, that's pretty stupid, because you can't add things to it.
But you and I are not dealing with all the pointers underneath the hood.
So now, let's go ahead and get a whole bunch of scores from the user.
So for i in range of 3, let's go ahead and grab a score from the user,
And then let's go ahead and append, to the scores list, that particular score.
Just like in C, I could tighten this up and do something like this as well.
It's more clear, to me, at least, that what I'm doing here, getting the score
and then appending it to the list.
Just so you've seen it, Python does have some neat syntactic tricks,
The catch is, you need to put the one score I'm
But it's necessary, so that this thing and this thing are both lists.
So now maybe it's a little more clear that scores and brackets score
or joined together.
So two different ways, not sure one is better than the other.
This way is pretty common, but .append is also quite reasonable as well.
Let me get a string from the user, asking them for a before string.
And then let me go ahead and say, after, just to demonstrate some changes,
and print them out in uppercase, one way to do that would be this.
For c in the before string, go ahead and print out C.uppercase, sorry, C.upper,
but don't end the line yet, because I want to keep these all on the same line
So what am I doing?
How?
I then just print out some fluffy text that says after colon,
and I get rid of the line ending, just so I can kind of line these up.
you don't have to do like Int i equals 0 and i less than this,
you could just say, for c in the string in question, for c and before.
Based on what we've seen thus far, like from our agreement example,
And so let me go ahead and just tweak my print statement a little bit.
Let me just go ahead and print out the after variable here, after creating it.
I'm calling upper on the whole string, not one character at a time.
Why?
and what this kind of function, upper, represents, with its docs.
No?
Oh.
AUDIENCE: Could you write, very close to variable string, and then print upper,
No, I don't.
You want to resist the temptation of having like a long line of code that's
inside the curly braces, because it's just going to be harder to read.
All right, how about command line arguments, which was one thing
we introduced in week two also, so that we could actually have the ability
So we could actually take input from the user at the command line,
So if you want access to the argument vector, argv, you import it.
Let's write a program that just requires that the user types in two, a word,
So if the length of argv equals 2, let's go ahead and print out, how about,
Hello comma argv bracket 1 close quote, else if they don't type two words
total at the prompt, let's just say the default's, like we did weeks ago,
Hello, world.
So the only thing that's new here is we're importing argv from CIS,
and we're using this fancy f-string format, which kind of to your point,
In this case, it's a list called argv, and we're getting bracket 1 from it.
the word Python does not appear in the argv list, just to be clear.
If I do, sorry, if I do Foo and bar, those words all print out.
And Foo and bar or baz are like a mathematician's x and y and z
for computer scientists, when you just need some placeholder words.
It reads a little more like English, and a for loop is just much more concise,
allows you to iterate very quickly when you want something like that.
Suppose I only wanted the real words that the human typed
Or, this is what's kind of neat about Python 2, let me undo that.
And let me just take a slice of the array of the list instead.
If, for whatever reason, you want to ignore the last element,
even if at first glance, you might not need them for typical things.
So in one last program here, let's do Exit.py, just to do one more mechanic,
Let's make sure the user gives me one command line argument.
So if the length of argv does not equal 2 in total, then let's go ahead
We can exit.
Turns out the better way to do this is with CIS.exit, because I can then exit
So if I, for instance, just run the program like this, oops, I screwed up.
So I have exactly two command line arguments, the file name and my name,
And let's just change our syntax, kind of like I proposed for CS50,
that we want to import from a library, this is one way to avoid collision.
All right, only to demonstrate how we can implement that same idea.
on a list of numbers.
And then let's just exit successfully, with 0, else, if we get down here,
Here is your loop, that's doing all of the checking for you.
And those were Bill and Charlie and Fred and George and Ginny,
Now just ask the question, if Ron is in names, then let's go ahead
So, again, this just does linear search for us by default, Python of Names.py,
we found Ron, because, indeed, he's there, and at the end of the list.
Because we first had two arrays, one with names, one with numbers.
You can do this in Python, using objects and things called classes.
because just like in P set 5, you can associate keys with values, using
and a dictionary again is just keys and values, words and definitions.
But if I know what I want to put in it by default, let's put Carter in there,
with a number of plus 1-617-495-1000, just like last time, and put myself,
we have fixed the code that underlies that little Easter egg.
Spoiler ahead.
There is some new syntax here in Python, not just the curly braces,
but the colons, and the quotes on the left and the right.
when we look at CSS and HTML and web programming, keys and values
because it's just a really useful way of associating one thing with another.
So let's write a program that gets a string from the user and asks them
whose number they would like to look up.
Then, let's go ahead and say, if that name is in the people dictionary,
to that specific name, within there, using an f-string for the whole thing.
Linear search and dictionary lookups will just happen automatically for you
So I'm using square brackets, because here's the interesting thing in Python,
And then let's just, or, sorry, let's say, number equals people bracket name.
But that can actually be a string, like a word the human has typed.
It's this table, that you can look up in one column the name,
Phonebook.py.
What's going on?
Print found.
I am confused.
[KEYS CLICKING]
AUDIENCE: I don't.
[LAUGHTER]
Say again?
SPEAKER 47: When you found the test results, it was doing both commands.
One sec.
[KEYS CLICKING]
Whoa, OK.
So.
[LAUGHTER]
[APPLAUSE]
Thanks.
All right.
Thus far, these simple phone book examples throw the information away.
comes with a library that just handles CSV files for you.
Write just blows it away if it exists, append adds to the bottom of it.
Now let me go ahead and get a couple of values from the user.
Then let me getString again, and ask the user for their number.
because I'm just printing a list to that particular row in the file.
if you will, where the comma represents the separation between your columns.
You can instead say, with the opening of a file called Phonebook.csv
in append mode, calling the thing file, go ahead and do all of these lines
here.
And it's used in a few different ways, but one of the ways it's used
This just has the effect of ensuring that you, the programmer,
might have complained at you, if you had a file that, you didn't close a file,
you might have had a memory leak as a result. The with keyword
Let's do this.
here, or online, go to this URL here, where you'll find a Google form.
And just to show that these CSVs are actually kind of omnipresent,
And via this form, will we then have the ability to export,
here, where we have 200 plus responses to a simple question of the form, what
But you can do the same thing with Office 365 as well.
And I'm going to shorten its name, just so it's a little easier to read.
And then we can see, in the file, that there's two columns, timestamp column
when people filled out the form, with someone very early in class.
And the second value, after each comma, is the name of the house.
the fact that Google's GUI or graphical user interface, can do this for me.
will be initialized to 0.
The values are initially zero, but I'm going to use this,
to this form.
With opening Hogwarts.csv, in read mode, not append, I don't want to change it.
that is using the reader function in the CSV library, by opening that file.
I'm going to go ahead and ignore the first line of the file,
because, recall, that the first line is just timestamp and house.
It figures out where the comma is, and, for every row in the file,
that has the effect of iterating over every line of the file,
else.
And notice that I'm using the name of the house to index into my dictionary,
And count will be the result of indexing into houses, for that given house.
And let me close my quote.
And that's just my now way of code, and this is, oh,
And one of the reasons that Python is so popular for data science and analytics,
more generally, is that it's actually really easy to manipulate data, and run
Capital D, capital R, this means I can throw away this next thing,
still returns to me every row from the file, one after the other,
It gives me a dictionary.
but it's way more resilient, especially if I'm using Google Spreadsheets,
and I'm moving the columns around or doing something like that,
But I now don't have to worry about where those individual columns are.
we'll do in P set 6.
because they require audio that the browser won't necessarily support.
with speech.
for initialize.
And then I'm going to go ahead and tell this engine to run and wait,
I'm using another popular program that we used in CS50 back in my day,
Let's ask the user for their name, like what's your name question mark.
And then, let's use the little F string, and say, not Hello, world,
David.
And I'm going to go into a folder called Detect, whoops, a folder called
Faces.py.
Sorry, Faces.
and then you can use the line of code like this.
and you get back a list of all of the faces in the image.
And then down here, a for loop, that iterates over all of those
face locations.
I figure out the top, right, bottom, and left corners of those locations.
But per the documentation, this just draws a nice little box
So let me go ahead and zoom out here, and run this now on Office.jpeg.
All right, it's analyzing, analyzing, and you can see in the sidebar here,
And here is every face that my, what, 10 lines of Python code
maybe without a mask, that has two eyes, a nose, and a mouth,
faces here.
Recognize.py, which is taking two files as input, that image and the image
Why?
but it looks similar enough to the person, that it all worked out OK.
online, too.
Hello, world.
So this first version of the program is just using some relatively simple, if
elif elif, and it's just asking for input, forcing it to lowercase.
Now let's do a cooler version, using a library, just by looking at the effect.
Python of Listen1.py.
Hello, world.
Huh.
Hello, world.
The third version of this program that actually analyzes the words that are
said.
And let me give you the URL of like a lecture video on YouTube, or something
And I'm going to use the OS.system library to open QR.png automatically.
you can see the result of my barcode, that's just been dynamically generated.
[UPROAR]
[APPLAUSE]
[MUSIC PLAYING]
[MUSIC PLAYING]
for having created this very festive scenery, and all past ones as well.
Well, with that said, problem set 6 certainly added some challenges,
But hopefully you've begun to appreciate that with Python, just a lot more stuff
is easier to do.
You get more out of the box with the language itself.
So that by term's end, and perhaps even for your final project,
In Python, we've played around with the CSV, comma-separated values library.
And that's a huge limitation, because pretty much all of the examples
But today, we'll start to focus all the more on storing things on disk,
that actually begin to grow, and grow, and grow their data sets,
as might happen if you get more and more users, for instance, on a website.
To play, then, with this new capability of being able to write files,
And that Google Form is going to ask you in just a moment for really
And what are one or more genres into which your TV show falls?
that I'll be able to see as the person who created this form, which
so that we can actually play around with the data that's come in.
If you need that URL again here, if you're just tuning in,
All right.
Good.
So I'm going to go up to the File menu, if you've never done this before.
But more simply, and the one we'll start to play with here,
is comma-separated values.
So CSV files we used this past week, why are they useful?
Now that you've played with them or used them in past real world,
what's the utility of a CSV file versus something like Excel, for instance?
Any instincts?
Yeah?
A simple text file with ASCII or Unicode text is probably pretty small.
I like that.
Other thoughts?
like commas you can represent the idea of columns using new lines,
or installed it.
And let me go ahead here, and in just a moment, let me just simplify the name.
And give me just one moment here, and you'll see that, indeed,
is that you can just drag and drop a file, for instance, into your Explorer.
but sometimes it's like crime, comma, drama, or action, comma, crime, comma,
drama.
Yeah?
AUDIENCE: [INAUDIBLE]
So what Google has done, what Microsoft does, what Apple does
have all of this data with which we can play in the form of what we'll
Let me go ahead, then, and open up, for instance, just my Terminal window.
And let's go ahead and iteratively start simple by just opening up this file
So you might recall that we can do this by doing something like import CSV
Then, I can go ahead and do something like with open, the name of the file
called file.
It deals with the process of opening it, reading it, and giving you
back something that you can just iterate over, like with a for loop
I do want to skip the first row, and recall that I can do this.
Next, reader, is this little trick that just says, ignore the first row.
But this means now that I've skipped that first row.
For row in the reader, let's go ahead and print out the title
of the show each of you typed in.
How do I get at the title of the show each of you typed in?
Yeah?
AUDIENCE: [INAUDIBLE]
the second column, 0 index, that is, the one in the middle with the title.
but it's a quick and dirty way to figure out, all right, what's my data
look like?
And you'll see now a purely textual list of all of the shows
Yeah?
AUDIENCE: User errors [INAUDIBLE].
or just stylistic differences that give the appearance that one show
No big deal.
But this is just a tiny example of where data in the real world
It was just someone not caring as much to capitalize it, and that's fine.
and let's actually do something a little more user friendly for me.
Instead of a reader, recall that there was this dictionary reader that's
And it means I can type in dictionary reader here, passing in the same file.
But now, when I iterate over this reader variable, what is each row?
Yeah?
AUDIENCE: [INAUDIBLE]
just way more obvious to me, the author of this code, what it is I'm
getting at.
Was it 0?
Was it 1?
Was it 2?
And God forbid someone changes the data by just dragging and dropping
Now the effect of this change isn't going to be really any different.
But I've now not made any assumptions as to where each of the columns
actually is numerically.
All right.
Because there's a lot of commonality among some of the shows here, so let's
If I'm reading a CSV file top to bottom, what intuitively might be the logic
It's not going to be quite as simple as a simple function that does it for me.
I'm going to have to build this.
AUDIENCE: [INAUDIBLE]
I could use a list and I could add each title to the list,
And actually, let me invert the logic so I'm doing something proactively.
then, go ahead and do something like titles.append the current row's title.
And then, what can I do at the very end, after I'm all
So it's two loops now, and we can come back to the quality of that design.
Let me increase the size of my Terminal window so we can focus just on this,
Any thoughts?
And for those of you who might have accidentally or instinctively hit
But let me go ahead and strip off, from the left and the right implicitly,
any whitespace.
If you read the documentation for the strip function, it does just that.
And remember, what's handy about Python is you can chain some of these function
So now, I'm going to just check whether this specific title is in titles.
So let me go ahead and run Python of favorites.py again and hit Enter.
All right.
And now, it's a little more overwhelming to look at because it's not sorted yet
Now, before we dive in further and clean this up further than this,
and this would just modestly allow me to refine my code here, such
Marginally better design if you know that a set exists because you're just
We've now gone ahead and fixed the problem of case sensitivity.
Let's go ahead now and sort these things by the titles themselves.
you all inputted them, but filtering out duplicates as we go, let me go ahead
and use another function in Python you might not have seen,
And now you can really see how many of these shows start with the word "the"
or do not.
But now you can really see some of the differences in people's inputs.
So far, so good.
But a few of you decided to stylize Avatar in three different ways here.
And I think if we keep going we'll see further and further variances that we
by now iterating over my data, and keeping track of how many of you
We're going to ignore the problems like Brooklyn 99 and the Avatar.
For each title, I want to keep around how many times I've seen it before.
I'm throwing away the total number of times I see these shows.
Elaborate on that.
AUDIENCE: [INAUDIBLE]
store keys and values, that is, associate something with something
else.
Because they just let you remember stuff in some kind of structured way.
the values could be the number of times I've seen each of those titles.
maybe this is the title that I've seen, and this is the count over here.
allow me to store this data, and then maybe do some simple arithmetic
to figure out which is the most popular.
So let's do this.
two curly braces that are empty gives me an empty dictionary automatically.
the syntax for that, like before, is titles, bracket, and then
and then eventually I'll come back and finish my second loop
But for now, let's just keep track of the total counts.
Huh.
And wow, we didn't even get past the second row in the file
I'm blindly indexing into the dictionary using a key, How I Met Your Mother,
because the key you're trying to use just doesn't exist yet.
We're close.
We got half of the problem solved, but I'm not handling the obvious, now,
Yeah?
AUDIENCE: Counter.
And recall, this is just shorthand notation for the same thing as in C,
title plus 1.
Whoops, typo.
Don't do that.
That's the same thing as this but it's a little more succinct
Else, if it's logically not the case that the current title is in the titles
AUDIENCE: Zero.
I just have to put some value there so that the key itself is also there.
All right.
I can, as a quick check, let me go ahead and just run the code
I could, say, print out the title, and then, maybe, let's do something like--
Huh.
Huh.
None of you said a whole lot of TV shows, it seems.
Yeah?
AUDIENCE: [INAUDIBLE]
obviously, I'm seeing this title for the very first time.
And then I could get rid of the else, and now blindly index
I don't know.
This one's a little nicer, maybe because it's one line fewer.
have to make sure the key exists before we presume to actually incrue.
There we go.
So otherwise, everyone would have liked this show once, and no matter
Let me run Python of favorites.py, and now we see more reasonable counts.
But I bet if we sort these things we can start to see a little more detail.
So let's reintroduce the sorted function as I did before, but no other changes.
And given a given title, it's going to return the value of that title.
where you can pass in, crazy enough, the name of a function
that it will use in order to determine what it should sort by, by the key,
around so that they can be called for you later on by someone else.
But if you provide them with a function called get value, or anything else, now
to determine, OK, if you don't want to sort by the key of the dictionary, what
So let me go ahead now and rerun this after increasing my Terminal, Python
of favorites.py, Enter.
typed in, albeit forced to uppercase and with any whitespace thrown out.
But it took me a lot of work just to get simple answers out of it.
for today, ultimately, is, how can we just make this easier?
it's not like a library function that you want to keep around--
Well, it's kind of stupid that I invented this name on line 13.
If there's only being used in one place, why bother giving it a name at all?
When you use this special lambda keyword that says, hey Python,
it then says, Python, this anonymous function will take one parameter.
It throws away the parentheses, and it throws away the return keyword just
And then the function you're passing it to, like sorted, will use it as before.
The goal here has been to write a Python program that just starts
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Could you use the lambda if it's just returning immediately?
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Good question.
Yes, but you would just ultimately return the value in question.
All right.
Office was clearly popping out of the code here quite a bit.
So let me go ahead and throw most of this code away, up until this point
And let me go ahead, and I don't even want the global variable here.
So counter equals 0.
If title equals, equals The Office, I could then go ahead and say,
number of people who like The Office is, whatever this value is.
But let's go ahead now and deliberately muddy the data a bit.
All of you were very nice in that you typed in The Office.
And many people might just write Office, you could imagine.
would have if we had even more and more submissions over time.
Now let's go ahead and rerun this program, no changes to the code.
The data is now as I mutated it to have a couple Offices, and many The Offices.
How could I change my Python code to now count both of those situations?
Any thoughts?
Yeah?
DAVID J. MALAN: Yeah, so I could just ask two questions like that.
I don't have to worry about spaces because I at least threw that all away.
So I like that.
Avatar had three different permutations, and there were some others
OK.
[APPLAUSE]
So The V Office.
could have like 26 conditions if someone said The A Office, or The B Office,
right?
But then there's surely going to be other typos that are possible.
But it turns out we got lucky and now this is actually the accurate count.
Let me show another way that just adds another tool to our toolkit.
It turns out that there's this feature in many programming languages, Python
But it's going to be really useful, actually, maybe toward final projects,
in web programming, any time you want to clean up data or validate data.
you might know that there's toggles like this in Google's world,
So here's an example in Google Forms how you can validate users' input.
But a feature most of you have probably never noticed, or cared about, or used,
And I could actually reimplement that same idea by doing something like this.
I can say, let the user type in anything represented by .star, then an at sign,
something else.
And I think I want the same thing here 1 or more times, 1 or more times.
I'm sure, pretty cryptic, there's this mini language built into Python,
and JavaScript, and Java, and other languages that allows you to express
Let me put up, for instance, a summary of what it is you can do.
And here's just a quick summary of some of the available symbols.
characters.
It can be B or nothing.
Change that to a plus and you now express one or more characters.
Caret symbol means start matching at the beginning of the user's input.
Dollar sign means stop matching at the end of the user's input.
But let me go over here and actually tackle this Office problem.
Let me go ahead and import a new library called the regular expression library,
import re.
Let's just search for Office, quote, unquote, in the current title.
has a function called search that takes as its first argument a pattern,
So it's sort of looking for a needle in this haystack, from left to right.
Let me go ahead now and run this version of the program, Enter.
And now I screwed up because I forgot my colon, but that's old stuff.
Enter.
Huh.
Yeah?
AUDIENCE: [INAUDIBLE]
I'm adding in some parentheses just like in math, just to add another symbol
And this is saying start matching at the beginning of the user string.
Check if the beginning of the string is Office, or the beginning of the string
is The Office.
And now we're down to 15, which used to be our correct answer,
to solve problems.
This is one that you've actually probably glanced at but never used
And we're just scratching the surface of what's actually possible with this.
But let's now do one final example just using some Python code here.
a little more general purpose that allows me to search for any given title
at the beginning of this program, and first ask the user for the title
And then whatever they type in, let's go ahead and strip whitespace
So again, the only difference is I'm asking the human for some input
this time.
I could type in the office all lowercase even, and now we're down to 13.
13, why?
Because I'm the one that went in and removed those The keywords a bit ago.
and can solve these kinds of problems, we had to write almost 20 lines of code
Anything here?
No?
All right.
So we are back.
This isn't to say that you shouldn't use Python to do the kinds of things
And in fact, it might be super common if you're getting a lot of messy input
And maybe the best way to do that is to write a program so that step-by-step
like we did with The Office, for instance, again and again,
and reuse that code, especially if more and more submissions are
coming through.
that sometimes there are different, if not better tools for the same job.
and the week after that, synthesizing a whole lot of these languages
you might decide what the trade-offs are between using this tool, or this tool,
Again, just a very simple file, flat in that there's no hierarchy to it.
It's just rows and columns.
You can read data from it, and you can have multiple sheets, a.k.a.,
are meant to be reused really by humans with their mouse and their keyboard,
And indeed, most any mobile app or web app today that you or someone else
it only does four things fundamentally, known by this silly acronym, CRUD.
That's it.
There's a few more keywords that exist in this language called SQL
The ability to create or insert data is the C. The ability to select data
is the R, or read.
in SQL that, at the end of the day, just allow you to create, read, and update
you can create using this language called SQL, in your very
own database, a brand new table.
In the world of programming, though, if you want to create the analogue of that
like a sheet, that has a name, and then in parentheses has one or more columns.
and Numbers does allow you to format or present data in different ways,
it's not strongly typed data like it is, for instance, when we were using C.
Why?
you give the database about your data, the more performance it can be,
the faster it can help you get at and store that data.
Now how can I go about converting, for instance, some real data,
There's something called MySQL that's been very popular for years.
If you download an app that stores data like your own contacts,
Because it's fairly lightweight, but you can still store hundreds,
thousands, even tens of thousands of pieces of data
We're going to go ahead and run SQLite3 with a file called favorites.db.
Once I'm inside of the program, now I'm going to go ahead and enter CSV Mode.
into a table, that is, a sheet, if you will, called favorites as well.
Now I'm going to hit Enter and I'm going to go ahead and exit the program
the CSV file, the Python file from before, and now favorites.db.
But if I did this right, all of the data you all typed into the CSV file
has now been loaded into a proper database where I can now use
So let's go ahead again and run SQLite3 of favorites.db, which now exists.
Now no thought was put into the design of this data at the moment
give more thought to the data types and the columns that we have.
called favorites.
timestamp, title, and genres, which were inferred, obviously, from the CSV.
Again, once we're more comfortable we'll create our own tables,
All right.
Well, if I wanted to, for instance, start playing around with data therein,
Let me find the right one here-- one of which would be select.
opening the file, creating a reader or a DictReader, iterating over every row,
and it's simulating what it looks like if it were more graphical by creating
So here's perhaps a better tool for the job once you have the data.
might allow you now to get more work done more quickly
Someone else has figured out how to select data like this.
Well, let me go ahead and pull up, in a moment, just a little bit
Select columns from table then, is the generic form of what I just did.
Suppose I wanted to get two things, like the genres that each of you inputted.
Thank you.
And now, OK, we're going to have to clean some of these up.
There it is.
Let's go back to the titles, though, and perhaps start playing around
Enter.
And you can see here that we have just the distinct titles,
So there's a trade-off.
One of the things I was doing in Python was forcing everything to uppercase
And that's actually going to get rid of some of those values as well.
And again, I did it all in one simple line that was fast.
We can also qualify our selections by saying where some condition is true.
you can have the same in SQL as well, where I can filter my data where
Order by, limit, and grouped by are other commands I can execute, too.
How about, let me just get, oh, I don't know, all of the titles from favorites
That might be one thing that's helpful to see if you just care about some
How about, select all of the titles from favorites, where the title itself
Those are the two rows, recall, that I mutated by getting rid of the word The.
I can select the title from favorites where the title is like, quote,
unquote, "Office."
But I can add, a bit weirdly, percent signs to the left and the right.
So this will just grab any title that contains O-F-F-I-C-E in it in that
order.
And now I get all 16, it would seem, of those results, again.
For instance, I've never really been a fan of Friends, for instance.
where title like, quote, unquote, Friends with the percent signs?
OK, you and me, delete from favorites, where title like Friends, Enter.
[APPLAUSE]
Yes, you could technically write Python code that not only reads the CSV file,
But with SQL, you can update the data in real time.
equal to The Office, where title equals quote, unquote, "The V Office"
semicolon?
These were kind of long, and I don't really agree with all of that.
And now, if I select genres again, same query, now we've canonicalized that.
But I have at least cleaned up my data, which is, again, the U in CRUD.
Beware worse using drop, whereby you can drop an entire table.
manipulate our data much more rapidly and with single thoughts.
because it allows you to really dive into data quickly, and ask
You can do this with much larger data sets as we soon will, too.
Questions here?
All right.
This is OK.
It gets the job done, and frankly, everything the user typed in
was arguably text, including the timestamp, which is the date and time.
and I'm lowercasing all of the column names and the table names.
I think, the code when you're co-mingling your names for columns
but again, the SQL specific keywords don't quite jump out as much.
So here is where--
oh.
The Office about action, adventure, drama, fantasy, thriller, and war.
AUDIENCE: [INAUDIBLE]
sorry.
Select genres from favorites, that was the result I was getting.
It's much messier, but that's because some of these are quite long.
All right.
Well, I don't love the design of the genres table for a couple of reasons.
Comedy, drama.
How about let's search for the-- oops, let me copy paste comedy, comma, drama.
OK, so The Office, in this case, was considered comedy and drama, Billions,
So the catch here is that, because I have all of these genres implemented
it's actually really hard and messy to get at any show, all of the shows
I'm going to get are this one, whatever that show is, this one, whatever
Why?
Why am I missing?
Yeah?
AUDIENCE: [INAUDIBLE]
So I have to search for these commas, so this gets messy quickly, right?
AUDIENCE: [INAUDIBLE]
that's going to give me all of them, so long as the word comedy is in there.
But let me go ahead and just open the form from earlier.
Let me see if I can open this real quick before I toggle over.
were all of those radio buttons asking for the specific genres
And if I open this, let me full screen here and now open the original form.
are that worrisome except for a corner case is jumping out at me.
DAVID J. MALAN: Yeah, music and musical are deliberately on the list here.
I could probably hack something together with-- maybe add some commas in there,
or complicate things using some other weirder character than commas alone.
that I wrote in advance that's going to use Python to open up the CSV file,
iterate over all of the rows, and load the data into two tables this time,
Version 8, OK.
as we'll soon see, a CS50 library, not for the sake of get_string,
library itself.
But inside of the CS50 library we'll see there is a special function called
SQL that gives you the ability using this weird URL-like looking thing,
Come on.
Come on.
Sorry.
OK.
Let me just skim this code real quick to see where we've gone wrong.
[INAUDIBLE] reader.
All right.
All those times we encourage you to use print, this is me actually using print.
All right.
So in a moment--
there we go.
Once I've run the program I can do .schema to look inside of it.
And here's what the two tables in this database are going to look like.
I've created a table called shows, this time to represent all of the TV shows
But now I'm going to start taking out for a spin some
And besides there being text, it turns out there's a data type called integer.
This is a database constraint that allows you to ensure that none of you
can't have of favorite TV show.
If you submit the form, you have to have typed in a title for it
Meanwhile, the second table my code has created for me, as we'll soon see,
If I have two tables now, not just one, they can somehow
an ID and a title.
Every title you gave me, I'm going to assign a unique value.
And the result of this, to pop back to the Terminal here, is, let's do this.
and you'll see that I've given, indeed, all of the shows you all typed
in unique identifiers.
to uppercase.
So there's going to be some duplicates here because I didn't
typed How I Met Your Mother, all the way down to input number 158.
Meanwhile, if I do select star from genres, which is now a table, not just
Let me go all the way to the top and you'll see two columns, one of which
had to take Google's messy output where everything was separated by commas.
I had to tear away the commas and then put each genre into this table
by itself.
Even though I've doubled the number of tables from one to two,
Yeah.
Oh, just because we had the conversation before about the commas.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Exactly.
We've cleaned up the data by giving every genre, every word in the genres
Whoever typed in How I Met Your Mother, they only associated one genre with it.
And so you can see now that we've associated the data with what we
A one-to-many relationship, whereby for every one show in the show's table,
it can now have many genres associated with it, each of which
How I Met Your Mother, The Sopranos was the second input there.
It would seem that now that I've created the data in this way,
I could ideally somehow search the data, but a little more correctly.
All of the comedies, no matter whether the person checked just the comedy
Yeah?
Let me do select show ID from genres, where the genre in a given row
Because literally, that column now is singular words, like comedy, or drama,
or the like.
You can actually compose one SQL question from multiple ones.
So let's do this.
Select show ID from genres, where the specific genre is, quote, unquote,
"comedy."
the titles for by selecting title from shows where the ID of the show
checked one box for comedy, two boxes, or all of the boxes.
Now we have a whole list of the same titles that are now sorted.
And what was the keyword with which I could filter out duplicates?
Yeah, distinct.
Same query, but let's select only the distinct titles from that whole query.
don't just start at the beginning and type out my whole thought,
taking baby steps in order to get to the answer you actually care about,
fix because I re-imported the data after accidentally changing everyone's genre,
But now it's better designed, because we have it split across these two tables.
Yeah?
AUDIENCE: [INAUDIBLE]
And you're probably updating it, maybe deleting it, adding to it,
and so forth.
For instance, the one command I did not show earlier
No?
All right, well, just the one person that checked that box, so you and me.
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: Say again?
AUDIENCE: [INAUDIBLE]
a show ID, like this, and then, the name of the genre,
and a genre name, the values 159, and, quote, unquote, "comedy" semicolon,
Enter.
And now, if I scroll back in my history and execute that really big query
Let's do update.
No?
Enter.
And now, if I execute that really big query, now Seinfeld is,
indeed, considered a comedy.
Well, thus far we've been doing all of this pretty manually.
and you, for instance, log in, odds are you're typing a username and password,
clicking Submit.
When you buy something on Amazon.com and you click Check Out,
and then maybe using a for loop of some sort, in Python or another language.
It's doing a whole bunch of SQL inserts to store in their database what
it is you bought.
step-by-step, line-by-line.
But when I want to get at some data I can actually talk to a SQL database.
And let me go ahead and throw away some of what we did earlier and really
that deal with SQL and Python are more complicated than they need to be.
Let's use that you URI, which is a fancy way of saying something
that looks like a URL, but that actually opens up a database on disk, that is,
Let's now ask the user for a title by prompting them for a, quote, unquote,
And let's strip off any whitespace just so that the data is not messy.
I'm going to go ahead now and write a line of code that uses Python
I'm using the original that we imported from your own data,
table, where the title the user typed in is like this question mark.
And using a comma outside of this first string, using CS50's execute
then any arguments I want to plug into the question marks therein.
So this is going to select the count of people from the favorites table
where the title they typed in is like whatever the user has just now typed
in.
the documentation.
I'm going to go ahead and grab the first row from those rows.
And then I'm going to go ahead and print out that row's first value.
that are coming back, especially if they are the result of functions like this.
That means I can now say get back the counter key inside of this dictionary.
file that you and I created earlier by importing your CSV into SQLite.
I'm now just asking the user for a title they want to search for.
So this line of code just gives me the first and only row.
And then, this goes inside of that row, which it turns out is a dictionary,
and gives me the key counter and the value it corresponds to.
So I've just recreated the same data set that you all
Select, count star from favorites, where title like, and let's
I technically get back a miniature table containing one column and one row.
contains one column, as we'll see, the key for which is counter.
I'm going to get out of SQLite3 and I'm going to run Python of favorites.py.
Enter.
I'm going to type in The Office and cross my fingers, and there's that 12.
Why is it 12?
I had deleted two of the Thes, so we're back at the original data set.
And it's just one line of SQL now, using the best of both worlds.
All right, any questions on what we've just done here or how any of this
works?
Yeah?
AUDIENCE: [INAUDIBLE]
DAVID J. MALAN: When does this function return more than one row?
AUDIENCE: Yeah.
of the ways you all typed in The Office by selecting the title this time.
Let's do it manually.
I get back all of these different rows, and we didn't even notice this one.
And that for loop now should iterate, what, 10 or more times,
Whoops.
Row title.
Whereas had I used the equal sign I would get back only the same ones
capitalized correctly.
arise when actually now using SQL and skating toward a world in which we're
using SQL for mobile apps, web apps, and generally speaking,
Give me just a moment to switch screens over to what we have for you today,
So InternetMovieDatabase.com is a website
where you can search for TV shows, and movies, and actors,
IMDb wonderfully makes their data set available as not CSV files,
And so what we did is, before class we downloaded those TSV files.
for you in SQLite that has multiple tables and multiple columns.
So let's go and wrap our minds around what's actually in this data set.
I'm going to go ahead and copy the file, which we've named shows.db.
And I'm going to go ahead and increase my Terminal and do SQLite3 of shows.db.
Whenever playing around with a SQLite database for the first time,
of what's in there.
and also problem set 7, where we'll look at the movie side of things
And notice we've just added whitespace by hitting Enter a bunch of times
types that are worth being aware of when it comes to creating tables themselves.
is for just raw 0s and 1s, like for files or things like that.
we decided that every show would be given an ID, which is just an integer.
What now is with these primary keys that we mentioned earlier, too?
A primary key is the column that uniquely identifies all of the data.
gave each of your submissions a unique ID so that even if two or more of you
But they don't come from us, they come from IMDb.com.
And so a genre has a show ID, and a genre just like our database.
cross referencing the original shows table, if shows have a primary key
called ID, and those same numbers appear in the genres table
under the column called show ID, by definition, show ID is a foreign key.
You have multiple tables with some column in common, numbers typically.
And those numbers allow you to line the two tables up in such a way
just like we did with our smaller data set a moment ago.
Notice that the IMDb database we've created for you has a stars table,
And we have decided in IMDb's world that every person in the TV show world
will have a unique identifier that's a number, a name that's text, a birth
date, which is numeric, and then, again, specifying that ID
Well, it turns out that TV stars and writers are both types of people.
So using this relational database, notice the road we're going down.
And then, notice these two tables are almost the same.
allows us via this middleman table, if you will, to link people with TV shows.
Similarly, the writers table allows us to connect shows with people, too,
There's a lot of people in the TV show business, not just actors and writers,
from people, where name equals Steve Carell, for instance, sticking
with comedies.
And you can see it here again flying across the screen.
Select star from shows where title equals, quote, unquote, "The Office."
Turns out there's been a lot of The Offices out in the world.
All right, so now we've got back just the ID of The Office
Enter.
It found it pretty fast, but it looks like it took how much real time?
Let me do this.
Enter.
But now watch, if I select star from shows searching for The Office again,
Literally, as you just saw, these tables are crazy long or tall right now,
it was literally doing linear search, top to bottom, looking at as many as,
like this, which I just did, creating an index on the title column of the show's
hey, I know I'm going to search on this column in this table a lot.
Maybe it's using a trie or a hash table, some fancier two-dimensional data
structure is generally going to lift the data up creating right maybe a tree
structure.
Different use of the letter B, but it looks a little something like the trees
we've seen.
And the upside of that is that if your data is stored in this tree,
And the reason it took half a second, a third of a second to build the index
and then I searched for it in the other table using select in a nested query,
Let's go ahead now and, for instance, find all of Steve Carell's TV shows.
And this is called a join table, in the sense that using two integer columns--
And so if you're savvy enough with SQL, you can do what I did with my hands
So let me do this.
Well, if I select star from people, where name equals Steve Carell,
So this gives me back his name, his ID, and his birth year.
Why?
Because in order to get back his shows, I need to link person ID with show ID.
I've just gotten, from the people table, Steve Carell's ID.
I bet by transitivity I could now use his person ID, his ID,
And then once I've got all of his show IDs, I can take it one step further
So the answer is actually English words and not just random, seemingly,
integers.
Let me, again, get Steve Carell's ID number, but not star.
So again, I'm building up my answer in reverse and taking these baby steps.
that have some connection with that person ID in the stars table.
Select title from the shows table, where the ID of the show
came back from the stars table searching for Steve Carell's person ID.
And if I want to tidy this up further, I can use the same tricks as before.
Create an index called person index, and I'm going to do this on the stars table
Enter.
Let's create another index called show index on the stars table.
Why?
Takes a moment.
Now let's create one last one, another index called name index,
but I could call these things anything I want, on the people table.
Why?
All right.
Now if you've ever used, for instance, the my.harvard course shopping tool
here on campus, or Yale's analogue, you might wonder, why is the thing so slow?
This could be one of the reasons why large data sets with thousands of rows,
and things are spinning and spinning, what might be among the problems?
Well, it could absolutely just be bad algorithms and bad code that you wrote.
All right, let's point out just a couple of final syntactic things,
want to connect Steve Carell to his show IDs and to their titles,
Select title from the people table, joined with the stars table on people
ID equals stars.personID.
So what am I doing?
New syntax.
And again, this is not something you'll have to memorize or ingrain right away.
But just so you've seen other approaches, select title from people
join stars.
This is an explicit way to say, take the people table in one hand, the stars
Join them so that the people, the ID column in the people table lines up
That's saying, go further and join the stars table with the show's table,
But now I can just say, where name equals, quote, unquote, "Steve Carell."
And I can still add in my order by title to get back the result.
And if I do this a little more neatly, let me type this out a little
differently.
Let me type this out by adding a new line-- ah, I can't do that here.
If you know in advance that you want to do something with all three tables,
you can just enumerate them, one table name after the other.
And then you can say where people.ID equals stars.personID.
And stars.showID equals shows.ID, and lastly, name equals Steve Carell.
In short, you specify that you want to select data from all three
of these tables.
And then you tell the database how to combine foreign keys with primary keys,
Oops.
All right.
All right.
But this is only to say that, even as we make the design of the data
over here, some of it over here so as to avoid duplication of data, weird hacks
like putting commas in the data, we can still get back all of the answers
Memory, so space.
because you're going to waste way more space than you might actually need.
is part of the process of designing and just getting better at these things.
and they continue to in the real world with people using SQL databases.
something technical about SQL databases, and websites being hacked in some form,
by not quite appreciating how it is the data is getting into your application.
Why?
But if I were the owner of the website trying to see if I've made any mistake,
Dangerous how?
Because single quote is used for quoting things in SQL, as we've seen--
But let's now imagine what the code underneath the hood
Suppose that they are using something like CS50's own execute function,
and they've got some SQL typed into the website that
Well, when the user types their username password, hits Enter,
All right.
Suppose that you were not using a third-party library like ours
and you were just manually constructing your SQL queries like this.
You've gotten into the habit of using curly braces and plugging in values.
and the user has typed in single quotes to their input, what
Where are we going with this if you're just blindly plugging user input
Yeah?
AUDIENCE: [INAUDIBLE]
Worst case, they could insert what is actually SQL code into your database
as follows.
Or you better hope that they don't type a single quote as well.
Because what if their single quote finishes your single quote instead,
Well, select star from users, where username equals [email protected], end
quote.
So length of rows will equal 1, and so presumably the rest of the pseudo code
or whatever it is.
Why?
were taught, which was just to use curly braces to plug in,
in f-strings, values.
But if you don't understand how the user's input is going to be used,
and if you don't distrust your users fundamentally, for every good person
some adversary who just wants to try to find fault in your code or hack
because the user can type something that happens to be or look like SQL,
and trick your database into doing something it didn't intend to,
Maybe the user types a semicolon, then the word drop, or the word update.
means you can do anything you want with the data set, either constructively,
or worse, destructively.
And now, just a quick, little cartoon that should now make sense.
The word drop, table, students, and doing some of the same technique.
because it's the mom realizing, oh, her son's doing a SQL injection
Less funny when you explain it, but once you notice the syntax, that's all this
is an allusion to.
All right.
to the world of proper databases and away from CSV files alone.
and honestly, even using CSV files if you have multiple users.
in almost every program we've written that it's just me using my code.
But the world gets interesting if you start putting your code on phones,
on websites, such that now you might have two users literally trying
And this is a problem in computing in general, not just with SQL, not just
with Python, really just any time you have shared data,
Like, a couple?
Oh, OK.
Wow.
or like any social media site, there's some equivalent of a like button
Your code has to update the database, and then do it again and again,
even if multiple people are perhaps right now clicking on that same egg.
And unfortunately, bad things can happen if two people try to do something
So here's some more code, half pseudocode, half Python code here,
as follows.
Suppose that what happens when you, literally, right now, maybe click
All right.
Why?
Because I need to know how many likes it already has if I want to add one to it
I need to select the data, then I need to update the data here.
All right.
likes, whatever comes back in the first row from the likes column.
but a common way of getting back first row and the column called
likes therein.
Then I do this.
equal to this value, where the ID of the post equals this value.
I'm checking the value of the likes, and maybe it's 10.
clicked on that egg at roughly the same time, or literally, the same time.
Why is that?
and the Instagrams of the world have thousands of physical servers nowadays.
And if your code is running on a server that multiple people have access to,
Then this line of code for me, then this line of code for you,
then this line of code for me, then this line of code for you.
do a little bit of work for me, a little bit of work for you,
and back and forth, and back and forth, equitably on the server.
Same order top to bottom, but other things might happen in between.
So suppose that the number of likes at the very beginning was 10.
And suppose that Carter and I both click on that egg at roughly the same time.
Suppose, then, that the computer takes a break from dealing with my request,
Then Carter's code is executed, updating the same row in the database
to 11, unfortunately.
actually act out, is something that was taught to me years ago in an operating
systems class, whereby the most similar analogue in the real world would be
And one of you and your roommates comes home, opens the fridge, and realizes,
oh, we're out of milk, was how the story went in my day.
So you close the refrigerator, and you walk across the street, go to CVS,
So long story short, you both eventually get home, open the door, and damn it,
How would you fix this or avoid this problem in the real world?
to happen when it's happening across multiple users just for fairness' sake,
You need to make sure that all three of these lines of code
execute for me, and then for Carter, and then for you
And for years, when social media was first getting off the ground,
And we won't get into the weeds of how you might use these things,
called locks, which I use that word deliberately with the fridge.
Software locks can allow you to protect a variable so no one else can
look at it until you're done with it.
that you're going to box someone out or make Carter's request a little slower.
Why?
together in between these transactions so that you get in and you get out.
And you go to CVS and you get back really fast so as to not
The original goal was just to solve problems using a different language
The week after, bringing Python and SQL back into the mix.
And over the next few weeks, the goal is to make sure you're understanding
and comfortable with what each of these things is good and bad for.
[MUSIC PLAYING]
So even as we explored variables and loops and conditionals and all of that,
of that away, when we introduced C, and a terminal window, and a command line,
because now, all of your programs became very textual, very keyboard-based,
and gone was the mouse, the animations, the menus, and so forth.
This will, to this week and next week, combine elements of the back-end server
stuff that we've been doing for the past several weeks,
on your own Mac, your own PC, your own phone, that's going
And increasingly, even the mobile apps that you all are using
called HTML, CSS, and JavaScript, which we'll focus on here today.
But before we do that, let's provide a foundation on which these apps can run,
because indeed, we'll start to look underneath the hood of how the internet
itself works, albeit quickly, so that we have kind of a mental model for where
all of this code is running, how you can troubleshoot issues, and how,
It's this utility nowadays, that we all rather take for granted.
There's a whole lot of servers out there, that are somehow interconnected,
Indeed, Harvard has its own network and Yale has its own network,
do you get the interconnected network that is the internet as we now know it?
And the intent, originally, was just to Interconnect a few universities here
whether back in 1970, or now in 2021, each of these dots you can think of as,
And so there's all these servers here on campus at Harvard, on Yale's campus,
you have your own routers out there, whose purpose in life
You, in your home, probably have just one cable coming in or going out.
But certainly, if you're a place like Harvard or Yale or Comcast or the like,
your phone, your desktop, actually get data from point A to point B?
when it boots up at the beginning of the day, what the local router is, what
server.
So here we have some of our TFs and TAs and CAs present and past.
[MUSIC PLAYING]
[APPLAUSE]
Here we have just, physically, what it was the staff were passing around.
and she wanted to send it to Brian on the West Coast, top left hand corner.
She could go up, down, in her case, and then each of those subsequent routers
could go up, down, left, or right, until it finally reaches Brian.
But they do so by taking an input, and in the form of input is this envelope.
because all of these routers and, in turn, all of our Macs and PCs
like Carter might extend his hand, thereby interacting with me, based
But that, too, is just a set of rules that we all follow and adhere to,
And TCP and IP are two such protocols that standardize this as follows.
if she wants to send an email to Brian, is put the email in a virtual envelope,
so to speak.
But on the outside of that virtual envelope, put Brian's unique address.
And then she's going to put her own source address in the top left hand
corner, just like you, the sender, would put your own source
Eight bits or one byte, which is to say, we can extrapolate from there,
of humans in the world these days, all of whom have, many of whom
have multiple devices, certainly in places like this, where you have
a laptop, and a phone, and you have other internet of things-type devices,
for computers, so we can at least handle all of the additional devices we now
have today.
address, that is Phyllis's IP address, so that this packet can go from point A
But on the internet, you presumably know that there's not just email servers.
There's web servers, there's chat servers, video servers, game servers.
how does he know it's an email, versus a web page, versus a Skype call,
One, the type of service whose data is in this envelope, that is,
And I'm going to go ahead and write down a colon, and the word port, P-O-R-T.
And I'm going to write that in the source address, too, colon and port.
that represents what kind of service is being sent from point A to point B,
not even in the context of email, but in the context of the web.
when that request is actually encrypted, using that thing you probably
These are the kinds of things you Google if you ultimately care about.
and then a number, which is only to say these numbers are omnipresent.
But that's all it takes, ultimately, for Phyllis to get this message to Brian.
so to speak.
a little bit of time to someone else, so that eventually, Phyllis' entire cat
and then use, not just a single envelope, but maybe a second, a third,
Yeah.
So I'm going to write one more thing in like the memo line of the envelope
here.
in these packets.
If Brian receives envelopes like these, with numbers like these in the memo
Yeah, in back.
So short answer, exactly, yes, TCP, because of this simple little integer
Why?
TCP, if it is the protocol being used to transmit data from point A to point B,
And just as a taste of why you might ever not want to guarantee delivery,
maybe you're watching like a streaming video, like a sports event online.
to buffer and buffer and buffer, just because you have a slow connection,
And then you're going to be the only one in the world watching
the game that ended 20 minutes ago, when everyone else is sort of up to speed.
that, even if the person on the other end sounds a little crappy, at least
because that would really slow down that sort of human interaction.
and standardizes numbers that every computer, your own included, gets,
can be used, between points A and point B. All right, this is great,
These days it's, like, I don't know most of the phone numbers
And, indeed, when you visit a website, what do you type in?
on the internet, one more acronym for today, called DNS, domain name system.
And pretty much every network on the internet, Harvard's, Yale's, Comcast's,
Someone else did, your campus, your job, your internet service provider.
But there is some server connected somehow to the network you're on,
via wires or wirelessly, that just has a really big table in its memory,
a hash table, that has at least two columns of keys and values
respectively.
to IP addresses.
to IP addresses.
Well, let's go ahead and poke around, for instance, at a couple of URLs here.
Let's see what we can actually do now with these basic primitives.
And so, just to go one level deeper, now that we have these packets that
put something specific inside of them, not just an email and a bunch of text,
but something called HTTP, which stands for hypertext transfer protocol.
in the form of URLs, so much so that you probably don't even type it nowadays.
talk about here, that just standardizes how web browsers and web
servers inter-communicate.
from left to right, right to left, top to bottom, that gets data from point A
to point B. You can do anything you want on top of that internet nowadays,
email and web and video and chat and gaming, and all of that.
you can just assume you can do really interesting things with that, too,
More on that later, but the HTTP is what standardizes what kinds of messages
to request information from a server, and how to respond from the server
Now these days Safari, and even Chrome to some extent, and other browsers,
are in the habit of trying to hide more and more of these details
but this whole thing is what we meant by fully qualified domain name.
know that dot com means commercial, although anyone can buy it these days.
Some of them are a bit restricted, dot mil is just for the US military,
whose country code is .io, and you see other two letter top level domains that
refer to countries.
That specifies how the server uses this URL to get data from point A
And if any of you have dabbled with HTML or made your own website,
GET means put any user input in the URL, POST means hide it,
so that things you're searching for, credit card numbers you're typing in,
but rather they're somehow provided elsewhere, deeper into that envelope.
and I'm connected to the cloud, and in that cloud is some server that I
that request is going to reply with what we'll typically call a response.
And just like in a restaurant, where you might request something to eat,
One request, one response, for each such web page we request.
All right, so what's inside these envelopes, and what do we actually see?
Well, this arrow, this line I just drew from left to right,
The verb GET, the URL, or rather the path that you want to get,
And the envelope contains some mention of the host that was typed in,
the fully qualified domain name.
This is because single servers can actually host many different websites.
nowadays, you don't get your own personal server, most likely.
know to send it to your web page or my web page or some other customer
altogether.
Hopefully, then, when your browser requests this web page from the server,
and then literally a short phrase like OK, which means exactly that, like, OK,
Text/HTML means here comes some HTML, which is just a text language.
there are these different content types, otherwise known as MIME types,
But in general, what you see here, are a familiar pattern, keys and values.
though you can do this kind of thing with most any browser today.
But it turns out we can poke around at what my browser is actually doing.
I'm going to start to use incognito mode this time, not because I
developer tools, which is something that all of you have, if you use Chrome.
I don't really care what's new, so I'm going to close the bottom thing there.
And I'm going to hover over the Network tab for just a moment.
110 other things you need, 112 other things you need to get.
So my computer went back and forth, requesting even more content for me.
Why?
And I'm going to click on this row, under the Network tab.
To an average person using the web, they needn't care about this,
and then other stuff that, for now, it's not that interesting for us.
But let's look at the response that came back from the server.
I'm going to scroll up now and see response headers, view source.
It is not OK.
Maybe the marketing people want you to be at www instead of just Harvard.edu.
Why?
And all this other stuff is sort of uninteresting for our purposes,
And that's why, in my browser, all of this happened in like a split second,
going to allow me to play with websites and just see those headers,
Let me go ahead and run, for instance, Curl-I-xget, which is just the command
Now, by way of how Curl, works, I'm just seeing the headers.
And you see exactly the same thing, 301 moved permanently.
Let's go to the location, with HTTPS and the www and hit Enter.
AUDIENCE: Migrate?
if I were using a real browser, the actual content of the web page.
Looks like Harvard's version of HTTP is even newer than the one I'm using.
It's using HTTP version 2, which is fine.
like Harvard.edu, when this file does not exist, something completely random,
What do you see now, that's perhaps familiar, in the real world?
Yeah.
something like OK, or moved permanently, what I've just gotten back, quite
that you'll start to see over time, as you start to program for the web.
200 is OK.
you might see one of these codes, indicating that you just
500 is bad.
We're going to have typos, logical errors, and this is on the horizon,
just like segfaults were in the world of C, but solvable with the right skills.
Has anyone, we can get away with this here, less so in New Haven,
[LAUGHTER]
OK, so--
[APPLAUSE]
Someone out there has been paying for the domain name,
The person who bought that domain name and somehow configured
DNS to point to their web server, the IP address of their web server,
And when we do this, you'll see that, oh, they don't need us to be secure,
recently, apparently.
So--
[APPLAUSE]
And, honestly, hands down, one of the best pranks ever done in this rivalry
was by Yale to Harvard.
[VIDEO PLAYBACK]
[MUSIC PLAYING]
[CHEERING]
- Go Harvard!
- Go Harvard!
- Let's go Harvard.
- Where does?
- Hah-hah.
- Because they don't have it.
- OK.
- Garbage.
- I know, but--
- Sometimes.
- Yeah.
[CHEERING]
- OK.
Full timer.
- Oh, no.
My bad.
- Yeah.
[CHEERING]
Go, Harvard.
[APPLAUSE]
[BEEP]
- You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
You suck.
[SCREAMING]
[SCREAMING]
- Oh.
- Harvard sucks.
Harvard sucks.
Harvard sucks.
Harvard sucks.
Harvard sucks.
Harvard sucks.
Harvard sucks.
[END PLAYBACK]
SPEAKER 1: All right, so thanks to our friends at Yale for that one.
Let's go ahead here and consider, in just a moment, what further is deeper
have the ability to get data from, oh, OK, YouTube autoplay again.
Let's consider for just a moment that, let's consider for just a moment
that we now have this ability to get data from point A to point B.
In fact, we don't yet have the language with which the web pages themselves
is that there are these constructs via which you can express conditionals.
HTML and CSS aren't so much about logic as they are about structure,
to both C and Python, but that's going to allow us to make these web pages not
just static, things that you look at, but interactive applications as well.
And then next week again, in week 9, will we reintroduce Python and SQL,
tie all of this together, so that you can actually have a browser or a phone
the experience that you and I now take for granted for most any app or website
today.
I'm going to go ahead and create a file quite simply called, Hello.html.
And I'm going to go ahead and bang this out real quick.
But then we'll more slowly step through what the constructs are herein.
and then notice I'm going to do open bracket slash html close bracket.
So you'll see that there's this symmetry to much of what I'm going to type,
VS Code is automatically generating the end of my thought for me, if you will.
And then down here, I'm going to create the body of this web page
and say something like Hello, body.
And let me specify at the very top, that all of this is really in English,
want that web page to also live in the cloud, that is, on the internet.
that you might use in a browser, by default that website is using probably
is that you and I can just pick port numbers to use and run our own web
by just running our own web server via this command, HTTP-server,
in my terminal window.
and 443 is already used, you can run your own server
here I see a so-called directory listing of the web server I'm running.
I only see the file that I've created in my current directory, called
Hello.html.
And now you can see just in this single browser window, in my own URL
which just says to the internet, hey, listen for requests from web browsers,
And this means I can develop a website using a web-based tool, like this one
All right, so now let's consider what it is I actually just typed out.
HTML is characterized really by just two features, two vocab words, tags
and attributes.
Most of what I just typed were tags, but there was at least one attribute
already.
Here's the same source code that I typed out in HTML, from top to bottom.
Let's consider what this is.
It's the only one that starts with an open bracket, a less than sign,
says to the browser, you are about to see a file written in HTML version 5.
That line of code has changed over time, over the years.
You use the same tag number, you use the same angled brackets.
Not the tags, but the words, like Hello, title and Hello, body,
So when you close a tag, you close the name of it with the slash
to keep track of this stuff, like a two column table, with keys and values.
The head tag says, hey, browser, here comes the head of the page.
The body is like 99% of the user's experience, the big rectangular window.
The head is really just the address bar and other such stuff at top,
like the title that we saw a moment ago.
Just to introduce the vernacular, then, the HTML tag, otherwise known
as an element, has two children, the head child and the body child,
So you can use the same kind of family tree terminology that we used,
If we look at the head tag, how many children does it seem to have?
ignore all the white space, the spaces or tabs or new line characters,
And an element is the terminology that includes the start tag and the end tag,
And the title element has one child, which is just pure text,
If we jump then to the body, which is the other child of the HTML tag,
it too has one child, which is just another chunk of text, a text node,
What's nice about this indentation, even though the browser technically
And this is where we connect, like weeks 5 and now weeks 8, here
Which is only to say that when your browser, Chrome, Safari, whatever,
and sees the contents that have come back from the server,
And it's the kind of thing that you look them up when you need to.
This one is being super simple, but it's just tags and attributes.
SPEAKER 1: If we put the hello tag around body, that's a good question.
whoops, sometimes you don't want it to finish your thought for you.
Give me a second.
There's my Hello.html.
copy paste my sample HTML into this box, and click Check.
which was to show me nothing at least, rather than the incorrect information.
But if I revert that change, and let me undo what we just did,
let me copy my original code back into this text box, and click Check,
And I'm just going to do a bunch of copy/paste just to start things off,
so I'm not constantly typing all this darn stuff again and again,
So let me go to some random website here and grab lorem ipsum text,
this is placeholder text, kind of looks like Latin, but technically isn't.
Here, though, I have a handy way of just getting three long paragraphs
Let me reload this page, and you'll see two files have now appeared,
Yeah.
So that's interesting, but it's just a little hint as to how pedantic HTML is.
And each of these tags tells the browser to start doing something,
and then maybe stop doing something, like, hey, browser, here comes my HTML.
between the HTML and the browser, doing literally what it says.
I'm going to keep things neat, even though the browser won't care,
Let me indent that, and then let me add it to the end of my page here.
So again, a little tedious, but now I have three paragraphs of text that say,
this just gives you headings, like in a book, different chapters or sections
So now that I've added an H1 tag, and the word one, H2 tag, the word
two, H3 tag and the word three, let's go back to the browser,
Yeah.
But you can kind of see that H1 is apparently big and bold,
as you might use for chapters, sections, subsections, and so forth, all right?
What's a common thing, too, well, let me go to VS Code again, let me go ahead
So in List.html, suppose I want to have a list of things, foo, bar, and baths,
There's List.html, and, hopefully, I'll see foo, bar, and baths, one
why we're seeing foo, bar, and baths on the same line, and not
But notice the hierarchy, open UL, open LI, close LI, open LI,
Let me go back to my browser, reload the same page, List.html, and voila,
If you don't want an unordered list, but an ordered list, what tag should I use?
AUDIENCE: OL.
Let me reload the page, and now it's going to automatically number for me.
might add some things in the middle, the beginning, or the end.
not just paragraphs, not just lists, but what about tabular data?
some financial data you want to present, a phone book that you want to present.
So T head is the name of that tag, and tables can have T bodies, table bodies.
finish your thought, and then go back and fill in what's in between.
So, for instance, like left column name, right column number.
Let's create a table heading with the TH tag, and let's say name here.
Let's do table data, which is synonymous with like the cell of the table,
and then lets grab Carter's number from our past demo, 617-495-1000.
Then let's put me into the mix, and I'll go ahead and copy paste here,
But we'll see that there's a lot of shared structure with HTML.
Let me go ahead and do mine, 949-468-2750, and now save this page.
But you can see that there's two columns, name and number.
Because it's a table heading, TH, the browser made it boldfaced for me.
In there, in the table, are two rows below that, Carter and David.
But with HTML alone, I'm really focusing on the structure alone.
But for now, this is how you might lay out tabular data.
All right, let me pause here just to see if there's any questions.
But, again, the goal right now is just to kind of throw at you
some basic building blocks, that, again, can be easily looked up in a reference.
For the stylization of these things, beyond the basics, like big and bold,
is full of, which is like photographs and images and the like.
Let me go ahead and create a new file called Image.html, and let me go ahead
And then, in the body of this page, let's go ahead and put an image.
going to have a start tag and an end tag, because that's kind of illogical.
Like, how can you start an image and then eventually finish it?
So let's now go back to my open browser tab, and let's look in the directory.
a really big picture of Memorial Hall, the building we're currently in.
Suffice it to say I should probably fix this and maybe make it only so wide.
But to do that, we're going to probably want to use this other language, CSS.
And let me go ahead and change this now to be a file called Video.html.
And let's go ahead and now introduce another tag, a video tag,
open bracket video, and then let me go ahead and close that tag proactively.
And then inside of the video tag, you can say the source of the video
Most browsers, to prevent ads, don't autoplay videos, if they have sound.
And let me set the width of this thing to be like, oh, 1280 pixels wide.
So I know this just from having looked up the syntax for this tag.
And this is actually a video that was just on Harvard's website yesterday,
This is the video that was on Harvard.edu last night, same photo.
Now there's some artifacts here, like there's a white border around the top.
But again, we'll come back to a language that can allow us to do exactly that.
If you've ever poked around with, if you have your own YouTube account,
an iFrame.
happens to be a YouTube video, there's a certain URL format you need to follow,
videos, in the course's website, or the video player does literally this.
there's iFrame.html.
But it does seem to embed a tiny little video there for you to play with later,
if you'd like.
So we could change the width, change the height, get rid of that margin,
and so forth.
But an iFrame is a way of embedding someone else's web page in your web
the more of an interactive experience for them on, say, your site.
All right, well, the web is, of course, known for things like links.
And if we want to create a web page that actually links from itself somewhere
else, let's go ahead and do this, something very simple like visit
Harvard.edu period.
Now, in like Facebook, Instagram, a lot of websites nowadays, if you just type
detects something that looks like a URL, and turns it into a proper link.
But instinctively, even if you've never written HTML before, what should
Yeah.
SPEAKER 1: Yeah, so I want to surround the URL with some kind of link text.
And you wouldn't necessarily know this until someone told you,
or you looked it up, but the tag for creating a link is somewhat weirdly
And then I can still say Harvard.edu, and make that what the human sees.
But the place they're going to go should be a full URL protocol and all,
and if I hover over it but don't click, you'll see that, in most browsers,
on this link.
P-H-I-S-H-I-N-G, whereby you can create clearly a web page, or, heck,
even an email using HTML, that tells the user they're going to go one place,
And so having the instinct to look in the bottom left hand corner,
I can just do HREF = equals quote unquote and the name of a file,
you'll see in the bottom left hand corner a very long URL.
There are special tags we can use to tell the browser to modify its display,
I'm going to copy/paste some starting point here, call this title Responsive.
And let me go ahead and just grab, let me grab some of that lorem ipsum text
from before, just so that we have a sizable paragraph to play with here.
And I'm just going to paste this into the body of this page.
But here's another trick you can do, using Chrome or Edge or other browsers
these days.
was kind of interesting, because we could see what the underlying network
traffic is.
I can turn my laptop into what looks like a mobile device by clicking this.
I'm going to click the dot dot dot menu over here, and just move the dock.
Here's what that same website might look like on an iPhone X. You know,
And let me go into the head of the page, and for the first time,
It's essentially the body of the page. but only the part the human
is currently seeing.
These are sort of magical statements that you just have to know
assume that the width of the page is the same thing as the width of the device.
And now, it's not very effective on this screen, if I were showing you this on,
is there--
Let's do this.
There we go.
All right, let me pause here to see if there's any questions, because that
things you Google and figure out over time, just to build up your vocabulary.
Some do not.
Yeah.
Version.
It's HTML with CSS, with JavaScript, both of which we'll get a taste of here
today.
work on Android devices and iPhones and Macs and PCs, and the like.
It is very expensive.
and make an iOS app, not to mention make them look and behave
even for mobile apps and web apps, has been increasingly compelling,
All right, so let's go ahead and now do something that's finally interactive.
there might be a question mark, and then some keys and values.
here, and let me search for something the internet is filled with, cats.
Enter, notice now that my URL changed from google.com
So let's just delete it for now, and leave it with the essence of that URL.
If I zoom out here, years ago you would get pictures of cats.
All right, this didn't used to happen when we searched for cats.
But anyhow, the point is that the URL changed to include the user's input.
They don't manually create the URLs, like I sort of just did.
But when you fill out a form on the web and you hit Enter,
a credit card information, because you don't want the next person to sit down
and what that means underneath the hood is that your browser is just
And hopefully what comes back is a page full of search results, including cats.
And what's interesting here now is, if I go back to VS Code on my own computer,
and let me go ahead and create a file called, how about Search.html.
And in the body of this page, I'm going to introduce a form tag.
And the types of inputs are going to be text, and the type of the input
is going to be submit.
which, if you roll back to the late '90s when Larry and Sergey of Google fame
created Google.com, Q represented query, the query that the human's typing in.
Stupidly, it's lowercase in HTML, even though what's in the envelope is indeed
uppercase, by convention.
is to send the data to Google's slash search path, using the GET method.
It's going to send an input called Q, whenever I click this Submit button.
Nothing seems to have changed yet, but, if I search for, let me zoom out,
If I zoom out and search for cats now and click Submit,
But notice that the URL is parameterized, with those key value
Right now, it's not ideal that like the human has to move their cursor
If I want some explanatory text, I can put placeholder text like quote unquote
"query."
I can search for dogs now, and you didn't see any autocomplete at all.
I've implemented the front end of Google.com, just not the back end.
going to need like a really big database, maybe something like SQL.
We're going to need some code that like searches the database for dogs or cats,
or anything else.
Or any question, then, about forms, these URL parameters, before we now
with JavaScript.
Anything at all?
No?
Let's go ahead now and introduce to the mix one other language, as follows.
as though I'm making a home page for the very first time.
like my name, John Harvard, for instance, or John Harvard's home page.
Then in the middle of the page, I'm going to have some text like,
All right, so it's like a web page with three different structural areas,
and create three quick paragraphs, a first paragraph for John Harvard.
Inside the middle, I'm going to say something like welcome to my home page
exclamation point!
It's a very simple, very underwhelming web page that has three main sections.
They're sort of like areas of the page, divisions, like the header is up here.
paragraphs of texts.
of the page, which is a very commonly used tag in HTML, which just has
There's some other tags in HTML, known as semantic tags, that literally
for screen readers, for search engines, because now, a screen reader, a search
On Main, I'm going to add a style attribute and say font size
And then on the footer, I'm going to say style equals font size
It'd be nice if the world standardized how you write key value pairs,
because we've now seen equal signs and arrows and colons and semicolons,
and all this.
The semicolon just separates one key value pair from another.
Just like in the URL, the ampersand did, in the context of HTTP.
So as of now, you can use the CSS language inside of the quote marks
is going to get messy quickly, certainly for large web pages, the size
But it's indeed centered, and it's indeed large, medium, and small text.
It turns out there are numeric codes, with this weird syntax, that
allow you to specify symbols that exist in Macs and PCs and phones,
It is correct, if my goal was small, medium, and large, bottom up, what
Yeah.
AUDIENCE: Same
like copy/paste, or typing the exact same thing again and again.
whereby children inherit the properties, the key value pairs of their parents
or ancestors.
I could get rid of the semicolon, too, but I'll leave it for now.
And let me add all of that style to the parent element, the body,
so that it sort of cascades down to the header, the main, and the footer tags
as well.
I can now reload the page, and voila, now it's over there.
Well, it's not that elegant that it's all just in line with my HTML.
you co-mingle your HTML and your CSS, especially since some of you
and the content and the data, and you might have a horrible sense of design
Wouldn't it be nice if you could work on the HTML, they could work on the CSS.
with my HTML.
Let me instead move it into the head of the page, in a style tag,
instead of an attribute.
are attributes that have the same names of tags as vice versa.
Here's a slightly different syntax for expressing the same key value pairs.
Then, if I want to apply some properties to the main section of the page,
can assign some properties like font-size small, and then text-align
center semicolon.
Yeah.
I like that.
Yeah, get rid of the text-align center in three different places, which
Yeah.
And so, just to make clear what we've been doing here,
these are all, again, CSS properties, these key value pairs.
What we've been doing thus far are what we're going to call type selectors,
I bet, after today, once I start creating other files for my home page,
And I might want to have large text or medium text or small text.
So let me do this.
and let me actually get rid of-- let me grab all of the CSS,
copy it to my clipboard.
Let me get rid of the style tag here, and create a new file called Home.css,
and let me just save all of that same text in a separate file ending in .css,
So ideally we would have used the link tag for links in web pages,
We're linking this file to this other one, so that they work together,
using this hyper-reference, Home.css, the relationship
because I can now use those same classes in my second page that I might make,
including one line of code, instead of copying and pasting all of that style
impressed by my centered class, and my large and medium and small classes,
I could bundle this up, let other people on the internet download it,
and I have my own library, my own CSS library, that other people can use.
if I already did it for you, stupid and small as this one is.
Use classes where you can, use external stylesheets where you can,
but don't use the style attribute where we began, which while explicit,
you're selecting all of the tags in the page, that have that particular class,
There's so much more that you can actually do with HTML and CSS together.
Let me go ahead and open up a few examples that I did here in advance.
Give me one second to grab the source eight directory for today's lectures,
Yeah?
Why?
Who knows, it's just a stylistic thing at the beginning of the chapter.
One, I can obviously go into VS Code and show you the code.
But, now, that we're using Chrome and we're using these developer tools,
let me turn off the mobile feature, and let me move the dock back
What's nice about the Elements tab is you can see a pretty printed
for you, so that you can now henceforth learn from, look at, the source
code, the HTML source code, of any web page on the internet.
Notice that my own web page here, it's not that interesting.
have some way of referring to the very first paragraph in this file.
If I look in the head of the page, and the style tag here,
And what this is telling the browser, whatever element has the first ID,
And that's why the first paragraph, and only the first paragraph,
is actually stylized.
If I want to change the color of that first paragraph to green, for instance,
I can do color colon: green.
If I want to change it to red, that would be, let's see, RGB FF 00 00,
Like, if you're a web designer trying to make a website for the first time,
before you open up your editor and you start making changes
The italicized ones here at the bottom means user agent stylesheet.
That means this is what Google makes all paragraphs look like by default.
I changed it to blue.
making a request to the cloud, the server in the cloud and the response
coming back, the browser, your Mac, your PC, your phone,
has a copy of all the HTML and CSS, so you can change it here,
to just open and close the tags, to dive in deeper and deeper.
And notice, if I hover over this LI, notice Stanford's using a list,
So now, if I close developer tools, now it's gone from Stanford's website.
But it's a wonderfully powerful way to, one, just iterate quickly, and try
And we'd have to read more to learn how this works, list style type none,
there we go.
if I turn that off, now Stanford's links all look like this.
programmatically.
Let me look in the head and the style of the page now.
So you can apply CSS to a very specific child, namely first child.
There's also syntax for last child, if just the first one
and here we have a very simple page that just says visit Harvard.
to be a little different.
has changed the color to be red, and the text decoration, which is a new thing,
but it's another CSS property, to none.
But maybe, watch this, maybe the link comes, the line comes back
Notice that I have stylization, and I put my curly braces on the same line
you can change the text decoration to be back to the default, underline.
So, again, just little ways of playing around with the aesthetics of the page,
But it's just another way of scoping your properties to specific tags.
Let's look at version 3 of this here, which adds Yale to the mix.
just to, again, emphasize, you can do this so many different ways.
If I want to get rid of the IDs, I can do this a slightly different way.
based on an attribute.
who's HREF value happens to equal this URL, and make it red.
Now, this might not be ideal, because if there's something after the slash,
Star equals means, change any anchor tag who's HREF contains anywhere in it
Harvard.edu to red, and do the same thing for Yale, based on star equals.
And, again, we could do this all day long, with diminishing returns,
working for a company doing the same, you might have internal conventions
For instance, the company might say, always use classes, don't use IDs.
They start with something off the shelf, a framework, typically a free and open
And there's so much documentation here, but let me just go to things like,
It just gives you, out of the box, the CSS with which you
If you've ever seen on CS50's website a yellow warning alert like this,
But we're using classes called alert and another class called alert warning.
colors and padding and margin and like other aesthetics with,
Role equals alert, just makes clear to like a screen reader that this
Under Getting Started, there is a link tag you copy/paste into your own.
So let me do this.
So in Table.html, we had code like this.
which I used earlier for my own CSS file, the HREF of which
If I want to use some JavaScript code, I can also copy this script tag.
bold, but centered, and then Carter and David were on the left,
It's fine.
It's not that pretty, but it'd be nice if it were a little prettier than that.
if I want to have a colorful table, like I could figure all of this stuff
But I care about making a phone book, not about reinventing these wheels.
and watch with a simple reload, what my now Table.html file looks like.
Might not be what you want, but, my God, with like two lines of code,
user-friendly websites than you might otherwise be able to make on your own,
certainly quickly.
It's got an about link, a store link, Gmail images, these weird dots,
but there's a big text box in the middle, and then two buttons,
so that we have access to all of their classes that are reusable now.
Well, just like Stanford's site had like its NAV navigation bar, using a UL,
but they changed it from being a bulleted list to being left to right,
with Bootstrap that says, make your web page fluid, that is,
Now, just like in Stanford's site, let's create an unordered list that has maybe
an LI, called with a class of NAV item, and then in here, whoops, in here,
Then I'm going to close my LI tag here, and I want to do one other thing,
says to add a class to your links, called like NAV link, and text dark,
to make it dark, like black or dark gray, instead of the default blue.
if you want to create a pretty menu like this, where your links are
NAV, and then again, this class NAV item, Bootstrap told me to,
NAV link text dark, Bootstrap told me to.
Let me go back to my page here, reload, and OK, still kind of ugly.
But at least the About link is in the top left hand corner,
And for Google Images, and I'm going to paste this, whoops,
how to maybe nudge one of these over, to start right aligning it.
But one way is if I want Gmail to move all the way over and push everything
else, I can say that add some margin to the Gmail list item, margin start auto.
And now, if I reload the page again, now, voila, Gmail and images
is over to the right.
Let me go ahead and add the big blue button to sign in.
So here with sign in, let me go ahead and, over in my same NAV, yeah,
so let's go ahead and do one more LI, class equals NAV item.
Turns out there is a class that can turn a link into a button, if you say BTN,
whereas the real google.com has a little bit of margin around it?
I even figured out how to get dots that look pretty similar to Google's.
And if we view source, you can see how I kind of finished this code.
and I go into this div, you'll see that here's an image tag for happy cat.
And I added some classes there to make it fluid, and width 25% of the screen.
If I go into the form tag, this is the same form tag as before.
And so in the end result, if I want to go ahead and search now for birds,
like the shade of blue that Bootstrap chose, or the gray button,
And then if you really don't like what the library is doing,
then use your own skills and understanding of HTML and CSS
to refine things a bit further.
But still, after all of that, all of these examples we've done thus far
give you a sense of what we can next do, next week onward, with JavaScript.
is that when you have written C code and Python code thus far,
And when you run the code, it's running in the cloud on the server.
recall that, when a browser gets the page containing this code,
it's going to get a copy of the HTML, the CSS, and the JavaScript code.
You don't specify the type, but you do use the keyword let,
and there's a few others as well, that say let counter equal 0 semicolon.
asking if x less than y, it looks pretty much like C. The parentheses are,
unfortunately, back.
The curly braces here are back, if you have multiple statements in particular.
But, syntactically, it's pretty much the same as it was for if, for if else,
The only difference, really, is using the word let here, instead of INT.
The browser will figure out what type you mean from context.
Think about most any website, that's at all interesting today, that you use.
If you're sitting in front of Gmail on a laptop or desktop with the browser tab
The point, though, is, you don't have to hit Command R or Control
to the existing DOM, document object model, which is the fancy term for tree
in memory that represents HTML, so that the web page can continue
If you click and drag and drag and drag, your browser did not
But when you click and drag, it's going to get some more tiles up there,
some more images, some more images, as you keep dragging, using
I just added a form to, because it'd be nice if this page didn't just
say Hello, title, Hello, body, it said, Hello, David, Hello, Carter, Hello,
I've got a form that I borrowed from some of our earlier code,
and that form has an input whose ID is name, that also has a submit button.
Suppose that, when this form is submitted, I want to greet the user.
and I can say on submit, call the function called greet, close quotes.
When the user clicks submit, normally forms get submitted to the server.
I want to just submit the form to the browsers, keep on the same page,
In the head of my page, I'm going to add a script tag, wherein the language is
for those of you who took APCS with Java, just a similarly named language,
Initially I'm going to keep it simple, using a built-in function called alert,
It literally says the whole code space URL of the web page
It's really just meant for simple interactions like this, for now.
it'd be nice if, just like in CSS, I can go grab the value of that text box,
using code.
It gives me one of these rectangles from the DOM, the document object model.
called document, that lets you just do stuff with the document, the web page
itself.
and grab the actual text that the human typed in.
I can go ahead and say hello, quote unquote "Hello," plus name.
reload the page, to get the latest version of the code, type in David,
that was not going to scale well, once you have lots and lots of properties.
you don't want to just put your code inside of this on submit handler.
Let me move the script tag, actually, just below the form,
but still inside the body, so that the script tag exists only
Just like in Python, your code is read top to bottom, left to right.
And pretty much any user interface is governed by events, especially phones.
On phones, you have touches, and you have drags, and you have long press,
you have key down, key up, as you're moving your hands up and down
on the keyboard.
And we had the two puppets sort of talking to one another via Events.
In the world of web programming, game programming, any human physical device
And you write code that listens for these events happening.
And when that happens, I want to call the Greet function, like this.
Then I'm adding an event listener, specifically for the Submit event.
I want to tell the browser to call Greet, when it hears this Submit event.
All right, but let's now make this slightly better designed.
where I was like, why are we creating a special function called get value
AUDIENCE: Lambda.
But using now just these four lines of code, I can do this.
I can tell the browser to add an event listener for the Submit event.
And then when it hears that, call this function that has no name.
But you can think of this as just being, run these two lines of code,
when the form is submitted.
And you would only know this from being told it or reading the documentation.
So here's a very simple example, that has just three buttons in it, one red,
I can add an event listener, this time not for submit, but for click.
but you can go into the body of the page, its style property,
like background-color.
of a tag that was removed from HTML, because in the late '90s, early 2000s,
I will admit, my very first web page probably used both of these tags.
How?
I wrote some code in this example, that waits every 500 milliseconds to change
here's the three variants of bananas that appear in that file, and so forth.
it's just updating the DOM, the tree in the computer's memory,
And for one final example, this is how programs like DoorDash and Google Maps
You have built into browsers today some fancy APIs, application programming
interfaces, whereby you can ask for information about the user's device.
It's taking a moment, because sometimes these things take a little while
to analyze.
and as a final flourish today, for what you can do with a little bit of HTML
JavaScript for your logic, which we'll tie in again next week, let me go ahead
We're not on that street, but there, oh, there it is, actually.
[MUSIC PLAYING]
we're going to add back into the mix, Python and SQL.
And with that, do we have the ability to program for the web.
And even though this isn't the only user interface out
but it's also, increasingly, the way that mobile apps are written as well.
potentially.
at least for the next some number of years, on HTML, CSS, and JavaScript
coupled with other languages like Python and SQL on the so-called backend.
But we need an additional tool today, and we've sort of outgrown HTTP server.
JavaScript files, maybe images, maybe video files, but just static content.
It has no ability to really interact with the user beyond simple clicks.
You can create a web form and serve it visually using HTTP server,
but if the human types in input into a form and click Submit, unless you
it's not actually going to go anywhere because this server can't actually
comes with Python that allows us to not only serve web pages
And recall that all that input is going to come ultimately from the URL,
So here's the canonical URL we talked about last week for random website
like www.example.com.
And I've highlighted the slash to connote the root of the web server,
set.
might have multiple slashes and multiple folders and some folders and files.
a better generic description of what these things are because it turns out
And just make sure that when the user visits that, you give them
If they visit something else, you give them a different web page.
And if you want to get input from the user, just like Google does,
like q=cats, you can add a question mark at the end of this route.
The key, or the HTTP parameter name that you want to define for yourself,
and then equal sum value that, presumably, the human typed in.
and then more key equals value pairs ampersand, repeat, repeat, repeat.
The catch, though, is that using the tools that we had last week alone,
we don't really have the ability to parse, that is, to analyze and extract
You could have appended question mark q equals cats or anything else
The server is not going to bother even looking in that for you.
And in fact, we're going to use a web server implemented in Python, instead
look for any key value pairs after the question mark
Recall that a dictionary in Python, a dict object, is just key value pairs.
this past week to make your home pages prettier and nicely laid out,
Why?
Well, you're using libraries, code that someone else wrote, like all the CSS,
maybe some of the JavaScript that the Bootstrap people wrote for you.
But it's also a framework in the sense that you have to go all in.
have to lay out your divs or your spans or your table tags
it's just going to make stuff like that easier for us.
If we have any libraries that we want to use, the convention in the Python world
means any files you create that are not ever going to change,
there are just different conventions out there for creating applications.
So something that, initially, is not all that dynamic, pretty static, in fact.
From Flask, import Flask, with a capital F second and a lowercase f first.
And then below that, I'm going to say, go ahead and do this.
So we've seen this a few weeks back when we played around with Python
For now, just know that __name__ refers to the name of the current file.
And so this line here, simple as it is, tells Python, hey, Python,
Flask is a function that just figures out, then, how to do the rest.
The last thing I'm going to do for this very simple web application is this.
And so I'm going to tell it to define a route for, quote unquote, "slash."
This is slightly new syntax, and it's really the only weirdness
For our purposes, just know that on line six this says,
The next two lines, seven and eight, say, hey Python,
And the only thing you should ever do is return render template of quote unquote
"index.html."
what is in index.html?
And I'm going to do a very simple web page, doc type HTML.
I'll then do a head tag, I'll do a meta tag, the name of which is viewport.
That is, it just grows and shrink to fit the size of the device.
The initial scale for which is going to be one, and the width of which
But then lastly, I'm going to add in my title, which will just
whoops, Bobby.
there we go.
The body of this page, rather, will just be hello comma world.
right now, because I don't have any other files that I want to serve up.
I'm going to go requirements.txt and just say make sure the system has
All right, but that's the only thing we can add in there for now.
All right, so now I have two files, app.py, and I have index.html.
It would not get executed because HTTP server is just for static content.
But today, I'm going to run a different command called Flask run.
yet, comes with a program called Flask, takes command line arguments like
the word run, and when I do that, you'll see somewhat similar output to last
just to make sure I'm allowed to access that particular port, let me zoom in.
It doesn't bother showing you a slash, even though it's implicitly there.
But let me do something explicit like my name equals, quote unquote, "David."
and get whatever the value of the parameter called name is.
This is the actual variable that I want to get the value from.
is to do two curly braces and then put the name of the variable
This is the plan to make a web page that has all of this code literally,
but there's this placeholder with two curly braces here and here
that says go ahead and plug in the value of the name variable right there.
to the end of the URL with a question mark, it still said hello, world.
Let me go back to my hello tab and click reload so it grabs the page anew
I can play around now and I can change the URL appear to, for instance,
Carter.
So the new pieces here are, in Python, we have some code here
And the only thing we have to do that is call this function request.args.get.
Yeah, in back.
The short answer is just because that is where key value pairs must go.
the convention, standardized by the HTTP protocol, is to put them in the URL
Other questions?
Yeah.
AUDIENCE: Can you go over again why the left and right in the [INAUDIBLE]??
DAVID: Sure.
I now, though, have to go into my index file and say name of person--
So what typically people do is they just use the same name as the variable
itself, even though it looks admittedly stupid, but it has two different roles.
the name of the variable you plan to use in the template, the thing on the right
There we go.
only ever going to see Emma no matter whose name is in the URL.
That's all.
If, in order to get a greeting for the day, you, the user,
What is the more normal mechanism for getting input from the user
DAVID: OK, so we did make something in order to get the input from the user.
And specifically, what was the tag or the terminology we used last week?
AUDIENCE: [INAUDIBLE].
Oh, no.
But yeah.
AUDIENCE: Is it input?
DAVID: So the input tag, inside of the form tag.
have a very simple index.html file that, by default, is going to simply ask
the user's name, this is the page I'm going to use to actually get input
The method I'm going to use for now is going to be, quote unquote, "get."
And I'm going to turn off autocomplete like we did last week.
I'm going to turn on auto focus, so it puts the cursor in the text box for me.
I'm going to give the name of this input the name, name.
Not to be too confusing, but I'm asking the human for their name.
So it makes sense that the name of the input should be, quote unquote, "name."
Then I'm just going to give myself, like last week, a submit button.
There we go.
and let me add, just like we did last week for it, Google.
and I'm going to have the user submit this form to a second route,
Greet feels like a nice operative word, so /greet is where the user will be
Let me reload this tab so that I get the very latest HTML and, indeed,
Hypotheses.
Yeah?
Yeah?
AUDIENCE: 404?
DAVID: 404?
when I click.
standardizing--
And all three of you are right, in effect, 404 not found.
And then, inside of-- under this, let me define another function.
and I want to pass, into it, the name that the human just typed in.
All right, so now if I go up and reload the page, what might happen now?
If I go ahead and hit reload or resubmit the form, what might happen now?
Any instincts?
Now it's worse, and this is the 500 error, internal server
error that I promised next week we will all encounter accidentally, ultimately.
Because it's an internal error, this means something's wrong with your code.
So the route was actually found because it's not a 404 this time.
But if we go into VS Code here and we look at the console, the terminal
Do I want to do this?
Oh, standby.
Come on.
There we go.
Come on.
Here, though, is the status code that the server returned, 500.
Well, here's where we get these annoying pretty cryptic Python messages
And again, representative of how you might diagnose problems like these,
let me go into my terminal window.
Inside of this, I'll have the head tag, inside of here, I'll have the meta.
The content of which is initial scale equals one, width equals device width.
And then here, in the body, I'm going to have hello comma name.
So I could have kept around the old version of this, but I just recreated,
You have to run Flask wherever app.py is, not in your templates directory.
and I get index.html's form, now I type in David and click Submit,
And now we have a full-fledged web app that has two different routes,
slash and /greet, the latter of which takes input like this and then,
I could require that the user have input on the previous page,
But there's another mechanism I can use that I'll just show you.
Oh, interesting.
Oh.
Suppose I just get rid of name altogether like this and hit Enter.
would be to go in here, in my HTML, and say that the name field is required.
you should never, ever, ever rely on client side safety checks like this.
Because we know, from last week, that a curious programmer can go to inspect,
Not a big deal with a silly little greeting application like this.
provide input that is necessary for the correct operation of the site,
you don't want to trust that the HTML is not altered by some adversary.
Yeah.
[INAUDIBLE]
DAVID: Sorry?
AUDIENCE: [INAUDIBLE]
DAVID: No.
What you should really do is something we're going to do with another example
only going to handle one of the scenarios that I was worried about.
my second template.
What's bad or dumb about this design of these two templates alone?
And there's a reason, too, that I bored us by typing it out that second time.
Yeah?
And little things did change along the way, like the title
And God forbid we have a third template, a fourth template, a hundredth template
you're going to have to change it now in two, three, a hundred different places
instead.
And let me go ahead, and per your answer, copy all of those
I don't want to give every page the same title, maybe, but for now that's OK.
But in the body of the page, what I'm going to do here is just
The other template syntax we saw before was the two curly braces.
There's this other syntax with Flask that allows you to, say, a single curly
brace, a percent sign, and then some functionality like this defining
a block.
nothing between the close curly and the open curly brace here.
Let me now go into my index.html, which is where I borrowed most of that code
The only thing that's really different in this page, title aside, is the form.
Index.html, first line now says, hey, Flask, this file extends layout.html,
put a default value there just in case some page does not have a body block.
I'm going to cut this content and get rid of everything else.
and then I'm going to have my body block here simply be this one line of code.
And then I'm going to go ahead and end that block here.
and these now curly braces with percent signs, is an example of Jinja syntax,
we're going to use these other people's syntax called Jinja syntax.
of code.
So Flask is using this syntax, but other libraries and other languages
that's already implicit in the fact that we have the extends keyword.
And just as a little check, let me view the source of the page
It's not quite pretty printed in the same way, but that's fine.
That's fine.
And we'll see almost the same thing with what's plugged in there.
Yeah?
AUDIENCE: Is what we did just better for design or for memory [INAUDIBLE]??
Both.
And as you saw with home page, often, in the head of your page,
you might want to include some CSS files like Bootstrap or something else.
you would literally have to go into three, four, a hundred different files
Flask is probably doing that, but not in the mode we're using it.
All right, so let me ask a question, not just in terms of the code design.
Why is this maybe not the best design for users, how I've implemented this?
Yeah?
DAVID: Yeah.
Not a big deal if it's your name, but if it's your password, your credit
You just don't want to expose yourself or your users to that kind of risk.
And in my form, I can just change the method from GET to POST.
Because now, if I go ahead and run Flask again after making that change,
and I now reload the form to make sure I have the latest version.
Why is that?
And if I now restart Flask, so Flask run, Enter, and I go back to this URL.
Type David and click Submit now, now I should see hello, world.
Notice that I'm at the greet route, but there's no mention of name
It's a simple change, but whereas GET puts things in the URL, POST does not.
Any thoughts?
Right, because it's obnoxious to be putting any information in URLs
DAVID: Yeah.
I mean, if you get rid of GET requests and put nothing in the URL,
Because none of the information is there for storage, so you can't just
And there's this other symptom that you can see here.
look different in Safari and Firefox and Edge and Chrome here, confirm form.
args
So your browser might remember what your inputs were and that's great,
om/search?q=what+time+is+it.
Because it's going to grab the information, the key value pair,
from the URL, send it to Google server, and it's just going to work.
wanted people to very quickly be able to check what is the current time.
And so I can sort automate the process of creating a Google search for you,
but that you induce when you click that link.
If Google did not support GET, they only supported this, the best I could do
I would have had to add to my email, by the way, type in the words
So there, too, we might have design when it comes to the low level code,
but also the design when it comes to the user experience, or UX,
All right, any questions, then, on this, our first web application?
Super simple, just gets someone's name and prints it back out.
And when we come back, we'll add to this some first year intramural sports.
But there was a subtle bug or change here that we didn't call out earlier.
I did type David into the form and I did click Submit,
Intuitively, that if I'm seeing hello, world, that's the default value
But if you focus on, really, the first principles of last week,
And let me give you one other mental model, now, for what it is we're doing.
that all implement the same paradigm, the same way of thinking
And here's a very simple diagram that represents the process that you
And actually, this is more than we've been implementing thus far.
business logic that makes all of the decisions, decides what to render,
Those things are dumb, they pretty much just say plop some values here.
And in your view is where your HTML and your Jinja code, your Jinja templating,
the curly braces, the curly braces with percent signs, usually is.
The model, where do you keep actual data, typically long term.
where you have one of these-- each of these components communicating with one
What we're teaching today, this week, is not really specific to Python.
It's not really specific to Flask, even though we're using Flask.
It really is a very common paradigm that you
and create a new folder altogether after closing these files here.
Neither of those two courses taught web programming back in the day.
And I learned a little something about CSV files, and I sort of read enough--
can't even say googled enough, because Google didn't come out
Read enough online to figure out how to make a web application so that students
and then walk it across the yard to Wigglesworth Hall, one of the dorms,
1996, 1997.
because we did not have the features that JavaScript and CSS nowadays have.
So it was really just HTML, and it was really just controller code written,
And then let's go ahead and define a route for slash for instance first.
I'm going to define a function called index.
just the function that will get called for this particular route.
called index.html.
So I just need to say block body, and then in here, I can just say to do
Let me reload.
To do.
maybe got a dropdown menu for all of the sports for which you can register.
how about an H1 tag that just says register so the user knows what it is
just because it's not really necessary to put this kind of information
in the URL.
In here, let me go ahead and create, how about an input with autocomplete
going to ask the student for their name using placeholder text of quote
unquote "name."
But if you've not seen this yet, let's create a select menu,
sports for the fall, which are basketball, and another option
I haven't implemented my route yet, but this feels like a good time
the server that we'll do for you for problem set nine,
Let me reload my index route and OK, it's not that pretty.
Let me create an empty option up here that, technically, this option is not
But it's just going to have a word I want the human to see,
so I'm actually going to disable this option and make it selected by default.
And there's different ways to do this, this is just one way of creating,
essentially, a--
whoops, option.
Submit button.
Getting better.
Recall that we can change some of these HTTP-- these HTML attributes.
All right, so now we really have the beginnings of the user interface
that I created some years ago to let people actually register for the sport.
So let's go, now, and create maybe the other route that we might need.
let's do a little bit of error checking which I promised we'd come back to.
And then let's go ahead, and in the greet function, let's go ahead
Basketball, the other one was soccer, and the last was ultimate frisbee.
Getting a little long, but notice what I'm-- the question I'm asking.
or ultimate frisbee, which I've defined as a Python list, then let's go ahead
And that's just going to be some error message inside of that file.
So let me extend layout.html and in the block body, you are not registered.
I'll just yell at them like that so that they know something went wrong.
And then let me create one other file called success.html, that
And I'm just going to say for now, even though they're not
Unintentional.
Anyone?
Ironically, the function could be greet, because that actually doesn't matter.
But to keep ourselves sane, let's use the one and the same words there.
oh my God.
don't have no need for a GET in this context, I can just do POST.
Register.
You are not registered.
Fine, I'm going to go ahead and be David with ultimate frisbee register.
Huh.
OK.
How to debug something like this, which is my third and final unintended,
unforced error?
is, all right, if my form thinks that it's missing a name or a sport,
I've now given a second input in the form of the select menu.
But what seems to be missing here that I'm assuming exists here?
just to make sure it's all defaults again, type in my name and type
And there.
So I can emphasize--
Go back to the basics, go back to what HTTP and what HTML forms are all about,
There's only a finite number of ways I could have screwed that up.
Yeah?
Previously, it was just the reality that I had this user input dropdown menu,
So if there's no name, there's no key to send, even if the human types a value.
It would be like nothing equals ultimate frisbee, and that just doesn't work.
Let me rerun Flask down here and reload the form itself
View Developer Tools, and then let me watch the Network tab, which recall,
And we also played around with Curl, which let us see the HTTP requests.
done if I still wasn't seeing the error and was really embarrassed on stage.
I would have typed in my name as before, I would have chosen ultimate frisbee.
And just like we did last week, I would go down to the request down here.
I'm just not sending the sport, even if the human typed it in.
Like good programmers, web developers are using these kinds of tools
Yeah.
in HTML, once you have to fix the template, how do you that?
DAVID: So how would you edit CSS if you have these templates?
Just to give you a teaser for this, and you'll do this in the problem set,
but we'll give you some distribution code to automate this process.
CSS rel equals style sheet, that's one of the techniques we showed last week.
The only difference today, using Flask, is that all of your static files,
by convention, should go in your static folder.
And in here, I could say background color, say FF0000 to make it red.
DAVID: If you want to change one page and not the other in terms of CSS?
AUDIENCE: Yes.
In that case, you might want to have different CSS files for each page
if they're different.
You could use different classes in one template than you did in the other.
I'm going to go ahead and just remove the static folder just so as
And let's go ahead and just play around with a different user interface
mechanism.
Before we even get into the checkboxes, there's one subtle bad design here.
Notice that I've hardcoded basketball, soccer, and ultimate frisbee here.
And if you recall, in app.py, I also enumerated all three of those here.
And any time you see copy paste or the equivalent thereof,
meant to be constant even though Python does not have constants, per se.
going to hint at the power of templating and Jinja, in this case here.
Let me go ahead and get rid of all three of these hard coded options
and let me show you some slightly different syntax for sport, in sports.
So you have a start and an end to your block without indentation mattering.
And now I still have a sport dropdown and all of those sports
another sport, for instance, that gets added, like say football,
And if I reload the form now and look in the dropdown, boom,
now, in this template, I'm using Jinja's for loop syntax, which
and for.
Iterating over something with a for loop lets you generate more and more HTML.
When you visit your inbox and you see all of this big table of emails,
and are just outputting table row after table row or div after div dynamically.
Instead of a select menu, I'm going to go ahead and do something like this.
For each of these sports let me go ahead and output, not an option,
the name for which is quote unquote "sport," the type of which
but it's going to allow users to sign up for multiple sports at once now,
it would seem.
If I view the page's source, this is, again, the power of templating.
I didn't have to type out four inputs, I got them now automatically.
And these things all have the same name, but that's OK.
It turns out with Flask, if it sees multiple values for the same name,
it's going to hand them back to you as a list if you use the right function.
All right, but suppose we don't want users registering for multiple sports.
And because I've given each of these inputs the same name, quote unquote,
The browser knows all four of these things are types of sports,
therefore I'm only going to let you select one of these things.
And that's simply because they all have the same name.
Again, if I view page source, notice all of them, name equal sport,
name equals sport, name equals sport, but what differs is the value
All right.
that I made in advance that's going to now start saving the information.
of where this website was, which actually allowed the proctors to see,
and let me go into what I call version three of this in the code for today.
Here, again, is where dictionaries are just such a useful data structure.
Why?
assuming a model where you can only register for one sport for now.
But notice I've started to make the user interface more expressive.
I'm telling the user, apparently, with a message what they did wrong.
Well how?
here's a new file I created here, that adorably is apparently going to have
a grumpy cat as part of the error message, but notice what I've done.
In my block body I've got an H1 tag that just says error, big and bold.
And then just for fun, I have a picture of a grumpy cat connoting
If there's no such sport, that is the human did not check any of the boxes,
Else, if the sport they did type in is not in my sports global variable,
using the name the human typed in as the key and assigning it a value of sport.
So let's go down this rabbit hole let me go into templates registrants dot HTML.
This has a table head that just says name sport for two columns.
Then it has a table body where in, using this for loop in Jinja syntax,
output a table row, start tag, and end tag, inside of which,
It essentially is Python syntax, albeit with these curly braces and the percent
sign.
Register.
Register.
Missing name.
I'll type my name, sure, but let me go into the body tag down here.
volleyball.
Enter.
server side.
Clicking register.
Register.
Now we see two rows in this table, David, ultimate frisbee, Carter,
basketball.
Yeah.
DAVID: Yeah.
RAM is volatile.
It's thrown away when you lose power or stop the program.
Let me close these tabs and let me open up app.py now in version four.
Let's do .schema.
I have a table called registrants, which has one, two, three columns.
An ID column that's an integer, a name column that's text but cannot be null,
I'm getting the user's inputted name, the user's inputted sport,
So I kept it simple.
makes it a little easier to execute SQL queries and we're executing this.
What two values, the name and the sport, that came from that HTML form.
And then lastly, and this is a new function that we're calling out
and all these other sites we played around with last week
we're all implemented redirecting the user from one place to another.
It handles the HTTP 301 or 302 or 307 code, whatever the appropriate one is.
It does that for me.
from registrants.
Instead of just having two columns with the person's name and sport,
run Flask, and actually see what this example looks like now.
All right.
OK.
I have a page at the route called /registrants that has a table with two
Why?
Because if I view the page source, notice that it's not the prettiest UI.
For every row in this table, I'm also going to be outputting a form just
But before we see how that works, let me go ahead and register Carter,
for instance.
Again, register.
And whereas, previously, when I executed this there were zero people, now
But how do we go about linking a web page with Python code with a database?
Up until now, everything's been with forms and also with URLs.
Any time you're curious how a website works, let me go to the Network tab.
What is it?
If I go into app.py and I look at my deregister route now, the last of them,
I first go into the form, and I get the ID that was submitted, hopefully.
If there was, in fact, an ID, and the form wasn't somehow empty,
and then I plug-in that number, deleting Carter and only Carter.
And I'm not using his name, because what if we have two people named Carter,
but supposed I emailed her this URL, /deregister?ID=3, and I said, hey,
Yeah?
AUDIENCE: Deregistering?
Why?
and the URL contains her ID just because I'm being malicious,
And in this case, it's enough to delete the user and boom,
be plugged into emails sent via Slack messages, text messages, or the like.
that they shouldn't have, because the website was using GET alone.
Yeah.
Which file?
AUDIENCE: It's in [INAUDIBLE] scroll up.
[INAUDIBLE]
AUDIENCE: Yeah.
DAVID: OK.
AUDIENCE: [INAUDIBLE].
OK, sorry.
That's all.
here just to show what I was actually doing too, back in the day,
and let me go ahead and open up, say, app.py this time.
And this is some code that I wrote in advance.
And it looks a little scary at first glance, but I've done the following.
comes with Flask that is automatically created when you create the app appear
want to send email as, the default password I want to use to send email,
the port number, the TCP port, that we talked about last week.
So for security purposes, I didn't want to hard code my own Gmail username
Register.
And if I did this correctly, not only is John Harvard, on his screen,
seeing you are registered, but when he checks his email on this other screen,
Horrifying.
I just tried submitting again, so I just did another you are registered.
AUDIENCE: [INAUDIBLE]
not sure we want to show spam here on the internet that every one of us gets.
Oh, maybe.
Oh!
Thank you.
OK.
All right, so you are registered is the email that I sent out,
So let's just take a quick look at how that code might work.
But besides setting these things, let's look at the register route down here.
which takes a list of emails that should get the confirmation email.
So let's scroll back up to see what message and what mail actually is.
You simply configure your current app with Mail support, capital M here.
Capital Mail, capital Message, so that I had the ability to create a message
So such a simple thing whether you want to confirm things for users,
All right.
So what other pieces might actually remain for us let me flip over here.
but it'll be one of our final flourishes today, is the notion of a session.
from all of the basics we talked about today and last week,
and a session is the technical term for what you and I know as a shopping cart.
When you go to amazon.com and you start adding things to your shopping cart,
Heck if you close your browser, come back to the next day,
they're typically still your shopping cart, which is great for Amazon
They don't want you to have to start from scratch the next day.
even if it's not an e-commerce thing but it has usernames and passwords,
Typically, you log in once, and then for the next hour, day, week, year,
that you might know as, and worry about, called cookies.
Let's go ahead and take one more five minute break here.
All right.
remembers information.
a stateless protocol.
You can unplug from the internet, you can turn off your Wi-Fi,
And yet we somehow want to make sure that the next time you
and we can actually do this using the building blocks we've seen thus far.
So concretely, here's a form you might see occasionally, but pretty rarely,
And I say rarely because most of you don't log into Gmail frequently,
you just stay logged in, pretty much endlessly, in your browser.
want to add friction to using their tool and making you log in every darn day.
including some of the CS50 zone, that makes you log in every time.
So once you do fill out this form, how does Google subsequently
know that you are you, and when you reload the page even
how do they know that you're still David or Carter or Emma or someone else?
that, last week we didn't care about, this week, we now do.
and they track you in some way, and that's both a blessing and a curse.
Without cookies, you could not implement things like shopping carts and log-ins
like tracking you on every website and serving you ads more effectively and so
forth.
So you can think of it like a file that a server is planting on your computer.
they plop a cookie on your computer like this with some session
And when you then visit another page on gmail.com or any other website,
you send the opposite header, not set cookie, but just cookie colon,
they very often take like a little stamp and say, OK, now you can come and go.
And then for you, efficiency-wise, if you come back later in the day
Now, unlike this hand stamp, which can be easily copied or transferred
these cookies are really big, seemingly random values, letters and numbers.
is just going to guess your cookie value and pretend to be you It's just
very low probability, statistically.
But this is all it boils down to is this agreement between browser and server
and implement sessions, I'm going to have to import from Flask support
for sessions.
So in fact, whenever you use sessions as you will for problem set nine,
Let's see what this application actually does before we dissect the code.
Let me go over to my terminal window, run Flask run, and then let me go ahead
Come on.
There we go.
So somehow, my code knows, hey, if you're not logged in, you're going
to /login instead.
Chrome is sort of annoyingly hiding it, but this is the same thing as just
a single slash.
Log out.
What's cool is notice if I reload the page, it still knows that.
If I create a second tab and go to the same URL, it still knows that.
I could even--
What's in index.html?
at templates/index.html.
So we haven't seen this yet, but it's more Jinja stuff, which
then literally say you are logged in as curly braces session bracket name.
And then notice this, I've got a simple HTML link to log out via /logout.
then it apparently says you are not logged in and it leads me to an HTML
link to /login and then end diff.
Recall the HTML and CSS don't really care about indentation,
and these that handle that whole process of stamping every user's hand
visit the exact same URL, all of you would be logged out by default.
here and my login directory, notice the Flask session directory I mentioned.
And if I CD into that and type ls, notice that I had two tabs open,
is your name.
My login route supports both GET and POST, so I could play around if I want.
Why?
Because that's how I'm going to design the HTML form in a second.
literally, slash login, by default, when you visit a URL like that,
And this is a nice way of having one route but for two different types
of operations or views.
When I'm just there visiting /login via URL, it shows me the form.
But if I submit the form, then this logic, these three lines, kick in,
and this just avoids my having to have both an index route and a greet route,
for instance.
I can just have one route that handles both GET and POST.
none, which is Python's version of null, essentially, and then redirect the user
back to slash.
And so I'll tell the user instead, you are not logged in.
So like it's--
And we skip the password name step for that, more on that in problem set nine,
but this is how every website out there remembers that you're logged in.
as you use in Python lines like this and lines like this,
Flask takes care of stamping the virtual hand of all of your users
and whenever Flask sees the same cookie coming back from a user,
so that your code is now unique to that user and their name.
that'll show how we might use these, now, for shopping carts.
But each of these books has a button via which I can add it to my cart.
uses a for loop in Jinja to iterate over a whole bunch of books, apparently,
So that's interesting.
Let's go ahead and open up app.py, because that must be-- excuse me,
And I'm going to pass that list of books into my books.html template, which
It's a book-- it's a table called books with two columns, ID and title.
There are the seven books, each of which has a unique ID.
is just a form.
you to specify a value without the human being able easily to change it.
type equals hidden will put the value in the form but not reveal it to the user.
So that's how I'm saying that the idea of this book is one,
the idea of this book is two, the idea of this book is three, and so forth.
to /cart using POST and that would seem to be what adds things to cart.
Here's my cart.
All right, let's go back and let's add the book number two.
But long story short, these lines here do ensure that the cart exists.
It makes sure that there's a cart key in the session, global variable,
Why?
But if the user visits this route via POST and the user did provide an ID,
they didn't muck with the form in any way and try to hack into the website,
they gave me a valid ID, then I'm going to use this syntax.
So I'm going to add the ID to the list and return the user to cart.
I am storing, in the cart, the books that I myself have added to my cart.
which is how this website knows that it's me adding these books to my cart
Indeed, if all of us visited the same long URL and I made it public
and allowed that, then we would all have our own illusions
All right.
Yeah.
this is maybe only appeals to the geek in us, but having clean URLs
is actually a thing.
it's nice if the URLs are nice and succinct and canonical, if you will.
only, and not in multiple routes, one for GET, one for POST.
So what this code here means is that this route, this function,
And the simplest way to do that is just to check this value here.
If it's a POST, that must mean, because I created the web form that uses POST,
Then, I just want to show the user the contents of the cart
So it's just one way of avoiding having two routes for two different HTTP
verbs.
You can combine them so long as you have a check like this.
to necessarily master how you yourself would write the Python code, the SQL
code, the JavaScript code, but just to give you a mental model for how
you at least have the bare bones of a mental model for how
Even though our focus, generally, has been more on Python and SQL
Let me go ahead and open up an example called shows, version zero of this.
And let me go into my URL here and see what this application looks
like by default. This has just a simple query text box with a search box.
Let's take a look at the HTML that just got sent to my browser.
It's going to use a q parameter, just like Google it seems, and submit it.
So this actually looks like the Google form we did last week.
Enter.
What I've gone ahead and done is I've grabbed all of the titles of TV shows
and I loaded them into this demo so that you can search
but then it would have to be a show called cat or called dog as opposed
And then, I'm passing all of those shows to a template called search.html.
sorry, search.html.
I threw away all the other stuff like ratings and actors and everyone else
looks like this, which is a huge number of li tags, one for each cat
All right, so these days, though, we're in the habit of seeing autocomplete.
And you start typing something and you don't have to hit Submit,
you don't have to click a button, you don't have to go to a new page.
And it's almost the same thing, but watch the behavior change a little bit.
I can start it again and do dog, but notice how instantaneous it was.
And notice my URL never changed, there's no /search route,
But if I look at the source code here, notice that in the source code,
there's just an empty UL by default but there is some fancy JavaScript code.
which you used this past week, quote unquote "input, " all right,
Then it's adding an event listener to that input for the input event.
I then have a function, no worries about this async function for now.
that let's just focus on the ideas and not the syntax.
slash search question mark q equals whatever the value of that input is.
When I get back a response, I want to get the text of that response
and store it in a variable called shows.
like await and await here, but for now, just focus on what came back.
A response came back from the server, I'm getting the text from it,
and I'm changing its inner HTML to be equal to the shows that
Let me go to the Network tab and let's just sniff the traffic going
I'm going to search for C. Notice that immediately triggered an HTTP request
to /search?q=c.
So I didn't even finish my cat thought, but notice what came back.
A bunch of response headers, but let's actually click on the raw response.
This is literally the response from the server, just a whole bunch of li tags.
Just li tags.
whoops, sorry.
slash search q equals c, we are just going to get back this stuff,
Once it gets back that response from the server, it's using these lines of code
Because if you've got a hundred shows or more, you're sending all of these tags
unnecessarily.
Why don't I just create those when I'm ready to create them?
where client and server keep talking to one another, Google Maps does this,
Let's actually use a format called JSON, JavaScript Object Notation, which
The user interface is exactly the same, and it still works exactly the same.
or a little worse.
did we do?
Yes, sort of recall that you can now have keys and values in JavaScript
I get back its ID and its title, its ID and its title, its ID and its title.
You get back very raw textual data in this format, JSON format,
and then you can write code that actually programmatically turns
that JSON data into any language you want, for instance, HTML.
I fetch slash search q equals whatever that input was, C or C-A or C-A-T.
I'm calling this other function that comes with JavaScript these days,
It turns it into a dictionary for me, or really a list of dictionaries for me,
and stores it in a variable called shows.
And this is where you start to see the convergence of HTML with JavaScript.
that I just got back in the server, that big chunk of JSON data.
Then let me dynamically add to this variable, an li tag, the actual title,
let me update the ULs in our HTML to be the HTML I just created on the fly.
because you won't need to use this unless you start playing
containing all of the open brackets, the li tags, the closed brackets,
but we're just grabbing the raw data from the server.
The data you're going to get back from that API is not
that you and I see and take for granted every day,
HTML and CSS are used to present the data, your so-called view.
Python might be used to send or get the data on the backend server.
but the whole point of problem set nine to tie everything together,