Week 7: Monday - CS50 2011 - Harvard University
Skills:
PM Basics50%
Key Takeaways
Introduces file I/O, linked lists, stacks, queues, Valgrind, hash tables, trees, binary search trees, and tries in the context of CS50
Full Transcript
alright welcome back to cs50 this is the start of week 7 so it was actually around this time last year that I went into a bit of a rant when I was in the subway station in Harvard Square and there was this nice lady who was clearly foreign had never been to our subway system before and all she wanted to do was get from Harvard Square to park street pretty reasonable goal pretty straightforward route and so she walked up to this device here blurry only in as much as I'm bad at photography but this is these machines if you never use them before that they're very high-tech you can touch the screens and interact with them and actually buy yourself a train ticket unfortunately the sheer number of steps involved in buying something as simple as a one-way ticket to park street involve some seven plus steps so you walk up as I did with this woman who would turn to me or really anyone around for some help because she could not make sense of our American machines here you walk up you see this screen and you've got three huge buttons and the one you ultimately want is that one up there to purchase new Charlie ticket value tickets and passes press here now as an aside and from the perspective of user interface what is a Charlie ticket right we're already assuming that some outsider knows what we were talking about when we say Charlie ticket but we can forgive that and move on so now I'm asked what I would like to purchase this one is admittedly pretty straightforward so you click on the left to actually take the subway and then you're asked for your fare here and so that 12 is fairly straightforward and then you get to this point which is a curious one in that going downtown costs neither five dollars ten dollars nor twenty dollars so you have to click other amount if you just want that one way past which these days i think is like dollar fifty two dollars right so clearly an upsell attempt here and i think if we do the math even five dollars is not the cost of a round trip i think it's probably four dollars or so so we clicked other amounts this woman and I and then you get to this and God forbid you want to spend some triple digit amount of money just to get to park street so now you're on this screen and you input your number and finally you get to this screen which says you're buying an SV adult whatever the heck that is turns out it's single value but who cares I've an amount to pay four dollars now I can click credit card debit card and long straw worry short dear God it took us like five minutes just to buy a one-way ticket to park street now rewind to yesteryear just when I was here and we didn't have these fancy touchscreens we instead had these devices where you put your dollars in and you get a tea token a metallic token drop it in the slot and bam you're on your way like this is not an improvement I would wager and yet this is representative this kind of screen of a horrible user interface and sort of technology gone wrong and unfortunately technology and by extension computer science too often get a bad rap for being beyond the scope of most mortals because we're so often presented with crap like this right interfaces that are not optimized for the common case which I daresay is to get on subway and go somewhere not to go through these seven some odd steps so you'll find in problem set 5 s form this week to go online tomorrow as usual you'll be asked to keep an eye out this week or to think about some piece of hardware or software whether in subway stations and your own life on campus here at Harvard that could be better and we're going to off ask you for your simply real world's human instinct as to how this software or hardware device could be better in terms of its design now a recurring theme when we did this exercise for the last time last year is that towards a lot of Harvard centric things that were pointed out I'm sure some of you can think of various websites of which you are all too fond here within harvard.edu that could use some tinkering so do know that we will actually group some of your feedback by Harvard and non Harvard specific feedback so that we as a course will escalate your critiques your informed judgments of various tools so that hopefully collectively as University we can rise up and present to each other interfaces that are not quite like this those of you who are not freshmen but lived through webmail from college don harvard edu in years past will know that that system is no more replaced now with gmail and i dare say that in particular was a recurring theme in last year's survey and look that problem is gone so perhaps we as a class can similarly chip away this year so with that rant aside so it was a sad week over the past few days where we of course as a world lost Steve Jobs and to be honest i would say in the context of these BTW a system and one of the most amazing things Apple really has done is take technologies and make them simpler right Mac OS itself is not fundamentally any simpler i would say than windows these days but these phones that we have these ipads we have like even technophiles have fallen in love with these things because they just make things easier and we would be remiss in not pointing out another gentleman who unfortunately passed away last week named dennis ritchie this is the author of the c language with which you have been all too acquainted this semester and he was also one of the co contributors to the UNIX operating system which then gave rise to things like Linux and the like which we've been using in this course so there's a really nice of vitoria in the new york times if you google his name dennis ritchie but know that it's to this man to whom we know we owe most everything we've been talking about on this class so with that said what's on the horizon ahead so problem set 5 introduces not only the world of multimedia and imagery but also that of forensics and part of forensics will involve understanding how things like images work now it turns out that even though we look at images all day long on facebook or just your own computer in the form of JPEGs and gifs and pngs well you can actually represent images as you'll see in this week's pset fairly simply right if we have a trivial image that we want to represent like a smiley face here well you probably know before cs50 that an image is just a bunch of dots called pixels it's a grid horizontally and vertically of dots like this and so in the simplest form if all we wanted to do is implement a black-and-white smiley face you know we could kind of do this using weak ones material we can just auto arbitrarily say that the bit one will represent White's the bit 0 will represent black and so if we create a file using some kind of program or editor whereby we say 11 0 0 0 0 1 1 we could get the top row aka scan line of this Smiley's head there now of course images today are much more glamorous than these black and white images but all we do to met to represent things like JPEGs on facebook profiles is just more than a single bit per dot rather we typically use like 24 bits per dot to represent most any color that a human could possibly see and so you'll see that the problem set walks you through the progression from bit apps BMPs which this thing here is to JPEGs which you then need to recover as part of the forensic challenge but we also introduce some C and programming specific ideas so we talked briefly in past weeks about struck and this table here is kind of overwhelming but what this table here represents is what you can find at the start of any bitmap file BMP is a file format like jiff and jpg it's what the smiley face there was but it turns out that it's not sufficient to just put zeros and ones in a file and say okay this is a BMP rather file formats like the bitmap file format needs what will generally call metadata needs some hints so that programs like Photoshop Andres Trudeau for photo viewer or whatever you're using to open up graphics know a little something about how to interpret those bits right if you just hand a program zeros and ones it's completely out of context it's not clear if this is a word document a text file and image so this metadata as well as the file extension the last three letters of a file name like BMP or jpg typically provide this information now not all of this is all that interesting but you'll see at the top that you'll actually you'll see in the problem set that we do inquire about things like bi size and width and height so information as to well are all of these zeros and ones that you see this wide are they this tall and so forth all of that information is hidden in here and as an aside the reason for these new data types like word and D word and bite these are actually typically used in the Windows world when doing windows programming Microsoft simply has some of its own data types that there is a mapping to linux and unix data types like char and float and so forth so you'll see in b mph one of the files we give you this week that we just create synonyms for these various things forward and double word and long and bytes so realize that we have all the expressiveness now and see to do this and those numbers if not clear on the left just show you how far into the struct at what bite position this particular field is but without this information you're kind of screwed right unless you understand something about the format of a file it's pretty much impossible to read it unless you have these hints and similarly for the recovery part of this problem set where you have to recover a whole bunch of JPEGs that I accidentally deleted just knowing a little something about this header information this one again is bmps it's not JPEGs but we tell you a little bit about the JPEG format so that you can similarly recover like an investigator might a whole bunch of images and now in a hacker edition for those have interested this week realize that this is most likely the last of our hacker editions because we now kind of converge and I'll get onto the same page realize this one's actually a little more accessible than usual what you'll do is much the same problem set but when challenged to resize images bmps to make them bigger the Packer edition will also challenge you to make them smaller and making an image smaller is actually non-obvious where is to make something twice as big you can imagine conceptually just double all the pixels make everything twice as white ice is tall and then you can just do that for any factor but if you have to shrink an image and for instance you have a pixel well what do you do with that pixel pixels a pixel you can't shrink it so instead you have to throw information away if you want to shrink an image and so one of the challenges of the hacker edition is to decide which pixels should you throw away or if something's black and something's white should you maybe throw one of them away and just make one remaining pixel gray or something to that effect so there's a lot of opportunities there for design and there will also be opportunities as usual for office hours but / 2 weeks ago its announcement we're going to head across the river just this week to join the ilab for one of its inaugural events where we'll have the same iPad and the same approach to office hours but upon arrival you'll see a nice sexy Lobby like this they'll presumably be people by then behind the glass walls there and you'll pretty much be welcome to just spread out wherever you can certainly bring friends from other classes the goal is simply to hang out as a class get questions answered by the TFS ncas and eat pizza and soda which will be supplied gratefully by the ilab all right so quiz they're all so what we'll do i'll be there tonight if you have questions or want to walk through your quiz zero you can reach out to me directly we'll make one of the categories quiz zero questions tonight on q cs50.net and also throughout the week if you want to walk through your quiz with one of the TFS or CAS realize that we will make ourselves available for that and lastly if you're on that fence and you're here today at lecture but you're sort of planning to head to the Registrar by 5pm to withdraw from cs50 belize do catch me after class um for conversation if you're on that fence because realize that we are now pretty much more than halfway through the courses problem sets in pretty much halfway through the course so we would hate to lose you at this point all right now for something technical so one of the hardest things in C of late is dealing with things like memory and pointers and malloc and free and increasingly will this be an opportunity to really screw up your programs and make programs that crash and seg fault and cord them but thankfully just as we have tools to write programs we also have tools to fix or to diagnose or debug programs we've used or we've said you should use gdb thus far the new debugger but we also introduced this week a nice tool called valgrind whose purpose in life is to analyze your code and figure out if you've screwed up with regard to memory management did you accidentally leaked memory did you accidentally go beyond the boundaries of summer-rae now the tool is imperfect it can't necessarily perfectly test your code because of course any time you run programs you give them input and valgrind is not going to be able to try your program on an infinite supply of input so it's going to do its best guess but you'll find that it can help us find some things quite nicely let me go ahead and open up a file called memory see that's among our source code for today this is a pretty stupid program that's deliberately buggy and that it does the following so let's start from the bottom here is my main function takes no command line arguments apparently calls a function f and then return 0 so that looks ok but then let's stroll upward to the F function it returns void it takes no arguments in but it does in this first line call malloc and just as a check sanity check how many bytes get my locked in this highlighted line so 40 right if we assume an int is four bytes then four times 10 is of course 40 and then what just to be technical or restoring in X then address the address of that chunk of 40 bytes so that's sort of from two weeks back and then on the next line unfortunately I do something stupid why is this flawed this second line of code x bracket 10 gets 0 yeah perfect so the maximum index for this erase should be 9 from 0 through 9 because I've only allocated 10-inch so of course this is an off by one error or I'm overreaching the boundaries of my array in the worst case this program might crash and seg fault and core dump but not always and that's one of the frustrating things often if you just go ever so slightly outside of your memory bounds your memories balance the program might not actually crash because the operation operation systems being a bit generous and is allowing you if unofficially and in a dangerous way to go slightly beyond your arrays bound so in other words this is kind of hard to catch some time certainly just by visual inspection so let's instead try this let me open a turn on terminal window let me go ahead then and compile my code with make memory and now if I run this program it actually seems to be fine right no core dump no segmentation fault so at first glance all is ok but now instead let me run this new program valgrind and let me go ahead and run it on memory I still need the dot slash to say chest this program here and then hit enter unfortunately whoever first wrote valgrind decided that let's make its output a crazy mess but if you read between the lines you can actually see some juicy information so let me scroll back up to the very start of this and you'll see here's where we started this is where I ran the command and now we have some copyright information and stuff like that but here's where the helpful information is invalid right of size for at such and such an address now these pointers these addresses on the left hand side are generally not all that useful unless we really get low level with our debugging but it does mean that you did something wrong at that address specifically at line 23 of what file alright so memory dot C right so the output is actually quite similar to gdb and so if i go back to line 23 which is this guy here sure enough Algren has noticed that i wrote 4 bytes erroneously there now it's still up to me the human to determine where is my mistake and of course we already identified the mistake but the program found it even though when I ran this program nothing appeared to be wrong now lastly this here says this actually being dresses 0 by it's actually that's okay let me scroll down here and now if we go down here leak summary definitely lost 40 bytes in one block now this might be a much bigger program right now it's pretty simple it's only a few lines so I can actually run valgrind with more switches and notice it's telling me to rerun with hyphen hyphen leak hyphen check equals full to see details of leak to memory and in fact if I go back to my slide here what you should generally run even though it's a little tedious to type is exactly this valgrind dash V so I'm going to try this valgrind dash V dash V almost always means verbose for a Linux program some of these switches take hyphen hyphen not single hyphens the convention just FYI is if it's a single letter switch it's usually a single hyphen if it's a longer word like leak then you would actually use hyphen hyphen but that's just a human convention so leak check equals full and then I have to say a dot out in general but in the concrete case it's the program called memory so let me hit enter now and oh my god so much more with flew across the screen there but let's scroll up scroll up and here we go now because I ran it with more verbose output and I frankly knew where to look you would kind of have to take a few seconds to sift through this but notice 40 bytes in one blocks are definitely lost in Los record 1 of 1 I don't quite know what that means but it looks like the bug is in a function called malloc right all right so probably not right so you've called malloc so let's at least chase this bug back a little further in the stack and what was the previous function called ok it's probably me who screwed up so is somewhere in memory dot C line 22 a function called F was called and somewhere in there 40 bytes were lost of course in a program this small it's pretty obvious to the reader what I did wrong which is what how did I lose these 40 bytes I didn't call free right I should have called free at some point and of course this would be kind of a silly exercise to just do it now because this is again kind of a bogus program but that would be the correct fix just hopefully in the real world if I'm writing some real program there would be some useful stuff going on in between those lines but this one is just completely wrong going into bracket 10 is just incorrect so in short unglamorous program but a useful tool so henceforth any time you write any program whether for pset 5 or onward that uses malik that uses arrays in any way do run your program through that tool take a few seconds to sift through the output and it might find a bug frankly before your TF does any questions all right so now we have the opportunity to start to implement with these same fundamentals from the first half of the semester some real-world constructs so that we can begin to solve more interesting more challenging problems with more sophisticated tools than just an array or variable so this is a photograph from my favorite house Mathur and this is of course a stack of trays well it turns out in computer science a stack is a data structure it's a data structure that allows you to actually store data and potentially a more useful way then say a simple array does the idea of a stack just like in the real world is that if you have a stack of trays there's some upsides and some downsides one upside of a stack is that if I put something down like a tray and then I put something else down like a tray then I put something else down like a tray well it's very easy and very fast to get the most recent thing from that stack you can just pop it off the top of the stack so to speak unfortunately a stack of trays or a stack of paper here is not the most efficient system if you instead are for instance queuing up for something you really want right it would kind of suck if you got to the apple store at like 5am before or eat got even earlier to get your fancy new iphone three or four or whatever and the staff and blue shirt starts popping people off the cue from the back right that is a bad a data structure a bad design when it comes to selling iphones you would really piss off the people at the front of the line so instead in computer science there's an alternative data structure that's similar in spirit it's kind of a list of some sort but it's called technically a q-and-a q is first in first out so the acronym is FIFO fi fo first in first out just means if you get there first in the morning you're going to be the first one allowed to buy that product by contrast a stack is the opposite but we'll see utility of both of them in fact those of you familiar with HTML know at prevalence in HTML is the idea of balancing tags having an open tag and a close tag and the notion of validating HTML involves checking dude you close all of the tags that you opened well you can actually use it turns out a stack data structure to answer questions like that if you're the w3c validator now for those for whom most of those words made no sense not to fear in just a couple of weeks we'll be diving into the web programming portion of the course but to begin to now use these data structures and solve more interesting problems we need to be able to chain structures together one of the biggest problems of an array is that is even though it's super simple to use once you get the hang of it and that you can go to any elements in it what can you not do easily with an array sorry you can't extend it right you have to decide in advance if you want 10 bytes or hundred bytes or whatever it is that you're allocating you have to decide in advance and what happens if you decide oh I really need one additional element or twice as many elements what do you have to do if you're using arrays to store your data what's that yeah you pretty much have to make a new one copy the old data into the new one and then free the old one now we haven't done that ourselves really but we've been using something that does what function that we've been using since week one does exactly that when it runs out of space it allocates more copies old into new and then frees the old get string exactly remember getstring is a perfect example because we have no idea how long of a sentence or paragraph some students ever going to type when we sit down at the start of the semester to implement get string so we have to be dynamic we have to enable the function to grow the storage space that's using and we've seen as of two weeks ago that the matter the technique for that involves using malloc and it turns out some other related functions like realloc and the like but it boils down to allocating and reallocating memory so what if there were a data structure that allows you to have some fixed amount of memory first but then grow it super easily because the problem with using the array trick is what if your string is this big and you need one more bite or twice as many bytes this is a lot of work right you have to ask the operating system for more memory so temporarily you're doubling your memory footprint then you have to copy as in a 4 or a while loop all of the old memory into the new memory then you have to free that memory that's a lot of steps if the user has maybe hit one extra character so what's an alternative you can avoid resizing arrays in this manner by just allocating maybe two gigabytes of memory any time the user it might type something in well just choose a huge upper bound 2 gigabytes but of course what's the obvious problem here right if you just allocate an excessive amount of memory your program at some point is just going to stop working or it's going to slow the computer to a crawl because it appears to be using way more memory than it should and if you only have a couple gigabytes of RAM and your silly little program is using it when you're actually trying to do useful work with like a browser or word processor you know everything will just slow to a crawl as we've discussed so suppose there were a structure that looked a little something more like this enter into the picture something called a linked list which is pretty much what the name suggests it is a linked list of what are generally called nodes what's a node for now it's just a chunk of memory and we can implement these little chunks by way of that see construct known as I struct so what have we use destruct before thus far what comes to mind sorry Sudoku so we had a struct in sodoku to represent things like the board and the y X location we had something for students a couple of weeks ago where we wrapped together a student's name and ID number and house so anytime you want to kind of clump together some related data fields and call it something new like a student structure you can do that so these squares in just a moment are going to represent what we'll generically call a node and each of these squares in this picture here stores a number apparently an int a 917 or 22 of 26 and a 34 and this is equivalent for the moment to an array of size five so I seen uh five numbers nine 17 22 26 34 so in effect I could implement this very easily with week one with week 2 stuff with an array of size five but I propose that because I've now drawn the picture in this way using arrows and not by having these numbers back to back to back as is the case for a raise it's going to be super easy conceptually to insert more numbers into this to append them to the end to insert them into the beginning and there too is an advantage what was really hard in an array remember when we had a whole bunch of people up here on the stage when we were doing sorting and we had to move someone who was over there all the way over here in the worst case we had to ask everyone can you move down can you move down so just to make one swap in the worst case with an array we had to do like n minus 1 steps of work but suppose the picture were like this and suppose that in between each number was literally a gap and there's some concept of an arrow saying the next numbers over there the next number is over there well maybe we could just have squeezed that student and their number into wherever we wanted in this list so we just need a way of linking nodes together in this way so odds are these arrows represent what so pointers right this is how we've been generically drawing pointers on the blackboard and so really underneath the hood there's some memory addresses and hexadecimal notation and all of that but conceptually this is a linked list it is something like an array but instead of having contiguous memory your nodes your integers can be stored all over your computer's memory which is great because then you don't need to find a huge contiguous block it can come all fragmented like from the entirety of your RAM and it looks like we'll be able to more easily insert numbers into these lists because we can just temporarily move up one of those arrows slot someone else and another number in there and then fix what the arrows are pointing at so let's just take a look at what this might look like I'm going to go ahead and open a file called let's say list 1 dot H and I'll propose this as the container for one of these things called nodes so we've seen this in text before even though we haven't written it ourselves all that much typedef does what if you kind of want to be a student you just say well it defines a type well what is that it makes a synonym one type is a synonym for something else so let's see what this means here so we see the keyword struct and we see the word node so gia it's being friendly and highlighting all of the sea specific words so typedef instruct our see keywords node is just something I made up it could be called foo but by convention we call these things nodes and then what composes a node what is in one of those rectangles well two things one is an int and I can arbitrarily call it n but I could call it anything I want but then the curious feature is the second thing the bottom half of this picture here so for instance here's n the number nine this fields here which I'm apparently going to call next what is that well if it's an arrow that suggests pictorially that it's an appoint or how do we write a pointer well with a star somehow but what is this a pointer to well it's a pointer to the same kind of structure so we say struct node star and then we choose an arbitrary word for this arrow and i'll call it next just to imply that it's always pointing to the next object and then down here i have the word node again which feels a little redundant and it is in some sense so at the end of this statement is the word node that relates to the typedef so what this word and this word together mean is henceforth call this whole thing more succinctly node but remember that c is not so generous it doesn't look ahead in your code and so we kind of have a problem because if i want to create this pointer inside of one of these structures to another such structure does the word node exist at this point in time not really write the word node is literally declared here well wait a minute you also said node here but this is again just because I need temporarily a name so that i can refer to myself inside of this structure so long story short struct node creates this thing and then the combination of typedef and node here rename it more succinctly from struct node to just note why why this complexity frankly it's just tedious typing struct node struct node struct node all over the place so now we can just call it node thanks to type def but if that's a little unclear syntactically that's fine for now just take for granted that it makes one of these rectangles possible how do we now start to manipulate these things well let's rewind just a moment to an example that we already looked at in the past recall this thing from a couple of weeks ago this was a structure that we use to define a students and there's one sink tactic difference what is fundamentally different about this is there's no word here to the right of struct why do I not need to say typedef struct student using the logic from a moment ago exactly and let me tweak the word I don't mention student inside of the struct so I don't need a temporary word there but I could put this here but again just as a matter of style because i'm not using struct students anywhere i might as well make it more succinct i do need to say struct and it's in red because it is an important key word here but I don't need to say students up here unless I'm going to refer to a student internally and this hopefully makes a little bit of intuitive sense right if i'm a student object there should fundamentally in the real world and in programming be no notion of next inside of a student object right I shouldn't as a human have some kind of data remembrance of someone to the left of me so again we don't need a pointer there but we do have two pointers for name and house so how did we use this well recall students struck swansea which we looked at briefly a while ago and this was just a simple example where we created a bunch of structs of type student we then got the users ID and the user's name and their house and then we scroll down and just yelled out so and so is in mather if they were in mather so recall this was from a couple of weeks ago and there's only a couple important takeaways for now for now recall that this is how we declare a in array of students and for those with prior programming experience class means like a class of students not a class of objects so student class gives me an array called class of student object of student structures here so what goes on here this was the new syntax if I want to update the ice students ID number i use the dot syntax if i want to update the ice students name I use the dot syntax and if I want to use the I update the ice students house dot syntax that was new then this was kind of old school stuff stir comp for string comparison what was like pairing I was comparing their house against literally the word Mather and double quotes and then this was important here I did need to for good measure call free on house and name why because what sorry because I stored it but how did I get those strings name and house so I used getstring and even though this was kind of some garbage we used to kind of brush under the rug this was kind of a flaw in our previous programs for several weeks which revealed a couple weeks ago they kept stirring all this time has of course been using malloc and so now any time you use malloc whether directly or indirectly you must free that memory otherwise valgrind this tool I showed a moment ago that would actually catch if you ran valgrind in fact on most of your prior programs that use getstring from early in the semester almost all of them would be buggy in that they would all have some memory leak all right but that's okay that was deliberate and those were the so called training wheels that are now off so let's just do one modification of this to kind of tie this together also to this week's problem set five so this is structured see it's almost identical but I thought it would be kind of neat to stop writing programs that take an input and then produce output and then lose that output forever it would kind of be nice if our program for instance could store permanently the names and ids and names of all the students we type in right if we want to approximate the idea of a database a server that stores information long-term what would be kind of nice if I could use these same ideas now and start saving my work to disk just like we take for granted in Microsoft Word or the like the ability to save let's do that ourselves so here is exactly the same code give me an array of students trucks populate them using get int and get string and get string but down here go ahead and say whoever's in mather just to be cute but now let's save these students to disk which just means save a file so how might we do this so we could do this in any number of ways Microsoft back in the day decided that a dot docx format do see is going to store your essay in your bold facing in your table of contents in a certain way similarly we as programmers here have to decide how are we going to store something like a student ID name and house perhaps for multiple people in a file so you could actually do this in a bunch of ways but we decided totally arbitrarily that we're just going to store these IDs names and house one per line in a text file just so we can see it but this is of course a very unsophisticated approach but it will allow us to write files to disc so let's see this the very first line of interest is this thing here so f open you might guess is file open we've not had to use this ourselves unless you already dove into problem set 5 but as the name suggests it opens a file what's the name of that file going to be arbitrarily i called it database it could be called foo but again i chose database and then just take a guess what does quote unquote w mean so it means right so it means right this file as opposed to reading it so you can actually use the same f open function to read files in but for today we'll just do a with w as an aside you might see in various books or tutorials people saying WB for right binary or right ascii realize that on the appliance and on most modern systems it suffice is just to say r or w don't need to worry about the be just FYI so why do i check for this if FP which is just generic notation for file pointer if it equals null when might something go wrong related to writing a file to disk why might fo pin' just conjecture return null sorry so database is missing maybe that file doesn't exist in this case that's going to be okay because what's nice about f open is if you're writing a file and it doesn't exist it will create it for you but a good thought that would be the answer if we were trying to read it back in certainly yeah yeah so if your disk is full right if your your hard drive is just overflowing with other files and you just don't have enough space for any new file f open could return null other maybe more technical problem if you've done something stupid or accidentally like try the permissions on some folder or you're in some folder on the system that doesn't belong to you it's someone else is my documents folder for instance well fo pin might return null in that case too so ensure if you lack the ability to write a file you're just going to get back no but what is it you're going to get back well FP is of type file and it is in all caps just because someone decided to make it that way and it turns out a file structure fi le in all caps is some kind of structure not unlike conceptually a student structure inside of which is a bunch of stuff and that stuff is generally hidden from us but you can think of a file kind of like a cassette tape if you even remember these things a cassette tape of course has a whole bunch of tape inside and when you play a song in a cassette tape with these two things turn and the tape physically moves a file on a computer system obviously has no tape but conceptually you read and write it in the same way when you open a file with F open it's like taking out a cassette tape and starting at the beginning of that tape and as you start to read from the cassette tape or read from the file it's like advancing the reading head of the cursor so to speak or the file position indicator so to speak on that file and if you write to a file similarly that happens now this is relevant because in problem set 5 one of the challenges is to enlarge a bitmap file so you have to take a pixel and double it or triple it and so forth so realize that one of the challenges that arises in problem set 5 is well if a file is being read from start to end just like a cassette tape well what if you need to kind of rewind the tape and go back and reprint those same pixels rial put those same pixels well there's functions like f seek file seek which can help you do this there's every whined although that's typically overkill but there's other ways altogether as Tommy discussed in the walkthrough involving arrays and just remembering some of those bits but in short for now this is a pointer to a file structure and it's not clear to us just yet what's actually in that file I accidentally quit there so let me go back two strokes to see and we'll finish this story so how do you write to this file well if I get as far as here inside of this if condition the file opened okay and I'm good to go to write so how do I write to a file it's actually pretty familiar so you don't use F you don't use printf use F printf for file printf the first argument is apparently a pointer to the thing that you want to write to and what are the second and third arguments apparently it's pretty much identical to printf right it's the format string what do you want to write and what do you want to substitute for any place holders so apparently i'm going to write a integer followed by a new line a string new line string new line and i'm going to plug in each of these students ids names and houses then i'm going to close my file with f closes file close then i'm going to go back and free the student's name and house but one question i don't free the students ID number why i don't free ID it's not a string it's an int and we did not allocate that integer with Malik it was just there inside of the struct recall so let's see what the end results here so this is struck stu dot c let me go ahead and make structs to ok it seemed to compile ok let me go ahead and run this now so run structs to enter students ID let's say Matt Kirkland students ID say Rob Kirkland students ID three David Mather so it still says David isn't Mather but then nothing but if i do LS now there's a whole bunch of code from today as well as this file database so let me go ahead and run g edit on database enter and voila we've just created this file again so primitive this format right this is not what a word document looks it looks like inside but it is our own proprietary file format just as bitmap and jpg have their own file formats and ours too is in just ascii text which might not be sufficient if we want to store something like an ID photo but at least we now have one way of expressing ourselves persistently by writing these things to disk and it remains then to use these same fundamentals to start linking things together in an interesting way let's go ahead and take our 5-minute break now and then we'll return to that alright so we are back linked lists are much more exciting when we actually make a mess of them with actual humans so if we could indulge or eight people willing to volunteer on center stage have to be comfy on camera ok let me decide this randomly who's going to come up ok here we go how about you two three four five six I'm liking the side of the room seven eight come on up are you lost your chance all right come on up and we are going to hand you the same numbers we see here on the screen so you'll be number nine and if you want to go in our usual spot over there welcome 17 welcome 22 welcome Swansea oh okay 26 will fix the slide to make this consistent with reality 34 you can be first you can be pointer you can be predecessor pointer very dramatic alright so slights bug oh and this is an image too isn't it may I see 20 who did I get wrong 29 okay debugging this fixed okay so that's 26 okay so here we have a linked list and let's ignore for the moment how we got to this point already let's just assume that we are with fast forward 10 minutes mentally we already have the ability to create linked lists and here is the state of our world at this moment and actually let's pop out if you don't mind pointer and predecessor pointer if you could be here in the heap for just a moment so first this is good and now this is the point at which we need to really mimic this picture if you could somewhat awkwardly with your left hand point at the next person except for 34 where I would point down with your left hand okay so now we have implemented a linked list with humans so what's going on here so first of all each of these what we still have to point each of these structures has two things inside of the struct we have an integer like the number 9 and then we also have a left arm known as next which is the pointer so each of these humans represents a struct at this point with those two fields for a total of little check how many bytes does each person represent how many bytes eights right so four bytes for the int and four bytes for the pointer so each of these rectangles is eight bytes on a system like the appliance now first is a little special so first what's your name lisa is special in that she does not have an integer she is only a pointer and so the reason for her being drawn is a smaller rectangle up there is that she's only four bytes but Lisa is super important because without Lisa we're going to forget completely where this whole linked list is and without Lisa we would have a whole huge memory leak if we allocated all of these nodes but didn't remember where the first of those nodes is so for now each Lisa special she's a pointer and Lisa really is our linked list so we've had a raise before and we've had names of arrays which are really just the addresses as we've seen and so Lisa is sort of the equivalent of that the only piece of data we have to have to have to keep around in memory to remember where the entirety of our linked list is is the first person a pointer to the start of the list and now all of these guys are fundamentally the same they're incomplete structures except for 34 who apparently is pointing at the ground because that's the end of the list so that's probably going to represent what when pointing down so no we need a special value to say mm-hmm there's no one else to the left so when pointing down this represents storing the null pointer all zeros in that pointer field okay so the first challenge at hand it will make use of you don't technically have to keep standing there like that will use you in a moment okay so let's go ahead and try inserting at the end so we have the we need the number 55 and actually let me change you let me steal pointer can you take on the role 55 so what's your name Lucas Lucas Malik Lucas right so now we have another eight bytes inside of which we can put the number 55 and now let's figure out how to insert Lucas into this list and actually mature naming Joseph ok and I'll take that's so Joseph if you want to kind of take charge here and explain just in real human terms how do we insert 55 into our linked list if you want to go ahead and make the insertion sure and let me give you this if this is on just to explain what it is you're doing here okay so i guess you take them to the beginning of the list and you know this is where the list okay you go down through each pointer to find the end of the list and then end up here okay good so we found the so-called tail of the list and then what did you do then you insert them there okay so let's push a little harder on what it means to insert into the list what you you put him there but so you take the pointer of the last element in the list and put okay excellent and where should his left hand be pointing to the ground okay good so in this is actually quite important just because you've Malik a node a structure put in the number 55 you also have to change what the default next value is because just as with local variables if you just Malik a node and then put 55 there what's going to be inside of the other four bytes called next it's just some garbage value and that's really bad in the context of pointers because if you follow a garbage value as a pointer you're going to end up in some crazy place in memory and the odds are that you're going to seg fault or cordon so that was really good so now we have 55 at the right position 34 is pointing at 55 let's now try to do a little something else with this list suppose we instead now wanted to insert the number five so i need to malloc one more volunteer please okay come on up Malik number five what's your name Emmy okay so Emmys bin Malik tier we're going to need you again with the mic to handle our insertion at the head of the list and consider now what is fundamentally different about insertion at the head of the list and the reason of course that we're doing head and tail is because we're trying to keep the list sorted which we've tried to do with arrays in the past all right go ahead oh and talk as close as you can okay so you have heard some place in memory okay and what the first person to her and her to point to the next person okay good and now even though as humans they did have to kind of shuffle left and right to make room for each other in memory because each of these nodes is at some arbitrary location in memory as determined by notice what did not have to happen was 19 17 22 26 34 and 55 did not have to move right if they did have to move the running time of insur would not be constant time it instead would have been what Big O of n right in the worst case if we had to move everyone over by one spot in memory which we would have had to do in an array it's Big O of n but instead insertion is big-oh of one constant time because all we have to do is go right to the start of the list and we know where Lisa is because she again is this special pointer that we always keep around and even though we had to do a couple things like point her at five and five at nine well that's still two three steps it's a finite number its constant so that was a really fast operation by contrast what was insertion at the tail of the list the same thing well big ol of big o of well you had to go okay we had to go up there every pointers a big o of n exactly so big ol event so let's see now what happens if we make this a little trickier I have one more node so I need to malloc one last volunteer number 20 and so come on up and so we've mapped another note and take it away he belongs in the middle and explain for the audience if you could how we find his location because it's not as simple as the tailor the head this time okay well in the middle as in well show in sorted order to ordered order yes and you can walk in front if it's if you need to see the actual numbers okay so what's your name you aren't expecting it to be this involved were you in this demo ok but so you go to first pointer follow it to the next value compare the value good then if it's greater than the value in the first element ok you point to the second one compare those good just please just watch where you're walking ok okay and good and so now how many pointers need to be updated oh so however many whatever element that is good okay well actually let's do that can we pop ansel out for just a moment in 17 if you could keep pointing to 22 so just super deliberately now how what needs to be updated in the way of left hands to insert ansel in the right location yes the point to Hansel okay that's good went to her good bug so you just brok
Original Description
File I/O. Linked lists. Stacks. Queues. Valgrind. Hash tables. Trees. Binary search trees. Tries.
Watch on YouTube ↗
(saves to browser)
Sign in to unlock AI tutor explanation · ⚡30
Playlist
Uploads from CS50 · CS50 · 52 of 60
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
▶
53
54
55
56
57
58
59
60
Hello, World: Hadi Partovi
CS50
Content Distribution and Archival in a Digital Age
CS50
CS50 2014 - Week 1
CS50
CS50 2014 - Week 3
CS50
CS50 2014 - Week 0, continued
CS50
CS50 2014 - Week 4
CS50
Week 3, continued
CS50
Quiz 0 Review
CS50
CS50 2014 - Week 3, continued
CS50
CS50 2014 - Week 7
CS50
CS50 2014 - Week 7, continued
CS50
Breaking Through The (Google) Glass Ceiling by Christopher Bartholomew
CS50
Introduction to Amazon Web Services by Leo Zhadanovsky
CS50
CS50 2014 - Week 9
CS50
How to Build Innovative Technologies by Abby Fichtner
CS50
Light Your World (with Hue Bulbs) by Dan Bradley
CS50
Building Dynamic Web Apps with Laravel by Eric Ouyang
CS50
CS50 2014 - CS50 Lecture by Steve Ballmer
CS50
CS50 2014 - Week 10
CS50
This is CS50 with Steve Ballmer?
CS50
Meteor: a better way to build apps by Roger Zurawicki
CS50
Data Analysis in R by Dustin Tran
CS50
Data Visualization and D3 by David Chouinard
CS50
CS50 2014 - Week 6
CS50
Build Tomorrow's Library by Jeffrey Licht
CS50
CS50 2014 - Week 9, continued
CS50
Essential Scale-Out Computing by James Cuff
CS50
iOS App Development with Swift by Dan Armendariz
CS50
Sam Clark Leads Yale Students on Tour to CS50 at Harvard
CS50
3D Modeling and Manufacture by Ansel Duff
CS50
CS50 2014 - Week 5, continued
CS50
hello, world
CS50
CS50 2014 - Deep Thoughts - Hash Table
CS50
CS50 2014 - Deep Thoughts - Binary Tree
CS50
CS50 2014 - Deep Thoughts - Scratch
CS50
CS50 2014 - Deep Thoughts - MySQL
CS50
LaunchCode Visits CS50
CS50
CS50 Live, Episode 100
CS50
CS50 Field Trip to Google
CS50
This is CS50 AP
CS50
Week 4: Monday - CS50 2011 - Harvard University
CS50
Week 2: Wednesday - CS50 2011 - Harvard University
CS50
Week 1: Wednesday - CS50 2011 - Harvard University
CS50
Week 11: Monday - CS50 2011 - Harvard University
CS50
Week 3: Wednesday - CS50 2011 - Harvard University
CS50
Week 12: Monday - CS50 2011 - Harvard University
CS50
Week 1: Friday - CS50 2011 - Harvard University
CS50
Week 3: Monday - CS50 2011 - Harvard University
CS50
Week 10: Wednesday - CS50 2011 - Harvard University
CS50
Week 2: Monday - CS50 2011 - Harvard University
CS50
Week 9: Monday - CS50 2011 - Harvard University
CS50
Week 7: Monday - CS50 2011 - Harvard University
CS50
Week 5: Monday - CS50 2011 - Harvard University
CS50
Week 5: Wednesday - CS50 2011 - Harvard University
CS50
Week 7: Wednesday - CS50 2011 - Harvard University
CS50
Week 8: Monday - CS50 2011 - Harvard University
CS50
Week 9: Wednesday - CS50 2011 - Harvard University
CS50
Week 8: Wednesday - CS50 2011 - Harvard University
CS50
Week 10: Monday - CS50 2011 - Harvard University
CS50
Week 2: Wednesday - CS50 2010 - Harvard University
CS50
More on: PM Basics
View skill →Related AI Lessons
⚡
⚡
⚡
⚡
Bloom Filters, Explained Properly
Dev.to · Daksh Gargas
Prefix Sums: The Preprocessing Trick That Makes Range Queries Instant
Medium · Programming
I Thought I Was Ready for the Interview — Then One Simple Math Question Destroyed Me
Medium · Programming
Week 2(Day 10): LeetCode Two Pointers(slow & fast): Remove Duplicates from Sorted Array (Brute…
Medium · Python
🎓
Tutor Explanation
DeepCamp AI