Saturday, Jan. 23 to Friday, Jan. 29
Today’s theme is removing the training wheels. Things about C that were hidden for the last three weeks by conveniences provided in cs50.h
.
#FF
is 255 which is the same as 111111112. One convenient way of thinking about hexadecimal (or hex), is that it does four binary bits at a time. Hex values for RGB colors as an example of where you have already seen hex used. Introducing the & operator (read as “address of”) and the * operator (read as “dereference”). So if you have declared int n;
, then &n references or takes the address of n and *&n turns around and dereferences it and gives you the value that is held at n. Of course in a program you’d never take take the address of a variable and turn right around and dereference it. You’d just write n
. Also, everyone acknowledges that the notation is a little bizarre, especially since *
has already been used for multiplication in C and &&
has been used for logical and, and you’ll see that &
has another use (search the web for “bitwise and in C” if you want to know what the other use is). Also, int *
is how you declare the type of a variable that is a pointer to an int as in int *p = &n;
. I could apologize more for the arcane notation, but I’d encourage you to not resist it, and admit that if you think like a compiler must think, it makes sense. You can read int *p
as “*p
has type int.” Continues explaining the preceding with more examples. Malan reveals that the actual type of a string is char *
, and that string
is a shorthand (defined in cs50.h) which looks like typedef char *string;
.s[0]
is the same as *s
and s[1]
is the same as *(s+1)
, etc. This is all that the []
notation does! Malan writes a program that compares two strings, and short-circuits the process if they are the very same string in memory by first comparing their pointers. Only if the pointers are different does his program bother with comparing the characters. Then he uses strcmp()
to do the same thing. Then he gets to Jesse’s question which arose in his caesar.c solution. char *s = "Hi"; char *t = s;
doesn’t give you a new string! See the screen at 1:04:50. Introducing malloc()
and writing a program that copies the bytes into the newly allocated memory. malloc()
is declared in stdlib.h
. Malan notes that malloc
can fail (in the extremely rare condition that the system is completely out of memory to give to your program). Introducing NULL
, not to be confused with \0
(also known as nul
). NULL is the pointer that points to 0x0. That’s not quite the same thing as the ASCII character that is all zeros. Heck, they are both 0, so why not? Well a pointer on a 64-bit computer is eight bytes, but an ASCII character is only one byte. Introducing free()
which should usually be used in conjunction with malloc
whenever your program no longer needs the memory that was allocated. Introducing valgrind. Introducing the sizeof
operator. Pointer fun with Binky.b = a + b; a = a - b; a = -a;
? Showing where the local variables in swap live. This is called the stack. Every time you call a function, space for the local variables you use in that function is added to the stack. Describing the stack and the heap helps understand why when main calls swap that the original values are not changed. The solution is to pass the pointers to the variables you want swapped in to swap()
. Stack overflow. Using a recursive implementation of draw()
to illustrate a stack overflow. Buffer overflow. Using scanf()
to implement your own get_string()
. He only hinted at the method. It involves guessing a reasonable amount of memory and if it isn’t enough noticing that you are running out and “re-allocating” more memory. Opening a file in append mode. Creating a csv file. Opening an image file. Opening in read and write mode.Although I have never heard of valgrind (the program that helps you detect memory problems), such programs are pretty common. Under the hood what they are doing is replacing malloc()
and free()
with functions that have more diagnostics.
Malan is oversimplifying how memory is reclaimed on modern operating systems. However, the habits he is asking you to build are very good ones. You should always be thinking about when memory you allocate is no longer used and then putting a corresponding free when the code gets to that point.
Determining when memory is no longer needed and and freeing it such a complex subject it is a subject of its own known as “garbage collection.” C is not a garbage-collected language. You, the programmer, are the garbage collector.
Java is a garbage-collected language. It is easy to write badly-performing programs that allocate unreasonable amounts of memory in either Java or C. However, in C you are in so much control that you can write very high-performing programs. In Java, even if you write very good code it will be kind of slow because the Java virtual machine does the garbage collection.
Personally, I prefer reference-counted languages that free resources as soon as their reference count goes to zero (rather than waiting for the system to do garbage collection). In practice, this gives the programmer the absolute control over memory that C has, but most of the convenience of Java. It also causes programs to fail-fast rather than after some arbitrarily-timed garbage collection sweep.
I leave “reference counting” and “automatic reference counting” as things for you to learn a little more about, but only if you are first feeling very confident about how malloc()
and free()
work.
The first of this week’s two problems is filter and it has two versions. Let’s do the easier version of filter. All that has to be done is to implement four filters in the file helpers.c. It is pretty necessary to watch the five short walk-through videos to know exactly what they want your filters to do.
Some 5x blowups of the four filters applied to (a portion) of stadium.bmp:
On to recover. The idea here is that there are jpeg files that can be found within card.raw
by looking for the 3 1/2 bytes of header that jpeg files always begin with and that are unlikely to occur randomly.
Here is my recover.c
output:
Satisfyingly it recovers exactly 50 files which is exactly what we were told to expect from card.raw
.