Harvard CS50 - Lecture 0: Binary, Algorithms & Scratch - Computer Science

Better Everything
14 min readMar 12, 2023

--

Harvard University offers a Computer Science course that is also freely available online: CS50 Introduction to Computer Science. It has online video lectures that anyone can watch to learn about the main concepts of computer science. There also are problem sets for you to solve to further practice what is taught. And you can even get a CS50 certificate.

These are my Lecture Notes for Harvard CS50 Lecture 0.
These are my Lecture Notes for Harvard CS50 Lecture 0. Source: own creation.

I decided to not only watch the lectures but also write and share lecture notes. These are my lecture notes for Lecture 0, which is about Bits, Representation, Algorithms & the programming language Scratch.

What is Computer Science?

Computer Science is generally about problem solving. You can take a problem from any field of interest to you and break it down step-by-step.

Computational thinking, which is thinking like a computer, is about that step-by-step breakdown of problems. These steps can then be expressed in code, which is computer language. That way you can let computers solve the problems for you.

So computer Science involves programming but it is not just that. More generally, computer science is the study of information: How to represent and process information?

This is a scheme of how to solve a problem with computational thinking:

Computational thinking is about methodically solving a problem by studying the available input and coming up with a step-by-step approach to get to the the desired output.
Computational thinking is about methodically solving a problem by studying the available input and coming up with a step-by-step approach to get to the the desired output. Source: own creation — idea from lecture slides.

The scheme splits the question How to represent and process information? into 2 questions:

  1. How to represent the input- and output information?
    Representation is about methods to express things like numbers, letters, colors, images, videos and music with a computer.
  2. How to code the algorithm to go from input to output?
    An algorithm consists of step-by-step instructions for solving a problem, it should be correct and ideally also quick.

Both questions are important aspects of computer science and are explored in the next 2 sections.

The binary system, bits, bytes and representation

How is information represented in computers?

Computers have huge amounts of switches. These switches are known as transistors and can be switched on or off.

When a transistor is on it represents a 1 and when it is off it represents a 0.
This section contains an overview of how collections of 0s and 1s can represent things like numbers, letters, colors, images, videos and music.

How are numbers represented in computers?

Most people use the decimal system to represent numbers. Decimal refers to ten and in the decimal system there are ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8 & 9.

Although a computer works with only 2 digits: 0 & 1, it can still be used to represent any other number.

Suppose we have a number with 3 digits: 1, 0 & 1 (101).

If it is a decimal number we know it as 101. We add 1*100 + 0*10 + 1*1 to get to the number 101. Here base-10 is used:

The rightmost digit (1) is multiplied with 10⁰ = 1.
The digit to its left (0) is multiplied with 10¹ = 10.
And the next digit (1) is multiplied with 10² = 100.

Since computers have only 2 digits, it uses base-2 instead of base-10. The system that uses base-2 is called binary system. And in the binary system 101 represents the decimal number 5. We add 1*4 + 0*2 + 1*1 to get 5.

Base-2 explains where the numbers 4, 2 and 1 come from:

The rightmost digit (1) is multiplied with 2⁰ = 1.
The digit to its left (0) is multiplied with = 2.
And the next digit (1) is multiplied with = 4.

A bit is a binary digit, binary refers to the two possible values: either 0 or 1. And since computers work with switches that are either off (0) or on (1), computers represent data with bits.

With one bit you can only represent 0 or 1.

With 2 bits you can also represent 2 and 3:
Binary 10 is decimal 2.
Binary 11 is decimal 3.

With another extra bit you can also represent 4, 5, 6 and 7:
Binary 100 is decimal 4.
Binary 101 is decimal 5.
Binary 110 is decimal 6.
Binary 111 is decimal 7.

So the more bits are available, the more values can be represented.

Bits are typically grouped in groups of 8. A group of 8 bits is known as a byte. A byte can have binary values between 00000000 (decimal 0) and 11111111 (decimal 255), which are 256 values.

How is text represented in computers?

Texts are series of characters, but if computers can only store 0’s and 1’s, how can text be represented by computers?

The answer is by mapping numbers to characters, that means that each number represents a different character. An example of such a mapping is the American Standard Code for Information Interchange (ASCII). This is just one of the many, though an important one, ways that people have agreed upon to let numbers represent characters.

As an example, in ASCII, 65 (01000001) is mapped to the capital letter A and 66 (01000010) is mapped tot the capital letter B.

With 3 bytes (24 bits) you can for instance make: 01001000 01001001 00100001, which is 72 73 33. These values can be translated with the ASCII chart to: HI!.

When only using 1 byte you can define 256 different characters. That is not enough to include all characters from all languages.

Another numbers-to-characters mapping is Unicode. This uses 4 bytes (32 bits) per character so that you can represent more than 4 billion different characters, since there are 4,294,967,296 unique ways to combine 32 bits.

How are colors represented in computers?

A common way to define a color is by specifying the mix of red, green and blue. A so-called RGB-value consists of the amounts of these 3 colors. By mixing those colors in different proportions you can make any other color.

The amounts of red, green and blue are given on a range from 0 to 255, so each of those 3 values requires 1 byte (8 bits) and together a RGB-value requires 3 bytes.

The amount of red, blue and green is often expressed with a hexadecimal digit pair.

Apart from the digits 0 to 9, a hexadecimal digit can be A, B, C, D, E and F, with:
A being 10
B being 11
C being 12
D being 13
E being 14
F being 15.

So in total there are 16 possible digit values. And the hexadecimal system uses base-16.

3 examples of hexadecimal values:

  • 0F which is 15 in the decimal system.
    0*16¹ + 15*16⁰ = 0*16 + 15*1 = 0+15 = 15
  • A2 which is 162 in the decimal system.
    10*16¹ + 2*16⁰ = 10*16 + 2*1 = 160+2 = 162
  • FF which is 255 in the decimal system.
    15*16¹ + 15*16⁰ = 15*16 + 15*1 = 240+15 = 255

A RGB-value is noted with its 3 hexadecimal values of red, green and blue put together. For example: #0FA2FF has a red value of 15, a green value of 162 and a blue value of 255. Mixed together you get this color:

The color with hexadecimal RGB value: #0FA2FF.
The color with hexadecimal RGB value: #0FA2FF. Source: own creation.

But since computers only work with 1s and 0s the color would be physically represented by 3 sets of 8 switches being switched on and off in this pattern (15, 162, 255 or 0F, A2, FF):

00001111 10100010 11111111

How are images represented in computers?

An image is a collection of pixels, tiny squares that become more noticable when you zoom in. All of these pixels together form an image like a mosaic.

Each pixel has its own color that can be represented by 3 bytes like we just saw. So an image can be represented by putting together each pixel’s 3-byte-long RGB values.

How are videos represented in computers?

Videos can be formed by quickly displaying one image, also called frame, after the other. Common framerates for videos are 24 or 30 frames per second. The more frames per second, the smoother the video.

How is music represented in computers?

Music can also be represented by numerical values. For instance:

  • the frequency of a tone
  • the volume of a tone
  • the duration of a tone

Before watching this lecture I wrote a post about how bits and bytes are used to store numbers, letters, colors and images. That post’s structure and contents are quite similar to the first part of this lecture, so it might help you better understand representation:

Algorithms

Abstraction is a term that refers to the simplification of something to focus on it at a high level instead of focussing on all lower level implementation details.

An algorithm is a set of step-by-step instructions for solving a problem. In our previous problem solving scheme an algorithm is the process to go from an input to an output.

Searching for an item in a sorted list

An example of a problem that can be solved with a computer is looking for a name in a sorted list of names, like a phonebook or contacts list. The input includes the name and the sorted list of names. And the output should be the contact details corresponding to the name or a notification that the name is just not in the list.

Brute force algorithm

One algorithm to find a name in a sorted list is starting at the beginning of the list and checking for each item whether it matches the name you are looking for. You stop when you find the name or when you get to a name alphabetically greater than the name you are looking for.

This is a correct algorithm because it does what it is supposed to do. However, it is not that fast or efficient.

The algorithm is an example of a brute force algorithm, instead of coming up with an efficient approach, every possibility is checked.

A faster approach

If you take the example of a (alphabetically sorted) phonebook, a faster algorithm would be to skip a page in each step. This way the algorithm takes half the amount of steps to find a name.

But if you skip a page in each step, you might skip over the name you are looking for which would make the algorithm incorrect. To keep the algorithm correct you should go back 1 page when you see that the page you are on has names that should be after the name you are looking for.

Divide and conquer algorithm

An even more efficient approach would be using a divide and conquer algorithm. In this algorithm the list is divided into 2 parts by splitting it in the middle.

If the middle item is the name you are looking for, you can stop searching.

If the middle item is alphabetically greater than the name you are looking for, you repeat the process with the first/left half of the list.

If the middle item is alphabetically smaller than the name you are looking for, you repeat the process with the second/right half of the list.

Essentially you are halving the input list in each step.

Comparing the speed of algorithms — Time Complexity and Big O Notation

Time Complexity is about studying how much the runtime of an algorithm increases as the size of the input increases.

When comparing the speed or efficiency of different algorithms, typically the worst case scenario is studied, the situation in which the algorithm takes the longest time.

For example, when searching for an item in a list and starting at the beginning of the list, the worst case scenario would be if the item was at the end of the list.

Time complexity is often noted using Big O notation. This axpromiately describes the runtime of an algorithm as a function of input size n.

  • The runtime of the brute force algorithm grows linearly as the input size increases. For each extra item in the input list an extra step must be taken. The Big O notation of an algorithm with linear time complexity is O(n).
  • The time complexity of the algorithm in which we skip a page in each step is noted with O(n/2). For each extra 2 items, 1 extra step must be taken.
  • The time complexity of the divide and conquer algorithm is noted with O(log n). Each time the input size doubles, only 1 extra step is required.

The time complexity gets more important as the input size increases.
When going from an input size from 2000 to 4000, 2000 extra steps are needed in the brute force approach, but only 1 extra step is needed with the divide and conquer algorithm.

A sketch of the time complexities of 3 different algorithms.
A sketch of the time complexities of 3 different algorithms. Source: own creation — idea from lecture slides.

To learn more about time complexity & big O notation you can read this short article of mine:

Writing pseudocode and the main building blocks of code

We can write the divide and conquer algorithm in pseudocode. This is not real code that the computer understands but it looks a little bit like it.

With pseudocode the algorithm can be written down like a rough sketch. And we don’t have to worry about the syntax of real programming languages.

A syntax of a language consists of the rules of how lines of code must be formed and the meanings behind symbols and words.

Here is the pseudocode for applying the divide and conquer algorithm to search a name in a phone book.

1   pick up phone book
2 open on the middle
3 look at page
4 if person on page
5 call person
6 else if person is earlier in book
7 open to the middle of left half of book
8 go back to line 3
9 else if person later in book
10 open to the middle of right half of book
11 go back to line 3
12 else
13 quit

In the pseudocode we can identify some main building blocks of code that are available in almost any programming language:

  • Functions — Functions are defined once and consist of one or more lines of code to perform a certain task. After defining it, a function can be called. That means that the code that is inside of the function is executed.
    Verbs have to do with actions and functions are about performing tasks so for the most verbs in the pseudocode: pick up, open, look & call, a function could be implemented.
  • Conditionals — On line 4, 6, 9 and 12 there are the words: if, else if and else. Combined with boolean expressions, which are either true or false, they form conditionals. Code either will or will not be run based on the boolean expression.
    We also see indentation on line 5, 7, 8, 10, 11 & 13, there is whitespace before the line of code. In the context of conditionals, indented code is only run when the above expression evaluates to true.
  • Loops — on line 8 & 11, we see go back. When the algorithm goes back to a previous step, lines are repeated and so a loop has been created.
    In the example, the input size keeps getting smaller. Eventually either the list will be empty or the name will be found. In both cases the loop stops.

These ideas are not that complex, but to apply them in a programming language you need to get familiar with the language’s syntax.

After writing code, it will be converted into patterns of 0s and 1s just like how data is represented in computers.

The programming language Scratch

Scratch is a graphical programming language in which you can combine blocks to create programs without worrying about syntax like in textual programming languages.

On the website scratch.mit.edu you can make Scratch programs. The interface has several parts.

On the left there are blocks divided into categories. For example, under the category Events there is a block when <green flag> clicked.

These blocks can be dragged into the middle part. If we connect the say Hello! block from category Looks and click the green flag to start the program, a speech bubble with Hello! will appear by the cat on the right.

The cat is a sprite, which is like a character. In the bottom right there is an overview to manage sprites. Each sprite can be given its own code.

Above the sprite overview there is a 2 dimensional coordinate system in which the sprites are ‘living’. The middle of the system has coordinate x=0, y=0. The x-range goes from -240 to 240 and the y-range goes from -180 to 180.

Clicking the green flag starts the execution of the code because scratch is waiting (listening) for this event to happen.

The say block is a function, it takes text like Hello! as input, it also has a side effect, which is something that happens when the code is run: a speech bubble appears.

You can also create your own blocks, this is the equivalent of creating a function. You define the block once, by giving it a name and adding its contents and then you basically have a reusable block. Then instead of adding all the different blocks again and again, you only have to reuse 1 block. Another advantage of using functions is that the code is easier to maintain, if you want to make a change you only have to do so in one place.

Within the Control category there is a forever block, which will keep repeating the blocks inside it ‘forever’.

The Control category also offers conditional blocks like if <> then, these blocks take boolean expressions, in Scratch these are blocks from the Sensing category.

The Sensing category also has a ask <> and wait block that makes it possible for the user to enter input, like answering a question. The function has the side effect that it returns a return variable. A return variable is a variable that becomes available after a function is called and that holds a value.

Variables can store values like in algebra, this can be a text like a name but also a number like a score. These variables can be used by other blocks and they can also be changed. For instance, adding points to a score.

When creating a project you can probably break it down into steps and code the steps one by one which makes the creation more manageable.

And when creating code you might accidentally create a bug. A bug is a mistake in code that needs to be solved in order for the program to work. The process of solving these bugs is called debugging and can be challenging but also fulfilling.

Thank you for reading!

I hope my post was useful.

To learn more about programming, follow this page and check out my E-books:

The next lecture is about the programming language C:

--

--

Better Everything

📖 My E-Books: amazon.com/author/better-everything ✅Programming, Data & Business ✅Automation & Optimization ✅Knowledge & Inspiration