Diagnostic Dojo Part II — Power & Memory
Because this is a blog, and bloggers like to get fancy, let me start by pontificating about what data is: Data is a form of communication. In its purest form, data is a representation of an idea that, when deemed appropriate, I want to communicate to someone else in a mutually understood format.
But, in reality, I’m no Iggy Azalea. I’m just a student trying to understand how my computer works and how I can talk to it in terms that it will understand. So what is data?
“Abstractly, information can be thought of as the resolution of uncertainty” -Wikipedia
“‘Information’ is thought of as a set of possible messages, where the goal is to send these messages over a noisy channel, and then to have the receiver reconstruct the message with low probability of error, in spite of the channel noise.” -Claude Shannon, A Mathematical Theory of Communication
Claude Shannon did theorize this, but I also needed a quotation other than from ‘Wikipedia’ to maintain a semblance of credibility.
Throwback Tuesday (Yea, it’s a thing…)
Data is a way of representing information that’s stored in my head so that I can share it with you, you can store it in your head and you can do something with it if you need to.
Anthropologically speaking, this is important. Why? Consider for a moment one of the most fundamental conditions of human existence — conflict.
Suppose we are villagers. We are in our village. I am in a watchtower and I have been assigned the role of Guard. We value foresight in our village, so we have assigned multiple villagers the role of Guard and placed them at all four corners of our square village to help keep us safe. My job is to survey the area around our village and determine whether or not it is “safe” or “unsafe”. Today, all is well, just a herd of animals moving toward the village. Wait… Why are those animals standing on two feet? Why are they carrying spears? Uh oh… Why are they carrying the banner of our rival village bearing the message, “Destroy Matt’s Village”.
The sensory systems of my body respond to this visual input, relay their feedback to my brain, where my brain concludes — This is UNSAFE! Okay, I know this is unsafe. But how do I signal this to the other guards? They won’t hear me if I shout — they are too far away. How do I let them know that this situation is “unsafe”?
Thankfully, we thought that this may happen eventually (because our village makes super delicious cinnamon buns and our rival village lives downwind). So we thought to use a fire signal:
“Enemy (Unsafe)” = Fire
“No Enemy (Safety)” = No fire
I light the signal fire. The other guards, closer to the village, see the signal fire indicating our “unsafe” state outside the village has been observed, they signal the villagers, we raise our army, and we thwart our rivals (but sent them home with a few boxes of delicious cinnamon buns because we felt bad).
This is a simple depiction of the beginnings of information theory — the quantification of information. In our example above, we effectively communicated that the state of our village had changed from “safe” to “unsafe”, despite the fact that I was separated by such a distance from the other guards that they couldn’t hear my voice.
“Safe” and “unsafe” only represent presence or absence of the safety state. It is absolute and not flexible. Is it conceivable that we can have varying degrees of “safe”? More importantly, is it conceivable that we can have varying degrees of “unsafe”? Can an army be large? Can an army be small? Can an army have various types of weapons allocated across multiple divisions with specialized functions? Do the full resources of our village have to be deployed every time we need to respond to a threat, regardless of its size?
BITS & BYTES
We can see how the information, or data, that we wish to convey can very quickly become complex. Fast forward hundreds of years and enter Claude Shannon with information theory. How do we quantify all of this information and communicate that information to others when we are separated by a large distance? A computer sounds like a useful tool. But how do we store information from the world of atoms into the world of our computers?
Shannon too thought of the example above, and the various complexities that could arise concerning the state of our village. What if we had a system that accounted for the true or false nature of a state? But during Shannon’s time, we didn’t have languages like Ruby or Python where we could simply type “true” or “false” into Atom or Sublime (those tools were still a long way off).
Shannon created a system whereby a computer could store the “true” or “false” state of a given piece of information by assigning those pieces of information values using the binary number system. In fact, these pieces of information, or data, were stored as binary digits — or bits. Various combinations of bits comprise bytes. Extrapolating that concept further with the advancement of computers, certain pieces of data (i.e. numbers, letters, ASCII text, audio, video) were assigned to encode the bits comprising the bytes to store this information in our computer.
CPU vs. RAM
We see the Central Processing Unit (CPU) and Random Access Memory (RAM) appear all the time when we read about our computers. But what do they mean? Remember, our computers would not be nearly the powerful tools that they are without the ability to compute, store and manage information. Claude Shannon gave us a means to encode the information of the world around us into a configuration of components and circuits that would allow our computers to do this.
Our CPU, in a sense, is the brain of our computer — it is where all of our calculations and actions on the information that we have stored on our computer take place. RAM, is our computer’s memory.
CPUs are made internally from very fine wires that if put head to head, they would stretch a few miles. Even electricity has a small delay when traveling a distance like this which is why at a particular point GHz won’t help you much: you need to store the data in a memory somewhere for later retrieval.
Think of the processor as a row of ants, carrying all sorts of stuff to eat towards their hillock. Now at some point, the hillock won’t accommodate the volume of the stuff brought, so the ants start to break the stuff apart and put it in the store room.
The store room is your RAM. The latency (how fast the current can get from point A to point B) is the speed of the ant, and the GHz represent the delay between two ants storing food (food=data) in the store room. You can imagine that high frequency with long latency is not good .
The processor core itself is the Queen Ant who gives the main order to their workers. A worker can also get to the store room, and fetch some stored data. This data can be partly digested, or already digested as a usable product (result of a few operations).
32-Bit vs. 64-Bit
Knowing more about CPUs, we have likely seen multiple instances of downloading software that is either “32-bit” or “64-bit” — what does that mean?
The number of bits in a processor refers to the size of the data types that it handles and the size of its registry. The key difference: 32-bit processors are perfectly capable of handling a limited amount of RAM, and 64-bit processors are capable of utilizing much more.
A 64-bit processor is capable of storing 264 computational values, including memory addresses, which means it’s able to access over four billion times as much physical memory than a 32-bit processor!
Cache, Cache, Cache
Now that we are armed with a bit (no pun intended) more knowledge about our computer’s memory and the difference between that and our CPU, we can discuss another term that we often see when discussing computer memory — cache.
A cache is a small, fast computer memory for keeping copies of data from a larger, slow memory. This can be broken down into three types:
Cache is the temporary memory which is generally used in browsers. Cache store the site pages, images, logos in the client machine to reduce bandwidth usage, server load, and perceived lag when you visit the site second time, it reloads the already stored data loading the webpage faster.
A web cache stores copies of documents passing through it, subsequent requests may be satisfied from the cache if certain conditions are met. Google’s cache link in its search results provides a way of retrieving information from websites that have recently gone down and a way of retrieving data more quickly than by clicking the direct link.
In CPU, the cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations and reduces the average time to access memory.
There is quite a bit more to this topic, with many more interesting questions that need answering. There are two particular questions that I will leave you with as food for thought:
- If I have decided to store information in my computer, and that information is encoded as a distinct set of bits, how do I prevent someone from reading the information on my computer if the set of bits is known
- What if the information I am trying to store and work with exceeds the memory capacity of my computer?