What is the first thing that comes to your mind when you hear the word ‘blob’? Without any context, it sure sounds like an obscure, mystic object you are playing against in a game of Minecraft. I can almost bet we have all seen the horrifying manifestation of our definition of BLOB on the web somewhere. However, in the world of data, blobs are extremely powerful and have transformed the way data is represented and transmitted.
Anything and everything that can represent information about an entity is data. Our name, address, cell-phone number, genetic-information, family-tree, all represent information about us, and hence all these fields can be called data. As we begin to think about using information collected from entities, we must also carefully think about representing this information accurately. Consider for example a mobile-app which stores the first name (data), last name (data)and phone number (data)of people (entity) I know. Every morning, this app would go through the list of people on it, and send them a greeting on their phone. The success of this app depends upon the fact that each data field contains what I expect it to contain. I am expecting first name and last name to be strings but I want the phone number to be an integer. What if my app was not able to store data correctly and some of the names were being stored as numbers instead of strings? It would break the logic and the functionality of the app. This is to give you a very basic idea of why data representation matters. As we scale-up into bigger applications with a complex architecture, data-efficiency is crucial and accurate data representation is the first step towards efficiency.
Where do blobs fit-in when we talk of data and representation? Before we get into that, let’s take a look at what a blob is exactly. Blob is an acronym for Binary Large Object. Restructuring the terms for a better understanding, a blob is nothing but a large binary object. Unlike an integer which represents numbers, or unlike a string which represents a sequence of characters, a blob is a stream of raw binary data which represents a “large” object (entity). We are able to understand what ‘Binary’ and ‘Large’ each mean individually. However, what do they mean in the context of an object?
Going back to talking of data: anything and everything can represent information about entities and therefore, anything and everything could be data. Text, numbers, images, audio — everything. Although, we do have ways of representing and working with numerical-data (data as integer, float), textual-data (data as characters and list of characters), but how do we represent images? Or audio files? We should be able to store them in a way which accurately represents each of these types, and can easily undergo further analysis.
Multimedia files (audio, video, gifs) are certainly produced in large quantities these days thanks to all the cool video streaming platforms and smart home-devices. Since our machines are not yet capable of understanding multimedia files as are, they need to be converted into a machine-understandable format before they can be processed and/or transmitted between systems. Stay with me, we are finally about to get to the end of understanding the fuss around blobs. What solution came to help to store huge “blobs” of data in the form of multimedia files? You guessed it! BLOBs!
So blobs represent multimedia files in a binary format. Representing and storing images, audio-files has become really easy with blobs. An application which allows users to upload their image and get a likability score on it or a simple speech-to-text service, both will be transferring data from the front-end (which the user interacts with) to the back-end (where the magic actually happens) in the form of a blob (or multiple blobs). In fact, for applications which use a constant streaming of data from front-end to the back-end (an interactive speech-to-text, for example), the audio is converted into not one but several blobs. These constituent blobs makes up the entire blob, which is nothing but the entire audio-file in binary form.
In the next part of this story, we will be looking at how blobs look like in real-applications, how blobs are transferred between front-end and back-end, and making use of the data that the blob represents. We will be looking at a web-app where the user will be able to record audio and get synonyms for nouns and adjectives in the speech. Here, data transfer between the front-end and the back-end happens through blobs. More fun things to come. Stay tuned folks!