Behind Your Searches

Winston Jian
CISS AL Big Data
Published in
4 min readDec 15, 2021
Streaming Big Data Analytics (Patni, 2020)

Have you ever thought of ordering something on an online platform like Amazon or Taobao and found yourself digging into a rabbit hole? Or bought a novel and realized that the books of the same genre show up again now and then?

These are big data applications surfacing from various platforms that adhere to our convenience at an individual level. However, big data has more significant implications to society that are worthwhile to analyze. The co-directors of the MIT Initiative on the Digital Economy, Andrew McAfee and Erik Brynjolfsson wrote, “As the tools and philosophies of big data spread, they will change long-standing ideas about the value of experience, the nature of expertise, and the practice of management.”

At the heart of Big Data lies 5 V’s. There is volume, which is the colossal amount of data that big data use to perform analytics. There is velocity, the humanly impossible speed at which data is processed in big data. There is variety, the extensive range of formats that big data analyze, including structured numeric data in traditional databases, unstructured data from emails, videos, audios, and financial transactions, and semi-structured data. There is veracity, which is the quality of data that determines the exactness of big data analytics.

The last, and perhaps the most important, is variability, the unpredictability of societal trends that incentivizes companies to understand how societies are changing. The variability factor in big data has revolutionized how we study societal concerns. In what used to only be available in censuses or surveys, big data present an alternative: search queries.

Twenty years ago, we confined the perspectives of what was relevant to our day-to-day lives within the literature we read, the people we know, and the media around us. However, as search engines acquire search queries from millions of users worldwide into an extensive database, the database transforms into an enormous repository of the questions and statements that people worldwide try to make.

It’s shocking to see that when one types into Google, “Why am I” the top three suggestions are “why am I so tired,” “why am I always so tired,” and “why am I peeing so much” as shown in Figure 1. Now, you might wonder if those were my past searches. Fortunately, yet unfortunately, no. These are search queries that have been performed by millions of individuals worldwide.

Figure 1: Screenshot of Google Search Queries

Maybe it was never a concern to you. And perhaps no one previously realized that this was a societal concern. But as it appears as a top search in Google as of September 2021, these suggested queries indicate that tiredness and the ability to focus deserve a close look. Even though this issue is less relevant to those who are always energetic, it reveals an essential facet of the way many people think, reason, and pose questions about the world around them.

So how does big data play a role in this? The first step is data acquisition. Most data sources produce staggering amounts of raw data, accumulating petabytes of data per day, and data software attempt to filter this data to generate metadata sets. Upon that process, “big data processes pull the required information from the underlying sources and express it in a structured form suitable for analysis” (Labrinidis and Jagadish, 2017).

In the processes of data acquisition and information extraction, big data reveals societal trends that represent collective concerns and statements. The search queries don’t define the population, but they show a sign, or a pattern, within the general masses. They also represent the variability of the mass populations that may appear unpredictable. Perhaps we don’t understand why people search about their sleepiness, but big data’s characteristic of embracing variability has revealed trends that were once inaccessible. This variability of data combined with unprecedented volume, velocity, variety, and veracity of data fuel the analytics that help data scientists interpret societal trends.

Big data revolutionized the essence of information and how we perceive it. Its contribution to society has demonstrated that we don’t always need to know about the “why.” Instead, determining the “what” is just as crucial. Upon pursuing the “what,” we may gain invaluable insights into the “why.”

References

Labrinidis, Alexandros and H. V. Jagadish. “Challenges and Opportunities with Big Data.” vol. 10, 2017, pp. 2032–33, doi:10.14778/3055540.

Mcafee, A. and Brynjolfsson, E. (2012) “Spotlight on Big Data Big Data: The Management Revolution”, Harvard Business Review, (October), pp. 1–9. Available at: http://tarjomefa.com/wp-content/uploads/2017/04/6539-English-TarjomeFa-1.pdf.

Sagiroglu, Seref and Duygu Sinanc. “Big Data: A Systematic Review.” Advances in Intelligent Systems and Computing, vol. 558, 2018, pp. 501–06, doi:10.1007/978–3–319–54978–1_64.

--

--