Photo by ev on Unsplash

A Complete Guide On

Introduction to Data Visualization

Data Analyst Career Guide / Data Visualization Fundamentals

Manvendra Bhadauria
CodeX
Published in
7 min readAug 20, 2021

--

Photo by Bogdan Karlenko on Unsplash

What I believe is that, in the not-too-distant future, data will be the new oxygen that will power the globe, and that we, as the brightest species on the planet, should recognize its importance and enter this field before the Big Data Revolution. The subject of data science is vast and still undiscovered, with numerous opportunities for us. In this article, I will discuss data visualization, which accounts for a significant chunk of the job of a Data Analyst, and other roles too. I can promise you that this article will be a Kickstarter for everyone out there who is interested in learning about Data Visualization or making a career as a Data Analyst.

The flow of the article is mentioned below :

What is data visualization & Its types?

Photo by Chris Liverani on Unsplash

Data visualization is a term that refers to a set of strategies for visually communicating data insights. It refers to the use of charts, graphs, and maps to visually portray data. It’s the art of presenting data in such a way that non-technical people can understand it, allowing data scientists and analysts to communicate with their end-users.

So broadly data visualization is divided into two categories :

  • Exploratory Analysis: It is done during data analysis to find insights. This is useful when you have a large amount of data and aren’t sure what’s in it. It is not necessary for visualizations created for these reasons to be perfect. Simply said, you’re looking for patterns.
  • Explanatory Analysis: This is done after you find insight. This is suitable when you already know what the data says and are trying to convey to someone else that idea. Visualizations created for these objectives must be accompanied by a narrative that leads the reader to an answer to the query.

Why data visualization & its importance?

Photo by NASA on Unsplash

We use visual components like charts, graphs, and maps to assess large amounts of data and make decisions based on it. Trends, anomalies, and patterns in data can be seen and understood using data visualization tools. Because humans absorb significantly more information through vision than through other organs, visualization can help us deal with more complex information and improve our retaining power of facts. In layman's terms the graphical methods that data visualization provides us make it easy for anyone to understand what the data indicates, even for non-technical people it is very easy to understand.

Importance of Data Visualization :

  • Assist in immediately grasping the information.
  • Allows users to readily interpret huge datasets by simplifying them.
  • Identifies trends that aren’t immediately obvious.
  • Determine which areas require improvement or attention.
  • Provides you with new data insights.
  • Detect outliers quickly.

Anscombe’s Quartet Example :

Photo by Fernando Reyes on Unsplash

The former idea can be demonstrated using Anscombe’s Quartet as an example. Anscombe’s quartet consists of four data sets or tables with nearly identical simple descriptive statistics but significantly varied distributions and graphing appearances. Each table or data set has two columns and eleven rows, and we calculate the Sum, Average, and Standard Deviation for each of these two columns of each table respectively. Below we can see the four tables that Anscombe’s quartet consists of.

Image By Author

The X fields for datasets I, II, and III have the same values, while the Y fields for each dataset have distinct values, as shown in the above graphic. One thing to note is that even though the values were different, the SUM, AVG, and STDEV were found to be the same. So, what are your thoughts on whether or not these points will have the same graphical representation? Let’s see the output for these after plotting them.

Anscombe.svg: Schutz : AvenueAnscombe.svg

So, despite the fact that the Datasets have similar SUM, AVG, and STDEV values, the visual depiction of these values is considerably different in terms of how they are spread. Despite having identical SUM, AVG, and STDEV the above scatter plot illustrates four different types of associations between the X and Y points in each dataset. This example demonstrates the value of Data Visualization and how it aids in the discovery of hidden links in data.

Different ways of Data Visualization :

Photo by Luana Azevedo on Unsplash

With the advent of technology, the tools that may be utilized for Data Visualization have rapidly evolved. Data visualization software is used to create everything from interactive libraries to 3D graphs and even real-time data interpretation. In this section, we will be going through almost all the ways that are used for Data Visualization by the majority of the population.

  • Python / R : So Python & R have the most amazing set of libraries in comparison to any other programming language and hence they are being widely used by a large population in the branch of Data Science for Data Visualisation. Libraries like Matplotlib, Seaborn & Plotly can be used for plotting beautiful graphics for data analysis, and apart from this these libraries are extremely fast and optimized to handle large datasets.
  • Excel : So Microsoft Excel was developed in 1985 & since then it has been gaining popularity among Data Analysts due to its various functions and easy-to-use Graphical Interface. It is widely used due to no requirement of programming knowledge but the downside of this is that it is memory-consuming & cannot handle very large datasets.
  • Tableau & Power Bi : Data Analysts use Tableau and Power Bi, two popular analytics and business intelligence products made by SalesForce and Microsoft, respectively, for data visualization and analysis. They are the market leaders and compete for a large consumer base in a head-to-head competition. Both of these tools are very dynamic and offer drag-and-drop functionality, making them simple to use and capable of handling big datasets.
  • Google Charts : HTML 5 and SVG are used to power Google Charts. It’s a data visualization tool for browsers and mobile devices that’s powerful, easy to use, and interactive. It is designed for Android, iOS, and comprehensive cross-browser compatibility, including earlier versions of Internet Explorer, and includes a large gallery of charts that you can easily edit to meet your needs.
  • Qlikview : It is a Tableau competitor that provides comprehensive business intelligence, analytics, and enterprise reporting features in addition to data visualization. Qlik Technologies Inc. is the company behind it and recently it has attracted a large number of new customers, posing a serious threat to Tableau.
  • Infogram : Users can create and share digital charts, infographics, and maps with this web-based data visualization and infographics application. It offers a user-friendly WYSIWYG (What You See Is What You Get) editor that converts data into shareable infographics. It allows you to connect their visualizations and infographics to live large data in real-time.

Apart from all of these tools, there are still a lot of often-used tools, but these are the most well-known ones, so I’ve listed them here. I’ve also attached the Gartner Magic Quadrant for your reference and to gain a broader perspective on these tools and their useability.

Reference Image by Gartner Inc

Facts and Sources for kickstarting your Journey :

Photo by Mel Poole on Unsplash

Did you know that according to Indeed.com the average annual salary for data analysts in the United States at present is $70,033 per year? Whereas according to Google the average income of a US citizen is 31,133 USD (2019). I hope this catches your attention and makes you curious enough to learn more about: Data Visualization & Data Analyst as a Profession.

Below I have mentioned some of the must-have skills to become a Data Analyst :

  • Critical Thinking
  • Microsoft Excel / Tableau / Power Bi
  • Structured Query Language (SQL)
  • R or Python (I will recommend Python)
  • Data Visualization & Presentation Skills

Some of the free sources that I will recommend to you all to start your learning journey for becoming a Data Analyst are W3Schools for SQL, there are various full free courses on youtube to learn about Tableau, Power Bi or Excel & same goes for Python or R.

Overview: That was it from my side on the topic “Introduction to Data Visualization”. I hope you learn something interesting and wanna kickstart your learning journey to become a Data Analyst. All the skills mentioned above are mandatory and it will take some time to master them and become perfect.

Editors Note: First and foremost, thank you for sticking with the article until the end. Happy Learning.

--

--

Manvendra Bhadauria
CodeX
Writer for

Philomath | Data Science Enthusiast ❤ | Learning and Sharing