Descriptive Statistics — Part 1 — Types of Data

Diogo Rezende
4 min readMar 1, 2024

--

Hello, everyone. Welcome to the first article in this series on descriptive statistics. This series is designed to cover the basic principles of descriptive statistics and provide resources for your analyses. We will start with a brief introduction to statistics and data types. In future articles, we will delve into new topics of descriptive statistics, and at the end of the series, we will conduct an exploratory data analysis together to apply and consolidate the knowledge gained throughout this journey.

Summary:

  • What is Statistics?
  • Descriptive statistics vs Inferential statistics
  • Nominal qualitative variable
  • Ordinal qualitative variable
  • Discrete quantitative variable
  • Continuous quantitative variable
  • Boolean variable

What is Statistics?

Statistics, at its core, is the science of collecting, analyzing, interpreting, and presenting data. It is present in various fields such as medicine, economics, engineering, and computing. It is through statistics that we are able to transform raw data into crucial information and insights for evidence-based decision-making. However, it is necessary to understand the two main branches of statistics: descriptive statistics and inferential statistics.

Descriptive statistics is the branch of statistics that allows us to summarize and organize data in a way that is easily understandable, presenting patterns and trends through measures of central tendency, dispersion, and graphs.

Inferential statistics, on the other hand, enters the realm of uncertainty, allowing us to go beyond the description of collected data through tools and methods that enable us to make predictions or decisions based on a sample of the population.

In this series, we will focus on descriptive statistics, diving into its techniques and importance for a wide range of applications. Let’s discover how this powerful tool can help us better understand data.

This first article will address data types and present some examples of each.

Data are representations of information or variables, collected by observation, measurement, or research, and used as a basis for analysis. We can divide data into 2 categories: qualitative (or categorical) and quantitative (or numerical). Within these two categories, there are two sub-divisions for each. In the category of qualitative data, we can divide them into nominal and ordinal, while quantitative data, in turn, can be divided into discrete and continuous.

Nominal qualitative variable: in this type of variable, there is no ordering or hierarchy. Example: this type of variable can be used to categorize patients’ blood types in a hospital (A, B, AB, O), as it is crucial information of extreme importance to ensure compatibility and safety in blood transfusions.

Ordinal qualitative variable: in this type of variable, the assigned values have an order or hierarchy. Example: it can be applied in satisfaction surveys about products or services provided, through categories such as “satisfied,” “neutral,” or “dissatisfied.” Such information provides valuable data for innovation and improvements.

Discrete quantitative variable: in this type of variable, the assigned values are in the set of natural numbers, i.e., they are finite and enumerable data. Example: it can be used to quantify the number of students enrolled in a school or university. This information can be used to plan resources and manage physical space.

Continuous quantitative variable: in this variable, the assigned values are in the set of real numbers, i.e., they are infinite data that can take any value within an interval. Example: it can be used to check the response time of a computational system (in milliseconds). This information is extremely useful for developers and software engineers for performance evaluation and system optimization, always seeking to improve user experience.

For each type of variable, there are appropriate techniques to summarize the information. However, we will see that techniques used in one case can be adapted for others.

To conclude, it is worth noting about qualitative variables. In some situations, numerical values can be assigned to the various qualities or attributes (or classes) of a qualitative variable and then proceed to analyze it as if it were quantitative, provided the procedure is interpretable.

There is a type of qualitative variable for which this quantification is very useful: the so-called boolean variable. In this type of variable, the possible values are 0 or 1. Assuming we work in a hospital and need to know whether a person has (1) or does not have (0) a certain disease. In this example, when a person is diagnosed with the disease, they will receive the value 1; when diagnosed as not having the disease, they will receive the value 0.

References:

Bussab, W., Morettin, P. — Estatística Básica — 10.ed — Editora Saraiva.

Morettin, P., Singer, J. — Estatística e Ciência de Dados — 1.ed — LTC.

BIAGGI, Renata. EBA — Estatística do Básico ao Avançado. Disponível em: https://www.renatabiaggi.com/eba

--

--