Data and Statistics

Data (Statistics for Business and Economics)

Vimaljeet Singh
5 min readJul 14, 2024

DATA
Data are the facts and figures collected, analyzed, and summarized for presentation and interpretation.

DATA SET
All the data collected in a particular study are referred to as the data set for the study.

ELEMENTS
Entities on which data are collected.

VARIABLE
Characteristic of interest for the element.

Measurements collected on each variable for every element in a study provide the data.

OBSERVATION
The set of measurements obtained for a particular element.

Image taken from Statistics for Business and Economics (Anderson, Sweeney, Williams)

SCALES OF MEASUREMENT

NOMINAL SCALE

When the data for a variable consist of labels or names used to identify an attribute of the element, the scale of measurement. The data points fall into distinct categories without any quantitative significance or order. Can be numeric or non-numeric.

Examples:
Eye Color: Blue, brown, green, hazel.
Employee ID Numbers: Assigned numeric IDs for payroll and administrative purposes. These IDs are used for identification rather than measurement.

ORDINAL SCALE

Should have properties of nominal data and additionally, order or rank of data is meaningful. The categories can be logically ordered based on some underlying continuum or scale, even though the precise numerical difference between categories may not be uniform or measurable. Can be numeric or non-numeric.

Examples:
Income Level: Low income, middle income, high income.
Performance Ratings: Poor, below average, average, above average, excellent.

Image taken from lbelzile.github.io

INTERVAL SCALE

Should have properties of ordinal data and additionally, interval between values is expressed in terms of a fixed unit of measure. Allows for meaningful addition and subtraction operations, and differences between values are interpretable. However, interval data lacks a true zero point, meaning ratios of measurements (like temperature ratios) are not meaningful. Is numeric.

Examples:
Temperature: The difference between 20°C and 30°C is the same as between 30°C and 40°C, but 0°C does not represent the absence of temperature.
Standardized Test Scores: Scores like SAT or GRE scores, where the difference between a score of 500 and 600 is the same as between 600 and 700, but a score of 0 does not indicate “no ability.”

RATIO SCALE

Should have all the properties of interval data and additionally, the ratio of two values is meaningful. Allows for the full range of mathematical operations, including addition, subtraction, multiplication, and division. It also supports meaningful ratios, where one value can be compared to another in terms of proportion or ratio. This scale requires that a zero value be included to indicate nothing exists for a variable at the zero point. Is numeric.

Examples:
Height: A height of 180 cm is twice as tall as 90 cm, and 0 cm represents the absence of height.
Money: A balance of $200 is twice as much as $100, and 0 dollars represents having no money.

Image created by author (will help to click and see separately)

CATEGORICAL DATA

Data that can be grouped by specific categories. Categorical data use either the nominal or ordinal scale of measurement.

QUANTITATIVE DATA

Data that uses numeric values to indicate how much or how many. Quantitative data are obtained using either the interval or ratio scale of measurement.

CATEGORICAL VARIABLE

  • A categorical variable is a variable with categorical data.
  • The statistical analysis appropriate for a particular variable depends upon whether the variable is categorical or quantitative. If the variable is categorical, the statistical analysis is limited
  • We can summarize categorical data by counting the number of observations in each category or by computing the proportion of the observations in each category. However, even when the categorical data are identified by a numerical code, arithmetic operations such as addition, subtraction, multiplication, and division do not provide meaningful results.
Image taken from Quora

QUANTITATIVE VARIABLE

  • A quantitative variable is a variable with quantitative data.
  • Arithmetic operations provide meaningful results for quantitative variables
  • Quantitative data may be added and then divided by the number of observations to compute the average value. This average is usually meaningful and easily interpreted. In general, more alternatives for statistical analysis are possible when data are quantitative

CROSS-SECTIONAL DATA

  • Data collected at the same or approximately the same point in time.
  • Cross-sectional data is like taking a snapshot or picture of a group of things at the same time.
Image from Scribbr
  • Example: Imagine you have a lemonade stand, and you want to know what different people think about your lemonade flavors right now. So, you ask all the customers who visit your stand today what their favorite flavor is: some say lemon, some say strawberry, and others say mango.
  • Explanation: In this case, you’re collecting cross-sectional data because you’re gathering information from different customers all at the same time (today). You’re not interested in how their preferences might change over time; you just want to know what flavors are popular right now.
  • Can be used for quick understanding of data without having to wait for it or watch it over time and easy comparison as you can compare different groups or things to see differences or similarities all at once.

TIME SERIES DATA

  • Data collected over several time periods.
  • Time series data is like watching how something changes over time.
Image from Scribbr
  • Example: Now, instead of asking customers their favorite flavor just once, you decide to keep track of how many cups of each flavor you sell every day for a month. You write down the number of lemon, strawberry, and mango cups sold each day.
  • Explanation: Here, you’re collecting time series data because you’re interested in seeing how the sales of each flavor change over time (every day for a month). By keeping track of this information, you can see if some flavors become more popular as the days go by or if there are certain days when sales are higher or lower.
  • Can be used for spotting trends as you can see if something is getting better, worse, or staying the same over time and predicting future as it helps us make guesses about what might happen next based on how things have changed in the past.

Credits: Statistics for Business and Economics 11e by Anderson, Sweeney and Williams

Find me on my LinkedIn account.

--

--