Statistics: Classification of Data
Statistics is a field of mathematics that is universally agreed to be a prerequisite for a deeper understanding of machine learning. In almost every endeavor of human activity, the scientific method has proven effective for solving problems and improving performance. Statistical methods are also commonly used in business practice, e.g. to forecast demand for goods and services or to determine the most efficient method of operation. Actuaries use statistical methods to assess risk levels and set premium rates for the insurance and pension industries. There are various types of data in statistics
Data can be described as a collection of values either qualitative or quantitative. Data can be numbers or measurement or scale or rank or it can even be purely qualitative. Data points can be classified into five broad categories — Nominal, Ordinal, Interval, Ratio
Nominal data represent qualitative information without order. It gives a label to the data. whereas the classification describes the data. For example, the Type of gender could be either Men, Women, Transgender. . It is important to note the absence of order. Here, Order doesn’t determine the priority. In an exam, Gender doesn’t matter. The person, who puts more effort will ace the examination. Nominal data is also called categorical data
Ordinal data represent qualitative information without order, indicates that the measurement classifications are different and can be ranked. For example The grading system of A, B, C, D. When Nominal data has some measure of rank built within, it is referred to as Ordinal data. A letter grade of A in the exam is ranked higher than a grade of B. Ordinal data is also known as ordered categorical data.
Interval data is measured along a numerical scale that has equal distances between adjacent values. These distances are called “intervals”.there is no complete zero in interval data, which makes it different from the ratio. For example: In an Examination, the Difference between 90 and 100 is similar to the difference between 80 and 90.In other words, if the difference between two values is meaningful, then the data is classified as interval data. Quasi-Interval data is a special case of Interval Data, that falls between ordinal and interval. For example, An opinion poll with options from Strongly Disagree to Strongly Agree.
Ratio data measures have equal intervals and a ‘true’ zero point. It has all the properties of interval data with a clear definition of true zero points. Unlike on an interval scale, a zero on a ratio scale means there is a total absence of the variable you are measuring. For example, Weight, height, price are all ratio variables. A package of 100 grams is twice in weight of a package of 50 grams. Similarly, a height of 5 ft. is 5 times the height of 1 ft.