Statistics For Data AnalystPart-1

Suraj Gusain
6 min readFeb 3, 2023

--

What are Statistics?

A statistic is a piece of data from a portion of a population. It’s the opposite of — data from a census, which surveys everyone.

Think of it like this: If you have a bit of information, it’s a statistic. If you look at part of a data set, it’s also a statistic. If you know something about 10% of people, that’s a statistic too. Parameters are all the information. And all the information is rarely known, which is why we need stats!

Statistics describe data. It’s also about analyzing and producing meaningful information about that data.

Statistics is the branch of mathematics concerned with the collection, analysis, interpretation, presentation, and organization of data. Statistics provides methods for making inferences and predictions about a population based on a sample of the population.

Examples of applications of statistics include:

  1. Surveys and polls: Collecting and analyzing data from a sample of people to make inferences about the opinions, behaviors, or characteristics of a larger population.
  2. Medical trials: Testing the efficacy of a new drug by collecting and analyzing data from a sample of patients to make inferences about its effects on a larger population.
  3. Quality control: Monitoring and improving the quality of a manufacturing process by collecting and analyzing data about the characproduct characteristics to detect and correct from desired standards.
  4. Sports: Analyzing data about player performance and team results to improve strategies and make predictions about future performance.

5. Election forecasting: Using data from polls and previous elections to make predictions about the outcome of future elections.

Descriptive Statistics?

Descriptive statistics are a set of methods and techniques used to summarize, describe, and represent a set of data. The goal of descriptive statistics is to provide a concise summary of the main features of a dataset and to describe the distribution of the data.

Examples of descriptive statistics include:

  1. Measures of central tendency: Mean, median, and mode, which describe the central value or typical value of a dataset.
  2. Measures of variability: Range, variance, and standard deviation, which describe how much the data values differ from each other.

3. Frequency distributions: Histograms, bar charts, and pie charts, which show the data values are distributed and how often each value occurs.

4. Box plots: A graph that summarizes the distribution of a dataset by displaying the median, quartiles, and outliers.

5. Summary statistics: A table or a set of numbers that summarizes the main features of a dataset, such as the number of observations, the mean, and the standard deviation.

These descriptive statistics provide a simple and effective way to summarize, visualize, and describe the main features of a dataset, which can be useful for exploratory data analysis and for communicating the results of an analysis to others.

what are inferential statistics?

Inferential statistics is a branch of statistics that uses a sample of data to make inferences or predictions about a population. The goal of inferential statistics is to estimate the parameters of a population and to make predictions about future observations based on a sample of the population.

Inferential statistics involves using statistical models and hypothesis testing to make generalizations about a population based on a sample. The sample data is used to estimate population parameters, such as the mean or the standard deviation, and to test hypotheses about the population.

Examples of inferential statistics include:

  1. Hypothesis testing: A procedure for testing claims or hypotheses about a population based on a sample of data.
  2. Confidence intervals: An estimate of the range of values that is likely to contain the true population parameter with a specified level of confidence.
  3. Regression analysis: A statistical method for modeling the relationship between a dependent variable and one or more independent variables.
  4. Analysis of variance (ANOVA): A statistical method for comparing the means of two or more groups and testing the hypothesis that they are equal.
  5. Prediction: Using a statistical model to make predictions about future observations based on past data.

Inferential statistics provides a powerful tool for making predictions and drawing conclusions about populations based on sample data, which can be useful for making decisions and for solving real-world problems in fields such as business, science, and technology.

Define Population and sample in statistics and examples

Population: In statistics, a population is the complete set of individuals or objects that have some characteristics of interest and can be studied to draw inferences about a larger group. It refers to the entire group of objects or individuals that you want to make inferences about.

Example: The population of students in a particular school, all employees in a company, all people living in a country, etc.

Sample: A sample is a smaller portion of the population that is selected to represent the entire population. It is used to draw inferences about the population based on the characteristics of the sample.

Example: A sample of 100 students from a school of 1000 students, a sample of 50 employees from a company of 1000 employees, a sample of 1000 people from a country of 100 million people, etc

Variable in statistics and examples?

Variable: In statistics, a variable is a characteristic or attribute that can take on different values for different individuals or objects. A variable can be categorical (i.e., it can take on a limited number of values, such as gender or eye color) or continuous (i.e., it can take on any value within a range, such as height or income).

Examples:

Categorical variables: Gender (male or female), Eye color (brown, blue, green), etc.

Continuous variables: Height (in inches or cm), Income (in dollars), Age (in years), etc.

An experiment in Statistics?

Experiment: In statistics, an experiment is a study in which the values of one or more variables are manipulated and the effect on a response variable is observed. An experiment is usually designed to determine cause-and-effect relationships between variables.

Examples:

  1. A medical experiment in which a new drug is tested on a group of patients to determine its efficacy in treating a certain condition.
  2. A psychological experiment in which the effects of different study techniques on exam scores are observed.
  3. An agricultural experiment in which the growth of crops is studied under different levels of water and fertilizer.
  4. A marketing experiment in which the response of customers to different packaging designs is observed.
  5. An educational experiment in which the effect of a new teaching method on student achievement is studied.

Types of data in statistics?

There are two main types of data in statistics:

  1. Categorical Data: Also known as nominal data, this type of data consists of categories or names. Categorical data can be further divided into two types: nominal data and ordinal data. Nominal data have no inherent order or ranking, such as gender (male or female) or eye color (brown, blue, green, etc.). Ordinal data have an inherent order or ranking, such as educational level (high school, bachelor’s, master’s, etc.).
  2. Numerical Data: Also known as continuous or interval data, this type of data consists of numerical values that can be measured and have meaningful differences between them. Numerical data can be further divided into two types: continuous data and discrete data. Continuous data can take on any value within a range, such as height or income. Discrete data can only take on a limited number of values, such as the number of children in a family.

These two types of data can also be combined in various ways, for example in observational studies where variables can be both numerical and categorical.

Thank you for taking the time to read the article. I appreciate your attention and feedback.”

If you found this article helpful, please consider sharing it with others who might benefit from it. Your support in spreading the word is greatly appreciated.”

Connect me on Linkedin

Connect me on GitHub

Connect me on Instagram

Connect me on Kaggle

--

--