Introduction to Mathematical Statistics — Probability and Distributions (1)

Day 1 notes from the “Introduction to Mathematical Statistics, 8th Edition by Hogg, et. al.” as part of my Data Science learning documentation.

Ahmad Yusuf Albadri
Python’s Gurus
3 min readJun 17, 2024

--

Photo by Naser Tamimi on Unsplash

Experimentation is the standard procedure for investigation.

Such investigation examples can be as follows:

  • In medical research, researchers try to understand the effect of drugs.
  • Economists try to understand the impact of different prices on the demands of a certain commodity.
  • Agronomists may wish to study how a chemical fertilizer affects the yield of a cereal grain.

We can gain meaningful information to answer such investigations by performing experiments.

  • Each experiment terminates with an outcome. However, the outcome of an experiment cannot be predicted with certainty.
  • If the outcome of an experiment can be described before it starts, and the way we execute the experiment can be repeated under the same conditions, then this kind of experiment is called a random experiment.
  • The collection of every possible outcome from an experiment is called sample space.

A simple example of a random experiment, example 1:

In the experiment of tossing a coin, the outcome might be Head (H) or Tail (T). If we assume that the coin may be repeatedly tossed under the same conditions, then the toss of this coin is an example of a random experiment with a sample space:

Sample space “C” of a random experiment of tossing a coin: head or tail
  • Often we are interested in the chances of certain subsets of elements of the sample space occurring. This subset is called events.
  • If the experiment results in an element in an event “A”, then we say the event “A” has occurred.
  • From example 1, we may be interested in the chances of getting head. Mathematically, we can write it as follows:
“A” is an event of getting head from a toss of a coin
  • To get the chance, we need to observe how many times Head occurred out of N repeated times we toss a coin. This value can be represented mathematically as:
This is an example of the relative frequency of observing head out of N repeatedly times tossing a coin
  • A relative frequency is usually quite erratic for small values of N, as you can discover by tossing a coin. But as N increases, experience indicates that we associate with the event “A” a number, say “p”, that is equal or approximately equal to that number about which the relative frequency seems to stabilize. This “p” is usually called the probability of an event.

Represent “p” in Python as N increases

Let’s visualize the relative frequency as a representation of “p” from example 1 as follows:

# Import library
import numpy as np
import matplotlib.pyplot as plt

# Simulate data, with p=0.5 for show head
N = [i for i in range(10, 1000, 1)]
p = [np.mean(np.random.binomial(1, 0.5, i)) for i in N ]

# Visualize
plt.scatter(x=N, y=p, alpha=0.5, edgecolors='black', facecolor='none')
plt.axhline(y=0.5, color='red', linestyle='--', linewidth=2)
plt.annotate('--- p = 0.5', xy=(850,0.73), color='red')
plt.show()
In a simulation for tossing a fair coin, thus the probability of showing a head is equal to 0.5

As we can see from the graph above as N increases, the “p” or the probability of showing the head slowly converges to 0.5.

This is a part of my 100 days Data Science learning journey. Follow me for more updates on my learning.

You can learn from what I learned too!

Check out my plan: https://medium.com/pythons-gurus/a-journey-to-learn-data-science-100-days-plan-cfce919f6f6e

Python’s Gurus🚀

Thank you for being a part of the Python’s Gurus community!

Before you go:

  • Be sure to clap x50 time and follow the writer ️👏️️
  • Follow us: Newsletter
  • Do you aspire to become a Guru too? Submit your best article or draft to reach our audience.

--

--

Ahmad Yusuf Albadri
Python’s Gurus

Data Analyst at an E-Commerce company | Sharing my learnings and thoughts related to Data Science, Data Analytics, and Statistics