Background Concepts

Background Concepts

Blas MOLA-YUDEGO

Introduction to Statistical Thinking

What is randomness? What is variability? How to deal with uncertainty? This introduction of the course deals with the philosophical approach to statistical thinking. The main tools related to statistics will be discussed, including the concept of variability and metrics associated (standard deviation, variance, standard error) as well as the concept of distribution. The normal distribution will be presented, as its main properties and how to use it for statistical inference. The normal distribution is the basis of several tests and statistical tools, but of course there are main other distributions that are common and that must be understood. This topic will conclude with ideas related to population and sampling, and how to deal with statistical inference, preparing the ground for the next topic.

Objectives

  • To get familiar with thinking probabilistically
  • To introduce statistics as a discipline
  • To describe main concepts and tools related to variability and distributions
  • Statistical inferences and standard errors

Topics

Discovering Statistics. Blas Mola (2025) [PDF]. A journey of mind discipline. Statistics is not a catalogue of tricks but a habit of mind so that evidence, fairly weighed, can change what we believe. From Bayes’ quiet prior beliefs and Laplace’s celestial posteriors to Gauss’s least-squares symmetry, our craft learned to turn uncertainty into guidance. Pearson mapped moments and correlation, Fisher forged likelihood, sufficiency, and design, Neyman and Pearson split risk with tests that target power. Today, amid the rush of machine learning and large language models, the charge remains the same: frame causal questions, validate with right samples, expose assumptions, and advance knowledge.

Materials

Datasets and exercises

  • Excel for practice [excel]
  • Excel from the lectures [excel
  • Exercise on std errors [PDF]

Videos and tutorials

  • The normal distribution [youtube]
  • How to produce a normal distribution in R? [video]

Tasks

What is a distribution of probability?

What is the trade-offs between certainty and precision? And between precision and cost/affordability? How are they all related?

Exercise

We propose you to try the following tasks to practice the concepts explain in those lectures:

  1. Create a large sequence of numbers following a normal distribution with define mean (µ) and std deviation (σ) using excel/R: explore the histogram and properties. That will be your population.
  2. Select samples from the sequence (eg. N=10), and explore the parameters (mean and standard deviation). Do they match your expectations?
  3. Increase the sample (N=5, 10, 50, 100): when can you successfully infer some properties of the population.
  4. Produce 20 samples (N=5), get the mean from each of them and the standard error: how often get the mean (µ) of the population within the confidence +/- 2 x SE?
  5. Do the same but for the std deviation (σ) .

How to do it?

In Excel:
Generate random numbers in excel: =RAND() 
Generate numbers following a normal distribution with mean=100 and st dev=10: =NORMINV(RAND(),100,10)

In R:
Check here.

For more instructions, google (as I do)!