Lecture 1.2 - Characteristics of distributions

Author

Professor MacDonald

Published

March 19, 2026

Characteristics of distributions

Distribution of common quantities

Many phenomena in nature have a relatively easily guessed distribution characteristics

  • What is the distribution of length of rivers in the U.S.?
  • What is the distribution of width of flower sepals?
  • What is the distribution of life expectancy across countries in 2007?

Features to guess:

  • Shape
  • Center
  • Spread

Graphs of common quantities

Length of rivers in the U.S.

Flower sepal width

Life expetancy in 2007

Data generating process

Data is what we record.

Data is a function of: Data point = underlying process + random variation + measurement error

Example: flower size.

Truncated data

Data not generate for values above or below specific values.

For example, all age data is truncated at zero.

Activity

Data generating process - advanced: height activity

What do you expect the shape, center, and spread of class height to be? Why? Write down with your partner your guesses.

Height distribution

Closing thoughts

  • Many distributions can be guessed in advanced based on the data generating process
  • You should have at least a guess as to what the distribution is before starting your exploratory data analysis
  • Think carefully about what your variable is actually measuring
  • Characteristics of distributions are summaries of the data, almost always obscure features of the data
  • Don’t mislead your readers!!