Lecture 1.2 - Characteristics of distributions

Author

Professor MacDonald

Published

March 19, 2025

Characteristics of distributions

Distribution of common quantities

Many phenomena in nature have a relatively easily guessed distribution characteristics

  • What is the distribution of length of rivers in the U.S.?
  • What is the distribution of width of flower sepals?
  • What is the distribution of life expectancy across countries in 2007?

Features to guess:

  • Shape
  • Center
  • Spread

Graphs of common quantities

Length of rivers in the U.S.

Flower sepal width

Life expetancy in 2007

Distribution of our data

Let’s now collect some data about our class

  • Information about handedness
  • Information about heights

Guessing the shape of our data

Take a guess at what each question’s distribution characteristics will be:

  • Shape
    • Skew
    • Modes
  • Center
    • Mean
    • Median
  • Spread
    • Range
    • IQR
    • Standard deviation
  • Also think carefully about the difference between the three different calculations of handedness - how do they differ? Discuss with your partner.

Height summary statistics

  • Shape
    • Skew:
    • Modes: 163
  • Center
    • Mean 170.1818182
    • Median 169.5
  • Spread
    • Range 152, 188
    • IQR 10
    • Standard deviation 8.7757044

Height graph

Handedness l-r summary statistics

  • Shape
    • Skew:
    • Modes: -12
  • Center
    • Mean -12.12
    • Median -12
  • Spread
    • Range -16, -8
    • IQR 3
    • Standard deviation 2.2233608
  • What does this measure?

Handedness l-r graph

Handedness l+r summary statistics

  • Shape
    • Skew:
    • Modes: 20
  • Center
    • Mean 18.64
    • Median 20
  • Spread
    • Range 10, 20
    • IQR 2
    • Standard deviation 2.3430749
  • What does this measure?

Handedness l+r graph

Handedness left - right / left + right graph

Closing thoughts

  • Many distributions can be guessed in advanced based on the data generating process
  • You should have at least a guess as to what the distribution is before starting your exploratory data analysis
  • Think carefully about what your variable is actually measuring
  • Characteristics of distributions are summaries of the data, almost always obscure features of the data
  • Don’t mislead your readers!!