STAT 7 - Statistical Methods for the Biological, Environmental & Health Sciences
10 Mar 2026
By the end of today’s lecture, you will be able to:
We couldn’t finish: Categorical data summarization, we will start with that today :)
What happens when we don’t have numbers to summarize, but instead, we have categories?
| Researcher_ID | Ecosystem_Studied | Research_Focus | Conservation_Challenge |
|---|---|---|---|
| 1 | Coral Reefs | Coral Bleaching | Overfishing |
| 2 | Open Ocean | Marine Biodiversity | Climate Change |
| 3 | Estuaries | Carbon Sequestration | Coastal Development |
| 4 | Estuaries | Nutrient Cycling | Pollution |
| 5 | Estuaries | Hydrothermal Vents | Pollution |
| 6 | Kelp Forests | Species Interactions | Ocean Acidification |
| 7 | Seagrass Meadows | Habitat Restoration | Agricultural Runoff |
| 8 | Rocky Intertidal | Invasive Species | Climate Change |
| 9 | Salt Marshes | Erosion Control | Sea Level Rise |
| 10 | Open Ocean | Fisheries Management | Overfishing |
How do we summarize categorical data?
| Conservation Challenge | Absolute Frequency | Relative Frequency |
|---|---|---|
| Pollution | 2 | 0.20 |
| Overfishing | 2 | 0.20 |
| Climate Change | 2 | 0.20 |
| Coastal Development | 1 | 0.10 |
| Ocean Acidification | 1 | 0.10 |
| Agricultural Runoff | 1 | 0.10 |
| Sea Level Rise | 1 | 0.10 |
| TOTAL | 10 | 1.00 |
Your Task (5 minutes)
Create a frequency table for Ecosystem_Studied using the data provided.
Calculate:
Can you also create a cross-table between Ecosystem Studied and Research Focus?
Ecosystem Studied × Conservation Challenge
| Estuaries | Open Ocean | Other | Total | |
|---|---|---|---|---|
| Pollution | 2 | 0 | 0 | 2 |
| Climate Change | 0 | 1 | 1 | 2 |
| Overfishing | 0 | 1 | 1 | 2 |
| Other | 1 | 0 | 3 | 4 |
| Total | 3 | 2 | 5 | 10 |
Note
Palmer Archipelago (Antarctica) penguin data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER.
See more info here.
Variables:
species: penguin species (Chinstrap, Adélie, or Gentoo)culmen_length_mm: culmen length (mm)culmen_depth_mm: culmen depth (mm)flipper_length_mm: flipper length (mm)body_mass_g: body mass (g)sex: penguin sex (female or male)island: island name (Dream, Torgersen, or Biscoe)Poll Question
Classify each variable as:
Data: Google Sheets Link
Poll: PollEv.com/slugstats

Where to start?
Quantitative (Numerical):
Qualitative (Categorical):
How can we summarize this dataset?
| Statistic | Culmen Length (mm) | Culmen Depth (mm) | Flipper Length (mm) | Body Mass (g) |
|---|---|---|---|---|
| Min | 32.10 | 13.10 | 172.00 | 2700.00 |
| Q1 | 39.23 | 15.60 | 190.00 | 3550.00 |
| Median | 44.45 | 17.30 | 197.00 | 4050.00 |
| Mean | 43.92 | 17.15 | 200.92 | 4201.75 |
| Q3 | 48.50 | 18.70 | 213.00 | 4750.00 |
| Max | 59.60 | 21.50 | 231.00 | 6300.00 |
| SD | 5.46 | 1.97 | 14.06 | 801.95 |
Species
| Species | Count |
|---|---|
| Adelie | 152 |
| Chinstrap | 68 |
| Gentoo | 124 |
| TOTAL | 344 |
Island
| Island | Count |
|---|---|
| Biscoe | 168 |
| Dream | 124 |
| Torgersen | 52 |
| TOTAL | 344 |
Sex
| Sex | Count |
|---|---|
| Female | 165 |
| Male | 169 |
| TOTAL | 334 |
What to check:
Your Task
For each histogram, describe: 1. Shape (symmetric, skewed left, skewed right) 2. Center (approximately) 3. Spread (range, typical deviation) 4. Any unusual features

Question: Should we compare directly the different distributions? Why or why not?
The Box:
The Whiskers:
The Mean: - Often shown as + (in some software)
What patterns do you notice?
How many species are present, and how many individuals belong to each species?
Discussion
What are the limitations of pie charts? When might a bar plot be better?
Break Time! ☕ 5-minute break
Stretch, grab water, chat with neighbors!
We’ll resume with types of variables and data collection.
What patterns emerge when we add species information?
Correlation coefficient (r) measures the strength and direction of a linear relationship:
\[r_{xy} = \frac{\sum(x_i - \bar{x})(y_i - \bar{y})}{\sqrt{\sum(x_i - \bar{x})^2 \sum(y_i - \bar{y})^2}}\]
Interactive practice: https://istats.shinyapps.io/guesscorr/
Warning
Correlation only measures linear relationships!
Example Data
| id | x | y |
|---|---|---|
| 1 | 25 | 60 |
| 2 | 48 | 123 |
| 3 | 39 | 96 |
| 4 | 34 | 83 |
| 5 | 16 | 42 |
Average x = 30.7, Average y = 74.3
Calculate the correlation coefficient!
What correlation do you expect?
Calculated correlation: r ≈ 0.87
Your Turn (Pairs)
Visit: https://datavizcatalogue.com/
Find NEW plots to represent:
Share what you found using screenshots in Ed Discussion!
Rate your confidence (1-25 ⭐s) on Ed Discussion:
Can you now:
If summing all the stars you had more than 16, you’re ready to move forward! 🎉
If not, review Chapter 1 from the textbook and come to office hours.
Summary: Post your self assessment in Ed Discussion. Please reply to the poll only, no need to leave comments.
Attendance: Did you complete at least one attendance activity? If not, see me now!
Complete:
Next class:
Readings:
Great work today!
See you next class! 📊✨
Questions? Catch me after class or on Ed Discussion
![]()
STAT 7 – Winter 2026