STAT 7 - Statistical Methods for the Biological, Environmental & Health Sciences
10 Mar 2026
By the end of today’s lecture, you will be able to:

Key Question: Does early peanut consumption prevent allergies?
At 5 years of age:
Main analysis: 530 children with an earlier negative skin test
Experimental Study
Based on three principles:
Compares responses across treatment levels
✅ Can establish causation
Observational Study
Researchers simply observe:
Participants may differ in ways that influence response
⚠️ Can only establish correlation
1. Control
When selecting participants, researchers work to control for extraneous variables and choose a representative sample
2. Randomization
Randomly assigning patients to treatment groups ensures groups are balanced with respect to both controlled and uncontrolled variables
3. Replication
Larger studies are more reliable; larger samples are more likely to be representative of the population (sometimes)
Your Task (3 minutes)
Imagine you have to design a similar study (same objectives) but using an observational design instead of experimental.
Would it work? Why or why not?
Discussion: What would be different? What problems might arise?
Poll: PollEv.com/slugstats
Most commonly used random sampling techniques:
Note that there are many other sampling methods (systematic, convenience, etc.) but these are less commonly used in formal studies.
Randomly select cases from the population
Strata are made up of similar observations
Clusters are usually NOT homogeneous
Combination approach
Scenario
You are an environmental scientist assessing air pollution (PM2.5 levels) in a metropolitan area with diverse zones:
Your Task
Which sampling method would you choose?
A. Random Sampling across entire region
B. Stratified Sampling (proportional to each zone)
C. Cluster Sampling (select neighborhoods, measure all locations)
Poll: PollEv.com/slugstats

Break Time! ☕ 5-minute break
Stretch, grab water, chat with neighbors!
We’ll resume with types of variables and data collection.
You are an environmental scientist studying water quality in a local river. You’ve collected data on nitrate concentrations (mg/L) from 10 sampling points:
3.2, 4.8, 4.8, 6.5, 7.0, 8.2, 8.2, 9.1, 9.1, 10.3
Question: How do we summarize this data?
Mean: Average of all values \[\text{Mean} = \bar{x} = \frac{\sum x_i}{n}\]
Median: Middle value when data is ordered
Mode: Most frequently occurring value(s)
Standard Deviation (SD): Average distance from the mean \[s = \sqrt{\frac{\sum(x_i - \bar{x})^2}{n-1}}\]
Range: Maximum - Minimum
Interquartile Range (IQR): Q3 - Q1 (middle 50% of data)
Data: 3.2, 4.8, 4.8, 6.5, 7.0, 8.2, 8.2, 9.1, 9.1, 10.3
\[\text{Mean} = \frac{3.2 + 4.8 + 4.8 + 6.5 + 7.0 + 8.2 + 8.2 + 9.1 + 9.1 + 10.3}{10} = \frac{71.2}{10} = 7.12\]
\[\text{Variance} = \frac{\sum(x_i - \bar{x})^2}{n-1} = \frac{46.62}{9} = 5.20\]
\[\text{SD} = \sqrt{5.20} = 2.28 \text{ mg/L}\]
Your Turn! (10 minutes)
Calculate the summary statistics for this second water sample:
0.5, 1.0, 3.0, 5.0, 7.0, 7.2, 9.0, 10.0, 12.0, 18.0
Calculate:
Sample 1:
Sample 2:
Discussion Questions
| Researcher_ID | Ecosystem_Studied | Research_Focus | Conservation_Challenge |
|---|---|---|---|
| 1 | Coral Reefs | Coral Bleaching | Overfishing |
| 2 | Open Ocean | Marine Biodiversity | Climate Change |
| 3 | Estuaries | Carbon Sequestration | Coastal Development |
| 4 | Estuaries | Nutrient Cycling | Pollution |
| 5 | Estuaries | Hydrothermal Vents | Pollution |
| 6 | Kelp Forests | Species Interactions | Ocean Acidification |
| 7 | Seagrass Meadows | Habitat Restoration | Agricultural Runoff |
| 8 | Rocky Intertidal | Invasive Species | Climate Change |
| 9 | Salt Marshes | Erosion Control | Sea Level Rise |
| 10 | Open Ocean | Fisheries Management | Overfishing |
How do we summarize categorical data?
| Conservation Challenge | Absolute Frequency | Relative Frequency |
|---|---|---|
| Pollution | 2 | 0.20 |
| Overfishing | 2 | 0.20 |
| Climate Change | 2 | 0.20 |
| Coastal Development | 1 | 0.10 |
| Ocean Acidification | 1 | 0.10 |
| Agricultural Runoff | 1 | 0.10 |
| Sea Level Rise | 1 | 0.10 |
| TOTAL | 10 | 1.00 |
Your Task (5 minutes)
Create a frequency table for Ecosystem_Studied using the data provided.
Calculate:
Can you also create a cross-table between Ecosystem Studied and Research Focus?
Ecosystem Studied × Conservation Challenge
| Estuaries | Open Ocean | Other | Total | |
|---|---|---|---|---|
| Pollution | 2 | 0 | 0 | 2 |
| Climate Change | 0 | 1 | 1 | 2 |
| Overfishing | 0 | 1 | 1 | 2 |
| Other | 1 | 0 | 3 | 4 |
| Total | 3 | 2 | 5 | 10 |
Yes, for learning:
No, for research:
We’ll explore this more next time, but here’s a preview:
Interactive tool: https://istats.shinyapps.io/guesscorr/
Rate your confidence (1-25 ⭐s) on Ed Discussion:
Can you now:
If summing all the stars you had more than 16, you’re ready to move forward! 🎉
If not, review Chapter 1 from the textbook and come to office hours.
Summary: Post your self assessment in Ed Discussion. Please reply to the poll only, no need to leave comments.
Attendance: Did you complete at least one attendance activity? If not, see me now!
Complete:
Next class:
Readings:
Great work today!
See you next class! 📊✨
Questions? Catch me after class or on Ed Discussion
![]()
STAT 7 – Winter 2026