Code
# Install if needed (run once)
install.packages("wbstats")
library(wbstats)
library(dplyr)
library(ggplot2)
STAT 80B - Data Visualization
Your Name
10 March 2026
This course teaches you to THINK about visualizations, not just use tools.
The software (Tableau/R) is a tool β you learn it through practice, trial and error, and experimentation. Donβt just follow instructions. Think about WHY youβre making each choice. Break things. Try different approaches. Understanding how each tool works comes from exploring on your own, not from step-by-step tutorials.
By the end of this lab, you will:
Both Tableau and the R package wbstats draw from World Bank open data β a rich longitudinal dataset covering nearly every country from the 1960s to the present, with indicators spanning health, economics, demographics, environment, and technology.
Rather than working with all countries at once, focus your analysis by choosing one of the following:
By Region:
| Region | Examples |
|---|---|
| π Latin America & Caribbean | Costa Rica, Brazil, Mexico, Chile, Colombia |
| π Sub-Saharan Africa | Kenya, Ghana, Nigeria, Ethiopia, South Africa |
| π East Asia & Pacific | China, Japan, South Korea, Indonesia, Vietnam |
| π Europe & Central Asia | Germany, France, Spain, Poland, Ukraine |
| π South Asia | India, Bangladesh, Pakistan, Sri Lanka, Nepal |
| π North America & Oceania | USA, Canada, Australia, New Zealand |
By Theme (use all or many countries, focus on specific indicators):
| Theme | Key Indicators |
|---|---|
| π Health | Life expectancy, fertility rate, mortality rates |
| π° Economic Development | GDP per capita, income groups, poverty |
| πΏ Environment | COβ emissions per capita, energy use |
| π Technology & Connectivity | Internet users (% population), mobile subscriptions |
| π₯ Demographics | Population growth, urbanization, age structure |
Write your choice at the top of your submission and stick with it throughout the lab.
World Indicators.tdsIn Tableauβs World Indicators, key fields to know:
Country / Region β geographic identifierYear β date field (treat as continuous for trend lines)Life Expectancy at Birth β yearsGDP per Capita β USDCO2 Emissions β metric tons per capitaFertility Rate β average children per womanInternet Users % β share of populationPopulation TotalInstall and load the wbstats package to pull World Bank data directly:
Search for indicators by keyword:
Download data for your chosen countries and indicators:
# Example: Health indicators for Latin America
# Adjust countries and indicators to match YOUR chosen region/theme
data <- wb_data(
indicator = c(
life_exp = "SP.DYN.LE00.IN", # Life expectancy at birth
gdp_pc = "NY.GDP.PCAP.CD", # GDP per capita (current USD)
fertility = "SP.DYN.TFRT.IN", # Fertility rate
co2 = "EN.ATM.CO2E.PC", # CO2 emissions per capita
internet = "IT.NET.USER.ZS", # Internet users (% of population)
population = "SP.POP.TOTL" # Total population
),
country = c("CRI", "BRA", "MEX", "CHL", "COL"), # ISO3 codes for your countries
start_date = 1990,
end_date = 2022
)
# Preview the structure
glimpse(data)
head(data)| Indicator | Code |
|---|---|
| Life expectancy at birth | SP.DYN.LE00.IN |
| GDP per capita (USD) | NY.GDP.PCAP.CD |
| Fertility rate | SP.DYN.TFRT.IN |
| COβ emissions per capita | EN.ATM.CO2E.PC |
| Internet users (%) | IT.NET.USER.ZS |
| Population total | SP.POP.TOTL |
| Urban population (%) | SP.URB.TOTL.IN.ZS |
| Infant mortality rate | SP.DYN.IMRT.IN |
| School enrollment, secondary | SE.SEC.ENRR |
Complete at least 3 of these 5 tasks. Focus on quality over quantity. For each task, attempt it in your software of choice and take note of how the experience differs.
Create a line graph showing how one indicator has changed over time for your chosen countries.
Choose a single indicator that tells an interesting story across your selected region or countries.
In Tableau:
Year to Columns and your chosen measure to RowsContry/Region to Color and Region to LabelIn R:
# Basic time series β replace with your indicator and countries
ggplot(data, aes(x = date, y = life_exp, color = country)) +
geom_line(linewidth = 1) +
labs(
title = "Life Expectancy Over Time",
subtitle = "Selected Latin American Countries, 1990β2022",
x = NULL,
y = "Life Expectancy at Birth (years)",
color = NULL
) +
theme_minimal() +
theme(legend.position = "bottom")Think about:
Enhance your time series with a trend line to highlight the overall trajectory.
In Tableau:
In R:
# Linear trend with confidence interval
ggplot(data, aes(x = date, y = life_exp, color = country)) +
geom_point(alpha = 0.3, size = 1) +
geom_smooth(method = "lm", se = TRUE, linewidth = 1) +
labs(
title = "Life Expectancy Trend (Linear)",
x = NULL,
y = "Life Expectancy at Birth (years)"
) +
theme_minimal()# LOESS smoothing for nonlinear patterns β try adjusting span
ggplot(data, aes(x = date, y = life_exp, color = country)) +
geom_point(alpha = 0.3, size = 1) +
geom_smooth(method = "loess", span = 0.5, se = TRUE, linewidth = 1) +
labs(
title = "Life Expectancy Trend (LOESS)",
x = NULL,
y = "Life Expectancy at Birth (years)"
) +
theme_minimal()Think about:
The classic βGapminderβ style scatter plot: explore the relationship between two development indicators, ideally for one year.
In Tableau:
Country / Region to Label (show on hover or always)Population Total to Size β does this change what you see?In R:
# Filter to a single year for a snapshot
data_2019 <- data |> filter(date == 2019)
ggplot(data_2019, aes(x = gdp_pc, y = life_exp, size = population, label = country)) +
geom_point(alpha = 0.6, color = "steelblue") +
geom_text(size = 2.5, vjust = -0.8, check_overlap = TRUE) +
scale_x_log10(labels = scales::label_dollar(suffix = "K", scale = 1/1000)) +
scale_size_continuous(range = c(2, 14), guide = "none") +
labs(
title = "GDP per Capita vs. Life Expectancy (2019)",
subtitle = "Bubble size = population",
x = "GDP per Capita (log scale, USD)",
y = "Life Expectancy at Birth (years)"
) +
theme_minimal()Think about:
Compare multiple countries across the same time period using two different approaches β overlaid lines and small multiples (facets). Decide which works better for your data.
Approach A: Overlay
In Tableau:
Country / Region to ColorIn R:
Approach B: Small Multiples (Facets)
In Tableau:
Country / Region to Columns or Rows to create panelsIn R:
ggplot(data, aes(x = date, y = life_exp)) +
geom_line(color = "steelblue", linewidth = 1) +
geom_smooth(method = "lm", se = FALSE, color = "darkred", linetype = "dashed", linewidth = 0.7) +
facet_wrap(~ country, ncol = 3) +
labs(
title = "Life Expectancy by Country (Small Multiples)",
x = NULL, y = "Life Expectancy (years)"
) +
theme_minimal() +
theme(strip.text = element_text(face = "bold"))Think about:
scales = "fixed" vs. scales = "free_y" in facet_wrap()Take one of your visualizations from Tasks 1β4 and add annotations to highlight a specific story, event, or pattern in the data.
This could be:
In Tableau:
In R:
library(ggrepel) # For non-overlapping text labels β install if needed
# Identify a country/year to highlight
highlight <- data |> filter(country == "Costa Rica", date == 2020)
ggplot(data, aes(x = date, y = life_exp, color = country)) +
geom_line(linewidth = 1, alpha = 0.7) +
# Add a vertical reference line for an event year
geom_vline(xintercept = 2020, linetype = "dashed", color = "gray40") +
annotate("text", x = 2020.3, y = min(data$life_exp, na.rm = TRUE) + 1,
label = "COVID-19\npandemic", hjust = 0, size = 3, color = "gray40") +
# Highlight a specific point
geom_point(data = highlight, aes(x = date, y = life_exp),
color = "red", size = 4, shape = 21, fill = "white", stroke = 2) +
labs(
title = "Life Expectancy with Annotation",
x = NULL, y = "Life Expectancy (years)", color = NULL
) +
theme_minimal()Think about:
Submit ONE well-designed, polished visualization. It can come from either Tableau or R β choose whichever produced the stronger result for your chosen story.
Your submission should include:
The visualization (exported image or screenshot)
ggsave("myplot.png", width = 10, height = 6, dpi = 300)Your region/theme choice and which countries/indicators you used
A brief write-up (3β5 sentences) addressing:
| Component | Weight | What weβre looking for |
|---|---|---|
| Technical execution | 30% | Code/Tableau runs, axes labeled, data handled correctly |
| Design quality | 40% | Appropriate chart type, readable layout, thoughtful color and annotation |
| Reflection | 30% | Evidence of experimentation and genuine comparison of the two tools |
Tableau:
R:
wbstats package documentationggrepel for non-overlapping labelsInspiration:
Remember: The goal is not to master Tableau or R in one lab. The goal is to practice thinking about how to visualize change over time and relationships between variables β and to start developing your own sense of when each tool serves you better. Learn by doing, trying, breaking, and fixing!