STAT 80B: Data Visualization

Week 1 - Data Types, Aesthetics & Getting Started

01 Dec 2025

Welcome Back! 🎨

Today’s Agenda:

  1. Review: The grammar of graphics and data types
  2. Data types and their visual representations
  3. Understanding aesthetic mappings
  4. Scales: Connecting data values to visual properties
  5. Hands-on: Software installation and your first visualization!

By the end of today: You’ll have created your first visualization and be ready for Lab 1! 🚀

Quick Recap: What Are Aesthetics?

From Tuesday, remember:

Aesthetics = Visual properties that can represent data

  • Position (x, y)
  • Color
  • Size
  • Shape
  • Transparency
  • Line type/width

Quick Recap: Data Types

From Tuesday, remember:

Data Types = Quantitative and Categorical

  • Quatitative Discrete
  • Quatitative Continuous
  • Ordinal
  • Categorical Nominal
  • Categorical Binary

Today we’ll learn: Which aesthetics work best for which types of data!

Aesthetic Mappings

What Is Mapping?

Mapping = Assigning data values to visual properties

Think of it as a translation:

Data value → Visual property

Examples:

  • Temperature data → y-position on a chart
  • Country name → color
  • Number of items → bar height
  • Date → x-position

The Mapping Table

Data Type Best Aesthetics Avoid
Continuous Quantitative Position, Size, Color intensity Shape, Discrete colors
Discrete Quantitative Position, Size, Color (with care) Too many colors
Categorical Color, Shape, Position (grouped) Size (implies order)
Ordinal Color (sequential), Size, Position Random colors

Position Aesthetic

Most effective aesthetic for showing quantitative data!

Why?

  • Humans are excellent at comparing positions
  • Works for both continuous and discrete data
  • Supports wide range of values

Good for:

  • Time series (x-axis)
  • Comparisons (y-axis)
  • Relationships (scatter plots)

Limitations:

  • Limited to 2-3 dimensions
  • Can be crowded with many data points

Color Aesthetic

Powerful but tricky - can show both categorical and quantitative data!

For Categorical Data

Use distinct, qualitative colors

✅ Red, Blue, Green for different categories
❌ Light blue, medium blue, dark blue (implies order!)

For Quantitative Data

Use sequential or diverging color scales

✅ Light → Dark gradient for “low to high”
✅ Blue → White → Red for “negative to positive”

Size Aesthetic

Best for quantitative data showing magnitude

Common uses:

  • Bubble charts (circle area represents values)
  • Sized text (word clouds)
  • Line width in network diagrams

Important:

  • Size = Area, not radius! (We’ll discuss why later)
  • Can be hard to compare precisely
  • Don’t use size for categorical data (implies ordering)

Shape Aesthetic

Best for categorical data with few categories

Strengths:

  • Distinguishable even for colorblind viewers
  • Works in black & white printing
  • Can combine with color for redundancy

Limitations:

  • Hard to distinguish more than ~6 shapes
  • Not all shapes have same visual weight
  • Never use for quantitative data!

Combining Aesthetics: The Power of Redundancy

Multiple aesthetics can show the same variable for emphasis or accessibility!

Example: Visualizing car efficiency by type

  • Car type → Color (Blue = Sedan, Red = SUV)
  • Car type → Shape (Circle = Sedan, Square = SUV)

Benefits:

  • Helps colorblind viewers
  • Reinforces the message
  • Works in black & white

Caution: Don’t overdo it - can become cluttered!

🔄 THINK-PAIR-SHARE #2 (7 minutes)

Scenario: Visualizing Student Performance Data

Variables:

  • Math score (0-100): Continuous
  • Reading score (0-100): Continuous
  • Grade level (9th, 10th, 11th, 12th): Ordinal
  • School type (Public, Private, Charter): Categorical
  • Number of absences: Discrete quantitative

Task: Design TWO different visualizations using different aesthetic mappings. For each:

  1. Choose which variables to include
  2. Decide which aesthetics to use (position, color, size, shape)
  3. Justify your choices

Share Your Designs! 📊

Let’s see different approaches:

  • What variables did you prioritize?
  • Which aesthetics did you choose and why?
  • Did anyone use redundant coding?

Key insight: Same data, many valid visualization approaches! The “best” one depends on what question you’re trying to answer.

Scales: Connecting Data to Visuals

What Are Scales?

Scales define HOW data values map to aesthetic properties

Think of scales as the translation dictionary:

Data Domain → Visual Range

Example: Temperature scale

  • Data domain: 0°F to 100°F
  • Visual range: Bottom of chart to top of chart
  • Scale type: Linear (equal spacing)

Types of Scales

Linear Scales

Equal steps in data = equal steps in visualization

✅ Best for: Most quantitative data
Example: 0, 10, 20, 30… equally spaced

Logarithmic Scales

Equal multiples in data = equal steps in visualization

✅ Best for: Data spanning many orders of magnitude
Example: 1, 10, 100, 1000… equally spaced

Color Scales

Different types for different data:

Sequential

Light → Dark (for quantitative data)

Example: Population density (low = light blue, high = dark blue)

Diverging

Two colors meeting at middle (for data with meaningful center)

Example: Temperature anomaly (cold = blue, neutral = white, hot = red)

Qualitative

Distinct colors (for categorical data)

Example: Political parties (red, blue, green, etc.)

Scale Limits: What to Show?

Question: Should your scale start at zero?

For Bar Charts: Almost always YES!

  • Bars represent amounts
  • Not starting at zero = misleading proportions

For Line Charts: Depends!

  • ✅ OK to zoom in on small changes (stock prices)
  • ✅ But make it clear you’re not starting at zero

For Scatter Plots: Usually no need for zero

  • Points represent individual values, not amounts

🧘‍♀️ STRETCH BREAK (5 minutes)

Time to move and recharge! 🤸‍♀️

When we return: Hands-on software tutorial - your first visualization!

Software Tutorial: Getting Started

Your Software Choices

Today we’ll cover two paths:

Path 1: Tableau

Best for:

  • Beginners
  • Interactive dashboards
  • Quick exploration
  • Business analytics

Pros: Intuitive, drag-and-drop
Cons: Less reproducible

Path 2: R/Python + Positron

Best for:

  • Reproducible research
  • Statistical analysis
  • Publication graphics
  • Custom visualizations

Pros: Fully reproducible, powerful
Cons: Steeper learning curve

Choose based on your goals! You can always learn the other later.

Tutorial Structure

  1. Installation (~10 min)
    • Tableau: Download and activate student license
    • R/Python: Set up Positron IDE
  2. Interface Tour (~10 min)
    • Understanding the workspace
    • Connecting to data
    • Basic navigation
  3. First Visualization (~15 min)
    • Load sample data
    • Create a simple chart
    • Customize and export
  4. Lab 1 Guidance (~5 min)
    • What to submit
    • Tips for success

Path 1: Tableau Desktop

Installing Tableau Desktop

Step 1: Get your student license

  1. Go to tableau.com/academic/students
  2. Use your (ucsc.edu?) email
  3. Check email for license key (TBD)
  4. Download Tableau Desktop (not Public)
  5. Install and activate with your key

💡 Tip: This can take 10-15 minutes, start now if you haven’t already!

Tableau Interface Tour

When you open Tableau, you’ll see:

Left: Connect pane (where you load data) - Excel, CSV, databases, web data, etc.

Center: Canvas (where you build visualizations) - Drag and drop fields here

Right: Show Me panel (suggested chart types) - Smart suggestions based on your data

Bottom: Data pane - Lists all fields (dimensions and measures)

Data in Tableau: Dimensions vs. Measures

Tableau auto-categorizes your data:

Dimensions (Blue)

Qualitative fields - Categorical data - Dates - Text fields

Used to slice and group data

Measures (Green)

Quantitative fields - Numbers - Counts - Calculations

Used to aggregate and measure

You can change how Tableau treats fields if needed!

Creating Your First Tableau Viz

We’ll use: Sample - Superstore (built-in dataset)

Step 1: Connect to data - Click “Sample - Superstore” on start page

Step 2: Go to Sheet 1 - Click “Sheet 1” tab at bottom

Step 3: Create a simple bar chart - Drag Sub-Category to Rows - Drag Sales to Columns

Boom! 💥 You have your first visualization!

Adding Color

Let’s add a second variable:

  • Drag Region to the Color card (in Marks panel)
  • Now bars are colored by region!

What do you notice? - Each sub-category is now subdivided by region - We’re using stacked bars - Color represents categorical data (Region)

Try this: Right-click the chart → Format → experiment with fonts, colors, etc.

Adding Size

Let’s make a bubble chart instead:

  1. Clear your shelf (click X on pills)
  2. Drag Sub-Category to Rows
  3. Drag Sales to Columns
  4. Change mark type (dropdown in Marks) to Circle
  5. Drag Profit to Size
  6. Drag Region to Color

You now have: Position showing sales/category, size showing profit, color showing region! 🎨

Exporting from Tableau

To save your visualization:

Option 1: Save as Workbook - File → Save As (.twbx file) - Contains data and visualizations

Option 2: Export Image - Worksheet → Export → Image (PNG) - Good for reports, presentations

Option 3: Copy to Clipboard - Worksheet → Copy → Image - Paste into documents

Path 2: R/Python with Positron

What Is Positron?

Positron = Modern IDE for data science

  • Built by Posit (makers of RStudio)
  • Works with both R and Python
  • Excellent for visualization
  • Interactive notebooks + scripts

Why Positron? - Modern interface - Great for learning - Smooth transition if you know RStudio - Handles both R and Python (flexibility!)

Installing Positron

Step 1: Download Positron

Step 2: Install R or Python (or both!)

Step 3: Install visualization packages

  • R: install.packages("ggplot2")
  • R: install.packages("tidyverse") (includes ggplot2 + more)
  • Python: pip install matplotlib seaborn plotly

Positron Interface Tour

Left Panel:

  • File explorer
  • Variables/environment
  • Plots/viewer

Center:

  • Code editor
  • Notebooks

Bottom:

  • Console (run code)
  • Terminal
  • Help viewer

Top:

  • Toolbar
  • Run buttons

First Viz in R with ggplot2

Load packages and data:

# Load packages
library(ggplot2)
library(palmerpenguins)

# Load penguins data
data(penguins)
head(penguins)  # Look at the data

Create scatter plot:

ggplot(data = penguins, 
       aes(x = bill_length_mm, y = bill_depth_mm)) +
  geom_point()

The ggplot2 Grammar

Structure: ggplot() + layers

ggplot(data = <DATA>, aes(<MAPPINGS>)) +
  <GEOM_FUNCTION>() +
  <MORE LAYERS>
  • Data: What dataset?
  • Aesthetics (aes()): What mappings? (x, y, color, size…)
  • Geoms: What shapes? (points, lines, bars…)
  • + symbol: Add layers

Adding Aesthetics in ggplot2

Add color by species:

ggplot(data = penguins, 
       aes(x = bill_length_mm, y = bill_depth_mm, 
           color = species)) +
  geom_point(size = 3)

Add size by body mass:

ggplot(data = penguins, 
       aes(x = bill_length_mm, y = bill_depth_mm, 
           color = species,
           size = body_mass_g)) +
  geom_point(alpha = 0.6)

First Viz in Python

Import and load data:

import matplotlib.pyplot as plt
import seaborn as sns

# Load penguins dataset
penguins = sns.load_dataset('penguins')
print(penguins.head())

Create scatter plot:

plt.figure(figsize=(8, 6))
sns.scatterplot(data=penguins, 
                x='bill_length_mm', 
                y='bill_depth_mm',
                hue='species',
                size='body_mass_g')
plt.title('Penguin Bill Dimensions')
plt.show()

Saving Plots in R/Python

R (ggplot2):

# Save last plot
ggsave("my_plot.png", width = 8, height = 6)

# Or be explicit
my_plot <- ggplot(...) + geom_point()
ggsave("my_plot.png", plot = my_plot)

Python:

# Save current figure
plt.savefig('my_plot.png', dpi=300, bbox_inches='tight')

Lab 1: Getting Started

Lab 1 Requirements

Create 3 visualizations using different aesthetic mappings:

  1. Visualization 1: Use position and color
    • Example: Scatter plot with colored categories
  2. Visualization 2: Use position and size
    • Example: Bubble chart
  3. Visualization 3: Your choice - be creative!
    • Combine 3+ aesthetics
    • Try something different from examples

Dataset for Lab 1

Palmer Penguins Dataset 🐧:

A real dataset from Antarctic research with measurements of 344 penguins!

Variables include:

  • Species (Adelie, Chinstrap, Gentoo)
  • Island location
  • Bill measurements (length and depth)
  • Flipper length and body mass
  • Sex and year

Where is the data?:

Why penguins? Mixed variable types, clear patterns, and adorable! 🐧

What to Submit

One PDF file containing:

  1. Your name and software used at the top

  2. Three visualizations, each with:

    • The image (full size, readable!)
    • Title describing what it shows
    • 2-3 sentences explaining:
      • What variables you mapped to which aesthetics
      • What pattern or insight it reveals
  3. Brief reflection (1 paragraph):

    • What was easy? What was challenging?
    • Which aesthetic mappings worked well?

Grading Rubric for Lab 1

Component Points What We’re Looking For
Software Working 3 All 3 visualizations present & exported
Different Aesthetics 3 Each viz uses different aesthetic combos
Descriptions 2 Clear explanation of mappings
Reflection 2 Thoughtful comments on process

Total: 8 points (remember: top 4 of 5 labs count!)

This is meant to be straightforward! Just show us you can make basic charts.

Lab 1 Tips for Success

  • Start today in class - use tutorial time to ask questions!
  • Don’t overthink it - simple charts are fine
  • Make sure images are readable - don’t shrink them too small
  • Export early and often - test your export workflow
  • Ask questions on Ed - help each other!
  • Due today (in class) - submit on Canvas before you leave!

Remember: The goal is to verify your software works and you understand basic mappings. Quality over complexity!

Common Pitfalls to Avoid

Too complex - Save fancy stuff for later!

Unreadable exports - Check image quality

No descriptions - We need to know your thinking

Using same aesthetics 3 times - Show variety

Waiting until last minute - Technical issues happen!

Live Work Time! 💻

For the rest of class (~30 min):

  1. Install your chosen software (if not done)
  2. Follow along with demo specific to your tool
  3. Start Lab 1 - create your 3 visualizations
  4. Ask questions! - We’re here to help
  5. Submit before you leave (or by end of day)

Breakout by tool: - Tableau users: Front of room - R/Python users: Back of room

Looking Ahead to Week 2

Tuesday: Color and Coordinate Systems (Wilke Ch 3-4)

  • Why color choices matter
  • Color palettes for different data types
  • Coordinate systems: Cartesian, polar, and more

Thursday: More practice + Lab 2 setup

Due Next Week: - Concept Map 1 (Friday): Summarizing data types, aesthetics, scales

Quick Confidence Check ✅

Rate your confidence (1-20) on Ed Discussion:

  1. I can distinguish quantitative, categorical, and ordinal data ⭐⭐⭐⭐⭐
  2. I understand what aesthetic mappings are ⭐⭐⭐⭐⭐
  3. I can create basic visualizations in my chosen software ⭐⭐⭐⭐⭐
  4. I know what to do for Lab 1 ⭐⭐⭐⭐⭐

If you rated anything 10 or below: Great time to ask questions! Stay after class or come to office hours.

Questions? 🤔

Stay after for help with: - Software installation issues - Lab 1 questions
- Choosing which tool to use - Anything else!

Office hours: Right here after class (3:05-3:40 PM)

Thank you! Happy visualizing! 📊✨

Lab 1 due today!

See you Tuesday for Color & Coordinate Systems!

STAT 80B – Winter 2026