Census data quickly

A quick demo of accessing American Community Survey (ACS) data using the tidycensus package. This post shows how to pull useful variables from the Census Data Profile tables (DP02, DP03, DP04, DP05) for analysis and mapping.

Setting up variable mappings

First, we’ll define a vector of useful variables from the Census Data Profile tables. These tables contain pre-calculated percentages and key demographic/economic indicators that are commonly used in analysis:

dp_vars_expanded <- c(
  # DP02 - Social Characteristics
  "pct_bachelor_degree_plus_adults25plus" = "DP02_0068P",
  "pct_high_school_graduate_adults25plus" = "DP02_0067P",
  "pct_english_only_home_pop5plus" = "DP02_0113P",
  "pct_spanish_home_pop5plus" = "DP02_0116P",
  "pct_broadband_subscription_households" = "DP02_0154P",
  
  # DP03 - Economic Characteristics  
  "per_capita_income" = "DP03_0088",
  "median_household_income" = "DP03_0062",                
  "pct_below_poverty_all_people" = "DP03_0128P",          
  "pct_unemployed_labor_force" = "DP03_0009P",
  "pct_no_vehicle_households" = "DP03_0057P",
  "pct_gov_employment_employed16plus" = "DP03_0042P",        
  "pct_snap_benefits_households" = "DP03_0074P",
  
  # DP04 - Housing Characteristics
  "pct_owner_occupied_housing_units" = "DP04_0046P",
  
  # DP05 - Demographics 
  "pct_under_5_total_pop" = "DP05_0005P",                 
  "pct_under_18_total_pop" = "DP05_0019P",   
  "pct_over_65_total_pop" = "DP05_0024P",
  
  # Race/Ethnicity
  "pct_hispanic_any_race_total_pop" = "DP05_0076P",
  "pct_white_non_hispanic_total_pop" = "DP05_0082P",
  "pct_black_non_hispanic_total_pop" = "DP05_0083P", 
  "pct_asian_non_hispanic_total_pop" = "DP05_0085P",
  "pct_native_american_non_hispanic_total_pop" = "DP05_0084P"
)

Pulling data for multiple geographies

Now we’ll use tidycensus::get_acs() to pull data for three different geographic levels: state, senate districts, and house districts. The ACS 5-year estimates provide the most reliable data for smaller geographies:

# Pull data for all three geographies
nm_state <- tidycensus::get_acs(
  geography = "state",
  variables = dp_vars_expanded,
  year = 2023,
  survey = "acs5",
  state = "NM"
)

nm_senate <- tidycensus::get_acs(
  geography = "state legislative district (upper chamber)",
  variables = dp_vars_expanded,
  year = 2023,
  survey = "acs5", 
  state = "NM"
)

nm_house <- tidycensus::get_acs(
  geography = "state legislative district (lower chamber)",
  variables = dp_vars_expanded,
  year = 2023,
  survey = "acs5",
  state = "NM"
)

# Optional: Combine all into one dataset with geography type
all_nm_data <- bind_rows(
  nm_state |> mutate(geo_type = "State"),
  nm_senate |> mutate(geo_type = "Senate District"), 
  nm_house |> mutate(geo_type = "House District")
)

Exploring the data

Let’s take a look at the combined dataset to see what we’ve pulled. The data includes estimates and margins of error for each variable across all geographies:

all_nm_data |> select(-geo_type) |> head(100) |> 
  DT::datatable(
    rownames = FALSE,
    options = list(
      scrollX = TRUE,
      autoWidth = TRUE,
      columnDefs = list(list(width = "100%", targets = "_all"))
    ),
    width = "100%"
  )

Quick map

For mapping, we need to pull the data with geometry = TRUE to get the spatial boundaries. We’ll also transform to a New Mexico-specific coordinate reference system for better visualization:

nm_house2 <- tidycensus::get_acs(
  geography = "state legislative district (lower chamber)",
  variables = dp_vars_expanded,
  year = 2023,
  survey = "acs5",
  state = "NM",
  geometry = TRUE
)

##   |                                                                              |                                                                      |   0%  |                                                                              |==                                                                    |   3%  |                                                                              |===                                                                   |   4%  |                                                                              |====                                                                  |   6%  |                                                                              |======                                                                |   8%  |                                                                              |========                                                              |  12%  |                                                                              |=========                                                             |  13%  |                                                                              |==========                                                            |  14%  |                                                                              |===========                                                           |  16%  |                                                                              |=============                                                         |  18%  |                                                                              |==============                                                        |  20%  |                                                                              |===============                                                       |  21%  |                                                                              |================                                                      |  22%  |                                                                              |==================                                                    |  26%  |                                                                              |===================                                                   |  27%  |                                                                              |====================                                                  |  29%  |                                                                              |=======================                                               |  33%  |                                                                              |=========================                                             |  35%  |                                                                              |===========================                                           |  39%  |                                                                              |============================                                          |  40%  |                                                                              |=============================                                         |  42%  |                                                                              |==============================                                        |  43%  |                                                                              |===============================                                       |  44%  |                                                                              |==================================                                    |  49%  |                                                                              |===================================                                   |  51%  |                                                                              |====================================                                  |  52%  |                                                                              |=======================================                               |  56%  |                                                                              |========================================                              |  57%  |                                                                              |==========================================                            |  59%  |                                                                              |===========================================                           |  61%  |                                                                              |===========================================                           |  62%  |                                                                              |==================================================                    |  71%  |                                                                              |=====================================================                 |  76%  |                                                                              |=========================================================             |  82%  |                                                                              |=============================================================         |  87%  |                                                                              |================================================================      |  92%  |                                                                              |====================================================================  |  97%  |                                                                              |======================================================================| 100%

# Choose a good projection for NM — Albers Equal Area (NAD83)
# EPSG: 5070 works well for US-wide data, but NM-specific CRS is EPSG: 6423
nm_house2 <- sf::st_transform(nm_house2, crs = 6423)

# Simple choropleth example — showing Spanish speakers by district
ggplot(
  nm_house2 |>
    subset(variable == "pct_spanish_home_pop5plus") |>
    sf::st_transform(32113)
) +
  geom_sf(aes(fill = estimate), color = "grey60", linewidth = 0.2) +
  scale_fill_distiller(palette = "Blues", 
                       direction = 1, 
                       na.value = "grey85") +
  theme_minimal(base_size = 12) +
  labs(
    title = "Spanish Spoken at Home (Population 5+)",
    subtitle = "New Mexico House Districts — ACS 2019–2023",
    fill = "%"
  )

Fin

The Data Profile tables are particularly useful because they contain many commonly-used demographic and economic indicators already calculated as percentages, saving you from having to compute them manually from detailed tables.