This vignette walks through the typical end-to-end use of
ACSdownload::get_acs_new(): which call shape to use, how to
keep your laptop healthy when fetching every block group in the country,
and how to turn the resulting cryptic column codes into readable
labels.
The one call you usually want
library(ACSdownload)
bg <- get_acs_new(
yr = 2024,
fips = "blockgroup",
tables = setdiff(ejscreen_acs_tables, c("C16001", "B18101")),
return_list_not_merged = FALSE,
cache_dir = tools::R_user_dir("ACSdownload", "cache")
)What that does:
-
yr = 2024– targets the 2020-2024 vintage (released 2026-01-29). Useacs_endyear_like_ejam()if you want today’s “best guess” of the latest published vintage: fips = "blockgroup"– filters every table to SUMLEVEL 150 rows.tables = setdiff(...)– the full EJSCREEN list except the two tract-only tables (C16001,B18101), since asking for them at blockgroup resolution produces zero rows.return_list_not_merged = FALSE– merges everything onfipsand hands you one widedata.tableinstead of a list of 14.cache_dir = ...– persists the downloaded.datfiles. The table- based summary file files are immutable once a vintage ships, so the second time you run this call it completes in milliseconds.
Why this is faster than the API
There are about 244,000 block groups in the U.S.
Pulling 16 tables nationwide via the Census API would mean tens of
thousands of paginated requests, an API key, and rate limits.
get_acs_new() does 16 large file fetches –
one per table – and lets data.table::fread() parse each one
in a few seconds. On a residential connection a full block- group pull
is typically 3-10 minutes wall-clock, and ~zero on the second run with
caching.
Why those two tables get dropped
C16001 (detailed languages spoken) and
B18101 (disability) are published at tract resolution only.
EJSCREEN repeats their tract values onto each blockgroup in the tract;
if you need that behavior, fetch them separately:
tracts <- get_acs_new(
yr = 2024,
fips = "tract",
tables = c("C16001", "B18101")
)and join into your block-group table afterwards on the tract
substring of fips.
Parallel + retry
If the network is misbehaving, get_acs_new() retries
each table on HTTP 429 / 5xx with exponential backoff (default 3
retries). If 16 sequential downloads is too slow, opt into
parallelism:
future::plan(future::multisession, workers = 4)
bg <- get_acs_new(
yr = 2024,
fips = "blockgroup",
tables = setdiff(ejscreen_acs_tables, c("C16001", "B18101")),
parallel = TRUE,
cache_dir = tools::R_user_dir("ACSdownload", "cache")
)You need both future and future.apply
installed. The caller is responsible for setting a
future::plan(); without one, parallel runs
sequentially.
Decoding the column names
The columns look like B25034_001,
B25034_002, B25034_M001, … That’s: table code,
then _<NNN> for estimates or
_M<NNN> for margins of error. To turn those into
something readable:
acs_label() accepts either estimate or MOE column names;
both map to the same label. It returns NA for codes that don’t appear in
the shipped lookup, which is built from the Census 2022 5-year table
shells.
Picking just what you need
You don’t always want every column from every table.
variables narrows the column selection at fetch time:
just_pre1960 <- get_acs_new(
yr = 2024,
fips = "blockgroup",
tables = "B25034",
variables = c("B25034_001", "B25034_010", "B25034_011"),
keep_moe = FALSE
)-
variablesis matched in the post-rename form (no_Einfix). -
keep_moe = FALSEdrops margin-of-error columns. -
keep_annotations = TRUEkeeps_EA<nnn>/_MA<nnn>annotation columns (some vintages include them; v3 drops them by default).
When something goes wrong
The first thing to try if get_acs_new() errors:
-
yrout of range?validate_acs_endyear()rejects years below 2022 or above today’s year + 1. -
Unknown table code?
validate_acs_tables()enforces^[BC][0-9]{5}[A-I]?(PR)?$. Common typo: passing the EJSCREEN-style name (e.g."pop") instead of the Census code. -
HTTP 404? Means the table doesn’t exist for that
vintage. Check
url_acs_table(tables = "...", yr = ...)for the table’s data.census.gov landing page.