Skip to contents

This package is mainly a set of helper functions used by the EJAM package, but it can be used apart from EJAM to download nationwide Census 2020 data. This does not require having a Census API key.

It creates block datasets (and some blockgroup tables) for use in the EJAM package. It has basic functions for downloading from Census Bureau, unzipping, reading the 2020 Census data for some or all US states, DC, and Puerto Rico into data.table format. It also has some tools for Island Areas (VI,GU,MP,AS).

It can retain a few key variables like

  • lat and lon of block internal point (similar to a centroid)
  • FIPS codes
  • population count or weight
  • area (or effective radius)

Installation

# install.packages("remotes")
remotes::install_github("ejanalysis/census2020download")

Quick start

library(census2020download)

# Download / unzip / read / clean a couple of small states' block data.
# (Downloads files from the Census Bureau, so it needs internet access.)
blocks <- census2020_get_data(c("DE", "DC"), cols_to_keep = "all")
dim(blocks)
head(blocks)

# Race/ethnicity subgroup counts that add up to the total block population:
groups <- c("hisp", "nhwa", "nhba", "nhaiana", "nhaa",
            "nhnhpia", "nhotheralone", "nhmulti")
all.equal(blocks$pop, rowSums(blocks[, ..groups]))

# Split the data into the individual data.tables used by EJAM
# (blockwts, blockpoints, blockid2fips, bgid2fips, quaddata):
tables <- census2020_save_datasets(blocks)
names(tables)

# Island Areas (VI, GU, MP, AS) come at blockgroup (not block) resolution:
islands <- census2020_get_data_islandareas()

By default census2020_get_data() downloads to a temporary folder; pass folder = to keep the files, and overwrite = FALSE to reuse files already downloaded there.

For more information see

Key functions and data.tables created include

Key data.table objects created:

  • blockid2fips - data.table with FIPS code to blockid lookup
  • blockpoints - data.table with latitude and longitude of internal points
  • quaddata - data.table with xyz format locations of blocks, used to create spatial index of blocks in the EJAM package.
  • blockwts - data.table with Census 2020 population-based weight as fraction of parent block group population, and size of block

Acknowledgements

Claude (Anthropic’s AI assistant) was used extensively in refactoring and updating this package.