Introducing censusr
/Here at Transport Foundry we regularly use data from the U.S. Census Bureau to validate our input data and our simulation engine outputs. To make this task easier for us, we wrote an R package that downloads data from the U.S. Census Bureau directly into a user's R environment. We have published this package – censusr – on CRAN under an open-source license. We hope others can use it to streamline their analyses. Contributions to the source code are welcome on GitHub.
Use
The package works by sending a list of requested variables and a list of
geographies. The call below requests the number of households owning 0, 1, 2, 3,
or 4 or more vehicles in Wake County, North Carolina (geoid = 37183
). We
specify that we want this table for 2012 5-year summary level.
library(censusr)
call_census_api(
paste0("B08201_", sprintf("%03d", 2:6), "E"),
names = paste0("est_", c(0:4)),
geoids = "37183",
data_source = "acs", year = 2012, period = 5)
## Source: local data frame [1 x 6]
##
## geoid est_0 est_1 est_2 est_3 est_4
## (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1 37183 15813 111992 149742 47222 16534
We can use the allgeos
argument to say that we actually want these variables
for all census tracts within Wake County.
call_census_api(
paste0("B08201_", sprintf("%03d", 2:6), "E"),
names = paste0("est_", c(0:4)),
geoids = "37183", allgeos = "tr",
data_source = "acs", year = 2012, period = 5)
## Source: local data frame [187 x 6]
##
## geoid est_0 est_1 est_2 est_3 est_4
## (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1 37183050100 248 516 310 37 0
## 2 37183050300 293 826 489 51 19
## 3 37183050400 44 369 328 23 9
## 4 37183050500 181 885 436 87 30
## 5 37183050600 289 600 209 69 19
## 6 37183050700 503 584 218 118 0
## 7 37183050800 359 227 162 74 0
## 8 37183050900 442 249 80 3 0
## 9 37183051000 202 543 329 68 28
## 10 37183051101 149 201 208 54 64
## .. ... ... ... ... ... ...
If we want the margins of error on this table instead of the estimates, we can
change the variable to call the M
type instead of the E
type.
call_census_api(
paste0("B08201_", sprintf("%03d", 2:6), "M"),
names = paste0("moe_", c(0:4)),
geoids = "37183", allgeos = "tr",
data_source = "acs", year = 2012, period = 5)
## Source: local data frame [187 x 6]
##
## geoid moe_0 moe_1 moe_2 moe_3 moe_4
## (chr) (dbl) (dbl) (dbl) (dbl) (dbl)
## 1 37183050100 87 169 81 52 13
## 2 37183050300 101 163 106 53 22
## 3 37183050400 25 80 62 21 13
## 4 37183050500 75 143 98 54 29
## 5 37183050600 95 112 76 47 14
## 6 37183050700 109 97 82 61 13
## 7 37183050800 94 85 73 58 13
## 8 37183050900 91 78 43 6 13
## 9 37183051000 96 117 99 61 44
## 10 37183051101 78 75 80 83 79
## .. ... ... ... ... ... ...
For a list of variable codes, see the U.S. Census Bureau API page. For a tutorial on how to setup the censusr
package with an API key, see the package vignette.