Ive noticed a possible issue in regard to microdata through tidycensus and the wage data.
Long story short I have been working on micro data with IPUMSR and tidycensus packages and while I can get close to similiar results it looks like some of the variables within tidycensus are rounded. Specifically the "WAGP" and "PERNP" variables. While their equilavents in IPMUSR ("INCWAGE" and "INCEARN") are not.
Is this a bug/error in tidycensus or is it from user error on my part?
My code is below.
###IPUMSR segment
ipums_extract_test <- define_extract_micro(
collection = "usa",
description = "USA extract for API vignette",
samples = c("us2022c"),
variables = c("AGE", "STATEFIP", "EMPSTAT", "INCWAGE", "INCEARN", "us2022c_schl"))
ipums_data <- ipums_extract_test %>%
submit_extract() %>%
wait_for_extract() %>%
download_extract() %>%
read_ipums_micro()
ipums_test <- ipums_data %>%
filter(STATEFIP == 26 & AGE >= 16 & EMPSTAT == 1 & US2022C_SCHL %in% 1:21)
####tidycensus segment
tidy_test <- get_pums(
year = 2022,
survey = "acs5",
state = "MI",
variables = c("AGEP", "ESR", "WAGP", "PERNP", "SCHL")
) %>%
filter(AGEP >= 16 & (ESR == 1 | ESR == 2 | ESR == 4 |ESR == 5) & SCHL %in% 1:21)
Thank you.
Ive noticed a possible issue in regard to microdata through tidycensus and the wage data.
Long story short I have been working on micro data with IPUMSR and tidycensus packages and while I can get close to similiar results it looks like some of the variables within tidycensus are rounded. Specifically the "WAGP" and "PERNP" variables. While their equilavents in IPMUSR ("INCWAGE" and "INCEARN") are not.
Is this a bug/error in tidycensus or is it from user error on my part?
My code is below.
###IPUMSR segment
ipums_extract_test <- define_extract_micro(
collection = "usa",
description = "USA extract for API vignette",
samples = c("us2022c"),
variables = c("AGE", "STATEFIP", "EMPSTAT", "INCWAGE", "INCEARN", "us2022c_schl"))
ipums_data <- ipums_extract_test %>%
submit_extract() %>%
wait_for_extract() %>%
download_extract() %>%
read_ipums_micro()
ipums_test <- ipums_data %>%
filter(STATEFIP == 26 & AGE >= 16 & EMPSTAT == 1 & US2022C_SCHL %in% 1:21)
####tidycensus segment
tidy_test <- get_pums(
year = 2022,
survey = "acs5",
state = "MI",
variables = c("AGEP", "ESR", "WAGP", "PERNP", "SCHL")
) %>%
filter(AGEP >= 16 & (ESR == 1 | ESR == 2 | ESR == 4 |ESR == 5) & SCHL %in% 1:21)
Thank you.