Enumeration Disparities Caused By The Suspension Of US Census Bureau Field Operations During COVID-19

By Benjamin Livingston, NewsCounts

June 1, 2020

Background

The suspension of US Census Bureau field operations due to the COVID-19 pandemic has created vast disparities in census counts between rural and non-rural areas.

The majority of areas in the United States receive mailed invitations to fill out the census online, by phone, or by mail - a process that requires no visit from a census worker (unless a household fails to respond and requires a Non-Response Follow-Up). This makes it easy for these "Self-Response" areas to participate in the census relatively normally, even with visits from census workers suspended.

However, areas without reliable mail service rely on initial contact through door-to-door visits from census workers, who either enumerate households themselves (referred to as "Update Enumerate (UE)" or "Remote Alaska (RA)"), or leave a packet with a form/invitation for the households to respond with themselves (also known as "Update Leave (UL)").

Until this visit happens, households in these areas will not receive a direct invitation to participate in the census.

However, households in these rural areas can often still fill out the census online, even before they receive an invitation. Even in these areas, online participation is still usually possible - the lack of an invitation to participate is the issue we are analyzing here.

A detailed map of these areas can be found here.

Why This Story Matters: The Threat To Rural Areas

Inevitably, the areas relying on the three kinds of in-person initial contact rather than mail invitations (and the rural states heavily stocked with them) have fallen far behind in census enumeration due to the pandemic - and there is no telling how much (or how little) these disparities will be assuaged when in-person enumeration begins.

Should these disparities linger even after in-person operations begin, it could lead to very low self-response rates, which will hurt these areas' abilities to be properly counted.

As in-person operations begin again, these rural areas will face added pressure to actively participate in the census, given how far behind they have fallen (by no fault of their own). This could have a lasting impact on these areas' collective ability to receive federal/state funding and resources, congressional apportionment, and general representation.

This analysis can help any journalist or researcher display just how wide this gap has become, and how much more important proactive census participation has become for these areas.

Analysis

We will now examine how wide these disparities are by comparing the response rates for tracts that rely heavily on in-person census operations with response rates for tracts that have largely received invitations by mail.

First we will import the data, and then we will visualize the trends.

In [1]:
# import packages and establish settings
import pandas as pd
import numpy as np
import requests
import json
import matplotlib.pyplot as plt
import warnings
from matplotlib.ticker import MaxNLocator
from IPython.display import HTML
import statsmodels.api as sm
from patsy import dmatrix
import math as m
%matplotlib inline
warnings.filterwarnings('ignore')

# set API key 
key = '2988f01f5e86175bda8beae2b5035e1ccef2d052'

# tested shortcut for obtaining state FIPS codes
url = f"https://api.census.gov/data/2010/dec/responserate?get=GEO_ID,FSRR2010&key={key}&for=state:*"
JSONContent = requests.get(url).json()
states = pd.DataFrame(JSONContent)
states = states.iloc[1:,2]
states = [int(i) for i in states if i !='72']
states = sorted(states)

# data frame to hold tract responses
tract_responses = pd.DataFrame(columns=['GEO_ID','CRRALL'])

# pull tract response data for 2020
for i in states:
    if i < 10:
        url = f"https://api.census.gov/data/2020/dec/responserate?get=GEO_ID,CRRALL&key={key}&for=tract:*&in=state:0"\
        + str(i)
    else:
        url = f"https://api.census.gov/data/2020/dec/responserate?get=GEO_ID,CRRALL&key={key}&for=tract:*&in=state:"\
        + str(i)
    try:
        JSONContent = requests.get(url).json()
        temp = pd.DataFrame(JSONContent)
        temp.columns = temp.iloc[0]
        temp = temp.iloc[1:,:]
        tract_responses = pd.concat([tract_responses,temp],sort=True)
    except json.JSONDecodeError:
        pass

# set index and column title for 2020 response rates
tract_responses['CRRALL'] = tract_responses['CRRALL'].astype('float')
tract_responses.index = tract_responses.GEO_ID.str.replace('1400000US','')
tract_responses = tract_responses.drop(columns = 'GEO_ID')
tract_responses.columns = ['response','county','state','tract']

# pull type of enumeration data
tea = pd.read_excel('https://www2.census.gov/geo/maps/DC2020/TEA/TEA_PCT_Housing_Tract.xlsx')
tea.index = tea.TRACT_GEOID
tea.index.name = 'GEO_ID'
tea.index = tea.index.astype('str').str.zfill(11)
tea['inperson'] = tea.PCT_HU_TEA2 + tea.PCT_HU_TEA4 + tea.PCT_HU_TEA6
tea = tea.drop(columns=['TRACT_GEOID','PCT_HU_TEA2','PCT_HU_TEA3','PCT_HU_TEA4','PCT_HU_TEA6'])
tea.columns=['mail','inperson']

# remove type of enumeration entries with no data
tea = tea[np.sum(tea,axis=1) > 99.5]

# count percentage of tracts in our response dataset for which we have enumeration data
temp = pd.merge(tract_responses,tea,'left','GEO_ID')
print(np.round(100 * np.sum(pd.notnull(temp.mail)) / temp.shape[0],1),'% of tracts with response rate data \
also have enumeration strategy logged and will be a part of this analysis',sep='')

# create dataset and sort by inperson
data = pd.merge(tract_responses,tea,'inner','GEO_ID')
data = data.sort_values('inperson')
73.3% of tracts with response rate data also have enumeration strategy logged and will be a part of this analysis

With the data imported, we see that approximately three-quarters of the tracts with response rate data also have enumeration strategies logged. Thus, these will be the tracts we use for our analysis (and about a quarter will not be included).

We start by creating a scatterplot that shows how the percentage of a tract relying on in-person census operations for initial contact relates to response rates, and fitting cubic splines.

Clearly, tracts that rely more heavily on in-person enumeration tend to have vastly lower response rates than areas that mostly received mail invitations.

In [2]:
# make scatterplot of data and add curves
plt.figure(figsize=(15,8))
uniq = np.linspace(0,100,1001)
new_x = dmatrix("bs(train, df=df, degree=3, include_intercept=True)", {"train": data.inperson,\
        "df":np.max([m.ceil(1/(1-np.mean(data.inperson<1))),4])},return_type='dataframe')
model = sm.GLM(data.response, new_x)
results = model.fit()
plt.plot(data.inperson,results.predict(new_x),c='orange',linewidth=5)
plt.scatter(data.inperson,data.response,alpha=0.7,label='Areas Relying Heavily On In-Person Operations')
plt.title('Census Response Rates Based On Enumeration Strategies',size=20)
plt.xlim(0,100)
plt.ylim(0,100)
plt.xlabel('Amount of Tract Relying On In-Person Census Operations (%)',size=15)
plt.ylabel('Response Rate For Tract (%)',size=15)
plt.show()

Next, we will compare the response rates for two types of tracts:

  • Those that do not rely on in-person census operations for initial contact at all
  • Those that are entirely dependent on in-person census operations for initial contact
In [3]:
# make histograms of data
all_mail = data[data.inperson == 0]
all_inperson = data[data.mail == 0]
fig,ax = plt.subplots(2,1,figsize=(15,9),sharex=True)
plt.xlabel('Response Rate (%)',size=16)
plt.subplots_adjust(top=0.87)
plt.suptitle('Census Response Rate Distribution Based On Enumeration Strategy',size=25)
mai = ax[0]
inp = ax[1]
mai.yaxis.set_major_locator(MaxNLocator(integer=True))
inp.yaxis.set_major_locator(MaxNLocator(integer=True))
mai.set_ylabel('Number of Tracts',size=12)
inp.set_ylabel('Number of Tracts',size=12)
mai.hist(all_mail.response,bins='auto')
inp.hist(all_inperson.response,bins='auto')
mai.set_title('Tracts Not Relying On In-Person Census Operations For Initial Contact',size=18)
inp.set_title('Tracts Relying Exclusively On In-Person Census Operations For Initial Contact',size=18)
plt.show()

As we compare these two types of areas, we can very clearly see how massive the response rate disparities between them are.

Surely, these will be assuaged to some degree once in-person operations ramp up again - but there is no telling how much (or how little) this gap will thin.

State Case Studies

We will now conduct this same analysis for each of the fifty states, plus the District of Columbia.

Some of these states rely heavily on in-person enumeration (and thus will provide robust data), while some barely depend on it. The former category of state is more interesting to us, but we will show all of the states in alphabetical order for the sake of being thorough.

For each state, we will show these same graphs, and provide a list of any tracts in the state that rely entirely on in-person enumeration. I recommend CUNY's Hard to Count map for visualizing and analyzing individual tracts.

Keep in mind - we can easily do this analysis for a smaller region within a state or a group of states almost instantly - email me any time at benjamin.livingston@columbia.edu if you'd like to see these results for your region, too.

NewsCounts provides countless data & research resources that make telling these stories easy - just drop us a line and we'll be happy to help.

Note: be sure not to miss our findings and additional information at the end of this guide.

In [4]:
# get names of areas to aid lookup in helper function below
names = pd.read_excel('https://www2.census.gov/programs-surveys/popest/geographies/2017/state-geocodes-v2017.xlsx',\
          skiprows=range(5),index_col=2).iloc[:,2]
names.columns=['name']

# add hyperlinks to each state
html = '<h2>Shortcut to state:</h2><p>'
links = sorted(names[names.index!=0])
for link in links:
    html += "<a href='#" + link + "'>" + link + "<br/></a>" 
html += "</p><h3><a href='#Conclusion'>Skip to Conclusion / Additional Information" + "</a></h3>"
display(HTML(html))

# helper function to plot individual areas
def show_area(fips):
    
    # convert to string
    area = str(fips)
    if len(area) == 1:
        area = area.zfill(2)
        
    # get name of area
    name = names[fips]
    
    # show name of area and add anchor
    display(HTML("<center><h1 id='" + name + "'>" + name +"</h1></center>"))
    
    # trim dataset to include only area in question and sort
    new = data[data.index.str.startswith(area)]
    new = new.sort_values('inperson')
    
    # if none of area relies on in-person enumeration, note and return
    if sum(new.inperson) == 0:
        display(HTML('<h4>' + name + ' does not rely on in-person enumeration</h4>'))
        return
    
    # generate separate datasets for areas with all in-person and no in-person enumeration  
    all_mail = new[new.inperson == 0]
    all_inperson = new[new.mail == 0]
    
    # generate list of tracts that rely completely on in-person enumeration
    tracts = sorted(all_inperson.index)
    
    # make scatterplot of data, and add curves
    plt.figure(figsize=(15,8))
    uniq = np.linspace(0,100,1001)
    new_x = dmatrix("bs(train, df=df, degree=3, include_intercept=True)", {"train": new.inperson,\
            "df":np.max([m.ceil(1/(1-np.mean(new.inperson<1))),4])},return_type='dataframe')
    model = sm.GLM(new.response, new_x)
    results = model.fit()
    plt.plot(new.inperson,results.predict(new_x),c='orange',linewidth=5)
    plt.scatter(new.inperson,new.response,alpha=0.7,label='Areas Relying Heavily On In-Person Operations')
    plt.title('Census Response Rates Based On Enumeration Strategies In ' + name,size=20)
    plt.xlim(0,100)
    plt.ylim(0,100)
    plt.xlabel('Amount of Tract Relying On In-Person Census Operations (%)',size=15)
    plt.ylabel('Response Rate For Tract (%)',size=15)
    plt.show()
    
    if len(tracts) != 0:
        # make histogram of data
        fig,ax = plt.subplots(2,1,figsize=(15,9),sharex=True)
        plt.xlabel('Response Rate (%)',size=16)
        plt.subplots_adjust(top=0.87)
        plt.suptitle('Census Response Rate Distribution Based On Enumeration Strategy In ' + name,size=20)
        mai = ax[0]
        inp = ax[1]
        mai.yaxis.set_major_locator(MaxNLocator(integer=True))
        inp.yaxis.set_major_locator(MaxNLocator(integer=True))
        mai.set_ylabel('Number of Tracts',size=12)
        inp.set_ylabel('Number of Tracts',size=12)
        mai.hist(all_mail.response,bins='auto')
        inp.hist(all_inperson.response,bins='auto')
        mai.set_title('Tracts Not Relying On In-Person Census Operations For Initial Contact',size=18)
        inp.set_title('Tracts Relying Exclusively On In-Person Census Operations For Initial Contact',size=18)
        plt.show()

    # print total number of tracts examined
    display(HTML('<h4>Total number of tracts examined in ' + name + ': ' + str(new.shape[0]) + '</h4>'))
    
    # print FIPS codes of tracts that rely on in-person operations
    if len(tracts) == 0:
        display(HTML('<h4>No tracts in ' + name + ' rely entirely on in-person census operations for initial contact</h4>'))
    else:
        display(HTML('<h4>Tracts in ' + name + \
                ' that are known to rely entirely on in-person census operations for initial contact:</h4>'))
        for tract in tracts:
            print(tract)
    
    # add links and space
    display(HTML("<h4><a href='#State-Case-Studies'> Back to top</a>"+"<br/>" + \
                       "<a href='#Conclusion'>Skip to Conclusion / Additional Information" + "</a></h4>"))
    print('\n')
    
# iterate through all states to display data
states_iter = names[names.index != 0].sort_values()
for place in states_iter.index:
    show_area(place)

Alabama

Total number of tracts examined in Alabama: 930

No tracts in Alabama rely entirely on in-person census operations for initial contact


Alaska

Total number of tracts examined in Alaska: 130

Tracts in Alaska that are known to rely entirely on in-person census operations for initial contact:

02013000100
02016000200
02020002900
02050000200
02070000200
02122000300
02122001200
02122001300
02170000501
02170000502
02170001300
02180000200
02185000100
02188000200
02195000200
02198000100
02198000200
02198940100
02220000100
02240000400
02275000300
02282000100

Arizona

Total number of tracts examined in Arizona: 1296

Tracts in Arizona that are known to rely entirely on in-person census operations for initial contact:

04001942600
04001942700
04001944000
04001944100
04001944201
04001944202
04001944901
04001944902
04001945001
04001945002
04001945100
04005942201
04005942202
04005944900
04005945000
04005945100
04005945200
04007940200
04007940400
04009940500
04012940200
04012940300
04013940700
04013941000
04013941100
04015940400
04017940008
04017940010
04017940011
04017940012
04017940013
04017940014
04017940015
04017940100
04017940301
04017940302
04017942300
04017942400
04017942500
04017963700
04017964201
04019940800
04019940900
04021941200
04025001300
04025002100
04027011403
04027011405

Arkansas

Total number of tracts examined in Arkansas: 546

No tracts in Arkansas rely entirely on in-person census operations for initial contact


California

Total number of tracts examined in California: 6813

Tracts in California that are known to rely entirely on in-person census operations for initial contact:

06007001800
06007001900
06007002000
06023940000
06037599000
06037599100
06041113000
06041122000
06059099506
06061020105
06061020106
06061020107
06061022100
06071009201
06071010803
06071010804
06071011203
06071011204
06071011205
06071011206
06071940100
06083980100

Colorado

Total number of tracts examined in Colorado: 1047

Tracts in Colorado that are known to rely entirely on in-person census operations for initial contact:

08007940400
08037000501
08037000502
08037000600
08067940300
08067940400
08107000200
08113968101

Connecticut

Total number of tracts examined in Connecticut: 770

No tracts in Connecticut rely entirely on in-person census operations for initial contact


Delaware

Delaware does not rely on in-person enumeration

District of Columbia

District of Columbia does not rely on in-person enumeration

Florida

Total number of tracts examined in Florida: 3266

Tracts in Florida that are known to rely entirely on in-person census operations for initial contact:

12005000600
12005000700

Georgia

Total number of tracts examined in Georgia: 1243

No tracts in Georgia rely entirely on in-person census operations for initial contact


Hawaii

Total number of tracts examined in Hawaii: 193

Tracts in Hawaii that are known to rely entirely on in-person census operations for initial contact:

15001021003
15001021011
15001022102
15005031900
15007040104
15007040800
15007040900
15009031700
15009031801

Idaho

Total number of tracts examined in Idaho: 194

Tracts in Idaho that are known to rely entirely on in-person census operations for initial contact:

16005940000
16011940000

Illinois

Total number of tracts examined in Illinois: 2967

No tracts in Illinois rely entirely on in-person census operations for initial contact


Indiana

Total number of tracts examined in Indiana: 1312

No tracts in Indiana rely entirely on in-person census operations for initial contact


Iowa

Total number of tracts examined in Iowa: 764

No tracts in Iowa rely entirely on in-person census operations for initial contact


Kansas

Total number of tracts examined in Kansas: 662

No tracts in Kansas rely entirely on in-person census operations for initial contact


Kentucky

Total number of tracts examined in Kentucky: 911

Tracts in Kentucky that are known to rely entirely on in-person census operations for initial contact:

21095970600
21121930602
21131920200
21131920300
21189930100
21189930200

Louisiana

Total number of tracts examined in Louisiana: 892

Tracts in Louisiana that are known to rely entirely on in-person census operations for initial contact:

22071004800

Maine

Total number of tracts examined in Maine: 306

No tracts in Maine rely entirely on in-person census operations for initial contact


Maryland

Maryland does not rely on in-person enumeration

Massachusetts

Total number of tracts examined in Massachusetts: 1314

No tracts in Massachusetts rely entirely on in-person census operations for initial contact


Michigan

Total number of tracts examined in Michigan: 2446

Tracts in Michigan that are known to rely entirely on in-person census operations for initial contact:

26099982100

Minnesota

Total number of tracts examined in Minnesota: 1170

Tracts in Minnesota that are known to rely entirely on in-person census operations for initial contact:

27021940002
27061940000
27087940100
27087940300

Mississippi

Total number of tracts examined in Mississippi: 452

No tracts in Mississippi rely entirely on in-person census operations for initial contact


Missouri

Total number of tracts examined in Missouri: 1133

No tracts in Missouri rely entirely on in-person census operations for initial contact


Montana

Total number of tracts examined in Montana: 222

Tracts in Montana that are known to rely entirely on in-person census operations for initial contact:

30003940600
30003940700
30005940100
30005940200
30035940200
30035940400
30035980000
30085940001
30085940002
30087940400

Nebraska

Total number of tracts examined in Nebraska: 511

Tracts in Nebraska that are known to rely entirely on in-person census operations for initial contact:

31173940100

Nevada

Total number of tracts examined in Nevada: 595

Tracts in Nevada that are known to rely entirely on in-person census operations for initial contact:

32003005612
32005001700
32005001800
32009950100
32031940200

New Hampshire

Total number of tracts examined in New Hampshire: 233

No tracts in New Hampshire rely entirely on in-person census operations for initial contact


New Jersey

Total number of tracts examined in New Jersey: 1823

Tracts in New Jersey that are known to rely entirely on in-person census operations for initial contact:

34005702101
34025809903
34029729000

New Mexico

Total number of tracts examined in New Mexico: 405

Tracts in New Mexico that are known to rely entirely on in-person census operations for initial contact:

35001940700
35006941500
35006946100
35025000800
35031940500
35031943700
35031943800
35035940000
35039000100
35039940700
35039940800
35039944100
35043940500
35043940600
35043940700
35045942801
35045942802
35045942803
35045942900
35045943000
35045943100
35049940300
35049940500
35049940600
35049940900
35055940000
35061940300

New York

Total number of tracts examined in New York: 4317

Tracts in New York that are known to rely entirely on in-person census operations for initial contact:

36033940000
36067940000
36103180300

North Carolina

Total number of tracts examined in North Carolina: 1712

Tracts in North Carolina that are known to rely entirely on in-person census operations for initial contact:

37133000800

North Dakota

Total number of tracts examined in North Dakota: 178

Tracts in North Dakota that are known to rely entirely on in-person census operations for initial contact:

38053940100
38061940300
38061940400
38079941800
38085940800
38085940900

Ohio

Total number of tracts examined in Ohio: 2601

No tracts in Ohio rely entirely on in-person census operations for initial contact


Oklahoma

Total number of tracts examined in Oklahoma: 865

Tracts in Oklahoma that are known to rely entirely on in-person census operations for initial contact:

40007951700
40025950100
40113940001
40113940003
40113940004
40113940005
40113940006
40113940007
40113940008
40113940009
40113940011

Oregon

Total number of tracts examined in Oregon: 667

Tracts in Oregon that are known to rely entirely on in-person census operations for initial contact:

41031940000

Pennsylvania

Total number of tracts examined in Pennsylvania: 2914

Tracts in Pennsylvania that are known to rely entirely on in-person census operations for initial contact:

42079216501

Rhode Island

Total number of tracts examined in Rhode Island: 218

Tracts in Rhode Island that are known to rely entirely on in-person census operations for initial contact:

44009041500

South Carolina

Total number of tracts examined in South Carolina: 882

Tracts in South Carolina that are known to rely entirely on in-person census operations for initial contact:

45019002004
45019003200

South Dakota

Total number of tracts examined in South Dakota: 196

Tracts in South Dakota that are known to rely entirely on in-person census operations for initial contact:

46007941000
46007941200
46023940200
46023940300
46031941000
46031941100
46041941500
46041941700
46071941200
46085940100
46102940500
46102940800
46102940900
46109940400
46109940800
46121940100
46121940200
46137941600

Tennessee

Total number of tracts examined in Tennessee: 1272

No tracts in Tennessee rely entirely on in-person census operations for initial contact


Texas

Total number of tracts examined in Texas: 3793

Tracts in Texas that are known to rely entirely on in-person census operations for initial contact:

48141010505
48141010506
48215024403
48409010500
48507950100
48507950200

Utah

Total number of tracts examined in Utah: 483

Tracts in Utah that are known to rely entirely on in-person census operations for initial contact:

49013940300
49013940600
49015976300
49037942000
49037942100
49043964203
49047940201

Vermont

Total number of tracts examined in Vermont: 163

No tracts in Vermont rely entirely on in-person census operations for initial contact


Virginia

Total number of tracts examined in Virginia: 1604

No tracts in Virginia rely entirely on in-person census operations for initial contact


Washington

Total number of tracts examined in Washington: 1161

Tracts in Washington that are known to rely entirely on in-person census operations for initial contact:

53019940000
53027940000
53047940100
53047940200
53065941000

West Virginia

Total number of tracts examined in West Virginia: 424

Tracts in West Virginia that are known to rely entirely on in-person census operations for initial contact:

54019020500
54045956102
54045956500
54045956700
54045956800
54047953600
54047953800
54047953900
54047954200
54047954504
54057010100
54059957100
54059957200
54059957300
54059957700
54075960102
54081001002
54083966500
54089000600
54089000700
54089000800
54099020800
54109002800
54109002902
54109003100

Wisconsin

Total number of tracts examined in Wisconsin: 1264

No tracts in Wisconsin rely entirely on in-person census operations for initial contact


Wyoming

Total number of tracts examined in Wyoming: 104

Tracts in Wyoming that are known to rely entirely on in-person census operations for initial contact:

56013940100
56013940201
56013940202
56013940400
56013940500

Conclusion

Rural areas relying on in-person census operations for initial contact have generally had very low response rates. This is true both nationally and for virtually every state where these areas exist en masse.

The census is a major driver for federal/state funding decisions, resource allocation, congressional apportionment, and general representation over the next ten years. If these rural areas (and the states that contain many of them) do not catch up when in-person operations begin, they could be vastly underrepresented.

The stakes have become very large for these areas, and this analysis can help tell that story.

A Vital Addendum

As mentioned earlier, households in rural areas can typically still fill out the census online even before they receive invitation. In most cases, you do not need to wait for an invitation to respond to the census.

Adapting This Analysis For Your Newsroom

NewsCounts can help you run this analysis for your region, whether it is a smaller region within a state or a collection of states. It is very easy for us to pop out these same numbers for just about any level of US geography with a FIPS code, as long as it has a reasonable mass of tracts relying on in-person enumeration.

We can do it almost instantaneously - just ask!

Contact Info & Other Resources

Feel free to email me at benjamin.livingston@columbia.edu any time if you'd like us to do this for your area, or if you have any questions.

The Census Bureau is tracking 2020 response rates and provides a wonderful map with up-to-date data. NewsCounts also provides a beta dashboard and API that allows you to grab the daily response data for yourself.

We have also conducted a couple other analyses that you may find useful for local census reporting:

Please don't hesistate to reach out with any census reporting-related questions. We recognize that 2020 is a challenging time for journalists, and we're here to make covering this pivotal census easier for you.