Here is a list of the ten most Hispanic counties in New York State from the 2020 US Census

Here is a list of the ten most Hispanic counties in New York State from the 2020 US Census.

County	Percent Hispanic
Bronx	54.7625579396111
Queens	27.7643315385306
Westchester	26.8138904900857
New York	23.7650737700612
Orange	22.3627619545987
Suffolk	21.8202133794694
Rockland	19.6409412140254
Richmond	19.5583634394157
Kings	18.8747087980808
Nassau	18.3715271956635

Here is how you can create this list using PANDAS. You will need to get the PL-94 171 Redistricting data, the Legacy File Format Header Records, and expand the ZIP file and place in the appropriate directory described below.

import pandas as pd
import geopandas as gpd

# path where 2020_PLSummaryFile_FieldNames.xlsx XX=State Code
# and XXgeo2020.pl, xx000012020.pl through XX000032020.pl
# reside on your hard drive
path='/home/andy/Desktop/2020pl-94-171/'

# state code
state='ny'

# header file, open with all tabs as an dictionary of dataframes
field_names=pd.read_excel(path+'2020_PLSummaryFile_FieldNames.xlsx', sheet_name=None)

# load the geoheader, force as str type to mixed types on certain fields
# ensure GEOIDs are properly processed avoids issues with paging
gh=pd.read_csv( path+state+'geo2020.pl',delimiter='|',
               header=None, 
               names=field_names['2020 P.L. Geoheader Fields'].columns,
               index_col='LOGRECNO',
               dtype=str )
               
 # load segment 1 of 2020 PL 94-171 which is racial data 
segNum=1
seg=pd.read_csv( path+state+'0000'+str(segNum)+'2020.pl',delimiter='|',
               header=None, 
               names=field_names['2020 P.L. Segment '+str(segNum)+' Fields'].columns,
               index_col='LOGRECNO',
              )
# discard FILEID, STUSAB, CHARITER, CIFSN as duplicative after join
seg=seg.iloc[:,4:]

# join seg to geoheader
seg=gh.join(seg)

# Calculate the population of New York Counties that is African American 
# using County SUMLEVEL == 50 (see Census Docts)
ql="SUMLEV=='050'"

# Create a DataFrame with the County and Percent Hispani
# You can get the fields list from 2020 PL Summary FieldNames.xlsx
# under the 2020 P.L. Segment 1 Definitions tab
his=pd.DataFrame({ 'County': seg.query(ql)['BASENAME'], 
              'Percent Hispanic': seg.query(ql)['P0020002'] / seg.query(ql)['P0020001'] *100})

# Sort and print most Hispanic Counties
his.sort_values(by="Percent Hispanic", ascending=False).head(10).to_csv('/tmp/hispanics.csv')

Here is a list of the ten most Hispanic counties in New York State from the 2020 US Census

1 Trackback or Pingback

Leave a Reply Cancel reply