1 dataset found
  1. a

    Decoding Home Values: The Power of Education vs. Race, Ethnicity, and Gender...

    • chi-phi-nmcdc.opendata.arcgis.com
    Updated Jul 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New Mexico Community Data Collaborative (2023). Decoding Home Values: The Power of Education vs. Race, Ethnicity, and Gender [Dataset]. https://chi-phi-nmcdc.opendata.arcgis.com/datasets/decoding-home-values-the-power-of-education-vs-race-ethnicity-and-gender
    Explore at:
    Dataset updated
    Jul 25, 2023
    Dataset authored and provided by
    New Mexico Community Data Collaborative
    Description

    A detailed explanation of how this dataset was put together, including data sources and methodologies, follows below.Please see the "Terms of Use" section below for the Data DictionaryDATA ACQUISITION AND CLEANING PROCESSThis dataset was built from 5 separate datasets queried during the months of April and May 2023 from the Census Microdata System (link below):https://data.census.gov/mdat/#/All datasets include information on Property Value (VALP) by: Educational Attainment (SCHL), Gender (SEX), a specified race or ethnicity (RAC or HISP), and are grouped by Public Use Microdata Areas (PUMAS). PUMAS are geographic areas created by the Census bureau; they are weighted by land area and population to facilitate data analysis. Data also Included totals for the state of New Mexico, so 19 total geographies are represented. Datasets were downloaded separately by race and ethnicity because this was the only way to obtain the VALP, SCHL, and SEX variables intersectionally with race or ethnicity data. Datasets were downloaded separately by race and ethnicity because this was the only way to obtain the VALP, SCHL, and SEX variables intersectionally with race or ethnicity data. Cleaning each dataset started with recoding the SCHL and HISP variables - details on recoding can be found below.After recoding, each dataset was transposed so that PUMAS were rows and SCHL, VALP, SEX, and Race or Ethnicity variables were the columns.Median values were calculated in every case that recoding was necessary. As a result, all Property Values in this dataset reflect median values.At times the ACS data downloaded with zeros instead of the 'null' values in initial query results. The VALP variable also included a "-1" variable to reflect N/A values (details in variable notes). Both zeros and "-1" values were removed before calculating median values, both to keep the data true to the original query and to generate accurate median values.Recoding the SCHL variable resulted in 5 rows for each PUMA, reflecting the different levels of educational attainment in each region. Columns grouped variables by race or ethnicity and gender. Cell values were property values.All 5 datasets were joined after recoding and cleaning the data. Original datasets all include 95 rows with 5 separate Educational Attainment variables for each PUMA, including New Mexico State totals.Because 1 row was needed for each PUMA in order to map this data, the data was split by Educational Attainment (SCHL), resulting in 110 columns reflecting median property values for each race or ethnicity by gender and level of educational attainment.A short, unique 2 to 5 letter alias was created for each PUMA area in anticipation of needing a unique identifier to join the data with. GIS AND MAPPING PROCESSA PUMA shapefile was downloaded from the ACS site. The Shapefile can be downloaded here: https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/PUMA_TAD_TAZ_UGA_ZCTA/MapServerThe DBF from the PUMA shapefile was exported to Excel; this shapefile data included needed geographic information for mapping such as: GEOID, PUMACE. The UIDs created for each PUMA were added to the shapefile data; the PUMA shapfile data and ACS data were then joined on UID in JMP.The data table was joined to the shapefile in ARC GiIS, based on PUMA region (specifically GEOID text).The resulting shapefile was exported as a GDB (geodatabase) in order to keep 'Null' values in the data. GDBs are capable of including a rule allowing null values where shapefiles are not. This GDB was uploaded to NMCDCs Arc Gis platform. SYSTEMS USEDMS Excel was used for data cleaning, recoding, and deriving values. Recoding was done directly in the Microdata system when possible - but because the system is was in beta at the time of use some features were not functional at times.JMP was used to transpose, join, and split data. ARC GIS Desktop was used to create the shapefile uploaded to NMCDC's online platform. VARIABLE AND RECODING NOTESTIMEFRAME: Data was queried for the 5 year period of 2015 to 2019 because ACS changed its definiton for and methods of collecting data on race and ethinicity in 2020. The change resulted in greater aggregation and les granular data on variables from 2020 onward.Note: All Race Data reflects that respondants identified as the specified race alone or in combination with one or more other races.VARIABLE:ACS VARIABLE DEFINITIONACS VARIABLE NOTESDETAILS OR URL FOR RAW DATA DOWNLOADRACBLKBlack or African American ACS Query: RACBLK, SCHL, SEX, VALP 2019 5yrRACAIANAmerican Indian and Alaska Native ACS Query: RACAIAN, SCHL, SEX, VALP 2019 5yrRACASNAsian ACS Query: RACASN, SCHL, SEX, VALP 2019 5yrRACWHTWhite ACS Query: RACWHT, SCHL, SEX, VALP 2019 5yrHISPHispanic Origin ACS Query: HISP ORG, SCHL, SEX, VALP 2019 5yrHISP RECODE: 24 original separate variablesThe Hispanic Origin (HISP) variable originally included 24 subcategories reflecting Mexican, Central American, South American, and Caribbean Latino, and Spanish identities from each Latin American counry. 7 recoded VariablesThese 24 variables were recoded (grouped) into 7 simpler categories for data analysis: Not Spanish/Hispanic/Latino, Mexican, Caribbean Latino, Central American, South American, Spaniard, All other Spanish/Hispanic/Latino Female. Not Spanish/Hispanic/Latino was not really used in the final dataset as the race datasets provided that information.SCHLEducational Attainment25 original separate variablesThe Educational Attainment (SCHL) variable originally included 25 subcategories reflecting the education levels of adults (over 18) surveyed by the ACS. These include: Kindergarten, Grades 1 through 12 separately, 12th grade with no diploma, Highschool Diploma, GED or credential, less than 1 year of college, more than 1 year of college with no degree, Associate's Degree, Bachelor's Degree, Master's Degree, Professional Degree, and Doctorate Degree.SCHL RECODE: 5 recoded variablesThese 25 variables were recoded (grouped) into 5 simpler categories for data analysis: No High School Diploma, High School Diploma or GED, Some College, Bachelor's Degree, and Advanced or Professional DegreeSEXGender2 variables1 - Male, 2 - FemaleVALPProperty Value1 variableValues were rounded and top-coded by ACS for anonymity. The "-1" variable is defined as N/A (GQ/ Vacant lots except 'for sale only' and 'sold, not occupied' / not owned or being bought.) This variable reflects the median value of property owned by individuals of each race, ethnicity, gender, and educational attainment category.PUMAPublic Use Microdata Area18 PUMAsPUMAs in New Mexico can be viewed here:https://nmcdc.maps.arcgis.com/apps/mapviewer/index.html?webmap=d9fed35f558948ea9051efe9aa529eafData includes 19 total regions: 18 Pumas and NM State TotalsNOTES AND RESOURCESThe following resources and documentation were used to navigate the ACS PUMS system and to answer questions about variables:Census Microdata API User Guide:https://www.census.gov/data/developers/guidance/microdata-api-user-guide.Additional_Concepts.html#list-tab-1433961450Accessing PUMS Data:https://www.census.gov/programs-surveys/acs/microdata/access.htmlHow to use PUMS on data.census.govhttps://www.census.gov/programs-surveys/acs/microdata/mdat.html2019 PUMS Documentation:https://www.census.gov/programs-surveys/acs/microdata/documentation.2019.html#list-tab-13709392012014 to 2018 ACS PUMS Data Dictionary:https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2014-2018.pdf2019 PUMS Tiger/Line Shapefileshttps://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2019&layergroup=Public+Use+Microdata+Areas Note 1: NMCDC attemepted to contact analysts with the ACS system to clarify questions about variables, but did not receive a timely response. Documentation was then consulted.Note 2: All relevant documentation was reviewed and seems to imply that all survey questions were answered by adults, age 18 or over. Youth who have inherited property could potentially be reflected in this data.Dataset and feature service created in May 2023 by Renee Haley, Data Specialist, NMCDC.

  2. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
New Mexico Community Data Collaborative (2023). Decoding Home Values: The Power of Education vs. Race, Ethnicity, and Gender [Dataset]. https://chi-phi-nmcdc.opendata.arcgis.com/datasets/decoding-home-values-the-power-of-education-vs-race-ethnicity-and-gender

Decoding Home Values: The Power of Education vs. Race, Ethnicity, and Gender

Explore at:
Dataset updated
Jul 25, 2023
Dataset authored and provided by
New Mexico Community Data Collaborative
Description

A detailed explanation of how this dataset was put together, including data sources and methodologies, follows below.Please see the "Terms of Use" section below for the Data DictionaryDATA ACQUISITION AND CLEANING PROCESSThis dataset was built from 5 separate datasets queried during the months of April and May 2023 from the Census Microdata System (link below):https://data.census.gov/mdat/#/All datasets include information on Property Value (VALP) by: Educational Attainment (SCHL), Gender (SEX), a specified race or ethnicity (RAC or HISP), and are grouped by Public Use Microdata Areas (PUMAS). PUMAS are geographic areas created by the Census bureau; they are weighted by land area and population to facilitate data analysis. Data also Included totals for the state of New Mexico, so 19 total geographies are represented. Datasets were downloaded separately by race and ethnicity because this was the only way to obtain the VALP, SCHL, and SEX variables intersectionally with race or ethnicity data. Datasets were downloaded separately by race and ethnicity because this was the only way to obtain the VALP, SCHL, and SEX variables intersectionally with race or ethnicity data. Cleaning each dataset started with recoding the SCHL and HISP variables - details on recoding can be found below.After recoding, each dataset was transposed so that PUMAS were rows and SCHL, VALP, SEX, and Race or Ethnicity variables were the columns.Median values were calculated in every case that recoding was necessary. As a result, all Property Values in this dataset reflect median values.At times the ACS data downloaded with zeros instead of the 'null' values in initial query results. The VALP variable also included a "-1" variable to reflect N/A values (details in variable notes). Both zeros and "-1" values were removed before calculating median values, both to keep the data true to the original query and to generate accurate median values.Recoding the SCHL variable resulted in 5 rows for each PUMA, reflecting the different levels of educational attainment in each region. Columns grouped variables by race or ethnicity and gender. Cell values were property values.All 5 datasets were joined after recoding and cleaning the data. Original datasets all include 95 rows with 5 separate Educational Attainment variables for each PUMA, including New Mexico State totals.Because 1 row was needed for each PUMA in order to map this data, the data was split by Educational Attainment (SCHL), resulting in 110 columns reflecting median property values for each race or ethnicity by gender and level of educational attainment.A short, unique 2 to 5 letter alias was created for each PUMA area in anticipation of needing a unique identifier to join the data with. GIS AND MAPPING PROCESSA PUMA shapefile was downloaded from the ACS site. The Shapefile can be downloaded here: https://tigerweb.geo.census.gov/arcgis/rest/services/TIGERweb/PUMA_TAD_TAZ_UGA_ZCTA/MapServerThe DBF from the PUMA shapefile was exported to Excel; this shapefile data included needed geographic information for mapping such as: GEOID, PUMACE. The UIDs created for each PUMA were added to the shapefile data; the PUMA shapfile data and ACS data were then joined on UID in JMP.The data table was joined to the shapefile in ARC GiIS, based on PUMA region (specifically GEOID text).The resulting shapefile was exported as a GDB (geodatabase) in order to keep 'Null' values in the data. GDBs are capable of including a rule allowing null values where shapefiles are not. This GDB was uploaded to NMCDCs Arc Gis platform. SYSTEMS USEDMS Excel was used for data cleaning, recoding, and deriving values. Recoding was done directly in the Microdata system when possible - but because the system is was in beta at the time of use some features were not functional at times.JMP was used to transpose, join, and split data. ARC GIS Desktop was used to create the shapefile uploaded to NMCDC's online platform. VARIABLE AND RECODING NOTESTIMEFRAME: Data was queried for the 5 year period of 2015 to 2019 because ACS changed its definiton for and methods of collecting data on race and ethinicity in 2020. The change resulted in greater aggregation and les granular data on variables from 2020 onward.Note: All Race Data reflects that respondants identified as the specified race alone or in combination with one or more other races.VARIABLE:ACS VARIABLE DEFINITIONACS VARIABLE NOTESDETAILS OR URL FOR RAW DATA DOWNLOADRACBLKBlack or African American ACS Query: RACBLK, SCHL, SEX, VALP 2019 5yrRACAIANAmerican Indian and Alaska Native ACS Query: RACAIAN, SCHL, SEX, VALP 2019 5yrRACASNAsian ACS Query: RACASN, SCHL, SEX, VALP 2019 5yrRACWHTWhite ACS Query: RACWHT, SCHL, SEX, VALP 2019 5yrHISPHispanic Origin ACS Query: HISP ORG, SCHL, SEX, VALP 2019 5yrHISP RECODE: 24 original separate variablesThe Hispanic Origin (HISP) variable originally included 24 subcategories reflecting Mexican, Central American, South American, and Caribbean Latino, and Spanish identities from each Latin American counry. 7 recoded VariablesThese 24 variables were recoded (grouped) into 7 simpler categories for data analysis: Not Spanish/Hispanic/Latino, Mexican, Caribbean Latino, Central American, South American, Spaniard, All other Spanish/Hispanic/Latino Female. Not Spanish/Hispanic/Latino was not really used in the final dataset as the race datasets provided that information.SCHLEducational Attainment25 original separate variablesThe Educational Attainment (SCHL) variable originally included 25 subcategories reflecting the education levels of adults (over 18) surveyed by the ACS. These include: Kindergarten, Grades 1 through 12 separately, 12th grade with no diploma, Highschool Diploma, GED or credential, less than 1 year of college, more than 1 year of college with no degree, Associate's Degree, Bachelor's Degree, Master's Degree, Professional Degree, and Doctorate Degree.SCHL RECODE: 5 recoded variablesThese 25 variables were recoded (grouped) into 5 simpler categories for data analysis: No High School Diploma, High School Diploma or GED, Some College, Bachelor's Degree, and Advanced or Professional DegreeSEXGender2 variables1 - Male, 2 - FemaleVALPProperty Value1 variableValues were rounded and top-coded by ACS for anonymity. The "-1" variable is defined as N/A (GQ/ Vacant lots except 'for sale only' and 'sold, not occupied' / not owned or being bought.) This variable reflects the median value of property owned by individuals of each race, ethnicity, gender, and educational attainment category.PUMAPublic Use Microdata Area18 PUMAsPUMAs in New Mexico can be viewed here:https://nmcdc.maps.arcgis.com/apps/mapviewer/index.html?webmap=d9fed35f558948ea9051efe9aa529eafData includes 19 total regions: 18 Pumas and NM State TotalsNOTES AND RESOURCESThe following resources and documentation were used to navigate the ACS PUMS system and to answer questions about variables:Census Microdata API User Guide:https://www.census.gov/data/developers/guidance/microdata-api-user-guide.Additional_Concepts.html#list-tab-1433961450Accessing PUMS Data:https://www.census.gov/programs-surveys/acs/microdata/access.htmlHow to use PUMS on data.census.govhttps://www.census.gov/programs-surveys/acs/microdata/mdat.html2019 PUMS Documentation:https://www.census.gov/programs-surveys/acs/microdata/documentation.2019.html#list-tab-13709392012014 to 2018 ACS PUMS Data Dictionary:https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2014-2018.pdf2019 PUMS Tiger/Line Shapefileshttps://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2019&layergroup=Public+Use+Microdata+Areas Note 1: NMCDC attemepted to contact analysts with the ACS system to clarify questions about variables, but did not receive a timely response. Documentation was then consulted.Note 2: All relevant documentation was reviewed and seems to imply that all survey questions were answered by adults, age 18 or over. Youth who have inherited property could potentially be reflected in this data.Dataset and feature service created in May 2023 by Renee Haley, Data Specialist, NMCDC.

Search
Clear search
Close search
Google apps
Main menu