5 datasets found
  1. Historic US Census - 1940

    • redivis.com
    application/jsonl +7
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Historic US Census - 1940 [Dataset]. http://doi.org/10.57761/660g-eq95
    Explore at:
    avro, arrow, sas, application/jsonl, spss, parquet, stata, csvAvailable download formats
    Dataset updated
    Jan 10, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 1940 - Dec 31, 1940
    Area covered
    United States
    Description

    Abstract

    The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The IPUMS microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    phsdatacore@stanford.edu for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Documentation

    Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

    In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

    The historic US 1940 census data was collected in April 1940. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

    Notes

    • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.
    • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT40, reconstructed using the variable SERIAL40, and the original count is found in the variable NUMPREC40.
    • Some variables are missing from this data set for specific enumeration districts. The enumeration districts with missing data can be identified using the variable EDMISS. These variables will be added in a future release.
    • Coded variables derived from string variables are still in progress. These variables include: occupation, industry and migration status.
    • Missing observations have been allocated and some inconsistencies have been edited for the following variables: Missing observations have been allocated and some inconsistencies have been edited for the following variables: SURSIM, SEX, SCHOOL, RELATE, RACE, OCC1950, MTONGUE, MBPL, FBPL, BPL, MARST, EMPSTAT, CITIZEN, OWNERSHP. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.
    • Most inconsistent information was not edited for this release, thus there are observations outside of the universe for many variables. In particular, the variables GQ, and GQTYPE have known inconsistencies and will be improved with the next r
  2. e

    Alaskan Population Demographic Information from Decennial and American...

    • knb.ecoinformatics.org
    • search.dataone.org
    • +1more
    Updated Apr 11, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Census Bureau; Juliet Bachtel; John Randazzo; Erika Gavenus (2019). Alaskan Population Demographic Information from Decennial and American Community Survey Census Data, 1940-2016 [Dataset]. http://doi.org/10.5063/F10R9MPV
    Explore at:
    Dataset updated
    Apr 11, 2019
    Dataset provided by
    Knowledge Network for Biocomplexity
    Authors
    United States Census Bureau; Juliet Bachtel; John Randazzo; Erika Gavenus
    Time period covered
    Jan 1, 1940 - Dec 31, 2015
    Area covered
    Variables measured
    lat, lng, Year, city, ANVSA, Negro, Other, Place, White, Aleut., and 145 more
    Description

    These data comprise Census records relating to the Alaskan people's population demographics for the State of Alaskan Salmon and People (SASAP) Project. Decennial census data were originally extracted from IPUMS National Historic Geographic Information Systems website: https://data2.nhgis.org/main (Citation: Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 12.0 [Database]. Minneapolis: University of Minnesota. 2017. http://doi.org/10.18128/D050.V12.0). A number of relevant tables of basic demographics on age and race, household income and poverty levels, and labor force participation were extracted. These particular variables were selected as part of an effort to understand and potentially quantify various dimensions of well-being in Alaskan communities. The file "censusdata_master.csv" is a consolidation of all 21 other data files in the package. For detailed information on how the datasets vary over different years, view the file "readme.docx" available in this data package. The included .Rmd file is a script which combines the 21 files by year into a single file (censusdata_master.csv). It also cleans up place names (including typographical errors) and uses the USGS place names dataset and the SASAP regions dataset to assign latitude and longitude values and region values to each place in the dataset. Note that some places were not assigned a region or location because they do not fit well into the regional framework. Considerable heterogeneity exists between census surveys each year. While we have attempted to combine these datasets in a way that makes sense, there may be some discrepancies or unexpected values. The RMarkdown document SASAPWebsiteGraphicsCensus.Rmd is used to generate a variety of figures using these data, including the additional file Chignik_population.png. An additional set of 25 figures showing regional trends in population and income metrics are also included.

  3. H

    VANV individual collections, 1940-2018 (inclusive): Dataset

    • dataverse.harvard.edu
    • search.datacite.org
    Updated Mar 28, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stark, Laura Jeanine Morris, 1975- (2019). VANV individual collections, 1940-2018 (inclusive): Dataset [Dataset]. http://doi.org/10.7910/DVN/WFFS4W
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 28, 2019
    Dataset provided by
    Harvard Dataverse
    Authors
    Stark, Laura Jeanine Morris, 1975-
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/WFFS4Whttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/WFFS4W

    Time period covered
    Jan 1, 1940 - Dec 31, 2018
    Description

    The Vernacular Archive of Normal Volunteers (VANV), 1940-2018 (inclusive) is a collection of oral histories, associated archival documents, and project records created and collected by Laura Jeanine Morris Stark (born 1975) to explore the lives of the first “normal control” research subjects at the Clinical Center of the United States National Institutes of Health (NIH) in Bethesda, Maryland who were recruited through NIH’s Normal Volunteer Patient Program. Dataset consists of materials from two sources. First, it includes audio recordings and transcripts of oral histories Laura Stark conducted from 2010-2017 with individuals who were involved with the NIH Normal Volunteer Patient Program between 1954 and 2002, along with related personal documents given to Stark by interviewees such as photographs, letters, diaries, news clippings and other memorabilia. Most of the interviewees were former “normal controls” and others were NIH staff members or scientists who did research on the “normal volunteers.” For the interviewees who provided Stark with historical contextual documents, these materials were digitized and the files were combined into one "records" PDF file for each individual interviewee by the Center for the History of Medicine as part of the dataverse deposit process. Original contextual documents remain in the possession of the interviewees. Second, the Dataset includes digital duplicates of materials related to the Normal Volunteer Patient Program compiled by Stark from the special collections of organizations that were the sources of “normal volunteers” for the NIH Clinical Center. Physical copies of the materials remain in the historical collections of organizations, such as universities, churches, civic groups, and labor unions, that signed contracts with NIH to provide healthy people for scientists to research through the Normal Volunteer Patient Program. Records for individual collections are grouped alphabetically by the last name of the interviewee or the name of the organization. Data files include audio files of the oral history interviews, interview transcripts, individual consent and release forms, and related contextual documents supplied by interviewees or organizations. Associated records such as interview questions, the protocol for interview transcription, and template consent, release, and donation forms may be found in the dataset "VANV project records, 2010-2018." Note that the date span (1940-2018) of this dataset reflects the creation dates of original materials that may exist here only as more recently created digital reproductions, for example, items from 1940 are digital scans of letters, photographs, and other documents created in 1940.

  4. Climate Change: Earth Surface Temperature Data

    • kaggle.com
    • redivis.com
    zip
    Updated May 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Berkeley Earth (2017). Climate Change: Earth Surface Temperature Data [Dataset]. https://www.kaggle.com/datasets/berkeleyearth/climate-change-earth-surface-temperature-data
    Explore at:
    zip(88843537 bytes)Available download formats
    Dataset updated
    May 1, 2017
    Dataset authored and provided by
    Berkeley Earthhttp://berkeleyearth.org/
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Earth
    Description

    Some say climate change is the biggest threat of our age while others say it’s a myth based on dodgy science. We are turning some of the data over to you so you can form your own view.

    us-climate-change

    Even more than with other data sets that Kaggle has featured, there’s a huge amount of data cleaning and preparation that goes into putting together a long-time study of climate trends. Early data was collected by technicians using mercury thermometers, where any variation in the visit time impacted measurements. In the 1940s, the construction of airports caused many weather stations to be moved. In the 1980s, there was a move to electronic thermometers that are said to have a cooling bias.

    Given this complexity, there are a range of organizations that collate climate trends data. The three most cited land and ocean temperature data sets are NOAA’s MLOST, NASA’s GISTEMP and the UK’s HadCrut.

    We have repackaged the data from a newer compilation put together by the Berkeley Earth, which is affiliated with Lawrence Berkeley National Laboratory. The Berkeley Earth Surface Temperature Study combines 1.6 billion temperature reports from 16 pre-existing archives. It is nicely packaged and allows for slicing into interesting subsets (for example by country). They publish the source data and the code for the transformations they applied. They also use methods that allow weather observations from shorter time series to be included, meaning fewer observations need to be thrown away.

    In this dataset, we have include several files:

    Global Land and Ocean-and-Land Temperatures (GlobalTemperatures.csv):

    • Date: starts in 1750 for average land temperature and 1850 for max and min land temperatures and global ocean and land temperatures
    • LandAverageTemperature: global average land temperature in celsius
    • LandAverageTemperatureUncertainty: the 95% confidence interval around the average
    • LandMaxTemperature: global average maximum land temperature in celsius
    • LandMaxTemperatureUncertainty: the 95% confidence interval around the maximum land temperature
    • LandMinTemperature: global average minimum land temperature in celsius
    • LandMinTemperatureUncertainty: the 95% confidence interval around the minimum land temperature
    • LandAndOceanAverageTemperature: global average land and ocean temperature in celsius
    • LandAndOceanAverageTemperatureUncertainty: the 95% confidence interval around the global average land and ocean temperature

    Other files include:

    • Global Average Land Temperature by Country (GlobalLandTemperaturesByCountry.csv)
    • Global Average Land Temperature by State (GlobalLandTemperaturesByState.csv)
    • Global Land Temperatures By Major City (GlobalLandTemperaturesByMajorCity.csv)
    • Global Land Temperatures By City (GlobalLandTemperaturesByCity.csv)

    The raw data comes from the Berkeley Earth data page.

  5. o

    Historic Redlining Scores for 2010 and 2020 US Census Tracts

    • openicpsr.org
    spss
    Updated May 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helen C.S. Meier; Bruce C. Mitchell (2021). Historic Redlining Scores for 2010 and 2020 US Census Tracts [Dataset]. http://doi.org/10.3886/E141121V2
    Explore at:
    spssAvailable download formats
    Dataset updated
    May 25, 2021
    Dataset provided by
    National Community Reinvestment Coalition
    University of Michigan. Institute for Social Research. Survey Research Center
    Authors
    Helen C.S. Meier; Bruce C. Mitchell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    The Home Owners’ Loan Corporation (HOLC) was a U.S. federal agency that graded mortgage investment risk of neighborhoods across the U.S. between 1935 and 1940. HOLC residential security maps standardized neighborhood risk appraisal methods that included race and ethnicity, pioneering the institutional logic of residential “redlining.” The Mapping Inequality Project digitized the HOLC mortgage security risk maps from the 1930s. We overlaid the HOLC maps with 2010 and 2020 census tracts for 142 cities across the U.S. using ArcGIS and determined the proportion of HOLC residential security grades contained within the boundaries. We assigned a numerical value to each HOLC risk category as follows: 1 for “A” grade, 2 for “B” grade, 3 for “C” grade, and 4 for “D” grade. We calculated a historic redlining score from the summed proportion of HOLC residential security grades multiplied by a weighting factor based on area within each census tract. A higher score means greater redlining of the census tract. Continuous historic redlining score, assessing the degree of “redlining,” as well as 4 equal interval divisions of redlining, can be linked to existing data sources by census tract identifier allowing for one form of structural racism in the housing market to be assessed with a variety of outcomes. The 2010 files are set to census 2010 tract boundaries. The 2020 files use the new census 2020 tract boundaries, reflecting the increase in the number of tracts from 12,888 in 2010, to 13,488 in 2020. Use the 2010 HRS with decennial census 2010 or ACS 2010-2019 data. As of publication (10/15/2020) decennial census 2020 data for the P1 (population) and H1 (housing) files are available from census.

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford Center for Population Health Sciences (2020). Historic US Census - 1940 [Dataset]. http://doi.org/10.57761/660g-eq95
Organization logo

Historic US Census - 1940

Explore at:
avro, arrow, sas, application/jsonl, spss, parquet, stata, csvAvailable download formats
Dataset updated
Jan 10, 2020
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1940 - Dec 31, 1940
Area covered
United States
Description

Abstract

The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The IPUMS microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

The historic US 1940 census data was collected in April 1940. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

Notes

  • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.
  • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT40, reconstructed using the variable SERIAL40, and the original count is found in the variable NUMPREC40.
  • Some variables are missing from this data set for specific enumeration districts. The enumeration districts with missing data can be identified using the variable EDMISS. These variables will be added in a future release.
  • Coded variables derived from string variables are still in progress. These variables include: occupation, industry and migration status.
  • Missing observations have been allocated and some inconsistencies have been edited for the following variables: Missing observations have been allocated and some inconsistencies have been edited for the following variables: SURSIM, SEX, SCHOOL, RELATE, RACE, OCC1950, MTONGUE, MBPL, FBPL, BPL, MARST, EMPSTAT, CITIZEN, OWNERSHP. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.
  • Most inconsistent information was not edited for this release, thus there are observations outside of the universe for many variables. In particular, the variables GQ, and GQTYPE have known inconsistencies and will be improved with the next r
Search
Clear search
Close search
Google apps
Main menu