Facebook
TwitterThis dataset contains Race/Ethinicty codes. It is used to enter in patient demographics information.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo entry details the methodology for extracting and reconciling ethnicity data from the Clinical Practice Research Datalink (CPRD), incorporating both General Practitioner (GP) and Hospital Episode Statistics (HES) sources. The approach aims to resolve discrepancies between these sources and provide a standardized single ethnicity value per patient, categorized into 6 and 12 levels according to NHS coding guidelines.
Ethnicity data from the CPRD are recorded in multiple formats. This study harmonizes these data to achieve consistent ethnicity classification across patient records, following a hierarchal reconciliation protocol prioritizing hospital data over GP records.
Ethnicity Levels: Ethnicity data are processed to conform to two levels of granularity:
Source Data Mapping:
Algorithm (AIM-CISC):
Unique Patient Identifiers: Each patient is represented once in hospital data, ensuring a single source of truth for hospital-based ethnicities. This simplifies reconciliation with GP data when discrepancies arise.
Instances were noted where multiple Medcodes map back to a single SNOMED code, highlighting the importance of careful data cross-referencing. For example, two different Medcodes represent the New Zealand European ethnicity, which both map back to the identical SNOMED code.
Facebook
TwitterThis layer shows the population broken down by race and Hispanic origin. Data is from US Census American Community Survey (ACS) 5-year estimates.To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right (in ArcGIS Online). A ‘Null’ entry in the estimate indicates that data for this geographic area cannot be displayed because the number of sample cases is too small (per the U.S. Census).Vintage: 2018-2022ACS Table(s): B03002 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Data Preparation: Data table was downloaded and joined with Zip Code boundaries in the City of Tempe.Date of Census update: December 15, 2023National Figures: data.census.gov
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Comparing NHS England SNOMED code mapping with how individuals self-identified their ethnicity in Census 2021.
Facebook
TwitterCode lists for ethnicity and the conditions considered in our study.
Facebook
TwitterThis dataset includes race/ethnicity of newly Medi-Cal eligible individuals who identified their race/ethnicity as Hispanic, White, Other Asian or Pacific Islander, Black, Chinese, Filipino, Vietnamese, Asian Indian, Korean, Alaskan Native or American Indian, Japanese, Cambodian, Samoan, Laotian, Hawaiian, Guamanian, Amerasian, or Other, by reporting period. The race/ethnicity data is from the Medi-Cal Eligibility Data System (MEDS) and includes eligible individuals without prior Medi-Cal Eligibility. This dataset is part of the public reporting requirements set forth in California Welfare and Institutions Code 14102.5.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This layer shows population broken down by race and Hispanic origin. Data is from US Census American Community Survey (ACS) 5-year estimates.To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right (in ArcGIS Online). A ‘Null’ entry in the estimate indicates that data for this geographic area cannot be displayed because the number of sample cases is too small (per the U.S. Census).Vintage: 2016-2020ACS Table(s): B03002 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Data Preparation: Data table downloaded and joined with Zip Code boundaries in the City of Tempe.Date of Census update: March 17, 2022National Figures: data.census.gov
Facebook
TwitterTIGER, TIGER/Line, and Census TIGER are registered trademarks of the Bureau of the Census. The Redistricting Census 2000 TIGER/Line files are an extract of selected geographic and cartographic information from the Census TIGER data base. The geographic coverage for a single TIGER/Line file is a county or statistical equivalent entity, with the coverage area based on January 1, 2000 legal boundaries. A complete set of Redistricting Census 2000 TIGER/Line files includes all counties and statistically equivalent entities in the United States and Puerto Rico. The Redistricting Census 2000 TIGER/Line files will not include files for the Island Areas. The Census TIGER data base represents a seamless national file with no overlaps or gaps between parts. However, each county-based TIGER/Line file is designed to stand alone as an independent data set or the files can be combined to cover the whole Nation. The Redistricting Census 2000 TIGER/Line files consist of line segments representing physical features and governmental and statistical boundaries. The Redistricting Census 2000 TIGER/Line files do NOT contain the ZIP Code Tabulation Areas (ZCTAs) and the address ranges are of approximately the same vintage as those appearing in the 1999 TIGER/Line files. That is, the Census Bureau is producing the Redistricting Census 2000 TIGER/Line files in advance of the computer processing that will ensure that the address ranges in the TIGER/Line files agree with the final Master Address File (MAF) used for tabulating Census 2000. The files contain information distributed over a series of record types for the spatial objects of a county. There are 17 record types, including the basic data record, the shape coordinate points, and geographic codes that can be used with appropriate software to prepare maps. Other geographic information contained in the files includes attributes such as feature identifiers/census feature class codes (CFCC) used to differentiate feature types, address ranges and ZIP Codes, codes for legal and statistical entities, latitude/longitude coordinates of linear and point features, landmark point features, area landmarks, key geographic features, and area boundaries. The Redistricting Census 2000 TIGER/Line data dictionary contains a complete list of all the fields in the 17 record types.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ethnic codes as defined by the Health Management Information System.
Facebook
Twitter"""
This dataset provides a detailed breakdown of demographic information for counties across the United States, derived from the U.S. Census Bureau's 2023 American Community Survey (ACS). The data includes population counts by gender, race, and ethnicity, alongside unique identifiers for each county using State and County FIPS codes.
The dataset includes the following columns: - County: Name of the county. - State: Name of the state the county belongs to. - State FIPS Code: Federal Information Processing Standard (FIPS) code for the state. - County FIPS Code: FIPS code for the county. - FIPS: Combined State and County FIPS codes, a unique identifier for each county. - Total Population: Total population in the county. - Male Population: Number of males in the county. - Female Population: Number of females in the county. - Total Race Responses: Total race-related responses recorded in the survey. - White Alone: Number of individuals identifying as White alone. - Black or African American Alone: Number of individuals identifying as Black or African American alone. - Hispanic or Latino: Number of individuals identifying as Hispanic or Latino.
NAME field for clarity.This dataset is highly versatile and suitable for: - Demographic Analysis: - Analyze population distribution by gender, race, and ethnicity. - Geographic Studies: - Use FIPS codes to map counties geographically. - Data Visualizations: - Create visual insights into demographic trends across counties.
Special thanks to the U.S. Census Bureau for making this data publicly available and to the Kaggle community for fostering a collaborative space for data analysis and exploration. """
Facebook
Twitterhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
The following data set is information obtained about counties in the United States from 2010 through 2019 through the United States Census Bureau. Information described in the data includes the age distributions, the education levels, employment statistics, ethnicity percents, houseold information, income, and other miscellneous statistics. (Values are denoted as -1, if the data is not available)
| Key | List of... | Comment | Example Value |
|---|---|---|---|
| County | String | County name | "Abbeville County" |
| State | String | State name | "SC" |
| Age.Percent 65 and Older | Float | Estimated percentage of population whose ages are equal or greater than 65 years old are produced for the United States states and counties as well as for the Commonwealth of Puerto Rico and its municipios (county-equivalents for Puerto Rico). | 22.4 |
| Age.Percent Under 18 Years | Float | Estimated percentage of population whose ages are under 18 years old are produced for the United States states and counties as well as for the Commonwealth of Puerto Rico and its municipios (county-equivalents for Puerto Rico). | 19.8 |
| Age.Percent Under 5 Years | Float | Estimated percentage of population whose ages are under 5 years old are produced for the United States states and counties as well as for the Commonwealth of Puerto Rico and its municipios (county-equivalents for Puerto Rico). | 4.7 |
| Education.Bachelor's Degree or Higher | Float | Percentage for the people who attended college but did not receive a degree and people who received an associate's bachelor's master's or professional or doctorate degree. These data include only persons 25 years old and over. The percentages are obtained by dividing the counts of graduates by the total number of persons 25 years old and over. Tha data is collected from 2015 to 2019. | 15.6 |
| Education.High School or Higher | Float | Percentage of people whose highest degree was a high school diploma or its equivalent people who attended college but did not receive a degree and people who received an associate's bachelor's master's or professional or doctorate degree. These data include only persons 25 years old and over. The percentages are obtained by dividing the counts of graduates by the total number of persons 25 years old and over. Tha data is collected from 2015 to 2019 | 81.7 |
| Employment.Nonemployer Establishments | Integer | An establishment is a single physical location at which business is conducted or where services or industrial operations are performed. It is not necessarily identical with a company or enterprise which may consist of one establishment or more. The data was collected from 2018. | 1416 |
| Ethnicities.American Indian and Alaska Native Alone | Float | Estimated percentage of population having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment. This category includes people who indicate their race as "American Indian or Alaska Native" or report entries such as Navajo Blackfeet Inupiat Yup'ik or Central American Indian groups or South American Indian groups. | 0.3 |
| Ethnicities.Asian Alone | Float | Estimated percentage of population having origins in any of the original peoples of the Far East Southeast Asia or the Indian subcontinent including for example Cambodia China India Japan Korea Malaysia Pakistan the Philippine Islands Thailand and Vietnam. This includes people who reported detailed Asian responses such as: "Asian Indian " "Chinese " "Filipino " "Korean " "Japanese " "Vietnamese " and "Other Asian" or provide other detailed Asian responses. | 0.4 |
| Ethnicities.Black Alone | Float | Estimated percentage of population having origins in any of the Black racial groups of Africa. It includes people who indicate their race as "Black or African American " or report entries such as African American Kenyan Nigerian or Haitian. | 27.6 |
| Ethnicities.Hispanic or Latino | Float |
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This directory contains the code and data behind the story Dear Mona, What’s The Most Common Name In America?
The main script file is most-common-name.R
There are four input files:
state-pop.csv - Total population and Hispanic population by state. surnames.csv - Data on surnames from the U.S. Census Bureau, including a breakdown by race/ethnicity. aging-curve.csv - Data from the Social Security Administration on the chances that someone born in the decade shown was still alive in 2013: http://www.ssa.gov/oact/NOTES/as120/LifeTables_Tbl_7.htmladjustments.csv - Taken directly from Lee Hartman's article: http://mypage.siu.edu/lhartman/johnsmith.html.And five output files:
adjusted-name-combinations-list.csv - Adjusted estimates for the most common full names. adjusted-name-combinations-matrix.csv - The same data from the file adjusted-name-combinations-list.csv but in matrix form. These are the estimates presented in the second (and final) table of the article.independent-name-combinations-by-pop.csv - Matrix of estimates for the top 100 most common first names by top 100 most common surnames. These were calculated using independent odds, and displayed in the first table presented in the article.new-top-firstNames.csv - Final estimated ranking of top first names.new-top-surnames.csv - Final estimated ranking of top surnames.This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
Facebook
TwitterThis layer shows the population broken down by race and Hispanic origin. Data is from US Census American Community Survey (ACS) 5-year estimates.To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right (in ArcGIS Online). A ‘Null’ entry in the estimate indicates that data for this geographic area cannot be displayed because the number of sample cases is too small (per the U.S. Census).Vintage: 2019-2023ACS Table(s): B03002 (Not all lines of this ACS table are available in this feature layer.)Data downloaded from: Census Bureau's API for American Community Survey Data Preparation: Data table was downloaded and joined with Zip Code boundaries in the City of Tempe.Date of Census update: December 12, 2024National Figures: data.census.gov
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This directory contains the code and data behind the story Dear Mona, What’s The Most Common Name In America?
The main script file is most-common-name.R
There are four input files:
state-pop.csv - Total population and Hispanic population by state. surnames.csv - Data on surnames from the U.S. Census Bureau, including a breakdown by race/ethnicity. aging-curve.csv - Data from the Social Security Administration on the chances that someone born in the decade shown was still alive in 2013: http://www.ssa.gov/oact/NOTES/as120/LifeTables_Tbl_7.htmladjustments.csv - Taken directly from Lee Hartman's article: http://mypage.siu.edu/lhartman/johnsmith.html.And five output files:
adjusted-name-combinations-list.csv - Adjusted estimates for the most common full names. adjusted-name-combinations-matrix.csv - The same data from the file adjusted-name-combinations-list.csv but in matrix form. These are the estimates presented in the second (and final) table of the article.independent-name-combinations-by-pop.csv - Matrix of estimates for the top 100 most common first names by top 100 most common surnames. These were calculated using independent odds, and displayed in the first table presented in the article.new-top-firstNames.csv - Final estimated ranking of top first names.new-top-surnames.csv - Final estimated ranking of top surnames.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset includes all personal names listed in the Wikipedia category “American people by ethnic or national origin” and all subcategories fitting the pattern “American People of [ ] descent”, in total more than 25,000 individuals. Each individual is represented by a row, with columns indicating binary membership (0/1) in each ethnic/national category.
Ethnicity inference is an essential tool for identifying disparities in public health and social sciences. Existing datasets linking personal names to ethnic or national origin often neglect to recognize multi-ethnic or multi-national identities. Furthermore, existing datasets use coarse classification schemes (e.g. classifying both Indian and Japanese people as “Asian”) that may not be suitable for many research questions. This dataset remedies these problems by including both very fine-grain ethnic/national categories (e.g. Afghan-Jewish) and more broad ones (e.g. European). Users can chose the categories that are relevant to their research. Since many Americans on Wikipedia are associated with multiple overlapping or distinct ethnicities/nationalities, these multi-ethnic associations are also reflected in the data.
Data were obtained from the Wikipedia API and reviewed manually to remove stage names, pen names, mononyms, first initials (when full names are available on Wikipedia), nicknames, honorific titles, and pages that correspond to a group or event rather than an individual.
This dataset was designed for use in training classification algorithms, but may also be independently interesting inasmuch as it is a representative sample of Americans who are famous enough to have their own Wikipedia page, along with detailed information on their ethnic/national origins.
DISCLAIMER: Due to the incomplete nature of Wikipedia, data may not properly reflect all ethnic national associations for any given individual. For example, there is no guarantee that a given Cuban Jewish person will be listed in both the “American People of Cuban descent” and the “American People of Jewish descent” categories.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and the group quarters population for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2019-2023 American Community Survey 5-Year Estimates.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year..Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Workers include members of the Armed Forces and civilians who were at work last week..The Hispanic origin and race codes were updated in 2020. For more information on the Hispanic origin and race code changes, please visit the American Community Survey Technical Documentation website..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2022 American Community Survey 1-Year Estimates.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..The Hispanic origin and race codes were updated in 2020. For more information on the Hispanic origin and race code changes, please visit the American Community Survey Technical Documentation website..The 2022 American Community Survey (ACS) data generally reflect the March 2020 Office of Management and Budget (OMB) delineations of metropolitan and micropolitan statistical areas. In certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB delineations due to differences in the effective dates of the geographic entities..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The EEO Tabulation is sponsored by four Federal agencies consisting of the Equal Employment Opportunity Commission (EEOC), the Employment Litigation Section of the Civil Rights Division at the Department of Justice (DOJ), the Office of Federal Contract Compliance Programs (OFCCP), and the Office of Personnel Management (OPM), and developed in conjunction with the U.S. Census Bureau..Supporting documentation on code lists and subject definitions can be found on the Equal Employment Opportunity Tabulation website. https://www.census.gov/topics/employment/equal-employment-opportunity-tabulation.html.Source: U.S. Census Bureau, 2014-2018 American Community Survey.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see https://www.census.gov/programs-surveys/acs/technical-documentation.html The effect of nonsampling error is not represented in these tables)..The U.S. Census Bureau collects race data in accordance with guidelines provided by the U.S. Office of Management and Budget (OMB). Except for the total, all race and ethnicity categories are mutually exclusive. "Black" refers to Black or African American; "AIAN" refers to American Indian and Alaska Native; and "NHPI" refers to Native Hawaiian and Other Pacific Islander. "Balance of Not Hispanic or Latino" includes the balance of non-Hispanic individuals who reported multiple races or reported Some Other Race alone. For more information on race and Hispanic origin, see the Subject Definitions at https://www.census.gov/programs-surveys/acs/technical-documentation.html..Race and Hispanic origin are separate concepts on the American Community Survey. "White alone Hispanic or Latino" includes respondents who reported Hispanic or Latino origin and reported race as "White" and no other race. "All other Hispanic or Latino" includes respondents who reported Hispanic or Latino origin and reported a race other than "White," either alone or in combination..Occupation titles and their 4-digit codes are based on the 2018 Standard Occupational Classification..The 2014-2018 American Community Survey (ACS) data generally reflect the September 2018 Office of Management and Budget (OMB) delineations of metropolitan and micropolitan statistical areas. In certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB delineations due to differences in the effective dates of the geographic entities..Explanation of Symbols:An "-" entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution, or the margin of error associated with a median was larger than the median itself.An "(X)" means that the estimate is not applicable or not available.An "**" entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.An "***" entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.An "*****" entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.An "N" entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.An "-" following a median estimate means the median falls in the lowest interval of an open-ended distribution.An "+" following a median estimate means the median falls in the upper interval of an open-ended distribution.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2020, the 2020 Census provides the official counts of the population and housing units for the nation, states, counties, cities, and towns. For 2016 to 2019, the Population Estimates Program provides estimates of the population for the nation, states, counties, cities, and towns and intercensal housing unit estimates for the nation, states, and counties..Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2016-2020 American Community Survey 5-Year Estimates.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..For more information on understanding race and Hispanic origin data, please see the Census 2010 Brief entitled, Overview of Race and Hispanic Origin: 2010, issued March 2011. (pdf format).The Hispanic origin and race codes were updated in 2020. For more information on the Hispanic origin and race code changes, please visit the American Community Survey Technical Documentation website..The 2016-2020 American Community Survey (ACS) data generally reflect the September 2018 Office of Management and Budget (OMB) delineations of metropolitan and micropolitan statistical areas. In certain instances, the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB delineation lists due to differences in the effective dates of the geographic entities..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview
This repository contains the coverage time distributions used to produce the figures and statistics for the paper:
S. Sousa, V. Nicosia "Quantifying ethnic segregation in cities through random walks". arXiv: https://arxiv.org/abs/2010.10462
Data
The ccp.zip file contains two subfolders with the coverage time distributions for the US and UK systems. Each file contains a line per node of the network with the format:
"Node ID" "[list with the CCT for each fraction c]"
Note that each line will always contain 101 columns where the first column identifies the node and the remaining ones represent the average coverage time to reach a fraction c of classes.
The dfa.zip file contains the following folders:
The synthetic.zip file contains the coverage time distributions for the experiment with different lattice sizes (scale-test) and the experiment with distinct spatial patterns for the population distribution (topology-test). The format follows the same as in ccp.zip folder.
Code
The reader interested in replicating the methods used to create the data can obtain the python scrips in the following repository:
https://github.com/segregation-rw/ethnic-segregation-rw
Note that the repository also includes the code to simulate the CCT random walks on the adjacency graphs so that the whole simulation can be replicated.
Facebook
TwitterThis dataset contains Race/Ethinicty codes. It is used to enter in patient demographics information.