Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimating differences between racial/ethnic groups often requires merging demographic variables from one dataset to variables of interest in another. A common method merges Home Mortgage Disclosure Act data to property databases. One alternative is to acquire this information from voter registration files; another is to predict race with a name-based algorithm. Compared to Census data, which method is more representative varies by location and group. We explore the practical implications of each method by using the matched samples in two empirical applications. Researchers can arrive at different conclusions about racial/ethnic disparities depending on the method selected.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A. SUMMARY This dataset includes San Francisco COVID-19 tests by race/ethnicity and by date. This dataset represents the daily count of tests collected, and the breakdown of test results (positive, negative, or indeterminate). Tests in this dataset include all those collected from persons who listed San Francisco as their home address at the time of testing. It also includes tests that were collected by San Francisco providers for persons who were missing a locating address. This dataset does not include tests for residents listing a locating address outside of San Francisco, even if they were tested in San Francisco.
The data were de-duplicated by individual and date, so if a person gets tested multiple times on different dates, all tests will be included in this dataset (on the day each test was collected). If a person tested multiple times on the same date, only one test is included from that date. When there are multiple tests on the same date, a positive result, if one exists, will always be selected as the record for the person. If a PCR and antigen test are taken on the same day, the PCR test will supersede. If a person tests multiple times on the same day and the results are all the same (e.g. all negative or all positive) then the first test done is selected as the record for the person.
The total number of positive test results is not equal to the total number of COVID-19 cases in San Francisco.
When a person gets tested for COVID-19, they may be asked to report information about themselves. One piece of information that might be requested is a person's race and ethnicity. These data are often incomplete in the laboratory and provider reports of the test results sent to the health department. The data can be missing or incomplete for several possible reasons:
• The person was not asked about their race and ethnicity.
• The person was asked, but refused to answer.
• The person answered, but the testing provider did not include the person's answers in the reports.
• The testing provider reported the person's answers in a format that could not be used by the health department.
For any of these reasons, a person's race/ethnicity will be recorded in the dataset as “Unknown.”
B. NOTE ON RACE/ETHNICITY The different values for Race/Ethnicity in this dataset are "Asian;" "Black or African American;" "Hispanic or Latino/a, all races;" "American Indian or Alaska Native;" "Native Hawaiian or Other Pacific Islander;" "White;" "Multi-racial;" "Other;" and “Unknown."
The Race/Ethnicity categorization increases data clarity by emulating the methodology used by the U.S. Census in the American Community Survey. Specifically, persons who identify as "Asian," "Black or African American," "American Indian or Alaska Native," "Native Hawaiian or Other Pacific Islander," "White," "Multi-racial," or "Other" do NOT include any person who identified as Hispanic/Latino at any time in their testing reports that either (1) identified them as SF residents or (2) as someone who tested without a locating address by an SF provider. All persons across all races who identify as Hispanic/Latino are recorded as “"Hispanic or Latino/a, all races." This categorization increases data accuracy by correcting the way “Other” persons were counted. Previously, when a person reported “Other” for Race/Ethnicity, they would be recorded “Unknown.” Under the new categorization, they are counted as “Other” and are distinct from “Unknown.”
If a person records their race/ethnicity as “Asian,” “Black or African American,” “American Indian or Alaska Native,” “Native Hawaiian or Other Pacific Islander,” “White,” or “Other” for their first COVID-19 test, then this data will not change—even if a different race/ethnicity is reported for this person for any future COVID-19 test. There are two exceptions to this rule. The first exception is if a person’s race/ethnicity value is reported as “Unknown” on their first test and then on a subsequent test they report “Asian;” "Black or African American;" "Hispanic or Latino/a, all races;" "American Indian or Alaska Native;" "Native Hawaiian or Other Pacific Islander;" or "White”, then this subsequent reported race/ethnicity will overwrite the previous recording of “Unknown”. If a person has only ever selected “Unknown” as their race/ethnicity, then it will be recorded as “Unknown.” This change provides more specific and actionable data on who is tested in San Francisco.
The second exception is if a person ever marks “Hispanic or Latino/a, all races” for race/ethnicity then this choice will always overwrite any previous or future response. This is because it is an overarching category that can include any and all other races and is mutually exclusive with the other responses.
A person's race/ethnicity will be recorded as “Multi-racial” if they select two or more values among the following choices: “Asian,” “Black or African American,” “American Indian or Alaska Native,” “Native Hawaiian or Other Pacific Islander,” “White,” or “Other.” If a person selects a combination of two or more race/ethnicity answers that includes “Hispanic or Latino/a, all races” then they will still be recorded as “Hispanic or Latino/a, all races”—not as “Multi-racial.”
C. HOW THE DATASET IS CREATED COVID-19 laboratory test data is based on electronic laboratory test reports. Deduplication, quality assurance measures and other data verification processes maximize accuracy of laboratory test information.
D. UPDATE PROCESS Updates automatically at 5:00AM Pacific Time each day. Redundant runs are scheduled at 7:00AM and 9:00AM in case of pipeline failure.
E. HOW TO USE THIS DATASET San Francisco population estimates for race/ethnicity can be found in a view based on the San Francisco Population and Demographic Census dataset. These population estimates are from the 2016-2020 5-year American Community Survey (ACS).
Due to the high degree of variation in the time needed to complete tests by different labs there is a delay in this reporting. On March 24, 2020 the Health Officer ordered all labs in the City to report complete COVID-19 testing information to the local and state health departments.
In order to track trends over time, a user can analyze this data by sorting or filtering by the "specimen_collection_date" field.
Calculating Percent Positivity: The positivity rate is the percentage of tests that return a positive result for COVID-19 (positive tests divided by the sum of positive and negative tests). Indeterminate results, which could not conclusively determine whether COVID-19 virus was present, are not included in the calculation of percent positive. When there are fewer than 20 positives tests for a given race/ethnicity and time period, the positivity rate is not calculated for the public tracker because rates of small test counts are less reliable.
Calculating Testing Rates: To calculate the testing rate per 10,000 residents, divide the total number of tests collected (positive, negative, and indeterminate results) for the specified race/ethnicity by the total number of residents who identify as that race/ethnicity (according to the 2016-2020 American Community Survey (ACS) population estimate), then multiply by 10,000. When there are fewer than 20 total tests for a given race/ethnicity and time period, the testing rate is not calculated for the public tracker because rates of small test counts are less reliable.
Read more about how this data is updated and validated daily: https://sf.gov/information/covid-19-data-questions
F. CHANGE LOG
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Non-Hispanic population of Manns Choice by race. It includes the distribution of the Non-Hispanic population of Manns Choice across various race categories as identified by the Census Bureau. The dataset can be utilized to understand the Non-Hispanic population distribution of Manns Choice across relevant racial categories.
Key observations
Of the Non-Hispanic population in Manns Choice, the largest racial group is White alone with a population of 265 (91.70% of the total Non-Hispanic population).
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Manns Choice Population by Race & Ethnicity. You can refer the same here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The purpose of this study is to advance our thinking about race and racism in geospatial analyses of school choice policy. To do so, we present a critical race spatial analysis of Detroit students’ suburban school choices. To frame our study, we describe the racial and spatial dynamics of school choice, drawing in particular on the concepts of opportunity hoarding and predatory landscapes. We find that Detroit students’ suburban school choices were circumscribed by racial geography and concentrated in just a handful of schools and districts. We also find notable differences between students in different racial groups. For all Detroit exiters, their schools were significantly more segregated and lower quality than those of their suburban peers. We propose future directions for research on families’ school choices as well as school and district behavior at the intersection of race, geography, and school choice policy.This research result used data structured and maintained by the MERI-Michigan Education Data Center (MEDC). MEDC data are modified for analysis purposes using rules governed by MEDC and are not identical to those data collected and maintained by the Michigan Department of Education (MDE) and/or Michigan’s Center for Educational Performance and Information (CEPI). Results, information, and opinions solely represent the analysis, information, and opinions of the author and are not endorsed by, or reflect the views or positions of, grantors, MDE, and CEPI or any employee thereof. All errors are my own.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This article uses a recent first name list to develop an improvement to an existing Bayesian classifier, namely the Bayesian Improved Surname Geocoding (BISG) method, which combines surname and geography information to impute missing race/ethnicity. The new Bayesian Improved First Name Surname Geocoding (BIFSG) method is validated using a large sample of mortgage applicants who self-report their race/ethnicity. BIFSG outperforms BISG, in terms of accuracy and coverage, for all major racial/ethnic categories. Although the overall magnitude of improvement is somewhat small, the largest improvements occur for non-Hispanic Blacks, a group for which the BISG performance is weakest. When estimating the race/ethnicity effects on mortgage pricing and underwriting decisions with regression models, estimation biases from both BIFSG and BISG are very small, with BIFSG generally having smaller biases, and the maximum a posteriori classifier resulting in smaller biases than through use of estimated probabilities. Robustness checks using voter registration data confirm BIFSG's improved performance vis-a-vis BISG and illustrate BIFSG's applicability to areas other than mortgage lending. Finally, I demonstrate an application of the BIFSG to the imputation of missing race/ethnicity in the Home Mortgage Disclosure Act data, and in the process, offer novel evidence that the incidence of missing race/ethnicity information is correlated with race/ethnicity.
This dataset includes race/ethnicity of newly Medi-Cal eligible individuals who identified their race/ethnicity as Hispanic, White, Other Asian or Pacific Islander, Black, Chinese, Filipino, Vietnamese, Asian Indian, Korean, Alaskan Native or American Indian, Japanese, Cambodian, Samoan, Laotian, Hawaiian, Guamanian, Amerasian, or Other, by reporting period. The race/ethnicity data is from the Medi-Cal Eligibility Data System (MEDS) and includes eligible individuals without prior Medi-Cal Eligibility. This dataset is part of the public reporting requirements set forth in California Welfare and Institutions Code 14102.5.
This statistic shows the percentage of adults in the U.S. who would turn to select sources in case of burnout as of February 2017, by ethnicity. It was found that ** percent of African American respondents stated they would turn to family/friends in case of burn out, compared to ** percent of Hispanic respondents.
A study released in November 2022 revealed that most U.S. subscribers to ad-supported video-on-demand tiers or ad-free plans were white, whereas just **** percent were Asian. Both white and Asian users, as well as Hispanic Americans, preferred ad-free plans, whereas Black subscribers were more likely to use ad-supported tiers than ad-free options.
Pursuant to Local Laws 126, 127, and 128 of 2016, certain demographic data is collected voluntarily and anonymously by persons voluntarily seeking social services. This data can be used by agencies and the public to better understand the demographic makeup of client populations and to better understand and serve residents of all backgrounds and identities. The data presented here has been collected through either electronic form or paper surveys offered at the point of application for services. These surveys are anonymous. Each record represents an anonymized demographic profile of an individual applicant for social services, disaggregated by response option, agency, and program. Response options include information regarding ancestry, race, primary and secondary languages, English proficiency, gender identity, and sexual orientation. Idiosyncrasies or Limitations: Note that while the dataset contains the total number of individuals who have identified their ancestry or languages spoke, because such data is collected anonymously, there may be instances of a single individual completing multiple voluntary surveys. Additionally, the survey being both voluntary and anonymous has advantages as well as disadvantages: it increases the likelihood of full and honest answers, but since it is not connected to the individual case, it does not directly inform delivery of services to the applicant. The paper and online versions of the survey ask the same questions but free-form text is handled differently. Free-form text fields are expected to be entered in English although the form is available in several languages. Surveys are presented in 11 languages. Paper Surveys 1. Are optional 2. Survey taker is expected to specify agency that provides service 2. Survey taker can skip or elect not to answer questions 3. Invalid/unreadable data may be entered for survey date or date may be skipped 4. OCRing of free-form tet fields may fail. 5. Analytical value of free-form text answers is unclear Online Survey 1. Are optional 2. Agency is defaulted based on the URL 3. Some questions must be answered 4. Date of survey is automated
Dataset, GDB, and Online Map created by Renee Haley, NMCDC, May 2023 DATA ACQUISITION PROCESS
Scope and purpose of project: New Mexico is struggling to maintain its healthcare workforce, particularly in Rural areas. This project was undertaken with the intent of looking at flows of healthcare workers into and out of New Mexico at the most granular geographic level possible. This dataset, in combination with others (such as housing cost and availability data) may help us understand where our healthcare workforce is relocating and why.
The most relevant and detailed data on workforce indicators in the United States is housed by the Census Bureau's Longitudinal Employer-Household Dynamics, LEHD, System. Information on this system is available here:
The Job-to-Job flows explorer within this system was used to download the data. Information on the J2J explorer can ve found here:
https://j2jexplorer.ces.census.gov/explore.html#1432012
The dataset was built from data queried with the LED Extraction Tool, which allows for the query of more intersectional and detailed data than the explorer. This is a link to the LED extraction tool:
https://ledextract.ces.census.gov/
The geographies used are US Metro areas as determined by the Census, (N=389). The shapefile is named lehd_shp_gb.zip, and can be downloaded under this section of the following webpage: 5.5. Job-to-Job Flow Geographies, 5.5.1. Metropolitan (Complete). A link to the download site is available below:
https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_shapefiles.html
DATA CLEANING PROCESS
This dataset was built from 8 non intersectional datasets downloaded from the LED Extraction Tool.
Separate datasets were downloaded in order to obtain detailed information on the race, ethnicity, and educational attainment levels of healthcare workers and where they are migrating.
Datasets included information for the four separate quarters of 2021. It was not possible to download annual data, only quarterly. Quarterly data was summed in a later step to derive annual totals for 2021.
4 datasets for healthcare workers moving OUT OF New Mexico, with details on race, ethnicity, and educational attainment, were downloaded. 1 contained information on educational attainment, 2 contained information on 7 racial categories identifying as non- Hispanic, 3 contained information on those same 7 categories also identifying as Hispanic, and 4 contained information for workers identifying as white and Hispanic.
4 datasets for healthcare worker moving INTO New Mexico, with details on race, ethnicity, and educational attainment, were downloaded with the same details outlined above.
Each dataset was cleaned according to Data Template which kept key attributes and discarded excess information. Within each dataset, the J2J Indicators reflecting 6 different types of job migration were totaled in order to simplify analysis, as this information was not needed in detail.
After cleaning, each set of 4 datasets for workers moving INTO New Mexico were joined. The process was repeated for workers moving OUT OF New Mexico. This resulted 2 main datasets.
These 2 main datasets still listed all of the variables by each quarter of 2021. Because of this the data was split in JMP, so that attributes of educational attainment, race and ethnicity, of workers migrating by quarter were moved from rows to columns. After this, summary columns for the year of 2021 were derived. This resulted in totals columns for workers identifying as: 6 separate races and all ethnicities, all races and Hispanic, white-Hispanic, and workers of 6 different education levels, reflecting how many workers of each indicator migrated to and from metro areas in New Mexico in 2021.
The data split transposed duplicate rows reflecting differing worker attributes within the same metro area, resulting in one row for each metro area and reflecting the attributes in columns, thus resulting in a mappable dataset.
The 2 datasets were joined (on Metro Area) resulting in one master file containing information on healthcare workers entering and leaving New Mexico.
Rows (N=389) reflect all of the metro areas across the US, and each state. Rows include the 5 metro areas within New Mexico, and New Mexico State.
Columns (N=99) contain information on worker race, ethnicity and educational attainment, specific to each metro area in New Mexico.
78 of these rows reflect workers of specific attributes moving OUT OF the 5 specific Metro Areas in New Mexico and totals for NM State. This level of detail is intended for analyzing who is leaving what area of New Mexico, where they are going to, and why.
13 Columns reflect each worker attribute for healthcare workers moving INTO New Mexico by race, ethnicity and education level. Because all 5 metro areas and New Mexico state are contained in the rows, this information for incoming workers is available by metro area and at the state level - there is less possability for mapping these attributes since it was not realistic or possible to create a dataset reflecting all of these variables for every healthcare worker from every metro area in the US also coming into New Mexico (that dataset would have over 1,000 columns and be unmappable). Therefore this dataset is easier to utilize in looking at why workers are leaving the state but also includes detailed information on who is coming in.
The remaining 8 columns contain geographic information.
GIS AND MAPPING PROCESS
The master file was opened in Arc GIS Pro and the Shapefile of US Metro Areas was also imported
The excel file was joined to the shapefile by Metro Area Name as they matched exactly
The resulting layer was exported as a GDB in order to retain null values which would turn to zeros if exported as a shapefile.
This GDB was uploaded to Arc GIS Online, Aliases were inserted as column header names, and the layer was visualized as desired.
SYSTEMS USED
MS Excel was used for data cleaning, summing NM state totals, and summing quarterly to annual data.
JMP was used to transpose, join, and split data.
ARC GIS Desktop was used to create the shapefile uploaded to NMCDC's online platform.
VARIABLE AND RECODING NOTES
Summary of variables selected for datasets downloaded focused on educational attainment:
J2J Flows by Educational Attainment
Summary of variables selected for datasets downloaded focused on race and ethnicity:
J2J Flows by Race and Ethnicity
Note: Variables in Datasets 1 through 4 downloaded twice, once for workers coming into New Mexico and once for those leaving NM. VARIABLE: LEHD VARIABLE DEFINITION LEHD VARIABLE NOTES DETAILS OR URL FOR RAW DATA DOWNLOAD
Geography Type - State Origin and Destination State
Data downloaded for worker migration into and out of all US States
Geography Type - Metropolitan Areas Origin and Dest Metro Area
Data downloaded for worker migration into and out of all US Metro Areas
NAICS sectors North American Industry Classification System Under Firm Characteristics Only downloaded for Healthcare and Social Assistance Sectors
Other Firm Characteristics No Firm Age / Size Detail Under Firm Characteristics Downloaded data on all firm ages, sizes, and other details.
Worker Characteristics Education, Race, Ethnicity
Non Intersectional data aside from Race / Ethnicity data.
Sex Gender
0 - All Sexes Selected
Age Age
A00 All Ages (14-99)
Education Education Level E0, E1, E2, E3, 34, E5 E0 - All Education Categories, E1 - Less than high school, E2 - High school or equivalent, no college, E3 - Some college or Associate’s degree, E4 - Bachelor's degree or advanced degree, E5 - Educational attainment not available (workers aged 24 or younger)
Dataset 1 All Education Levels, E1, E2, E3, E4, and E5
RACE
A0, A1, A2, A3, A4, A5 OPTIONS: A0 All Races, A1 White Alone, A2 Black or African American Alone, A3 American Indian or Alaska Native Alone, A4 Asian Alone, A5 Native Hawaiian or Other Pacific Islander Alone, SDA7 Two or More Race Groups
ETHNICITY
A0, A1, A2 OPTIONS: A0 All Ethnicities, A1 Not Hispanic or Latino, A2 Hispanic or Latino
Dataset 2 All Races (A0) and All Ethnicities (A0)
Dataset 3 6 Races (A1 through A5) and All Ethnicities (A0)
Dataset 4 White (A1) and Hispanic or Latino (A1)
Quarter Quarter and Year
Data from all quarters of 2021 to sum into annual numbers; yearly data was not available
Employer type Sector: Private or Governmental
Query included all healthcare sector workflows from all employer types and firm sizes from every quarter of 2021
J2J indicator categories Detailed types of job migration
All options were selected for all datasets and totaled: AQHire, AQHireS, EE, EES, J2J, J2JS. Counts were selected vs. earnings, and data was not seasonally adjusted (unavailable).
NOTES AND RESOURCES
The following resources and documentation were used to navigate the LEHD and J2J Worker Flows system and to answer questions about variables:
https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_public_use_schema.html
https://www.census.gov/history/www/programs/geography/metropolitan_areas.html
https://lehd.ces.census.gov/data/schema/j2j_latest/lehd_csv_naming.html
Statewide (New
Percent population by race and Hispanic Origin North Carolina and all counties from the 2012-2016 American Community Survey.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Polling data routinely indicates broad support for the concept of diverse schools, but integration initiatives—both racial and socioeconomic—regularly encounter significant opposition. We leverage a nationally-representative survey experiment to provide novel evidence on public support for integration initiatives. Specifically, we present respondents with a hypothetical referendum where we provide information on two policy options for assigning students to schools: 1) A residence-based assignment option and 2) An option designed to achieve stated racial/ethnic or socioeconomic diversity targets, with respondents randomly assigned to the racial/ethnic or socioeconomic diversity option. After calculating public support and average willingness-to-pay, our results demonstrate a clear plurality of the public preferring residence-based assignment to the racial diversity initiative, but a near-even split in support for residence-based assignment and the socioeconomic integration initiative. Moreover, we find that the decline in support for race-based integration, relative to the socioeconomic diversity initiative, is entirely attributable to white and Republican respondents.
As of June 2024, there were around 3.09 million ethnic Chinese residents in Singapore. Singapore is a multi-ethnic society, with residents categorized into four main racial groups: Chinese, Malay, Indian, and Others. Each resident is assigned a racial category that follows the paternal side. This categorization would have an impact on both official as well as private matters. Modelling a peaceful, multi-ethnic society The racial categorization used in Singapore stemmed from its colonial past and continues to shape its social policies, from public housing quotas along the ethnic composition in the country to education policies pertaining second language, or ‘mother tongue’, instruction. Despite the emphasis on ethnicity and race, Singapore has managed to maintain a peaceful co-existence among its diverse population. Most Singaporeans across ethnic levels view the level of racial and religious harmony there to be moderately high. The level of acceptance and comfort with having people of other ethnicities in their social lives was also relatively high across the different ethnic groups. Are Singaporeans ready to move away from the CMIO model of ethnic classification? In recent times, however, there has been more open discussion on racism and the relevance of the CMIO (Chinese, Malay, Indian, Others) ethnic model for Singaporean society. The global discourse on racism has brought to attention the latent discrimination felt by the minority ethnic groups in Singapore, such as in the workplace. In 2010, Singapore introduced the option of having a ‘double-barreled’ race classification, reflecting the increasingly diverse and complicated ethnic background of its population. More than a decade later, there have been calls to do away from such racial classifications altogether. However, with social identity and policy deeply entrenched along these lines, it would be a challenge to move beyond race in Singapore.
In early February 2024, we will be retiring the Mpox Vaccinations Given to SF Residents by Demographics dataset. This dataset will be archived and no longer update. A historic record of this data will remain available.
A. SUMMARY This dataset represents doses of mpox vaccine (JYNNEOS) administered in California to residents of San Francisco ages 18 years or older. This dataset only includes doses of the JYNNEOS vaccine given on or after 5/1/2022. All vaccines given to people who live in San Francisco are included, no matter where the vaccination took place. The data are broken down by multiple demographic stratifications.
B. HOW THE DATASET IS CREATED Information on doses administered to those who live in San Francisco is from the California Immunization Registry (CAIR2), run by the California Department of Public Health (CDPH). Information on individuals’ city of residence, age, race, ethnicity, and sex are recorded in CAIR2 and are self-reported at the time of vaccine administration. Because CAIR2 does not include information on sexual orientation, we pull information from the San Francisco Department of Public Health’s Epic Electronic Health Record (EHR). The populations represented in our Epic data and the CAIR2 data are different. Epic data only include vaccinations administered at SFDPH managed sites to SF residents.
Data notes for population characteristic types are listed below.
Age * Data only include individuals who are 18 years of age or older.
Race/ethnicity * The response option "Other Race" is categorized by the data source system, and the response option "Unknown" refers to a lack of data.
Sex * The response option "Other" is categorized by the source system, and the response option "Unknown" refers to a lack of data.
Sexual orientation * The response option “Unknown/Declined” refers to a lack of data or individuals who reported multiple different sexual orientations during their most recent interaction with SFDPH.
For convenience, we provide the 2020 5-year American Community Survey population estimates.
C. UPDATE PROCESS Updated daily via automated process.
D. HOW TO USE THIS DATASET This dataset includes many different types of demographic groups. Filter the “demographic_group” column to explore a topic area. Then, the “demographic_subgroup” column shows each group or category within that topic area and the total count of doses administered to that population subgroup.
E. CHANGE LOG
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project provides code to replicates the exhibits for Ellison and Pathak (2020), "The Efficiency of Race-Neutral Alternatives to Race-Based Affirmative Action: Evidence from Chicago’s Exam Schools."
This dataset includes the race of applicants for Insurance Affordability Programs (IAPs) who reported their race as American Indian and/or Alaska Native, Asian Indian, Black or African American, Chinese, Cambodian, Filipino, Guamanian or Chamorro, Hmong, Japanese, Korean, Laotian, Mixed Race, Native Hawaiian, Other, Other Asian, Other Pacific Islander, Samoan, Vietnamese, or White by reporting period. The race data is from the California Healthcare Eligibility, Enrollment and Retention System (CalHEERS) and includes data from applications submitted directly to CalHEERS, to Covered California, and to County Human Services Agencies through the Statewide Automated Welfare System (SAWS) eHIT interface. Please note the reporting category Other Asian option on the CalHEERS application was removed in September 2017. This dataset is part of public reporting requirements set forth by the California Welfare and Institutions Code 14102.5.
In 2023, about **** million people in Washington were of Hispanic or Latino origin. Furthermore, there were about **** million white people and ******* Asian people living in Washington state in that year.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
RACE 是一个大规模阅读理解数据集,包含超过 28,000 篇文章以及近 100,000 个问题。该数据集采集自中国的中学和高中英语考试,可以作为机器阅读理解任务的训练与测试集。 RACE is a large-scale reading comprehension dataset with over 28,000 passages and nearly 100,000 questions. The dataset is collected from English exams for Chinese middle and high school students and can serve as training and test sets for machine comprehension tasks.
数据格式 | Data Format
article(文章): 一个字符串,包含完整的阅读文章。 questions(问题): 一个字符串列表,每个字符串为一个问题(有陈述句和带填空两类)。 option(选项): 列表,每个问题有四个备选答案。… See the full description on the dataset page: https://huggingface.co/datasets/XuehangCang/RACE.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The census is undertaken by the Office for National Statistics every 10 years and gives us a picture of all the people and households in England and Wales. The most recent census took place in March of 2021.The census asks every household questions about the people who live there and the type of home they live in. In doing so, it helps to build a detailed snapshot of society. Information from the census helps the government and local authorities to plan and fund local services, such as education, doctors' surgeries and roads.Key census statistics for Leicester are published on the open data platform to make information accessible to local services, voluntary and community groups, and residents. There is also a dashboard published showcasing various datasets from the census allowing users to view data for Leicester and compare this with national statistics.Further information about the census and full datasets can be found on the ONS website - https://www.ons.gov.uk/census/aboutcensus/censusproductsEthnicityThis dataset provides Census 2021 estimates that classify usual residents in England and Wales by ethnic group. The estimates are as at Census Day, 21 March 2021.Definition: The ethnic group that the person completing the census feels they belong to. This could be based on their culture, family background, identity or physical appearance.Respondents could choose one out of 19 tick-box response categories, including write-in response options.This dataset includes data relating to Leicester City and England overall.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Youth arrest data are from the Minnesota Bureau of Criminal Apprehension special request and does not include youth arrests by ethnicity. The only options for race categories are American Indian/Alaska Native, Asian, Black/African American, Native Hawaiian/Pacific Islander (suppressed due to low numbers), and White. There is no ‘other’ or ‘mixed race’ option.
Youth population data are from the Office of Juvenile Justice and Delinquency Prevention’s Easy Access to Juvenile Populations (https://www.ojjdp.gov/ojstatbb/ezapop/). Since youth arrests are only of youth 10-17 years old, so is the youth population data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimating differences between racial/ethnic groups often requires merging demographic variables from one dataset to variables of interest in another. A common method merges Home Mortgage Disclosure Act data to property databases. One alternative is to acquire this information from voter registration files; another is to predict race with a name-based algorithm. Compared to Census data, which method is more representative varies by location and group. We explore the practical implications of each method by using the matched samples in two empirical applications. Researchers can arrive at different conclusions about racial/ethnic disparities depending on the method selected.