https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for "race"
Dataset Summary
RACE is a large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for middle school and high school students. The dataset can be served as the training and test sets for machine comprehension.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Needed… See the full description on the dataset page: https://huggingface.co/datasets/ehovy/race.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Definition Of Human Race is a dataset for object detection tasks - it contains Human Race annotations for 3,150 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
The ReAding Comprehension dataset from Examinations (RACE) dataset is a machine reading comprehension dataset consisting of 27,933 passages and 97,867 questions from English exams, targeting Chinese students aged 12-18. RACE consists of two subsets, RACE-M and RACE-H, from middle school and high school exams, respectively. RACE-M has 28,293 questions and RACE-H has 69,574. Each question is associated with 4 candidate answers, one of which is correct. The data generation process of RACE differs from most machine reading comprehension datasets - instead of generating questions and answers by heuristics or crowd-sourcing, questions in RACE are specifically designed for testing human reading skills, and are created by domain experts.
Use this application to view the pattern of concentrations of people by race and Hispanic or Latino ethnicity. Data are provided at the U.S. Census block group level, one of the smallest Census geographies, to provide a detailed picture of these patterns. The data is sourced from the U.S Census Bureau, 2020 Census Redistricting Data (Public Law 94-171) Summary File. Definitions: Definitions of the Census Bureau’s categories are provided below. This interactive map shows patterns for all categories except American Indian or Alaska Native and Native Hawaiian or Other Pacific Islander. The total population countywide for these two categories is small (1,582 and 263 respectively). The Census Bureau uses the following race categories:Population by RaceWhite – A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.Black or African American – A person having origins in any of the Black racial groups of Africa.American Indian or Alaska Native – A person having origins in any of the original peoples of North and South America (including Central America) and who maintains tribal affiliation or community attachment.Asian – A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.Native Hawaiian or Other Pacific Islander – A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.Some Other Race - this category is chosen by people who do not identify with any of the categories listed above. People can identify with more than one race. These people are included in the Two or More Races Hispanic or Latino PopulationThe Hispanic/Latino population is an ethnic group. Hispanic/Latino people may be of any race.Other layers provided in this tool included the Loudoun County Census block groups, towns and Dulles airport, and the Loudoun County 2021 aerial imagery.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was developed by the Research & Analytics Group at the Atlanta Regional Commission using data from the U.S. Census Bureau.For a deep dive into the data model including every specific metric, see the Infrastructure Manifest. The manifest details ARC-defined naming conventions, field names/descriptions and topics, summary levels; source tables; notes and so forth for all metrics.Naming conventions:Prefixes: None Countp Percentr Ratem Mediana Mean (average)t Aggregate (total)ch Change in absolute terms (value in t2 - value in t1)pch Percent change ((value in t2 - value in t1) / value in t1)chp Change in percent (percent in t2 - percent in t1)s Significance flag for change: 1 = statistically significant with a 90% CI, 0 = not statistically significant, blank = cannot be computed Suffixes: _e19 Estimate from 2014-19 ACS_m19 Margin of Error from 2014-19 ACS_00_v19 Decennial 2000, re-estimated to 2019 geography_00_19 Change, 2000-19_e10_v19 2006-10 ACS, re-estimated to 2019 geography_m10_v19 Margin of Error from 2006-10 ACS, re-estimated to 2019 geography_e10_19 Change, 2010-19The user should note that American Community Survey data represent estimates derived from a surveyed sample of the population, which creates some level of uncertainty, as opposed to an exact measure of the entire population (the full census count is only conducted once every 10 years and does not cover as many detailed characteristics of the population). Therefore, any measure reported by ACS should not be taken as an exact number – this is why a corresponding margin of error (MOE) is also given for ACS measures. The size of the MOE relative to its corresponding estimate value provides an indication of confidence in the accuracy of each estimate. Each MOE is expressed in the same units as its corresponding measure; for example, if the estimate value is expressed as a number, then its MOE will also be a number; if the estimate value is expressed as a percent, then its MOE will also be a percent. The user should also note that for relatively small geographic areas, such as census tracts shown here, ACS only releases combined 5-year estimates, meaning these estimates represent rolling averages of survey results that were collected over a 5-year span (in this case 2015-2019). Therefore, these data do not represent any one specific point in time or even one specific year. For geographic areas with larger populations, 3-year and 1-year estimates are also available. For further explanation of ACS estimates and margin of error, visit Census ACS website.Source: U.S. Census Bureau, Atlanta Regional CommissionDate: 2015-2019Data License: Creative Commons Attribution 4.0 International (CC by 4.0)Link to the manifest: https://www.arcgis.com/sharing/rest/content/items/3d489c725bb24f52a987b302147c46ee/data
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides Census 2021 estimates that classify usual residents in Birmingham by ethnic group, by religion, and by age.
Ethnic Group: The ethnic group that the person completing the census feels they belong to. This could be based on their culture, family background, identity or physical appearance. Religion: The religion people connect or identify with (their religious affiliation), whether or not they practise or have belief in it. Age: A person's age on Census Day, 21 March 2021 in England and Wales.CoverageThis dataset is focused on the data for Birmingham at city level. About the 2021 CensusThe Census takes place every 10 years and gives us a picture of all the people and households in England and Wales.Protecting personal dataThe ONS sometimes need to make changes to data if it is possible to identify individuals. This is known as statistical disclosure control. In Census 2021, they:Swapped records (targeted record swapping), for example, if a household was likely to be identified in datasets because it has unusual characteristics, they swapped the record with a similar one from a nearby small area. Very unusual households could be swapped with one in a nearby local authority.Added small changes to some counts (cell key perturbation), for example, we might change a count of four to a three or a five. This might make small differences between tables depending on how the data are broken down when they applied perturbation.For more geographies, aggregations or topics see the link in the Reference below. Or, to create a custom dataset with multiple variables use the ONS Create a custom dataset tool.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was developed by the Research & Analytics Group at the Atlanta Regional Commission using data from the U.S. Census Bureau.For a deep dive into the data model including every specific metric, see the Infrastructure Manifest. The manifest details ARC-defined naming conventions, field names/descriptions and topics, summary levels; source tables; notes and so forth for all metrics.Naming conventions:Prefixes: None Countp Percentr Ratem Mediana Mean (average)t Aggregate (total)ch Change in absolute terms (value in t2 - value in t1)pch Percent change ((value in t2 - value in t1) / value in t1)chp Change in percent (percent in t2 - percent in t1)s Significance flag for change: 1 = statistically significant with a 90% CI, 0 = not statistically significant, blank = cannot be computed Suffixes: _e19 Estimate from 2014-19 ACS_m19 Margin of Error from 2014-19 ACS_00_v19 Decennial 2000, re-estimated to 2019 geography_00_19 Change, 2000-19_e10_v19 2006-10 ACS, re-estimated to 2019 geography_m10_v19 Margin of Error from 2006-10 ACS, re-estimated to 2019 geography_e10_19 Change, 2010-19The user should note that American Community Survey data represent estimates derived from a surveyed sample of the population, which creates some level of uncertainty, as opposed to an exact measure of the entire population (the full census count is only conducted once every 10 years and does not cover as many detailed characteristics of the population). Therefore, any measure reported by ACS should not be taken as an exact number – this is why a corresponding margin of error (MOE) is also given for ACS measures. The size of the MOE relative to its corresponding estimate value provides an indication of confidence in the accuracy of each estimate. Each MOE is expressed in the same units as its corresponding measure; for example, if the estimate value is expressed as a number, then its MOE will also be a number; if the estimate value is expressed as a percent, then its MOE will also be a percent. The user should also note that for relatively small geographic areas, such as census tracts shown here, ACS only releases combined 5-year estimates, meaning these estimates represent rolling averages of survey results that were collected over a 5-year span (in this case 2015-2019). Therefore, these data do not represent any one specific point in time or even one specific year. For geographic areas with larger populations, 3-year and 1-year estimates are also available. For further explanation of ACS estimates and margin of error, visit Census ACS website.Source: U.S. Census Bureau, Atlanta Regional CommissionDate: 2015-2019Data License: Creative Commons Attribution 4.0 International (CC by 4.0)Link to the manifest: https://www.arcgis.com/sharing/rest/content/items/3d489c725bb24f52a987b302147c46ee/data
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
The ethnic group that the person completing the census feels they belong to. This could be based on their culture, family background, identity or physical appearance.CoverageThis dataset is focused on the data for Birmingham at 2021 constituency level. About the 2021 CensusThe Census takes place every 10 years and gives us a picture of all the people and households in England and Wales.Protecting personal dataThe ONS sometimes need to make changes to data if it is possible to identify individuals. This is known as statistical disclosure control. In Census 2021, they:Swapped records (targeted record swapping), for example, if a household was likely to be identified in datasets because it has unusual characteristics, they swapped the record with a similar one from a nearby small area. Very unusual households could be swapped with one in a nearby local authority.Added small changes to some counts (cell key perturbation), for example, we might change a count of four to a three or a five. This might make small differences between tables depending on how the data are broken down when they applied perturbation.For more geographies, aggregations or topics see the link in the Reference below. Or, to create a custom dataset with multiple variables use the ONS Create a custom dataset tool.Population valueThe value column represents All usual residents.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This dataset supports measure M.C.5 of SD 2023. The sources of data are the American Community Survey and the Austin Transportation Department. Each row displays the percentage of people in different demographic categories who participated in mobility engagement process as compared to percentage of people in the same demographic category in Austin. This dataset can be used to understand how well the City reaches different communities and subpopulations when soliciting public input. View more details at https://data.austintexas.gov/stories/s/Percentage-of-participants-in-mobility-public-enga/pfnb-5uev/.
Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:
See the Splitgraph documentation for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set consists of 17 variables that underpin the analysis of the paper entitled Exploring intergenerational, intra-generational and transnational patterns of family caring in minority ethnic communities: the example of England and Wales published in the International Journal of Care and Caring.
The methodology for the survey is described in the paper.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The RACECAR dataset is the first open dataset for full-scale and high-speed autonomous racing. Multi-modal sensor data has been collected from fully autonomous Indy race cars operating at speeds of up to 170 mph (273 kph). Six teams who raced in the Indy Autonomous Challenge during 2021-22 have contributed to this dataset. The dataset spans 11 interesting racing scenarios across two race tracks which include solo laps, multi-agent laps, overtaking situations, high-accelerations, banked tracks, obstacle avoidance, pit entry and exit at different speeds. The data is organized and released in both ROS2 and nuScenes format. We have also developed the ROS2-to-nuScenes conversion library to achieve this. The RACECAR data is unique because of the high-speed environment of autonomous racing and is suitable to explore issues regarding localization, object detection and tracking (LiDAR, Radar, and Camera), and mapping that arise at the limits of operation of the autonomous vehicle.
This data set is no longer being updated and is historical, last update 10/10/2022.Provides the percentage of COVID-19 cases by race/ethnicity in Jefferson County, KY. In addition, percentage of Jefferson county vaccine recipients broken out by race/ethnicity, excluding doses administered by Walgreens and CVS clinics. Fieldname Definition race description of race/ethnicity CensusCountPCT percentage of population make-up of Jefferson county ConfirmedCaseCountPCT percentage of confirmed cases by race/ethnicity (rounded to the whole percent) DeceasedCountPCT percentage of deceased cases by race/ethnicity (rounded to the whole percent) RecoveredCountPCT percentage of recovered cases by race/ethnicity (rounded to the whole percent) VaccinatedCountPCT percentage of Jefferson county vaccine recipients by race/ethnicity, excluding doses administered by Walgreens and CVS clinics. (rounded to the whole percent) Loaded Date the data was loaded into the system Note: This data is preliminary, routinely updated, and is subject to change For questions about this data please contact Angela Graham (Angela.Graham@louisvilleky.gov) or YuTing Chen (YuTing.Chen@louisvilleky.gov) or call (502) 574-8279.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Formula One is the highest class of international racing for open-wheel single-seater racing cars sanctioned by the Fédération Internationale de l'Automobile (FIA). Ever since its inaugural season in 1950, Formula1 has been regarded as the pinnacle of motorsport.
This dataset contains detailed information about qualifying and race results for all the tracks over the course of multiple seasons. There is a separate directory for each season. There are 2 sub-directories for each season, namely: Qualifying Results
and Race Results
. The Race Results
directory contains an overall_race_results.csv
file which summarizes the race results throughout the entire season. It also contains multiple .csv
files for the results of each race in the season. The Qualifying Results
directory contains multiple .csv
files for the qualifying results before the start of each race.
For the 1982 season and before the qualifying results contain only 1 entry in the file which is that of the polesitter. The lap times of the other drivers were not accounted for, and on the official website there is only 1 entry under the qualifying results.
F1 is one of my favorite sports and I almost never miss a race 😄
The motivation behind creating this dataset was to learn more about web scraping and try to perform a statistical analysis of the data. Some of the things you could do with the entire dataset are as follows: - Identify the driver with the most poles - Compare qualifying times of different drivers (championship contenders, team-mates, etc) - Determine how often a particular driver out-qualifies his team-mate - Compare qualifying lap times of a race from previous seasons - Identify the driver with the most number of wins at a particular track - Analyze how the championship battle unfolded based on the number of points scored by the drivers (specially interesting for the 2021 f1 season 👀) - Identify drivers with the highest number of wins, podiums, DNFs, etc - Compare the average lap times of different tracks to identify the slowest and fastest tracks on the calendar - Compare the number of laps for each race in the season (Belgium 2021 being the clear winner 😂) - Find out who won the Driver's Championship based on the total number of points - Find out who won the Constructor's Championship based on the total number of points for each team
DNF
: Did Not Finish. Commonly used nomenclature for drivers that crashed/failed to complete the entire raceDNQ
: Did Not Qualify. Eliminated missing values from the qualifying datasets by introducing this abbreviation for drivers who failed to qualify.NC
: Not Confirmed. For drivers that DNF the term NC
is used in the Position
columnDQ
: Disqualified. Generally drivers are disqualified from races due to technical infringements or a breach of sporting regulations (Example: Sebastian Vettel was disqualified from the 2021 Hungarian Grand Prix due to fuel irregularites and stripped of all the points he earned from finishing the race in P2)As I collect more data for the previous seasons, I will create new versions for the dataset. The goal with this dataset is to create an archive of qualifying and race data from 1950-2021. The dataset will also be updated when the 2022 season commences.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘DSS Township Counts - by Ethnicity - CY 2020’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/bd790316-65e8-4836-b2d1-47a1dd18bf11 on 26 January 2022.
--- Dataset description provided by original source is as follows ---
In order to facilitate public review and access, enrollment data published on the Open Data Portal is provided as promptly as possible after the end of each month or year, as applicable to the data set. Due to eligibility policies and operational processes, enrollment can vary slightly after publication. Please be aware of the point-in-time nature of the published data when comparing to other data published or shared by the Department of Social Services, as this data may vary slightly.
As a general practice, for monthly data sets published on the Open Data Portal, DSS will continue to refresh the monthly enrollment data for three months, after which time it will remain static. For example, when March data is published the data in January and February will be refreshed. When April data is published, February and March data will be refreshed, but January will not change. This allows the Department to account for the most common enrollment variations in published data while also ensuring that data remains as stable as possible over time. In the event of a significant change in enrollment data, the Department may republish reports and will notate such republication dates and reasons accordingly. In March 2020, Connecticut opted to add a new Medicaid coverage group: the COVID-19 Testing Coverage for the Uninsured. Enrollment data on this limited-benefit Medicaid coverage group is being incorporated into Medicaid data effective January 1, 2021. Enrollment data for this coverage group prior to January 1, 2021, was listed under State Funded Medical. An historical accounting of enrollment of the specific coverage group starting in calendar year 2020 will also be published separately. DSS CY 2020 Town counts - Number of people enrolled in DSS services in the calendar year 2020, by township and ethnicity. For privacy considerations, a count of zero is used for counts less than five. A recipient is counted in all townships where that recipient resided in that year.
--- Original source retains full ownership of the source dataset ---
CalEnviroScreen scores represent a combined measure of pollution and the potential vulnerability of a population to the effects of pollution. Like the previous versions, CalEnviroScreen 4.0 does not include indicators of race/ethnicity or age. However, the distribution of the CalEnviroScreen 4.0 cumulative impact scores by race or ethnicity is important. This information can be used to better understand issues related to environmental justice and racial equity in California. CalEPAs racial equity team has released a StoryMap using CalEnviroScreen 3.0 data that examines the connection between racist land use practices of the 1930s and the persistence of environmental injustice. The CalEPA StoryMap, along with this analysis, are examples of information that can be used to better understand issues related to environmental justice and racial equity in California.
This dataset includes all verified Hate Crime occurrences investigated by the Hate Crime Unit by reported date since 2018. The Hate Crime categories (bias categories) include Age, Mental or Physical Disability, Race, Ethnicity, Language, Religion, Sexual Orientation, Gender and Other Similar Factor. This data is provided at the offence and/or occurrence level, therefore one occurrence may have multi-bias categories associated to the victim used to categorize the hate crime. Definitions Hate Crime A hate crime is a criminal offence committed against a person or property motivated in whole or in part by bias, prejudice or hate based on race, national or ethnic origin, language, colour, religion, sex, age, mental or physical disability, sexual orientation or gender identity or expression or any other similar factor. Hate Incident A hate incident is a non-criminal action or behaviour that is motivated by hate against an identifiable group. Examples of hate incidents include using racial slurs, or insulting a person because of their ethnic or religious dress or how they identify.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These data are modelled using the OMOP Common Data Model v5.3.Correlated Data SourceNG tube vocabulariesGeneration RulesThe patient’s age should be between 18 and 100 at the moment of the visit.Ethnicity data is using 2021 census data in England and Wales (Census in England and Wales 2021) .Gender is equally distributed between Male and Female (50% each).Every person in the record has a link in procedure_occurrence with the concept “Checking the position of nasogastric tube using X-ray”2% of person records have a link in procedure_occurrence with the concept of “Plain chest X-ray”60% of visit_occurrence has visit concept “Inpatient Visit”, while 40% have “Emergency Room Visit”NotesVersion 0Generated by man-made rule/story generatorStructural correct, all tables linked with the relationshipWe used national ethnicity data to generate a realistic distribution (see below)2011 Race Census figure in England and WalesEthnic Group : Population(%)Asian or Asian British: Bangladeshi - 1.1Asian or Asian British: Chinese - 0.7Asian or Asian British: Indian - 3.1Asian or Asian British: Pakistani - 2.7Asian or Asian British: any other Asian background -1.6Black or African or Caribbean or Black British: African - 2.5Black or African or Caribbean or Black British: Caribbean - 1Black or African or Caribbean or Black British: other Black or African or Caribbean background - 0.5Mixed multiple ethnic groups: White and Asian - 0.8Mixed multiple ethnic groups: White and Black African - 0.4Mixed multiple ethnic groups: White and Black Caribbean - 0.9Mixed multiple ethnic groups: any other Mixed or multiple ethnic background - 0.8White: English or Welsh or Scottish or Northern Irish or British - 74.4White: Irish - 0.9White: Gypsy or Irish Traveller - 0.1White: any other White background - 6.4Other ethnic group: any other ethnic group - 1.6Other ethnic group: Arab - 0.6
https://datafinder.stats.govt.nz/license/attribution-4-0-international/https://datafinder.stats.govt.nz/license/attribution-4-0-international/
Dataset contains ethnic group census usually resident population counts from the 2013, 2018, and 2023 Censuses, as well as the percentage change in the ethnic group population count between the 2013 and 2018 Censuses, and between the 2018 and 2023 Censuses. Data is available by regional council.
The ethnic groups are:
Map shows percentage change in the census usually resident population count for ethnic groups between the 2018 and 2023 Censuses.
Download lookup file from Stats NZ ArcGIS Online or embedded attachment in Stats NZ geographic data service. Download data table (excluding the geometry column for CSV files) using the instructions in the Koordinates help guide.
Footnotes
Geographical boundaries
Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.
Subnational census usually resident population
The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city.
Caution using time series
Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).
About the 2023 Census dataset
For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.
Data quality
The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.
Quality rating of a variable
The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.
Ethnicity concept quality rating
Ethnicity is rated as high quality.
Ethnicity – 2023 Census: Information by concept has more information, for example, definitions and data quality.
Using data for good
Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga”.
Confidentiality
The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.
Symbol
-998 Not applicable
Percentages
To calculate percentages, divide the figure for the category of interest by the figure for ‘Total stated’ where this applies.
FairFace is a face image dataset which is race balanced. It contains 108,501 images from 7 different race groups: White, Black, Indian, East Asian, Southeast Asian, Middle Eastern, and Latino. Images were collected from the YFCC-100M Flickr dataset and labeled with race, gender, and age groups.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset has been created by using the open-source code released by LNDS (Luxembourg National Data Service). It is meant to be an example of the dataset structure anyone can generate and personalize in terms of some fixed parameter, including the sample size. The file format is .csv, and the data are organized by individual profiles on the rows and their personal features on the columns. The information in the dataset has been generated based on the statistical information about the age-structure distribution, the number of populations over municipalities, the number of different nationalities present in Luxembourg, and salary statistics per municipality. The STATEC platform, the statistics portal of Luxembourg, is the public source we used to gather the real information that we ingested into our synthetic generation model. Other features like Date of birth, Social matricule, First name, Surname, Ethnicity, and physical attributes have been obtained by a logical relationship between variables without exploiting any additional real information. We are in compliance with the law in putting close to zero the risk of identifying a real person completely by chance.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for "race"
Dataset Summary
RACE is a large-scale reading comprehension dataset with more than 28,000 passages and nearly 100,000 questions. The dataset is collected from English examinations in China, which are designed for middle school and high school students. The dataset can be served as the training and test sets for machine comprehension.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Needed… See the full description on the dataset page: https://huggingface.co/datasets/ehovy/race.