Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘ACS and LTDB Race Data by Community Reporting Area’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/d8e11059-8f97-42b5-986f-9dd40a5fba03 on 27 January 2022.
--- Dataset description provided by original source is as follows ---
Abstract: Census tract-based race and ethnicity data aggregated to City of Seattle Community Reporting Areas (CRAs) from the 1990 and 2010 Brown University Longitudinal Database (LTDB), 2010 decennial census and the 2014-2018 5-year American Community Survey (ACS). Brown University researchers created the LTDB to allow for comparing census data over time (see https://s4.ad.brown.edu/projects/diversity/Researcher/Bridging.htm). The race and ethnicity categories in the 2010 LTDB have been modified from those in the 2010 census to more closely match the 1990 race categories. (Before 2000, census questionnaires allowed respondents to identify as one race only. The LTDB allocates mixed-race people in post-1990 census estimates to non-white categories.) Please remember that the ACS data carry margins of error, and for small racial/ethnic groups they can be significant. The numeric and percentage changes overtime are also included. There is also a polygon representation for the City of Seattle as a whole.
Purpose: Census data of racial and ethnic categories from 1990 and 2010 Brown University LTDB, 2010 decennial and 2018 American Community Survey (ACS). Data is for the City of Seattle Community Reporting Areas as well as a polygon representation for the City of Seattle as a whole. Numeric and percentage changes over time are also included.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NA: Not applicable, for cells where the zero percent of the population fell into that category.(1) Prevalences and standard errors are calculated using the survey weights from the 5-year visit provided with the dataset. These adjust for unequal probability of selection and response. Survey and subclass estimation commands were used to account for complex sample design.(2) Overweight/obesity is defined as body mass index (BMI) z-score >2 standard deviations (SD) above age- and sex- specific WHO Childhood Growth Standard reference mean at all time points except birth, where we define overweight/obesity as weight-for-age z-score >2 SD above age- and sex- specific WHO Childhood Growth Standard reference mean.(3) To represent socioeconomic status, we used a composite index to capture multiple of the social dimensions of socioeconomic status. This composite index was provided in the ECLS-B data that incorporates information about maternal and paternal education, occupations, and household income to create a variable representing family socioeconomic status on several domains. The variable was created using principal components analysis to create a score for family socioeconomic status, which was then normalized by taking the difference between each score and the mean score and dividing by the standard deviation. If data needed for the composite socioeconomic status score were missing, they were imputed by the ECLS-B analysts [9].(4) We created a 5-category race/ethnicity variable (American Indian/Alaska Native, African American, Hispanic, Asian, white) from the mothers' report of child's race/ethnicity, which originally came 25 race/ethnic categories. To have adequate sample size in race/ethnic categories, we assigned a single race/ethnic category for children reporting more than one race, using an ordered, stepwise approach similar to previously published work using ECLS-B (3). First, any child reporting at least one of his/her race/ethnicities as American Indian/Alaska Native (AIAN) was categorized as AIAN. Next, among remaining respondents, any child reporting at least one of his/her ethnicities as African American was categorized as African American. The same procedure was followed for Hispanic, Asian, and white, in that order. This order was chosen with the goal of preserving the highest numbers of children in the American Indian/Alaska Native group and other non-white ethnic groups in order to estimate relationships within ethnic groups, which is often not feasible due to low numbers.
https://data-seattlecitygis.opendata.arcgis.com/datasets/c66ae5121051454d8d88349c86b5ce31_0/license.jsonhttps://data-seattlecitygis.opendata.arcgis.com/datasets/c66ae5121051454d8d88349c86b5ce31_0/license.json
Abstract: Census tract-based race and ethnicity data aggregated to City of Seattle Community Reporting Areas (CRAs) from the 1990 and 2010 Brown University Longitudinal Database (LTDB), 2010 decennial census and the 2014-2018 5-year American Community Survey (ACS). Brown University researchers created the LTDB to allow for comparing census data over time (see https://s4.ad.brown.edu/projects/diversity/Researcher/Bridging.htm). The race and ethnicity categories in the 2010 LTDB have been modified from those in the 2010 census to more closely match the 1990 race categories. (Before 2000, census questionnaires allowed respondents to identify as one race only. The LTDB allocates mixed-race people in post-1990 census estimates to non-white categories.) Please remember that the ACS data carry margins of error, and for small racial/ethnic groups they can be significant. The numeric and percentage changes overtime are also included. There is also a polygon representation for the City of Seattle as a whole.
Purpose: Census data of racial and ethnic categories from 1990 and 2010 Brown University LTDB, 2010 decennial and 2018 American Community Survey (ACS). Data is for the City of Seattle Community Reporting Areas as well as a polygon representation for the City of Seattle as a whole. Numeric and percentage changes over time are also included.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains data of horse racings from 1990 till 2020.
There are two different file types, races and horses, one pair for each year from 1990. I hope to update the current year data on a regular basis.
rid - Race id; course - Course of the race, country code in brackets, AW means All Weather, no brackets means UK; time - Time of the race in hh:mm format, London TZ; date - Date of the race; title - Title of the race; rclass - Race class; band - Band; ages - Ages allowed distance - Distance; condition - Surface condition; hurdles - Hurdles, their type and amount; prizes - Places prizes; winningTime - Best time shown; prize - Prizes total (sum of prizes column); metric - Distance in meters; countryCode - Country of the race; ncond - condition type (created from condition feature); class - class type (created from rclass feature).
rid - Race id; horseName - Horse name; age - Horse age; saddle - Saddle # where horse starts; decimalPrice - 1/Decimal price; isFav - Was horse favorite before start? Can be more then one fav in a race; trainerName - Trainer name; jockeyName - Jockey name; position - Finishing position, 40 if horse didn't finish; positionL - how far a horse has finished from the pursued horse, horses corpses; dist - how far a horse has finished from a winner, horses corpses; weightSt - Horse weight in St; weightLb - Horse weight in Lb; overWeight - Overweight code; outHandicap - Handicap; headGear - Head gear code; RPR - RP Rating; TR - Topspeed; OR - Official Rating father - Horse's Father name; mother - Horse's Mother name; gfather - Horse's Grandfather name; runners - Runners total; margin - Sum of decimalPrices for the race; weight - Horse weight in kg; res_win - Horse won or not; res_place - Horse placed or not
forward.csv contains information collected prior a race starts. The odds are averages from from Oddschecker.com, RPRc and TRc also have current values.
Please be aware, the prices provided are the SP (starting prices), and they are not available before race starts. This means prices before start may differ from SP. But usually favorites stay the same, and prices on them often higher then SP. Anyway you can't predict profit with accuracy based only on SP prices.
I suppose prediction of horse racing results by machine learning methods is a difficult task. There is no any highly correlated features, the outcome classes are imbalanced. I tried to make my own predictions, but with no luck. I hope to get some inspirations from your research. Please, share your experience with everyone or just with me. Thank you!
The data provided has been collected from public open websites, without sign-ups, log-ins and other restrictions from sources. Please, do not use this data for any commercial purposes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘🗳 Primary Candidates’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/primary-candidatese on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This folder contains the data behind the stories:
- We Researched Hundreds Of Races. Here’s Who Democrats Are Nominating.
- How’s The Progressive Wing Doing In Democratic Primaries So Far?
- We Looked At Hundreds Of Endorsements. Here’s Who Republicans Are Listening To.
This project looks at patterns in open Democratic and Republican primary elections for the U.S. Senate, U.S. House and governor in 2018.
dem_candidates.csv
contains information about the 811 candidates who have appeared on the ballot this year in Democratic primaries for Senate, House and governor, not counting races featuring a Democratic incumbent, as of August 7, 2018.
rep_candidates.csv
contains information about the 774 candidates who have appeared on the ballot this year in Republican primaries for Senate, House and governor, not counting races featuring a Republican incumbent, through September 13, 2018.Here is a description and source for each column in the accompanying datasets.
dem_candidates.csv
andrep_candidates.csv
include:
Column Description Candidate
All candidates who received votes in 2018’s Democratic primary elections for U.S. Senate, U.S. House and governor in which no incumbent ran. Supplied by Ballotpedia. State
The state in which the candidate ran. Supplied by Ballotpedia. District
The office and, if applicable, congressional district number for which the candidate ran. Supplied by Ballotpedia. Office Type
The office for which the candidate ran. Supplied by Ballotpedia. Race Type
Whether it was a “regular” or “special” election. Supplied by Ballotpedia. Race Primary Election Date
The date on which the primary was held. Supplied by Ballotpedia. Primary Status
Whether the candidate lost (“Lost”) the primary or won/advanced to a runoff (“Advanced”). Supplied by Ballotpedia. Primary Runoff Status
“None” if there was no runoff; “On the Ballot” if the candidate advanced to a runoff but it hasn’t been held yet; “Advanced” if the candidate won the runoff; “Lost” if the candidate lost the runoff. Supplied by Ballotpedia. General Status
“On the Ballot” if the candidate won the primary or runoff and has advanced to November; otherwise, “None.” Supplied by Ballotpedia. Primary %
The percentage of the vote received by the candidate in his or her primary. In states that hold runoff elections, we looked only at the first round (the regular primary). In states that hold all-party primaries (e.g., California), a candidate’s primary percentage is the percentage of the total Democratic vote they received. Unopposed candidates and candidates nominated by convention (not primary) are given a primary percentage of 100 but were excluded from our analysis involving vote share. Numbers come from official results posted by the secretary of state or local elections authority; if those were unavailable, we used unofficial election results from the New York Times. Won Primary
“Yes” if the candidate won his or her primary and has advanced to November; “No” if he or she lost.
dem_candidates.csv
includes:
Column Description Gender
“Male” or “Female.” Supplied by Ballotpedia. Partisan Lean
The FiveThirtyEight partisan lean of the district or state in which the election was held. Partisan leans are calculated by finding the average difference between how a state or district voted in the past two presidential elections and how the country voted overall, with 2016 results weighted 75 percent and 2012 results weighted 25 percent. Race
“White” if we identified the candidate as non-Hispanic white; “Nonwhite” if we identified the candidate as Hispanic and/or any nonwhite race; blank if we could not identify the candidate’s race or ethnicity. To determine race and ethnicity, we checked each candidate’s website to see if he or she identified as a certain race. If not, we spent no more than two minutes searching online news reports for references to the candidate’s race. Veteran?
If the candidate’s website says that he or she served in the armed forces, we put “Yes.” If the website is silent on the subject (or explicitly says he or she didn’t serve), we put “No.” If the field was left blank, no website was available. LGBTQ?
If the candidate’s website says that he or she is LGBTQ (including indirect references like to a same-sex partner), we put “Yes.” If the website is silent on the subject (or explicitly says he or she is straight), we put “No.” If the field was left blank, no website was available. Elected Official?
We used Ballotpedia, VoteSmart and news reports to research whether the candidate had ever held elected office before, at any level. We put “Yes” if the candidate has held elected office before and “No” if not. Self-Funder?
We used Federal Election Committee fundraising data (for federal candidates) and state campaign-finance data (for gubernatorial candidates) to look up how much each candidate had invested in his or her own campaign, through either donations or loans. We put “Yes” if the candidate donated or loaned a cumulative $400,000 or more to his or her own campaign before the primary and “No” for all other candidates. STEM?
If the candidate identifies on his or her website that he or she has a background in the fields of science, technology, engineering or mathematics, we put “Yes.” If not, we put “No.” If the field was left blank, no website was available. Obama Alum?
We put “Yes” if the candidate mentions working for the Obama administration or campaign on his or her website, or if the candidate shows up on this list of Obama administration members and campaign hands running for office. If not, we put “No.” Dem Party Support?
“Yes” if the candidate was placed on the DCCC’s Red to Blue list before the primary, was endorsed by the DSCC before the primary, or if the DSCC/DCCC aired pre-primary ads in support of the candidate. (Note: according to the DGA’s press secretary, the DGA does not get involved in primaries.) “No” if the candidate is running against someone for whom one of the above things is true, or if one of those groups specifically anti-endorsed or spent money to attack the candidate. If those groups simply did not weigh in on the race, we left the cell blank. Emily Endorsed?
“Yes” if the candidate was endorsed by Emily’s List before the primary. “No” if the candidate is running against an Emily-endorsed candidate or if Emily’s List specifically anti-endorsed or spent money to attack the candidate. If Emily’s List simply did not weigh in on the race, we left the cell blank. Gun Sense Candidate?
“Yes” if the candidate received the Gun Sense Candidate Distinction from Moms Demand Action/Everytown for Gun Safety before the primary, according to media reports or the candidate’s website. “No” if the candidate is running against an candidate with the distinction. If Moms Demand Action simply did not weigh in on the race, we left the cell blank. Biden Endorsed?
“Yes” if the candidate was endorsed by Joe Biden before the primary. “No” if the candidate is running against a Biden-endorsed candidate or if Biden specifically anti-endorsed the candidate. If Biden simply did not weigh in on the race, we left the cell blank. Warren Endorsed?
“Yes” if the candidate was endorsed by Elizabeth Warren before the primary. “No” if the candidate is running against a Warren-endorsed candidate or if Warren specifically anti-endorsed the candidate. If Warren simply did not weigh in on the race, we left the cell blank. Sanders Endorsed?
“Yes” if the candidate was endorsed by Bernie Sanders before the primary. “No” if the candidate is running against a Sanders-endorsed candidate or if Sanders specifically anti-endorsed the candidate. If Sanders simply did not weigh in on the race, we left the cell
This dataset contains natality data based on CDC-collected statistics for live births occurring within the United States to U.S. residents. The data capture a range of maternal demographic information, such as state and county of residence, mother's age and race, ethnicity and country of origin, marital status, and education. It also includes health and medical data on these mothers, including prior birth history, prenatal care visits, WIC enrollment, tobacco use, method of delivery, method of payment, and congenital anomalies and other morbidity data. Beyond maternal characteristics, this dataset also illustrates both paternal and infant information that may be relevant to understanding certain social determinants of health. Paternal characteristics include age, race and ethnicity (including country of origin), education. Infant characteristics: gender, birth weight, delivery, congenital abnormalities. For researchers and population health teams, this data can be used to identify localities that have had higher-than-average complicated and high-cost births and give insight into possible targeting strategies based on population characteristics. The data are derived from birth certificates, reported to the CDC. For more information, see here .
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘ACS and LTDB Race Data by Community Reporting Area’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/d8e11059-8f97-42b5-986f-9dd40a5fba03 on 27 January 2022.
--- Dataset description provided by original source is as follows ---
Abstract: Census tract-based race and ethnicity data aggregated to City of Seattle Community Reporting Areas (CRAs) from the 1990 and 2010 Brown University Longitudinal Database (LTDB), 2010 decennial census and the 2014-2018 5-year American Community Survey (ACS). Brown University researchers created the LTDB to allow for comparing census data over time (see https://s4.ad.brown.edu/projects/diversity/Researcher/Bridging.htm). The race and ethnicity categories in the 2010 LTDB have been modified from those in the 2010 census to more closely match the 1990 race categories. (Before 2000, census questionnaires allowed respondents to identify as one race only. The LTDB allocates mixed-race people in post-1990 census estimates to non-white categories.) Please remember that the ACS data carry margins of error, and for small racial/ethnic groups they can be significant. The numeric and percentage changes overtime are also included. There is also a polygon representation for the City of Seattle as a whole.
Purpose: Census data of racial and ethnic categories from 1990 and 2010 Brown University LTDB, 2010 decennial and 2018 American Community Survey (ACS). Data is for the City of Seattle Community Reporting Areas as well as a polygon representation for the City of Seattle as a whole. Numeric and percentage changes over time are also included.
--- Original source retains full ownership of the source dataset ---