Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset presents all the characteristics of the horses that raced in Honk Kong, between 2017 and 2020. The data was taken from the Hong Kong Jockey Club website.
The meaning of the columns:
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset contains data of horse racings from 1990 till 2020.
There are two different file types, races and horses, one pair for each year from 1990. I hope to update the current year data on a regular basis.
rid - Race id; course - Course of the race, country code in brackets, AW means All Weather, no brackets means UK; time - Time of the race in hh:mm format, London TZ; date - Date of the race; title - Title of the race; rclass - Race class; band - Band; ages - Ages allowed distance - Distance; condition - Surface condition; hurdles - Hurdles, their type and amount; prizes - Places prizes; winningTime - Best time shown; prize - Prizes total (sum of prizes column); metric - Distance in meters; countryCode - Country of the race; ncond - condition type (created from condition feature); class - class type (created from rclass feature).
rid - Race id; horseName - Horse name; age - Horse age; saddle - Saddle # where horse starts; decimalPrice - 1/Decimal price; isFav - Was horse favorite before start? Can be more then one fav in a race; trainerName - Trainer name; jockeyName - Jockey name; position - Finishing position, 40 if horse didn't finish; positionL - how far a horse has finished from the pursued horse, horses corpses; dist - how far a horse has finished from a winner, horses corpses; weightSt - Horse weight in St; weightLb - Horse weight in Lb; overWeight - Overweight code; outHandicap - Handicap; headGear - Head gear code; RPR - RP Rating; TR - Topspeed; OR - Official Rating father - Horse's Father name; mother - Horse's Mother name; gfather - Horse's Grandfather name; runners - Runners total; margin - Sum of decimalPrices for the race; weight - Horse weight in kg; res_win - Horse won or not; res_place - Horse placed or not
forward.csv contains information collected prior a race starts. The odds are averages from from Oddschecker.com, RPRc and TRc also have current values.
Please be aware, the prices provided are the SP (starting prices), and they are not available before race starts. This means prices before start may differ from SP. But usually favorites stay the same, and prices on them often higher then SP. Anyway you can't predict profit with accuracy based only on SP prices.
I suppose prediction of horse racing results by machine learning methods is a difficult task. There is no any highly correlated features, the outcome classes are imbalanced. I tried to make my own predictions, but with no luck. I hope to get some inspirations from your research. Please, share your experience with everyone or just with me. Thank you!
The data provided has been collected from public open websites, without sign-ups, log-ins and other restrictions from sources. Please, do not use this data for any commercial purposes.
https://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do
Korea Racing Authority provides information on races held at racecourses in Seoul, Busan, Gyeongnam, and Jeju. (The information provided includes the name of the racecourse, race date, race day, race number, number of race days, race distance, grade conditions, burden classification, race conditions, age conditions, conditions by prize, weather, course, race name, 1st place prize, 2nd place prize, 3rd place prize, 4th place prize, 5th place prize, additional prize 1, additional prize 2, additional prize 3 data, as well as information on participating horses and rankings, starting number, horse name, English horse name, horse number, nationality, age, gender, burden weight, rating (grade), jockey name, English jockey name, jockey number, trainer name, English trainer name, trainer number, owner name, English owner name, owner number, race record, horse weight, and record data by race route section.) - If nothing is entered as a request variable among race year/race year month/race date, information for the past month based on the most recent race date is displayed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mean displacement minima and maxima based on collated stride data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of output from linear mixed models: F and significance (p) values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of gallop runs analysed for each shoe-surface combination.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimated marginal means and confidence intervals for shoe effects.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimated marginal means and confidence intervals for surface effects.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset presents all the characteristics of the horses that raced in Honk Kong, between 2017 and 2020. The data was taken from the Hong Kong Jockey Club website.
The meaning of the columns: