8 datasets found
  1. Ultimate UFC Dataset

    • kaggle.com
    zip
    Updated Jun 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mdabbert (2020). Ultimate UFC Dataset [Dataset]. https://www.kaggle.com/mdabbert/ultimate-ufc-dataset
    Explore at:
    zip(437477 bytes)Available download formats
    Dataset updated
    Jun 22, 2020
    Authors
    mdabbert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    There are some great UFC datasets available on Kaggle. I want to bring together all of those sets into one set to allow for deeper analysis.

    Content

    Version 4 has data updated through June 22nd, 2020

    Version 4 of this dataset includes: Rajeev Warrier's excellent dataset. This dataset was the basis for my work. It contains data for every UFC bout. The 'red fighter' and 'blue fighter' are improperly recorded prior to around 2010, so that data has been excluded. Additionally, features that could not be easily scraped by me for future fights have been removed.

    My odds dataset. My big contribution was the gambling odds for each fight.

    Mart Jürisoo's Rankings dataset. Includes a history of UFC fighter rankings. A wonderful resource that could have a lot of implications for machine learning models.

    There are 108 columns of data. I have included a detailed description to the data file.

    Additions to the Datasets

    I have created some new features for this dataset. Highlights include a set of differential features [age_dif, avg_td_dif, reach_dif....] that are the blue fighter's feature minus the red fighter's feature. The feature 'empty_arena' denotes whether the fights occurred in an empty arena.

    Update Schedule

    I plan on uploading a file of upcoming fights before every event and updating the main csv after every event.

    My TODO list:

    • A kernel showing how to build models to predict winning bets
    • Add opening and closing betting line information to the dataset
    • Add a file for upcoming events
    • Add DaveRosenman's PPV Sales Data
    • Add a rank differential column

    Want more?

    Poke around my GitHub for this project. Sorry for the lack of documentation. I'll get around to it!

  2. UFC-Fight historical data from 1993 to 2021

    • kaggle.com
    zip
    Updated Mar 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajeev Warrier (2021). UFC-Fight historical data from 1993 to 2021 [Dataset]. https://www.kaggle.com/rajeevw/ufcdata
    Explore at:
    zip(3876811 bytes)Available download formats
    Dataset updated
    Mar 21, 2021
    Authors
    Rajeev Warrier
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    UPDATE

    This dataset got a lot of love from the community and I saw many people asking for an updated version, so I have uploaded the latest scraped and processed data ( as of 21/03/2021). Now it's super easy for anyone to get the latest dataset (Just use a single command), so in case you need bleeding-edge data, or you want to see the code, you can look here. Hope this solves all problems! If there are any issues with the data, please forgive me and write about it in the comments or raise an issue on github. I will pick it up 👍 Thank you everyone for the emails and messages. As usual, have fun! ❤️ 😁

    Context

    This is a list of every UFC fight in the history of the organisation. Every row contains information about both fighters, fight details and the winner. The data was scraped from ufcstats website. After fightmetric ceased to exist, this came into picture. I saw that there was a lot of information on the website about every fight and every event and there were no existing ways of capturing all this. I used beautifulsoup to scrape the data and pandas to process it. It was a long and arduous process, please forgive any mistakes. I have provided the raw files incase anybody wants to process it differently. This is my first time creating a dataset, any suggestions and corrections are welcome! Incase anyone wants to check out the work, I have all uploaded all the code files, including the scraping module here

    Have fun!

    Content

    Each row is a compilation of both fighter stats. Fighters are represented by 'red' and 'blue' (for red and blue corner). So for instance, red fighter has the complied average stats of all the fights except the current one. The stats include damage done by the red fighter on the opponent and the damage done by the opponent on the fighter (represented by 'opp' in the columns) in all the fights this particular red fighter has had, except this one as it has not occured yet (in the data). Same information exists for blue fighter. The target variable is 'Winner' which is the only column that tells you what happened. Here are some column definitions:

    Column definitions:

    • R_ and B_ prefix signifies red and blue corner fighter stats respectively
    • _opp_ containing columns is the average of damage done by the opponent on the fighter
    • KD is number of knockdowns
    • SIG_STR is no. of significant strikes 'landed of attempted'
    • SIG_STR_pct is significant strikes percentage
    • TOTAL_STR is total strikes 'landed of attempted'
    • TD is no. of takedowns
    • TD_pct is takedown percentages
    • SUB_ATT is no. of submission attempts
    • PASS is no. times the guard was passed?
    • REV is the no. of Reversals landed
    • HEAD is no. of significant strinks to the head 'landed of attempted'
    • BODY is no. of significant strikes to the body 'landed of attempted'
    • CLINCH is no. of significant strikes in the clinch 'landed of attempted'
    • GROUND is no. of significant strikes on the ground 'landed of attempted'
    • win_by is method of win
    • last_round is last round of the fight (ex. if it was a KO in 1st, then this will be 1)
    • last_round_time is when the fight ended in the last round
    • Format is the format of the fight (3 rounds, 5 rounds etc.)
    • Referee is the name of the Ref
    • date is the date of the fight
    • location is the location in which the event took place
    • Fight_type is which weight class and whether it's a title bout or not
    • Winner is the winner of the fight
    • Stance is the stance of the fighter (orthodox, southpaw, etc.)
    • Height_cms is the height in centimeter
    • Reach_cms is the reach of the fighter (arm span) in centimeter
    • Weight_lbs is the weight of the fighter in pounds (lbs)
    • age is the age of the fighter
    • title_bout Boolean value of whether it is title fight or not
    • weight_class is which weight class the fight is in (Bantamweight, heavyweight, Women's flyweight, etc.)
    • no_of_rounds is the number of rounds the fight was scheduled for
    • current_lose_streak is the count of current concurrent losses of the fighter
    • current_win_streak is the count of current concurrent wins of the fighter
    • draw is the number of draws in the fighter's ufc career
    • wins is the number of wins in the fighter's ufc career
    • losses is the number of losses in the fighter's ufc career
    • total_rounds_fought is the average of total rounds fought by the fighter
    • total_time_fought(seconds) is the count of total time spent fighting in seconds
    • total_title_bouts is the total number of title bouts taken part in by the fighter
    • win_by_Decision_Majority is the number of wins by majority judges decision in the fighter's ufc career
    • win_by_Decision_Split is the number of wins by split judges decision in the fighter's ufc career
    • win_by_Decision_Unanimous is the number of wins by unanimous judges decision in the fighter's ufc career
    • win_by_KO/TKO is the number of wins by knockout in the fighter's ufc career
    • win_by_Submission is the number of wins by submission in the fighter's ufc career
    • win_by_TKO_Doctor_Stoppage is the number of wins by doctor stoppage in the fighter's ufc career

    Acknowledgements

    Info about me

    You can check out who I am and what I do here

  3. UFC Fights (2010 - 2020) with Betting Odds

    • kaggle.com
    Updated May 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mdabbert (2020). UFC Fights (2010 - 2020) with Betting Odds [Dataset]. https://www.kaggle.com/datasets/mdabbert/ufc-fights-2010-2020-with-betting-odds
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2020
    Dataset provided by
    Kaggle
    Authors
    mdabbert
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    There are some great UFC datasets out there, but I could not find one that included gambling odds.... So I went and made one myself. This dataset focuses very generally on the fights and hopes to be able to draw very broad conclusions. More a more in depth statistical fight analysis I would recommend Rajeev Warrier's excellent datasetwhich was the inspiration for my work.

    Content

    This dataset consists of 11 columns of data with basic information about every match that took place between March 21, 2010 and March 14, 2020.

    Column Definitions:

    R_fighter and B_fighter: The names of the fighter in the red corner and the fighter in the blue corner R_odds and B_odds: The American odds of the fighter winning.
    date: The date of the fight location: The location of the fight country: The country the fight occurred in Winner: The winner of the fight ('Red' or 'Blue') title_bout: Was this fight a title bout? ('True' or 'False') weight_class: What weight class did this fight occur at? gender: Male or Female

    Acknowledgements

    I was inspired by the work of Rajeev Warrier

    Want More?

    My work, including a scraper to help gather data for upcoming events, can be found on my GitHub. I promise I'll add more documentation soon.

  4. UFC 263 Contest Dummy Submission

    • kaggle.com
    zip
    Updated Jun 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mdabbert (2021). UFC 263 Contest Dummy Submission [Dataset]. https://www.kaggle.com/mdabbert/ufc-263-contest-dummy-submission
    Explore at:
    zip(488 bytes)Available download formats
    Dataset updated
    Jun 6, 2021
    Authors
    mdabbert
    Description

    Dataset

    This dataset was created by mdabbert

    Contents

    It contains the following files:

  5. Data from: Integrated Approaches to Manage Multi-Case Families in the...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Mar 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Justice (2025). Integrated Approaches to Manage Multi-Case Families in the Criminal Justice System in Maricopa County, Arizona, and Deschutes and Jackson Counties, Oregon, 1999-2005 [Dataset]. https://catalog.data.gov/dataset/integrated-approaches-to-manage-multi-case-families-in-the-criminal-justice-system-in-1999-20a01
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    National Institute of Justicehttp://nij.ojp.gov/
    Area covered
    Jackson County, Maricopa County, Arizona
    Description

    The project goal was to collect data on approximately 100 Unified Family Court (UFC) cases at each of the three selected jurisdictions -- Maricopa County, Arizona, Deschutes County, Oregon, and Jackson County, Oregon -- that have developed systems to address the special needs of families with multiple court cases. The purpose of the study was to examine research questions related to: (1) dependency case processing and outcomes, (2) delinquency case processing and outcomes, (3) domestic relations/probate case processing and outcomes, and (4) criminal case processing and outcomes. The data used in this study were generated from a review of the court records of 602 families including 406 families served by the UFC as well as comparison groups of 196 non-UFC multi-case families. During the study's planning phase, an instrument was drafted for use in extracting this information. Data collectors were recruited from former UFC staff and current and former non-UFC court staff. All data collectors were trained by the principal investigator in the use of the data collection form. The vast majority of all data extraction required a manual review of paper files. Variables in this dataset are organized into the following categories: background variables, items from dependency/abuse and neglect filings, delinquency filings, domestic relations/probate filings, civil domestic violence/protection order filings, criminal domestic violence filings, criminal child abuse filings, other criminal filings, and variables from a summary across cases.

  6. UFC Contest Dummy Submission 2021-07-17 Contest

    • kaggle.com
    zip
    Updated Jul 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mdabbert (2021). UFC Contest Dummy Submission 2021-07-17 Contest [Dataset]. https://www.kaggle.com/mdabbert/ufc-contest-dummy-submission-20210717-contest
    Explore at:
    zip(455 bytes)Available download formats
    Dataset updated
    Jul 12, 2021
    Authors
    mdabbert
    Description

    Dataset

    This dataset was created by mdabbert

    Contents

    It contains the following files:

  7. f

    Unified Factory Spolka Akcyjna Financial Reports

    • financialreports.eu
    Updated Mar 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FinancialReports UG (2023). Unified Factory Spolka Akcyjna Financial Reports [Dataset]. https://financialreports.eu/companies/unified-factory-spolka-akcyjna/
    Explore at:
    Dataset updated
    Mar 3, 2023
    Dataset authored and provided by
    FinancialReports UG
    License

    https://financialreports.eu/https://financialreports.eu/

    Time period covered
    2022 - Present
    Description

    Comprehensive collection of financial reports and documents for Unified Factory Spolka Akcyjna (UFC)

  8. Cleaned_UFC_Data

    • kaggle.com
    zip
    Updated Mar 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeremy Sun zhaocheng (2020). Cleaned_UFC_Data [Dataset]. https://www.kaggle.com/jeremysunzhaocheng/cleaned-ufc-data
    Explore at:
    zip(1088780 bytes)Available download formats
    Dataset updated
    Mar 2, 2020
    Authors
    Jeremy Sun zhaocheng
    Description

    Dataset

    This dataset was created by Jeremy Sun zhaocheng

    Contents

    It contains the following files:

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
mdabbert (2020). Ultimate UFC Dataset [Dataset]. https://www.kaggle.com/mdabbert/ultimate-ufc-dataset
Organization logo

Ultimate UFC Dataset

Merging All Kaggle Public UFC Datasets

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(437477 bytes)Available download formats
Dataset updated
Jun 22, 2020
Authors
mdabbert
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Context

There are some great UFC datasets available on Kaggle. I want to bring together all of those sets into one set to allow for deeper analysis.

Content

Version 4 has data updated through June 22nd, 2020

Version 4 of this dataset includes: Rajeev Warrier's excellent dataset. This dataset was the basis for my work. It contains data for every UFC bout. The 'red fighter' and 'blue fighter' are improperly recorded prior to around 2010, so that data has been excluded. Additionally, features that could not be easily scraped by me for future fights have been removed.

My odds dataset. My big contribution was the gambling odds for each fight.

Mart Jürisoo's Rankings dataset. Includes a history of UFC fighter rankings. A wonderful resource that could have a lot of implications for machine learning models.

There are 108 columns of data. I have included a detailed description to the data file.

Additions to the Datasets

I have created some new features for this dataset. Highlights include a set of differential features [age_dif, avg_td_dif, reach_dif....] that are the blue fighter's feature minus the red fighter's feature. The feature 'empty_arena' denotes whether the fights occurred in an empty arena.

Update Schedule

I plan on uploading a file of upcoming fights before every event and updating the main csv after every event.

My TODO list:

  • A kernel showing how to build models to predict winning bets
  • Add opening and closing betting line information to the dataset
  • Add a file for upcoming events
  • Add DaveRosenman's PPV Sales Data
  • Add a rank differential column

Want more?

Poke around my GitHub for this project. Sorry for the lack of documentation. I'll get around to it!

Search
Clear search
Close search
Google apps
Main menu