18 datasets found
  1. PMData

    • kaggle.com
    • huggingface.co
    zip
    Updated Apr 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VT (2021). PMData [Dataset]. https://www.kaggle.com/datasets/vlbthambawita/pmdata-a-sports-logging-dataset/discussion
    Explore at:
    zip(1401630710 bytes)Available download formats
    Dataset updated
    Apr 19, 2021
    Authors
    VT
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Paper: https://dl.acm.org/doi/10.1145/3339825.3394926

    In this dataset, we present the PMData dataset that aims to combine traditional lifelogging with sports activity logging. Such a dataset enables the development of several interesting analysis applications, e.g., where additional sports data can be used to predict and analyze everyday developments like a person's weight and sleep patterns, and where traditional lifelog data can be used in a sports context to predict an athletes performance. In this respect, we have used the Fitbit Versa 2 smartwatch wristband, the PMSys sports logging app a and Google forms for the data collection, and PMData contains logging data for 5 months from 16 persons. Our initial experiments show that such analyzes are possible, but there are still large rooms for improvements.

    Dataset details

    The structure of the main folder:

    The structure of the main folder:

    • [Main folder]
      • p01
      • p02
      • ...
      • p16
      • participant-overview.xlsx

    The structure of each sub folder (pXX):

    • pXX [folder]: is a folder containing data of participant XX (notation XX represents the identifier of the participant).

      • fitbit [folder]

        • calories.json: shows how many calories the person have burned the last minute.

        • distance.json: gives the distance moved per minute. Distance seems to be in centimeters.

        • exercise.json: describes each activity in more detail. It contains the date with start and stop time, time in different activity levels, type of activity and various performance metrics depending a bit on type of exercise, e.g., for running, it contains distance, time, steps, calories, speed and pace.

        • heart_rate.json: shows the number of heart beats per minute (bpm) at a given time.

        • lightly_active_minutes.json: sums up the number of lightly active minutes per day.

        • moderately_active_minutes.json: sums up the number of moderately active minutes per day.

        • resting_heart_rate.json: gives the resting heart rate per day.

        • sedentary_minutes.json: sums up the number of sedentary minutes per day.

        • sleep_score.csv: helps understand the sleep each night so you can see trends in the sleep patterns. It contains an overall 0-100 score made up from composition, revitalization and duration scores, the number of deep sleep minutes, the resting heart rate and a restlessness score.

        • sleep.json: is a per sleep breakdown of the sleep into periods of light, deep, rem sleeps and time awake.

        • steps.json: displays the number of steps per minute.

        • time_in_heart_rate_zones.json: gives the number of minutes in different heart rate zoned. Using the common formula of 220 minus your age, Fitbit will calculate your maximum heart rate and then create three target heart rate zones fat burn (50 to 69 percent of your max heart rate), cardio (70 to 84 percent of your max heart rate), and peak (85 to 100 percent of your max heart rate) - based off that number.

        • very_active_minutes.json: sums up the number of very active minutes per day.

      • googledocs [folder]

        • reporting.csv: contains one line per report including the date reported for, a timestamp of the report submission time, the eaten meals (breakfast, lunch, dinner and evening meal), the participants weight this day, the number of glasses drunk, and whether one has consumed alcohol.
      • pmsys [folder]

        • injury.csv: shows injuries with a time and date and corresponding injury locations and a minor and major severity.

        • srpe.csv: contains a training session’s end-time, type of activity, the perceived exertion (RPE), and the duration in the number of minutes. This is, for example, used to calculate the sessions training load or sRPE (RPE×duration).

        • wellness.csv: includes parameters like time and date, fatigue, mood, readiness, sleep duration (number of hours), sleep quality, soreness (and soreness area), and stress. Fatigue, sleep qual-ity, soreness, stress, and mood all have a 1-5 scale. The score 3 is normal, and 1-2 are scores below normal and 4-5 are scores above normal. Sleep length is just a measure of how long the sleep was in hours, and readiness (scale 0-10) is an overall subjective measure of how ready are you to exercise, i.e., 0 means not ready at all and 10 indicates that you cannot feel any better and are ready for anything!

        • food-images.zip: Participants 1, 3 and 5 have taken pictures of everything they have eaten except water during 2 months (February and March). There are food images included in this .zip file, and information about day and time is given in the...

  2. WNBA Play-by-Play and Box Scores

    • kaggle.com
    zip
    Updated Oct 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ZachHT (2025). WNBA Play-by-Play and Box Scores [Dataset]. https://www.kaggle.com/datasets/zachht/wnba-play-by-play-and-box-scores
    Explore at:
    zip(27409879 bytes)Available download formats
    Dataset updated
    Oct 27, 2025
    Authors
    ZachHT
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by ZachHT

    Released under Database: Open Database, Contents: Database Contents

    Contents

  3. Daily Fantasy Basketball - DraftKings NBA

    • kaggle.com
    zip
    Updated Dec 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alan Du (2017). Daily Fantasy Basketball - DraftKings NBA [Dataset]. https://www.kaggle.com/alandu20/daily-fantasy-basketball-draftkings
    Explore at:
    zip(269586908 bytes)Available download formats
    Dataset updated
    Dec 29, 2017
    Authors
    Alan Du
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    In Daily Fantasy Sports (DFS) contests, contestants construct a virtual lineup of players that score points based on their real-world performances. Unlike in season-long Fantasy Sports contests,in DFS contestants submit a new lineup for each set of games. DFS contests are held for several professional sports leagues, including the National Football League (NFL), National Basketball League (NBA), and National Hockey League (NHL). The leading DFS sites today are DraftKings and Fanduel, which control approximately 90% of the $3B DFS market.

    There are three primary types of DFS games: Head-to-Heads (H2Hs), Double-Ups, and Guaranteed Prize Pools (GPPs). In H2H games, two contestants play for a single cash prize. In Double-Up games, a pool of contestants compete to place in the top 50% of lineups, which are awarded twice the entry fee. In GPPs, a pool of contestants compete for a fixed prize structure that tends to be very top heavy; some contests payout hundreds of thousands of dollars to the top finisher.

    Over the last year, I have developed a winning system for daily fantasy football and baseball contests. Building this system from scratch was a fantastic compliment to the things I learned as a student, from machine learning and optimization to optimal learning and game theory. I hope others can join me in researching daily fantasy basketball and perhaps get involved with the burgeoning world of daily fantasy sports.

    Content

    This dataset contains 20 days of DraftKings NBA contest data scraped between 2017-11-27 and 2017-12-28. For DraftKings NBA daily fantasy basketball contest rules, see https://www.draftkings.com/help/rules/nba.

    Format:

    • One folder per day
    • One folder per contest for a given day
    • Salary file (“DKSalaries.csv”), payout structure file (“payout_structure.csv”), and contest results file (“contest-standings.csv”) for a given contest. Column headers in each files are pretty self-explanatory.
    • Some additional files (e.g. “players.csv”, “covariance_mat_unfiltered.csv”, “hist_fpts_mat.csv”) for a given contest. These files were for my personal research, feel free to use or ignore.
    • “projections” folder contains projections data for each player from rotogrinders and daily fantasy nerd, labeled by date.
    • “contests.csv” contains information about each contest, e.g. entry fee, slate, and contest size.

    Acknowledgements

    Thank you to my friend from college, Michael Chiang, for contributions to this project.

    Inspiration

    A few ideas to get started:

    • What kind of position "stacks" tend to maximize correlation within a lineup?
    • How can you minimize correlation between lineups, such that you maximize your chances of winning a GPP?
    • What are the tendencies of some of the top DFS pros?
    • Can you improve rotogrinders and daily fantasy nerd player projections?
    • Can you predict which players are undervalued (i.e. high fantasy points / salary ratio)?
    • Can you predict the ownership percentage for each player in a given contest?
  4. mega_nba2023-2024

    • kaggle.com
    zip
    Updated Nov 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    monogurui_ii (2025). mega_nba2023-2024 [Dataset]. https://www.kaggle.com/datasets/rickmcintire/mighunba2024
    Explore at:
    zip(31068 bytes)Available download formats
    Dataset updated
    Nov 17, 2025
    Authors
    monogurui_ii
    Description

    game simulator (basketball): NBA 2023-2024

    The aim of this project is to generate simulations of basketball games between NBA finals teams for the 2023-2024 season for the purpose of modeling predicted outcomes from a player efficiency metric (the "r metric").

    A simulated 82 game season will be run daily.

    2022-2023 box score statistics for players (on a per 100 possessions basis) were gathered from https://www.basketball-reference.com/.

    The players stats were filtered and transformed to reflect a focus on box score stats measuring playing efficiency, as opposed to measures of volume. For example, Real Shooting Percentage (True Shooting Percentage adjusted for volume, based on points generated above average) was incorporated into the metric as opposed to Points Per Game; Adjusted Assist to Turnover Ratio (Assist to Turnover adjusted for volume, based on assists to turnovers generated above average) was incorporated as opposed to Assists Per Game. The complete list of stats used for the r metric is as follows:

    Real Shooting Percentage
    Offensive Rebounds
    Adjusted Assist to Turnover Ratio
    Steals
    Blocks
    Personal Fouls 
    

    The r metric efficiency rating was derived from performing a boosted regression on the overall team stats for a selection of teams for NBA seasons from 1980 to the present against their Point Differential and then applying the resulting predicted values to individual players.

    An R function was created to generate simulated game outcomes from a Kaggle notebook. The output is produced as a ggplot (visualizing the r metric (in pink) against the traditional box score stats (coded by team in blue/red) and a csv file as a box score. The notebook is scheduled to run daily, randomly selecting teams to play against one another and generating an outcome based on the player stats and metric for each team with an element of random variation.

  5. d

    Sports - Cricket: Year- and Match-wise Scores, Winners, Victory Margins and...

    • dataful.in
    Updated Nov 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataful (Factly) (2025). Sports - Cricket: Year- and Match-wise Scores, Winners, Victory Margins and Season Winners in ODI World Cups, since 1975 [Dataset]. https://dataful.in/datasets/5809
    Explore at:
    application/x-parquet, xlsx, csvAvailable download formats
    Dataset updated
    Nov 6, 2025
    Dataset authored and provided by
    Dataful (Factly)
    License

    https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions

    Area covered
    Countries of the World
    Variables measured
    Matches
    Description

    The dataset contains year- and match-wise historical data on each match played in all the world cups since 1975. The specifics of data contained of each match includes year in which world cup was held, venue, first and second batting teams, their scores, results, winners, winning margins by number of runs or wickets, types of match, such as league match, quarter finals, semi finals, finals, etc, along with names of host country and season winner.

  6. France and Germany Football Leagues Dataset

    • kaggle.com
    zip
    Updated Oct 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gökhan Ergül (2024). France and Germany Football Leagues Dataset [Dataset]. https://www.kaggle.com/datasets/gokhanergul/france-and-germany-football-leagues-dataset
    Explore at:
    zip(363605 bytes)Available download formats
    Dataset updated
    Oct 6, 2024
    Authors
    Gökhan Ergül
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Germany, France
    Description

    France and Germany Football Leagues Dataset

    This dataset contains match data from the top football leagues and cup competitions in France and Germany. The dataset provides comprehensive information about home and away teams, their scores, match dates, and seasons. It is a valuable resource for football enthusiasts, data scientists, and analysts interested in exploring football statistics and trends across two of Europe's biggest football nations.

    Dataset Summary:

    • Total Matches: 38,596
    • Countries Covered: France, Germany
    • Leagues Included:
      • Ligue 1 (France)
      • Ligue 2 (France)
      • Coupe de France
      • Bundesliga (Germany)
      • 2. Bundesliga (Germany)
      • DFB-Pokal (Germany)

    Column Descriptions:

    • Country: The country where the match took place (France or Germany).
      Example values: 'France', 'Germany'

    • Lig: The specific league or cup in which the match was played. This column captures whether the match was part of Ligue 1, Ligue 2, Coupe de France, Bundesliga, 2. Bundesliga, or DFB-Pokal.
      Example values: 'Ligue 1', 'Bundesliga', 'DFB-Pokal'

    • home_team: The name of the home team in the match.
      Example values: 'Paris Saint-Germain', 'Bayern Munich'

    • away_team: The name of the away team in the match.
      Example values: 'Olympique Lyonnais', 'Borussia Dortmund'

    • home_score: The number of goals scored by the home team in the match.
      Example values: '3', '0'

    • away_score: The number of goals scored by the away team in the match.
      Example values: '1', '2'

    • season_year: The season in which the match took place. Typically, football seasons run from one year to the next (e.g., 2022-2023 season).
      Example values: '2022/2023', '2021/2022'

    • Date_day: The specific day on which the match was played, formatted as day and month (dd.mm).
      Example values: '05.01', '29.09'

    • Date_hour: The hour and minute the match kicked off, formatted as hh:mm.
      Example values: '20:45', '18:30'

    Use Cases:

    This dataset can be used for various purposes, including: - Analyzing team performance trends over different seasons. - Comparing goal-scoring patterns in home vs. away matches. - Building predictive models to forecast match outcomes based on historical data. - Understanding football dynamics in France and Germany through data visualizations.

    Feel free to explore and use this dataset to draw your own insights and conclusions!

  7. NLL Statistics

    • kaggle.com
    zip
    Updated Dec 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    apalensky (2020). NLL Statistics [Dataset]. https://www.kaggle.com/apalensky/nll-statistics
    Explore at:
    zip(816536 bytes)Available download formats
    Dataset updated
    Dec 15, 2020
    Authors
    apalensky
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Lacrosse lags behind the big four sports in data driven insights. As the third largest indoor league, the National Lacrosse League could be next up in sports analytics breakthroughs. I will work to draw insights from this data and hope others can enjoy the same process. Insights and more files can be accessed in my GitHub repository. Some basic Tableau workbooks are available through my Tableau Public account

    Content

    Floor player stats by game for every publicly available box score.

    Legend for NLLFloorGameStats.csv:

    Day - day of the week game was played
    Date - date of game
    Location - hosting team
    #- jersey number
    Name - player name
    Captain - denotes Captain and Alternate Captains
    Team - player's team
    G - goals
    A - assists
    +/- - score differential while player is on the floor
    PIM - penalty minutes
    S - shots on goal
    SOFF - shots off goal
    LB - loose balls
    T - turnovers
    CT - caused turnovers
    FO_W - faceoff wins
    FO - total faceoffs taken
    TOF - time on floor
    

    Changes to the legend for all yearly files:

    Points - sum of goals and assists
    PM - score differential while player is on the floor (replacement for +/-)
    ....PG - statistic average per game
    ....p60 - statistic per 60 minutes of floor time (only applicable to 2020 with recording of TOF)
    ATO_ratio - assist to turnover ratio
    FOpercent - faceoff percentage
    ShootingPct - goals scored out of total shots taken
    AdjShootingPct - goals scored out of shots on goal
    

    Legend for NLLGoaliesGameStats.csv:

    Day - day of the week game was played
    Date - date of game
    Location - hosting team
    #- jersey number
    Name - player name
    Credit- denotes credited win, loss, or designated backup
    Team - player's team
    MIN - minutes in net
    SV Q1 - saves in quarter 1
    SV Q2 - saves in quarter 2
    SV Q3 - saves in quarter 3
    SV Q4 - saves in quarter 4
    SV OT - saves in overtime
    SV - total saves
    SOG - shots on goal seen
    GA - goals allowed
    

    Inspiration

    Lacrosse lags behind the big four sports in data driven insights. As the third largest indoor league, the National Lacrosse League could be next up in sports analytics breakthroughs.

  8. Esports Performance Rankings and Results

    • kaggle.com
    zip
    Updated Dec 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Esports Performance Rankings and Results [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlocking-collegiate-esports-performance-with-bu
    Explore at:
    zip(110148 bytes)Available download formats
    Dataset updated
    Dec 12, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Esports Performance Rankings and Results

    Performance Rankings and Results from Multiple Esports Platforms

    By [source]

    About this dataset

    This dataset provides a detailed look into the world of competitive video gaming in universities. It covers a wide range of topics, from performance rankings and results across multiple esports platforms to the individual team and university rankings within each tournament. With an incredible wealth of data, fans can discover statistics on their favorite teams or explore the challenges placed upon university gamers as they battle it out to be the best. Dive into the information provided and get an inside view into the world of collegiate esports tournaments as you assess all things from Match ID, Team 1, University affiliations, Points earned or lost in each match and special Seeds or UniSeeds for exceptional teams. Of course don't forget about exploring all the great Team Names along with their corresponding websites for further details on stats across tournaments!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    Download Files First, make sure you have downloaded the CS_week1, CS_week2, CS_week3 and seeds datasets on Kaggle. You will also need to download the currentRankings file for each week of competition. All files should be saved using their originally assigned name in order for your analysis tools to read them properly (ie: CS_week1.csv).

    Understand File Structure Once all data has been collected and organized into separate files on your desktop/laptop computer/mobile device/etc., it's time to become familiar with what type of information is included in each file. The main folder contains three main data files: week1-3 and seedings. The week1-3 contain teams matched against one another according to university, point score from match results as well as team name and website URL associated with university entry; whereas the seedings include a ranking system amongst university entries which are accompanied by information regarding team names, website URLs etc.. Furthermore, there is additional file featured which contains currentRankings scores for each individual player/teams for an first given period of competition (ie: first week).

    Analyzing Data Now that everything is set up on your end it’s time explore! You can dive deep into trends amongst universities or individual players in regards to specific match performances or standings overall throughout weeks of competition etc… Furthermore you may also jumpstart insights via further creation of graphs based off compiled date from sources taken from BUECTracker dataset! For example let us say we wanted compare two universities- let's say Harvard University v Cornell University - against one another since beginning of event i we shall extract respective points(column),dates(column)(found under result tab) ,regions(csilluminating North America vs Europe etc)general stats such as maps played etc.. As well any other custom ideas which would come along in regards when dealing with similar datasets!

    Research Ideas

    • Analyze the performance of teams and identify areas for improvement for better performance in future competitions.
    • Assess which esports platforms are the most popular among gamers.
    • Gain a better understanding of player rankings across different regions, based on rankings system, to create targeted strategies that could boost individual players' scoring potential or team overall success in competitive gaming events

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: CS_week1.csv | Column name | Description | |:---------------|:----------------------------------------------| | Match ID | Unique identifier for each match. (Integer) | | Team 1 | Name of the first team in the match. (String) | | University | University associated with the team. (String) |

    File: CS_week1_currentRankings.csv | Column name | Description | |:--------------|:-----------------------------------------------------------|...

  9. match data for soccer leagues

    • kaggle.com
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scottie Meadows (2025). match data for soccer leagues [Dataset]. https://www.kaggle.com/datasets/scottiemeadows/match-data-for-soccer-leagues
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 18, 2025
    Dataset provided by
    Kaggle
    Authors
    Scottie Meadows
    Description

    This CSV file contains a comprehensive dataset of simulated soccer match statistics spanning 25 years (2000-2001 to 2024-2025) for major leagues including MLS, Premier League, La Liga, and Bundesliga. Each row represents a single match and includes details such as team names, scores, match results, yellow/red cards, ball possession, offsides, fouls, team formations, starting lineups, and betting odds from Bet365. The data also breaks down goals by type (penalty, freekick, corner) and half, and includes a "build-up speed" metric. The entire dataset is sorted chronologically by match date. This file is designed to support various analytical questions related to soccer performance, strategy, and betting trends over time.

  10. Tokyo Olympic Dataset

    • kaggle.com
    zip
    Updated Apr 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil Kadbhane (2025). Tokyo Olympic Dataset [Dataset]. https://www.kaggle.com/datasets/sahil1kadbhane1234/tokyo-olympic-dataset
    Explore at:
    zip(329690417 bytes)Available download formats
    Dataset updated
    Apr 4, 2025
    Authors
    Sahil Kadbhane
    Area covered
    Tokyo
    Description

    Here's a detailed description of the Tokyo Olympics Dataset, including file descriptions and insights into its contents:

    Tokyo Olympics Dataset: A Comprehensive Analysis of the 2020 Summer Games

    📌 Subtitle: Explore the athletes, events, medal winners, and historical trends from the Tokyo 2020 Summer Olympics.

    📝 Dataset Overview

    The Tokyo Olympics 2020 dataset provides a detailed breakdown of the athletes, events, and medals awarded during the Summer Games. This dataset serves as an essential resource for data analysis, visualization, and machine learning applications related to sports analytics.

    📂 File Descriptions

    1. athletes.csv

    This file contains detailed information about all participating athletes, including demographics and country representation.

    • Columns:

      • Athlete_ID – Unique identifier for each athlete
      • Name – Full name of the athlete
      • Gender – Male (M) / Female (F)
      • Age – Age of the athlete during the event
      • Country – Country the athlete represents
      • Sport – The sport in which the athlete competed
      • Event – Specific event the athlete participated in
    • Insights:

      • Analyze the gender distribution across various sports.
      • Identify the youngest and oldest participants in the Olympics.

    2. medals.csv

    This file lists all medal winners, including details on the type of medal awarded and the event in which it was won.

    • Columns:

      • Athlete_ID – Unique athlete reference
      • Name – Name of the medal-winning athlete
      • Country – Country represented
      • Sport – Sport category
      • Event – Specific event won
      • Medal – Type of medal won (Gold, Silver, Bronze)
    • Insights:

      • Track the top-performing countries based on total medals won.
      • Identify athletes who won multiple medals across different events.

    3. events.csv

    A dataset containing all sporting events held during the Tokyo 2020 Olympics.

    • Columns:

      • Event_ID – Unique event identifier
      • Sport – Name of the sport
      • Event – Name of the event
      • Venue – Location where the event took place
      • Date – Date of the event
    • Insights:

      • Understand the distribution of events across different venues.
      • Identify the busiest days in the Olympic schedule.

    4. results.csv

    This file records the performance outcomes of athletes in various events.

    • Columns:

      • Athlete_ID – Unique reference for the athlete
      • Event_ID – Reference to the event in which they participated
      • Position – Final ranking or placement in the event
      • Time/Score – Performance metric (e.g., time, points, or score)
    • Insights:

      • Analyze event results to find record-breaking performances.
      • Compare performances of athletes from different nations.

    5. countries.csv

    A reference file that provides details on each participating country.

    • Columns:

      • Country_Code – Standard Olympic country abbreviation
      • Country_Name – Full name of the country
      • Continent – Continent to which the country belongs
    • Insights:

      • Explore regional performance trends by grouping countries by continent.
      • Compare medal counts across different continents.

    📊 Potential Use Cases 🔹 Sports Analytics – Identify patterns in athlete performance and event results
    🔹 Machine Learning – Predict medal winners based on past data
    🔹 Data Visualization – Create dashboards showing country-wise and event-wise medal counts
    🔹 Time Series Analysis – Analyze trends across multiple Olympic events

    This dataset is a valuable resource for data enthusiasts, sports analysts, and researchers aiming to uncover insights into the Tokyo 2020 Olympics. 🚀

    Would you like me to format this further for a Kaggle dataset page? 😊

  11. mega_mb0000data

    • kaggle.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    monogurui_ii (2023). mega_mb0000data [Dataset]. https://www.kaggle.com/datasets/rickmcintire/mega-mb0000data
    Explore at:
    zip(31948 bytes)Available download formats
    Dataset updated
    Jun 1, 2023
    Authors
    monogurui_ii
    Description

    game simulator (basketball): NBA Finals Teams 1980-2022

    • The aim of this project is to generate simulations of basketball games between NBA finals teams from 1980 to the present for the purpose of modeling predicted outcomes from a player efficiency metric (the "r metric").

    • A champion will be determined for the simulated season using a quadruple-elimination format, with teams eliminated from contention upon recording 4 losses until only one team remains.

    • Playoff box score statistics for players (on a per 100 possessions basis) were gathered from https://www.basketball-reference.com/.

    • The players stats were filtered and transformed to reflect a focus on box score stats measuring playing efficiency, as opposed to measures of volume. For example, Real Shooting Percentage (True Shooting Percentage adjusted for volume, based on points generated above average) was incorporated into the metric as opposed to Points Per Game; Assist to Turnover Ratio was incorporated as opposed to Assists Per Game. The complete list of stats used for the r metric is as follows:

      • Real Shooting Percentage
      • Offensive Rebounds
      • Assist to Turnover Ratio
      • Steals
      • Blocks
      • Personal Fouls
    • The r metric efficiency rating was derived from performing a regression on the overall team stats for a selection of teams for NBA seasons from 1980 to the present against their Point Differential and then applying the resulting predicted values to individual players.

    • An R function was created to generate simulated game outcomes from a Kaggle notebook. The output is produced as a ggplot (visualizing the r metric (in pink) against the traditional box score stats (coded by team in blue/red) and a csv file as a box score. The notebook is scheduled to run daily, randomly selecting two teams to play against one another and generating an outcome based on the player stats and metric for each team with an element of random variation.

  12. EgyptianLeague

    • kaggle.com
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahmoud Elshabrawy (2024). EgyptianLeague [Dataset]. https://www.kaggle.com/datasets/mahmoudelshabrawy/egyptianleague
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 29, 2024
    Dataset provided by
    Kaggle
    Authors
    Mahmoud Elshabrawy
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Egyptian Premier League Match Data (2010-2024) This dataset contains detailed information about matches played in the Egyptian Premier League from 2010 to 2024. The dataset includes match statistics, team performance, referee decisions, and the outcome of each match.

    Features Overview: 1. ID: Unique identifier for each match. 2. Season: The season in which the match took place. 3. Fixture: Details about the specific fixture in the league. 4. MatchDay: The match day number within the season. 5. Date: The date on which the match was played. 6. Time: The time of the match. 7. Home Team: The team playing at home. 8. Away Team: The visiting team. 9. Referee: The referee officiating the match. 10. Yellow Home: Number of yellow cards issued to the home team. 11. Yellow Away: Number of yellow cards issued to the away team. 12. 2nd Yellow Home: Number of second yellow cards (leading to a red card) for the home team. 13. 2nd Yellow Away: Number of second yellow cards for the away team. 14. Red Home: Number of red cards issued to the home team. 15. Red Away: Number of red cards issued to the away team. 16. Half Time Result: The score at halftime. 17. Full Time Result: The final score at the end of the match. 18. Home Goals: Goals scored by the home team. 19. Away Goals: Goals scored by the away team. 20. Winner: Indicates the winner of the match (Home, Away, or Draw). 21. Label: Various performance labels or categorization criteria. 22. Count: Frequency or count associated with certain labels.

    Potential Use Cases: * Match Analysis: Track performance trends for different teams, referees, and players over multiple seasons. * Predictive Modeling: Create machine learning models to predict match outcomes based on past performance. * Referee Performance: Analyze the impact of referees on match outcomes and team discipline. * Team Strategy Insights: Examine the correlation between yellow/red cards and match results. * Time Series Analysis: Perform time-based analysis of matches and outcomes across different seasons. This dataset is ideal for soccer analysts, sports statisticians, and machine learning enthusiasts who are interested in exploring match data from the Egyptian Premier League.

  13. ODI Cricket Data

    • kaggle.com
    zip
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira (2025). ODI Cricket Data [Dataset]. https://www.kaggle.com/datasets/willianoliveiragibin/odi-cricket-data
    Explore at:
    zip(56818 bytes)Available download formats
    Dataset updated
    Feb 23, 2025
    Authors
    willian oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    this graph was created in R :

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Ffd90736223cc5572985e7a2153c51327%2Ffoto3.png?generation=1740349164551931&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F87be218db233c41e5a4260c8f24a9c80%2Fgif2.gif?generation=1740349170058731&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F03d6822c7ea63bcb99b339fd51d2168d%2Fgif1.gif?generation=1740349175430653&alt=media" alt="">

    This dataset provides comprehensive information on more than 2400 One Day International (ODI) cricket matches obtained from Cricsheet and includes detailed batting and bowling statistics match summaries and individual player performances with the exception of matches involving Afghanistan’s men’s team or those played in the Afghanistan Premier League due to Cricsheet’s data policy making it an excellent resource for sports analytics machine learning and cricket strategy modeling allowing users to analyze player consistency evaluate team performance predict fantasy cricket outcomes and assess match results the dataset is divided into several files including batter_player_stats.csv which contains batting data such as total runs strike rate matches played and player of the match awards bowler_player_stats.csv which offers bowling data including total wickets economy rate overs bowled and matches played as a bowler detailed_player_data.csv which provides per-match player performance data such as runs scored balls faced wickets taken catches and fantasy points and match_summary.csv which includes match-level information such as toss results match outcomes either by runs or wickets player of the match and venue details potential use cases include player performance analysis to identify the most consistent batters and bowlers across various seasons match outcome prediction by developing models that leverage historical performance data fantasy cricket strategy optimization by selecting teams based on previous player performance and cricket analytics and visualization to explore trends in runs wickets and match-winning performances enabling deeper insights into the game and supporting advanced sports research and data-driven decision-making.

  14. 2025 - 2026 NBA Fantasy Projections

    • kaggle.com
    zip
    Updated Oct 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nishaan Amin (2025). 2025 - 2026 NBA Fantasy Projections [Dataset]. https://www.kaggle.com/datasets/nishaanamin/2025-2026-nba-fantasy-projections
    Explore at:
    zip(152070 bytes)Available download formats
    Dataset updated
    Oct 20, 2025
    Authors
    Nishaan Amin
    Description

    This dataset has projections for season-long NBA fantasy for both points and category leagues. Category league projections are based on the default 9 categories (points, assists, rebounds, 3 pointers made, field goal %, free throw %, steals, blocks, and turnovers). Points league projections are based on the default scoring systems for Yahoo, ESPN, Fantrax, and Sleeper leagues.

  15. Cricket data

    • kaggle.com
    zip
    Updated Jan 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mahendran narayanan (2020). Cricket data [Dataset]. https://www.kaggle.com/datasets/mahendran1/icc-cricket
    Explore at:
    zip(383854 bytes)Available download formats
    Dataset updated
    Jan 20, 2020
    Authors
    mahendran narayanan
    Description

    Context

    Any aspiring datascientist will look everything in view of data. Even when chilling with friends, watching cricket live and cheering for the favorite team.

    Content

    It includes ODI, Test, t20 statistics of all the players in all the three category (batting ,bowling and fielding).

    Acknowledgements

    We wouldn't be here without the help of cricket. Thank you for all the great cricketers for the wonderful contribution.

  16. ODI World Cup 2023 Complete Dataset

    • kaggle.com
    zip
    Updated Dec 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Engg Bilal Ali Khan (2023). ODI World Cup 2023 Complete Dataset [Dataset]. https://www.kaggle.com/datasets/enggbilalalikhan/odi-world-cup-2023-complete-dataset
    Explore at:
    zip(85414 bytes)Available download formats
    Dataset updated
    Dec 12, 2023
    Authors
    Engg Bilal Ali Khan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Comprehensive dataset containing detailed information on batting and bowling performances, as well as the schedule and results of matches from the ICC Cricket World Cup 2023. The dataset covers player statistics, match details, and more, providing a rich resource for cricket enthusiasts, analysts, and data scientists interested in exploring the dynamics of the tournament.

    Content - batting_summary.csv: Player-wise batting statistics. - bowling_summary.csv: Player-wise bowling statistics. - matches_schedule_results.csv: Schedule and results of World Cup 2023 matches.

  17. Data from: Pro Kabaddi League Dataset

    • kaggle.com
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Kapadnis (2023). Pro Kabaddi League Dataset [Dataset]. https://www.kaggle.com/datasets/sujaykapadnis/pro-kabaddi-league-dataset
    Explore at:
    zip(5823861 bytes)Available download formats
    Dataset updated
    Dec 11, 2023
    Authors
    Sujay Kapadnis
    Description

    Data Set DS_match.csv - Contain the Match Details as per the below table match_id :- Unique ID of for each match match_number : Same of the above in text format date : Date of the Match start_time : Match begin Time on the day result : Text field explaining what is the result player_id_of_the_match : player of the match id (can be refered to the DS_players with the combination of Match ID and Player ID player_name_of_the_match : Player of the match series_id : season identifier series_name :season name status : status of the match toss_winner : toss winner team id toss_selection : toss selection venue_id : location venue_name : location name home_team_id : Team ID home_team_name : description.

  18. English Premier League EPL xG Results (2023-24)

    • kaggle.com
    zip
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    orkunaktas4 (2024). English Premier League EPL xG Results (2023-24) [Dataset]. https://www.kaggle.com/datasets/orkunaktas/english-premier-league-epl-results-2023-24
    Explore at:
    zip(24100 bytes)Available download formats
    Dataset updated
    Jul 22, 2024
    Authors
    orkunaktas4
    Description

    Context

    this dataset contains the results and xG values of matches played in the english premier league in 2023-24

    Variables

    • Day: Match day
    • Date: Match date
    • Time: Match Time
    • Home: Home Team
    • xG: Home Team expected goal value
    • Score: Match result
    • xG: Away Team expected goal value
    • Away: Away Team
  19. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
VT (2021). PMData [Dataset]. https://www.kaggle.com/datasets/vlbthambawita/pmdata-a-sports-logging-dataset/discussion
Organization logo

PMData

A Sports Logging Dataset

Explore at:
zip(1401630710 bytes)Available download formats
Dataset updated
Apr 19, 2021
Authors
VT
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Paper: https://dl.acm.org/doi/10.1145/3339825.3394926

In this dataset, we present the PMData dataset that aims to combine traditional lifelogging with sports activity logging. Such a dataset enables the development of several interesting analysis applications, e.g., where additional sports data can be used to predict and analyze everyday developments like a person's weight and sleep patterns, and where traditional lifelog data can be used in a sports context to predict an athletes performance. In this respect, we have used the Fitbit Versa 2 smartwatch wristband, the PMSys sports logging app a and Google forms for the data collection, and PMData contains logging data for 5 months from 16 persons. Our initial experiments show that such analyzes are possible, but there are still large rooms for improvements.

Dataset details

The structure of the main folder:

The structure of the main folder:

  • [Main folder]
    • p01
    • p02
    • ...
    • p16
    • participant-overview.xlsx

The structure of each sub folder (pXX):

  • pXX [folder]: is a folder containing data of participant XX (notation XX represents the identifier of the participant).

    • fitbit [folder]

      • calories.json: shows how many calories the person have burned the last minute.

      • distance.json: gives the distance moved per minute. Distance seems to be in centimeters.

      • exercise.json: describes each activity in more detail. It contains the date with start and stop time, time in different activity levels, type of activity and various performance metrics depending a bit on type of exercise, e.g., for running, it contains distance, time, steps, calories, speed and pace.

      • heart_rate.json: shows the number of heart beats per minute (bpm) at a given time.

      • lightly_active_minutes.json: sums up the number of lightly active minutes per day.

      • moderately_active_minutes.json: sums up the number of moderately active minutes per day.

      • resting_heart_rate.json: gives the resting heart rate per day.

      • sedentary_minutes.json: sums up the number of sedentary minutes per day.

      • sleep_score.csv: helps understand the sleep each night so you can see trends in the sleep patterns. It contains an overall 0-100 score made up from composition, revitalization and duration scores, the number of deep sleep minutes, the resting heart rate and a restlessness score.

      • sleep.json: is a per sleep breakdown of the sleep into periods of light, deep, rem sleeps and time awake.

      • steps.json: displays the number of steps per minute.

      • time_in_heart_rate_zones.json: gives the number of minutes in different heart rate zoned. Using the common formula of 220 minus your age, Fitbit will calculate your maximum heart rate and then create three target heart rate zones fat burn (50 to 69 percent of your max heart rate), cardio (70 to 84 percent of your max heart rate), and peak (85 to 100 percent of your max heart rate) - based off that number.

      • very_active_minutes.json: sums up the number of very active minutes per day.

    • googledocs [folder]

      • reporting.csv: contains one line per report including the date reported for, a timestamp of the report submission time, the eaten meals (breakfast, lunch, dinner and evening meal), the participants weight this day, the number of glasses drunk, and whether one has consumed alcohol.
    • pmsys [folder]

      • injury.csv: shows injuries with a time and date and corresponding injury locations and a minor and major severity.

      • srpe.csv: contains a training session’s end-time, type of activity, the perceived exertion (RPE), and the duration in the number of minutes. This is, for example, used to calculate the sessions training load or sRPE (RPE×duration).

      • wellness.csv: includes parameters like time and date, fatigue, mood, readiness, sleep duration (number of hours), sleep quality, soreness (and soreness area), and stress. Fatigue, sleep qual-ity, soreness, stress, and mood all have a 1-5 scale. The score 3 is normal, and 1-2 are scores below normal and 4-5 are scores above normal. Sleep length is just a measure of how long the sleep was in hours, and readiness (scale 0-10) is an overall subjective measure of how ready are you to exercise, i.e., 0 means not ready at all and 10 indicates that you cannot feel any better and are ready for anything!

      • food-images.zip: Participants 1, 3 and 5 have taken pictures of everything they have eaten except water during 2 months (February and March). There are food images included in this .zip file, and information about day and time is given in the...

Search
Clear search
Close search
Google apps
Main menu