20 datasets found
  1. Dataset Player FIFA Football 2025

    • kaggle.com
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danish Baariq (2025). Dataset Player FIFA Football 2025 [Dataset]. http://doi.org/10.34740/kaggle/dsv/12446270
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 12, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Danish Baariq
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains detailed attributes of professional football players, curated from a FIFA-based database. It includes a wide range of physical, technical, and performance-related statistics for each player.

    The dataset provides valuable insights for sports analysts, data scientists, machine learning practitioners, and football enthusiasts who are interested in exploring player performance, scouting analysis, player valuation modeling, and team formation strategy.

    📩 Dataset Features

    Each record in the dataset represents an individual football player and includes the following types of data:

    • Personal Information:

      • full_name, birth_date, age, height_cm, weight_kgs, nationality
    • Player Role & Identity:

      • positions, preferred_foot, national_team, national_team_position, national_rating, national_jersey_number
    • Overall & Potential Ratings:

      • overall_rating, potential, international_reputation(1-5)
    • Technical Attributes:

      • Ball control: ball_control, dribbling, crossing, curve
      • Shooting: finishing, volleys, shot_power, long_shots, penalties
      • Passing: short_passing, long_passing, vision
      • Defending: marking, standing_tackle, sliding_tackle, interceptions
    • Physical & Mental Attributes:

      • acceleration, sprint_speed, agility, strength, stamina, jumping, balance, aggression, composure, reactions
    • Special Skills:

      • skill_moves(1-5), weak_foot(1-5), freekick_accuracy, heading_accuracy
    • Economic Value:

      • value_euro, wage_euro, release_clause_euro

    🎯 Potential Use Cases

    • Player performance comparison
    • Player valuation prediction
    • Machine learning models for scouting
    • Visual dashboards for team analytics
    • Career trajectory analysis (based on age and potential)
  2. Football Players Data

    • kaggle.com
    zip
    Updated Nov 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Masood Ahmed (2023). Football Players Data [Dataset]. https://www.kaggle.com/datasets/maso0dahmed/football-players-data/suggestions
    Explore at:
    zip(1298758 bytes)Available download formats
    Dataset updated
    Nov 13, 2023
    Authors
    Masood Ahmed
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Description:

    This comprehensive dataset offers detailed information on approximately 17,000 FIFA football players, meticulously scraped from SoFIFA.com.

    It encompasses a wide array of player-specific data points, including but not limited to player names, nationalities, clubs, player ratings, potential, positions, ages, and various skill attributes. This dataset is ideal for football enthusiasts, data analysts, and researchers seeking to conduct in-depth analysis, statistical studies, or machine learning projects related to football players' performance, characteristics, and career progressions.

    Features:

    • name: Name of the player.
    • full_name: Full name of the player.
    • birth_date: Date of birth of the player.
    • age: Age of the player.
    • height_cm: Player's height in centimeters.
    • weight_kgs: Player's weight in kilograms.
    • positions: Positions the player can play.
    • nationality: Player's nationality.
    • overall_rating: Overall rating of the player in FIFA.
    • potential: Potential rating of the player in FIFA.
    • value_euro: Market value of the player in euros.
    • wage_euro: Weekly wage of the player in euros.
    • preferred_foot: Player's preferred foot.
    • international_reputation(1-5): International reputation rating from 1 to 5.
    • weak_foot(1-5): Rating of the player's weaker foot from 1 to 5.
    • skill_moves(1-5): Skill moves rating from 1 to 5.
    • body_type: Player's body type.
    • release_clause_euro: Release clause of the player in euros.
    • national_team: National team of the player.
    • national_rating: Rating in the national team.
    • national_team_position: Position in the national team.
    • national_jersey_number: Jersey number in the national team.
    • crossing: Rating for crossing ability.
    • finishing: Rating for finishing ability.
    • heading_accuracy: Rating for heading accuracy.
    • short_passing: Rating for short passing ability.
    • volleys: Rating for volleys.
    • dribbling: Rating for dribbling.
    • curve: Rating for curve shots.
    • freekick_accuracy: Rating for free kick accuracy.
    • long_passing: Rating for long passing.
    • ball_control: Rating for ball control.
    • acceleration: Rating for acceleration.
    • sprint_speed: Rating for sprint speed.
    • agility: Rating for agility.
    • reactions: Rating for reactions.
    • balance: Rating for balance.
    • shot_power: Rating for shot power.
    • jumping: Rating for jumping.
    • stamina: Rating for stamina.
    • strength: Rating for strength.
    • long_shots: Rating for long shots.
    • aggression: Rating for aggression.
    • interceptions: Rating for interceptions.
    • positioning: Rating for positioning.
    • vision: Rating for vision.
    • penalties: Rating for penalties.
    • composure: Rating for composure.
    • marking: Rating for marking.
    • standing_tackle: Rating for standing tackle.
    • sliding_tackle: Rating for sliding tackle.

    Use Case:

    This dataset is ideal for data analysis, predictive modeling, and machine learning projects. It can be used for:

    • Player performance analysis and comparison.
    • Market value assessment and wage prediction.
    • Team composition and strategy planning.
    • Machine learning models to predict future player potential and career trajectories.

    Note:

    Please ensure to adhere to the terms of service of SoFIFA.com and relevant data protection laws when using this dataset. The dataset is intended for educational and research purposes only and should not be used for commercial gains without proper authorization.

  3. Ultimate 25k+ Matches Football Database -European

    • kaggle.com
    zip
    Updated Dec 23, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prajit Datta (2016). Ultimate 25k+ Matches Football Database -European [Dataset]. https://www.kaggle.com/prajitdatta/ultimate-25k-matches-football-database-european
    Explore at:
    zip(34297253 bytes)Available download formats
    Dataset updated
    Dec 23, 2016
    Authors
    Prajit Datta
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    **The ultimate Soccer database for data analysis and machine learning

    What you get:**

    • +25,000 matches
    • +10,000 players
    • 11 European Countries with their lead championship
    • Seasons 2008 to 2016
    • Players and Teams' attributes* sourced from EA Sports' FIFA video game series, including the weekly updates - - Team line up with squad formation (X, Y coordinates)
    • Betting odds from up to 10 providers
    • Detailed match events (goal types, possession, corner, cross, fouls, cards etc...) for +10,000 matches

    **16th Oct 2016: New table containing teams' attributes from FIFA !*

    Original Data Source:

    You can easily find data about soccer matches but they are usually scattered across different websites. A thorough data collection and processing has been done to make your life easier. I must insist that you do not make any commercial use of the data. The data was sourced from:

    When you have a look at the database, you will notice foreign keys for players and matches are the same as the original data sources. I have called those foreign keys "api_id".

    Improving the dataset:

    You will notice that some players are missing from the lineup (NULL values). This is because I have not been able to source their attributes from FIFA. This will be fixed overtime as the crawling algorithm is being improved. The dataset will also be expanded to include international games, national cups, Champion's League and Europa League. Please ask me if you're after a specific tournament.

    Please get in touch with Hugo Mathien if you want to help improve this dataset.

    CLICK HERE TO ACCESS THE PROJECT GITHUB

    Important note for people interested in using the crawlers: since I first wrote the crawling scripts (in python), it appears sofifa.com has changed its design and with it comes new requirements for the scripts. The existing script to crawl players ('Player Spider') will not work until i've updated it.

    Exploring the data:

    Now that's the fun part, there is a lot you can do with this dataset. I will be adding visuals and insights to this overview page but please have a look at the kernels and give it a try yourself ! Here are some ideas for you:

    The Holy Grail... ... is obviously to predict the outcome of the game. The bookies use 3 classes (Home Win, Draw, Away Win). They get it right about 53% of the time. This is also what I've achieved so far using my own SVM. Though it may sound high for such a random sport game, you've got to know that the home team wins about 46% of the time. So the base case (constantly predicting Home Win) has indeed 46% precision.

    Probabilities vs Odds

    When running a multi-class classifier like SVM you could also output a probability estimate and compare it to the betting odds. Have a look at your variance vs odds and see for what games you had very different predictions.

    Explore and visualize features

    With access to players and teams attributes, team formations and in-game events you should be able to produce some interesting insights into The Beautiful Game . Who knows, Guardiola himself may hire one of you some day! Database released under Open Database License, individual papers copyright their original authors

  4. Football: Match Statistics and More! âšœđŸ”„

    • kaggle.com
    zip
    Updated Dec 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tony Gordon Jr. (2024). Football: Match Statistics and More! âšœđŸ”„ [Dataset]. https://www.kaggle.com/datasets/tonygordonjr/football-match-statistics-and-more
    Explore at:
    zip(114937852 bytes)Available download formats
    Dataset updated
    Dec 17, 2024
    Authors
    Tony Gordon Jr.
    Description

    Have you ever found yourself with a football dataset that almost had it all, but left you short of happiness? Time after time, promising datasets failed to deliver the statistics that truly matter – match events, player performances, team results, and season standings.

    That time is over!

    This in-depth football dataset, curated straight from a RapidAPI endpoint, brings you the data points we've all been waiting for. From fixtures and injuries to goals, assists, and tactical breakdowns, this dataset unlocks the full picture of the beautiful game.

    What You Get 🏆 - Fixture Stats & Events: Goals, assists, fouls, and match-defining moments across leagues up to 2024. - Player Performances: From tackles to dribbles, passes, and shots – every stat that makes a difference. - Season Stats & League Standings: Discover how teams dominate, stumble, or rise to glory each season. - Team Insights: Analyze home/away performance, goal-scoring patterns, and defensive strengths. - Match Highlights: Real-time events like own goals, red cards, and critical substitutions. - Injuries & Suspensions: Missing players and their impact on team dynamics. - Iconic Stadiums: Explore venues, capacities, and surfaces that set the stage for football's greatest moments.

    Why It’s Exciting 🌟

    This isn’t just another football dataset – it’s the ultimate resource for fans, analysts, and strategists who want to dig deeper. Whether you're predicting outcomes, analyzing player form, or crafting the next big football insights project, you now have all the tools you need.

    Get ready to unlock stories, trends, and insights like never before – because this time, the stats you actually care about are all here. Let’s kick it off! ⚜✚

    In terms of fixture stats for players, the endpoint provides data from 2015 up through the 2024 season and I plan to make one more update at the end of all league/cup seasons in June of 2025.

    Disclaimer: This dataset is intended for non-commercial, academic purposes and does not infringe upon any intellectual property rights of the original data providers, including RapidAPI or associated sources. For full details, please refer to the respective terms of use provided by the data sources.

    If you have questions about the data or simply want to connect, reach out on LinkedIn and if you plan on using this data for any type of analysis, can you please share that with me!

    PS: I am a Ronaldo fan... Suiiiii !!!

    Leagues/Cups in datasets: - La Liga - Ligue 1 - Serie A - World Cup - Bundesliga - NWSL Women - Pro League - Championship League - Copa America - Premier League - CONCACAF Gold Cup - Euro Championship - UEFA Europa League - MLS - Africa Cup Of Nations - CONCACAF Champions League

    Other Datasets: - Spotify - Zillow

  5. Player Injuries and Team Performance Dataset

    • kaggle.com
    zip
    Updated Dec 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amrit Biswas (2024). Player Injuries and Team Performance Dataset [Dataset]. https://www.kaggle.com/datasets/amritbiswas007/player-injuries-and-team-performance-dataset
    Explore at:
    zip(31333 bytes)Available download formats
    Dataset updated
    Dec 23, 2024
    Authors
    Amrit Biswas
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset investigates the impact of player injuries on team performance across seven Premier League clubs from 2019 to 2023, including Tottenham, Aston Villa, Brighton, Arsenal, Brentford, Everton, Burnley, and Manchester City. The dataset contains over 600 injury records, offering insights into how player absences influence match results and individual performance metrics.

    Data Sources Transfer Market: Provided player injury records and durations. Football Critic: Offered player ratings for pre- and post-injury matches. Sky Sports: Supplemented additional match statistics and player performance data.

    Dataset Overview Each entry includes: Player Information: Name, position, age, FIFA rating (spanning five years). Injury Details: Type of injury, date of injury, date of return. Performance Data: Match results (win, draw, loss), opposition, and goal difference (GD) for three matches before the injury, during missed matches, and for three matches after the player's return. Player ratings for each match, before and after the injury.

    Key Data Points Performance fluctuations around injury events. Match outcomes during player absences. Ratings of players over time to observe any decline or improvement post-injury. This dataset is ideal for sports analytics, performance modeling, and evaluating the broader implications of player injuries on Premier League teams. Explore how injuries disrupt team dynamics and contribute to competitive outcomes in one of the world’s top football leagues.

  6. Major US Sports Venues Usage and Affiliations

    • kaggle.com
    zip
    Updated Jan 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Major US Sports Venues Usage and Affiliations [Dataset]. https://www.kaggle.com/datasets/thedevastator/major-us-sports-venues-usage-and-affiliations
    Explore at:
    zip(36399 bytes)Available download formats
    Dataset updated
    Jan 15, 2023
    Authors
    The Devastator
    Area covered
    United States
    Description

    Major US Sports Venues Usage and Affiliations

    Team, League, Conference and Population Usage Records

    By Homeland Infrastructure Foundation [source]

    About this dataset

    This dataset provides detailed information on major sport venues, along with their usage and affiliations. It includes data related to the National Association for Stock Car Auto Racing, Indy Racing League, Major League Soccer, Major League Baseball, National Basketball Association, Women's National Basketball Association, National Hockey League, National Football League, PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball and thoroughbred horse racing.* This dataset contains columns such as USE (which describes the type of use for the venue), TEAM (the team associated with the venue), LEAGUE (the league associated with the venue) , CONFERENCE (the conference associated with the venue), DIVISION (the division associated with the venue), INST_AFFIL(the institution affiliation associatedwith the venue), TRACK_TYPE(type of track at a specific point in time or over its complete life-cycle) as well as LENGTH_MILEGE ('length of track in milege') ROOF_TYPE(The type of roof covering used at a specific point in time or over its complete life-cycle) and plenty other variables. With this astounding range and quantity of data points -- spanning countries across different continents and leagues -- explore patterns in sports games you never even thought were possible!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚹 Your notebook can be here! 🚹!

    How to use the dataset

    The MajorUS Sports Venues Usage and Affiliations dataset includes data on major sports venues from leagues including National Association for Stock Car Auto Racing (NASCAR), Indy Racing League (IRL), Major League Soccer (MLS), Major League Baseball (MLB), National Basketball Association (NBA), Women's National Basketball Association (WNBA), National Hockey League (NHL), National Football League(NFL), PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball, and thoroughbred horse racing. The columns provided include USE_, USE_POP, TEAM, LEAGUE,CONFERENCE,DIVISION ,INST_AFFIL,TRACK_TYPE. LENGTH_MI,ROOF_TYPESTADIUM_SH,`ADDDATAE , USEWEBSITE',and'COMMENTS'.

    The `USE~ column specifies the type of usage of each venue at which point can be college athletics or professional athletics. The corresponding column to this is the ‘USE~POP’ which informs you about how many people are using each venue for a particular sport at a given time. For example if there were 6 NHL games being played that day then USE~ would say “professional Athletics” while USE~POP would state “NNN” reflecting there were NNN people spectating those events collectively: The next column is TEAM which represents what team sponsors or manages each venue or what teams will be playing in them.

     Following on from TEAM is LEAGUE; here you can find out what league each team represents such as MLB, NBA etc
 The next three columns CONFERENCE/DIVISION/INST ~ AFFIL provide more specific details as they blur into collegiate level as well where CONFERENCE indicates which conference they belong within their respective division: while INST ~ AFFIL states its affiliated school body e.g.: Southeastern Conference > University of Arkansas Razorbacks . Rounding up our overview these last three columns TRACK ~ TYPE/LENGTH
    

    Research Ideas

    • Analyzing the affiliations and usage of different sports venues to determine which teams or leagues have the most presence across a certain geographic area.
    • Comparing different stadiums within a given conference in terms of their roof type, track length, and stadium shape for optimal design features for new construction projects.
    • Placing sponsorships or advertisements within each sporting arena based on audience size, league popularity, and team affiliation within a given conference or division

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contribut...

  7. NFL Pro Football Hall of Fame 1963-2022

    • kaggle.com
    zip
    Updated May 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt OP (2022). NFL Pro Football Hall of Fame 1963-2022 [Dataset]. https://www.kaggle.com/datasets/mattop/nfl-pro-football-hall-of-fame-19632022
    Explore at:
    zip(12861 bytes)Available download formats
    Dataset updated
    May 30, 2022
    Authors
    Matt OP
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset contains every Pro Football Hall of Fame Inductee in NFL history. This includes player stats for rushing, passing and receiving.

    As of the Class of 2022, there are a total of 362 members of the Hall of Fame. Members are referred to as "Gold Jackets" due to the distinctive gold jackets they receive during the induction ceremony. Between four and eight new inductees are normally enshrined every year.

  8. Complete FIFA 2017 Player dataset (Global)

    • kaggle.com
    zip
    Updated Apr 13, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soumitra Agarwal (2017). Complete FIFA 2017 Player dataset (Global) [Dataset]. https://www.kaggle.com/datasets/artimous/complete-fifa-2017-player-dataset-global/code
    Explore at:
    zip(10237907 bytes)Available download formats
    Dataset updated
    Apr 13, 2017
    Authors
    Soumitra Agarwal
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The dataset for people who double on Fifa and Data Science

    Content

    • 17,000+ players
    • 50+ attributes per player ranging from ball skills aggression etc.
    • Player's attributes sourced from EA Sports' FIFA video game series, including the weekly updates
    • Players from all around the globe
    • URLs to their homepage
    • Club logos
    • Player images male and female
    • National and club team data

    Weekly Updates would include :

    • Real life data (Match events etc.)
    • The fifa generated player dataset
    • Betting odds
    • Growth

    https://data2.origin.com/live/content/dam/originx/web/app/games/fifa/fifa-17/screenshots/fifa-17/PogbaDab_pdp_screenhi_3840x2160_en_ww.jpg" alt="">

    Data Source

    Data was scraped from https://www.fifaindex.com/ first by getting player profile url set (as stored in PlayerNames.csv) and then scraping the individual pages for their attributes

    Improvements

    • You may have noticed that for a lot of players, their national details are absent (Team and kit number) even though the nationality is listed. This may be attributed to the missing data on fifa sites.

    GITHUB PROJECT

    • There is much more than just 50 attributes by which fifa decides what happens to players over time, how they perform under pressure, how they grow etc. This data obviously would be well hidden by the organisation and thus would be tough to find

    Important note for people interested in using the scraping: The site is not uniform and thus the scraping script requires considering a lot of corner cases (i.e. interchanged position of different attributes). Also the script contains proxy preferences which may be removed if not required.

    Exploring the data

    For starters you can become a scout:

    • Create attribute dependent or overall best teams
    • Create the fastest/slowest teams
    • See which areas of the world provide which attributes (like Africa : Stamina, Pace)
    • See which players are the best at each position
    • See which outfield players can play a better role at some other position
    • See which youngsters have attributes which can be developed

    And that is just the beginning. This is the playground.. literally!

    Data description

    • The file FullData.csv contains attributes describing the in game play style and also some of the real statistics such as Nationality etc.
    • The file PlayerNames.csv contains URLs for different players from their profiles on fifaindex.com. Append the URLs after the base url fifaindex.com.
    • The compressed file Pictures.zip contains pictures for top 1000 players in Fifa 17.
    • The compressed file Pictures_f.zip contains pictures for top 139 female players in Fifa 17.
    • The compressed file ClubPictures.zip contains pictures for emblems of some major clubs in Fifa 17.

    Inspiration

    I am a huge FIFA fanatic. While playing career mode I realised that I picked great young players early on every single time and since a lot of digital learning relies on how our brain works, I thought scouting great qualities in players would be something that can be worked on. Since then I started working on scraping the website and here is the data. I hope we can build something on it.

    https://www.xzone.cz/download/products/fifa-17-01.jpg" alt="">

    With access to players attributes you can become the best scout in the world. Go for it!

  9. FiveThirtyEight NFL Wide Receivers Dataset

    • kaggle.com
    zip
    Updated Apr 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight NFL Wide Receivers Dataset [Dataset]. https://www.kaggle.com/fivethirtyeight/fivethirtyeight-nfl-wide-receivers-dataset
    Explore at:
    zip(183689 bytes)Available download formats
    Dataset updated
    Apr 26, 2019
    Dataset authored and provided by
    FiveThirtyEighthttps://abcnews.go.com/538
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    NFL Wide Receivers

    This folder contains data behind the story The Football Hall Of Fame Has A Receiver Problem.

    advanced-historical.csv contains advanced career stats for NFL receivers, 1932-2013.

    HeaderDefinition
    pfr_player_idPlayer identification code at Pro-Football-Reference.com
    player_nameThe player's name
    career_tryCareer True Receiving Yards
    career_ranypaAdjusted Net Yards Per Attempt (relative to average) of player's career teams, weighted by TRY w/ each team
    career_wowyThe amount by which career_ranypa exceeds what would be expected from his QBs' (age-adjusted) performance without the receiver
    bcs_ratingThe number of yards per game by which a player would outgain an average receiver on the same team, after adjusting for teammate quality and age (update of http://www.sabernomics.com/sabernomics/index.php/2005/02/ranking-the-all-time-great-wide-receivers/)

    try-per-game-aging-curve.csv contains receiver aging curve definitions.

    HeaderDefinition
    age_fromThe age (as of December 31st) the player is moving from
    age_toThe age (as of December 31st) the player is moving to
    trypg_changeExpected change in TRY/game from one age-season to the next

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

  10. NFL Penalties Data (2009-2022 Season)

    • kaggle.com
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt OP (2023). NFL Penalties Data (2009-2022 Season) [Dataset]. https://www.kaggle.com/datasets/mattop/nfl-penalties-data-2009-2022-season
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    Kaggle
    Authors
    Matt OP
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This extensive dataset offers a granular look at the penalties incurred by teams and players, providing a valuable resource for football enthusiasts, analysts, and researchers alike.

    Explore the nuances of each NFL season, dissecting penalty types, frequency, and the teams and players most frequently penalized. Uncover trends, anomalies, and strategic shifts that have shaped the league's dynamic landscape over the years.

    Whether you're an avid fan seeking a deeper understanding of your favorite team's discipline on the field or a data scientist in search of rich, reliable information for analytical purposes, this NFL Penalties Data delivers a comprehensive and insightful perspective into the intricate world of penalties in professional football. From false starts to pass interference, this dataset serves as a powerful tool for unraveling the threads of each NFL season's story, penalty by penalty.

  11. University Football Injury Prediction Dataset

    • kaggle.com
    zip
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataMaverick (2025). University Football Injury Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/yuanchunhong/university-football-injury-prediction-dataset/data
    Explore at:
    zip(85919 bytes)Available download formats
    Dataset updated
    Aug 2, 2025
    Authors
    DataMaverick
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    University Football Injury Prediction Dataset

    🎯 Overview

    This dataset contains comprehensive data from 800 Chinese university football players participating in collegiate and provincial leagues. The goal is to predict whether a player will suffer an injury in the next academic season using machine learning classification methods.

    📊 Dataset Details

    • Samples: 800 university football players
    • Features: 18 input features + 1 target label
    • Task: Binary classification (0 = No injury, 1 = Injury)
    • Balance: Well-balanced dataset
    • Age Range: 18-24 years (typical university students)

    đŸƒâ€â™‚ïž Feature Categories

    Physical Characteristics (4 features)

    • Age, Height, Weight, BMI
    • Measured through standard university health checkups

    Football-Specific Metrics (4 features)

    • Playing position, training hours per week, matches played, injury history
    • Collected from official records and coach evaluations

    Physical Fitness Assessment (6 features)

    • Knee strength, hamstring flexibility, reaction time, balance, sprint speed, agility
    • Professional fitness testing using standardized protocols

    Lifestyle Factors (3 features)

    • Sleep hours, stress level, nutrition quality
    • Self-reported surveys and validated questionnaires

    Training Compliance (1 feature)

    • Warm-up routine adherence (0=Poor, 1=Good)

    🎯 Target Variable

    Injury_Next_Season: Binary classification where injury is defined as training/competition-related injury causing ≄7 consecutive days of absence, verified by university medical center and coaching staff.

    🔬 Data Quality

    • Collection Period: Within 4 weeks of each academic year start
    • Validation: Multi-source verification (medical records, coach reports, student surveys)
    • Quality Control: Reviewed by sports medicine professionals
    • Missing Data: Minimal, verified through multiple channels

    🚀 Potential Applications

    • Sports Medicine Research: Identify key injury risk factors
    • Preventive Healthcare: Data-driven injury prevention strategies
    • University Sports Management: Risk assessment for student athletes
    • Machine Learning: Healthcare classification algorithm validation

    📈 Suggested Approaches

    • Classical ML: Logistic Regression, Random Forest, SVM, XGBoost
    • Evaluation Metrics: Accuracy, Precision, Recall, F1-Score, AUC-ROC
    • Validation: Cross-validation with stratified sampling

    🏆 Research Value

    This dataset bridges sports science and machine learning, offering insights into university-level athletic injury prediction. It's particularly valuable for researchers in sports medicine, preventive healthcare, and applied machine learning.

    This dataset is intended for academic research and educational purposes. Please respect data privacy and usage guidelines.

  12. College Football Team Stats Seasons 2013 to 2023

    • kaggle.com
    zip
    Updated Mar 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeff Gallini (2024). College Football Team Stats Seasons 2013 to 2023 [Dataset]. https://www.kaggle.com/jeffgallini/college-football-team-stats-2019
    Explore at:
    zip(598976 bytes)Available download formats
    Dataset updated
    Mar 31, 2024
    Authors
    Jeff Gallini
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    College football is one of the most long-living fascinations in American culture. Its TV rankings routinely dominate the fall TV schedules. . The NCAA has a stats website but it does not have all the team information and uses many acronyms that are obscure.

    With the data available, I went ahead and scraped the team statistics for each college football season from 2013 to the present.

    Content

    Inside the data is the team statistics for all of the FBS level teams at the year of the college season, it includes offensive, defensive, turnover, redzone, special teams, first down, third down, and fourth down stats. There are around 145 differenc team statistics that can be used.

    Acknowledgements

    All of this information is thanks to the NCAA stats website which makes the data easy to use and find. See more here: https://www.ncaa.com/stats/football/fbs

    Inspiration

    College Football is the only sport in the world where the college version is much older than the professional version. It has a very storied history and many antidotes about it. Explore the data to learn for yourself the following: - Does defense really does win championships? - What features translate into wins? - Are special teams of particular value for a team's performance? - Which Collegiate Conference is the best? - What's the correlation between offensive and defensive performance?

  13. World Soccer live data feed

    • kaggle.com
    zip
    Updated Jan 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammad Ghahramani (2019). World Soccer live data feed [Dataset]. https://www.kaggle.com/datasets/analystmasters/world-soccer-live-data-feed/discussion
    Explore at:
    zip(911496 bytes)Available download formats
    Dataset updated
    Jan 28, 2019
    Authors
    Mohammad Ghahramani
    Description

    Context

    This is the first live data stream on Kaggle providing a simple yet rich source of all soccer matches around the world 24/7 in real-time.

    What makes it unique compared to other datasets?

    • It is the first live data feed on Kaggle and it is totally free
    • Unlike “Churn rate” datasets you do not have to wait months to evaluate your predictions; simply check the match’s outcome in a couple of hours
    • you can use your predictions/analysis for your own benefit instead of spending your time and resources on helping a company maximizing its profit
    • A Five year old laptop can do the calculations and you do not need high-end GPUs
    • Couldn’t make it to the top 3 submissions? Nevermind, you still have the chance to get your prize on your own
    • You can’t get accurate results on all samples? Do not worry, just filter out the hard ones (e.g. ignore international friendly) and simply choose the ones you are sure of.
    • Need help from human experts for each sample? Every sample comes with at least two opinions from experts
    • You wish you could add your complementary data? Just contact us and we will try to facilitate it.
    • Couldn’t win “Warren Buffett's 2018 March Madness Bracket Contest”? Here is your chance to make your accumulative profit.

    Simply train your algorithm on the first version of training dataset of approximately 11.5k matches and predict the data provided in the following data feed.

    Fetch the data stream

    The CSV file is updated every 30 minutes at minutes 20’ and 50’ of every hour. I kindly request not to download it more than twice per hour as it incurs additional cost.

    You may download the csv data file from the following link from Amazon S3 server by changing the FOLDER_NAME as below,

    https://s3.amazonaws.com/FOLDER_NAME/amasters.csv

    *. Substitute the FOLDER_NAME with "**analyst-masters**"

    Content

    Our goal is to identify the outcome of a match as Home, Draw or Away. The variety of sources and nature of information provided in this data stream makes it a unique database. Currently, FIVE servers are collecting data from soccer matches around the world, communicating with each other and finally aggregating the data based on the dominant features learned from 400,000 matches over 7 years. I describe every column and the data collection below in two categories, Category I – Current situation and Category II – Head-to-Head History. Hence, we divide the type of data we have from each team to 4 modes,

    • Mode 1: we have both Category I and Category II available
    • Mode 2: we only have Category I available
    • Mode 3: we only have Category II available
    • Mode 4: none of Category I and II are available

    Below you can find a full illustration of each category.

    I. Current situation

    Col 1 to 3:

    Votes_for_Home Votes_for_Draw Votes_for_Away
    

    The most distinctive parts of the database are these 3 columns. We are releasing opinions of over 100 professional soccer analysts predicting the outcome of a match. Their votes is the result of every piece of information they receive on players, team line-up, injuries and the urge of a team to win a match to stay in the league. They are spread around the world in various time zones and are experts on soccer teams from various regions. Our servers aggregate their opinions to update the CSV file until kickoff. Therefore, even if 40 users predict Real-Madrid wins against Real-Sociedad in Santiago Bernabeu on January 6th, 2019 but 5 users predict Real-Sociedad (the away team) will be the winner, you should doubt the home win. Here, the “majority of votes” works in conjunction with other features.

    Col 4 to 9:

    Weekday Day Month  Year  Hour  Minute
    

    There are over 60,000 matches during a year, and approximately 400 ones are usually held per day on weekends. More critical and exciting matches, which are usually less predictable, are held toward the evening in Europe. We are currently providing time in Central Europe Time (CET) equivalent to GMT +01:00.

    *. Please note that the 2nd row of the CSV file represents the time, data values are saved from all servers to the file.

    Col 10 to 13:

    Total_Bettors   Bet_Perc_on_Home    Bet_Perc_on_Draw   Bet_Perc_on_Away
    

    This data is recorded a few hours before the match as people place bets emotionally when kickoff approaches. The percentage of the overall number of people denoted as “Total_Bettors” is indicated in each column for “Home,” “Draw” and “Away” outcomes.

    Col 14 to 15:

    Team_1 Team_2   
    

    The team playing “Home” is “Team_1” and the opponent playing “Away” is “Team_2”.

    Col 16 to 36:

    League_Rank_1  League_Rank_2  Total_teams     Points_1  Points_2  Max_points Min_points Won_1  Draw_1 Lost_1 Won_2  Draw_2 Lost_2 Goals_Scored_1 Goals_Scored_2 Goals_Rec_1 Goal_Rec_2 Goals_Diff_1  Goals_Diff_2
    

    If the match is betw...

  14. Tom Brady data for cleaning and analysis

    • kaggle.com
    zip
    Updated Jan 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Alsharif (2023). Tom Brady data for cleaning and analysis [Dataset]. https://www.kaggle.com/datasets/analyst0111/tom-brady-data-for-cleaning-and-analysis
    Explore at:
    zip(21389 bytes)Available download formats
    Dataset updated
    Jan 30, 2023
    Authors
    Omar Alsharif
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This data is collected from https://www.pro-football-reference.com/players/B/BradTo00/gamelog and it has a good experiment in data cleaning and data analysis.

  15. US state_trends.csv

    • kaggle.com
    zip
    Updated Jan 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ANKITHA SRIDHAR (2024). US state_trends.csv [Dataset]. https://www.kaggle.com/datasets/ankithasridhar/us-state-trends-csv
    Explore at:
    zip(64366 bytes)Available download formats
    Dataset updated
    Jan 18, 2024
    Authors
    ANKITHA SRIDHAR
    Area covered
    United States
    Description

    This dataset, named "state_trends.csv," contains information about different U.S. states. Let's break down the attributes and understand what each column represents:

    1. state: The name of the U.S. state.
    2. state_code: The two-letter postal code abbreviation for the state.
    3. population: The population of the state.
    4. sq_miles: The total land area of the state in square miles.
    5. pop_density: Population density, which is the number of people per square mile.
    6. region: The geographical region of the United States to which the state belongs (e.g., South, West).
    7. psych_region: A description of the psychological region based on personality traits.
    8. psy_reg: A shortened version of the psychological region.
    9. extraversion: A measure of the state's population tendency toward extraversion.
    10. agreeableness: A measure of the state's population tendency toward agreeableness.
    11. conscientiousness: A measure of the state's population tendency toward conscientiousness.
    12. neuroticism: A measure of the state's population tendency toward neuroticism.
    13. openness: A measure of the state's population tendency toward openness.
    14. data_science: A score related to the state's interest or proficiency in the field of data science.
    15. artificial_intelligence: A score related to the state's interest or proficiency in artificial intelligence.
    16. machine_learning: A score related to the state's interest or proficiency in machine learning.
    17. data_analysis: A score related to the state's interest or proficiency in data analysis.
    18. business_intelligence: A score related to the state's interest or proficiency in business intelligence.
    19. spreadsheet: A score related to the state's interest or proficiency in spreadsheet usage.
    20. statistics: A score related to the state's interest or proficiency in statistics.
    21. art: A score related to the state's interest or involvement in the field of art.
    22. dance: A score related to the state's interest or involvement in dance.
    23. museum: A score related to the state's interest or presence of museums.
    24. basketball: A score related to the state's interest or involvement in basketball.
    25. football: A score related to the state's interest or involvement in football.
    26. baseball: A score related to the state's interest or involvement in baseball.
    27. soccer: A score related to the state's interest or involvement in soccer.
    28. hockey: A score related to the state's interest or involvement in hockey.
    29. has_nba: Indicates whether the state has a National Basketball Association (NBA) team (Yes/No).
    30. has_nfl: Indicates whether the state has a National Football League (NFL) team (Yes/No).
    31. has_mlb: Indicates whether the state has a Major League Baseball (MLB) team (Yes/No).
    32. has_mls: Indicates whether the state has a Major League Soccer (MLS) team (Yes/No).
    33. has_nhl: Indicates whether the state has a National Hockey League (NHL) team (Yes/No).
    34. has_any: Indicates whether the state has any of the mentioned professional sports teams (Yes/No).

    In summary, this dataset provides a variety of information about U.S. states, including demographic data, geographical region, psychological region, personality traits, and scores related to interests or proficiencies in various fields such as data science, art, and sports.

  16. European Football Market Values

    • kaggle.com
    zip
    Updated Nov 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A Richt (2019). European Football Market Values [Dataset]. https://www.kaggle.com/aricht1995/european-football-market-values
    Explore at:
    zip(346847 bytes)Available download formats
    Dataset updated
    Nov 29, 2019
    Authors
    A Richt
    Description

    Contains web scrapped (rvest) Market Value information and other related data on Players from the top 9 European leagues including: Premier League, La Liga, Liga NOS, Ligue 1, Bundesliga, Seria A, Premier Liga, Eredivisie and Jupiler Pro League (20+ variables)

  17. 100 Highest Paid Athletes of the World

    • kaggle.com
    zip
    Updated Aug 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Batros Jamali (2024). 100 Highest Paid Athletes of the World [Dataset]. https://www.kaggle.com/datasets/batrosjamali/100-highest-paid-athletes-of-the-world
    Explore at:
    zip(3103 bytes)Available download formats
    Dataset updated
    Aug 6, 2024
    Authors
    Batros Jamali
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    In the eight years since he became the world’s highest-paid athlete for the first time, much has changed for Cristiano Ronaldo. The 39-year-old Portuguese soccer star went from lighting up the BernabĂ©u with Real Madrid to stints with Juventus and Manchester United, until finally landing at his current home, Al Nassr of the Saudi Pro League. But no matter the location, one thing has remained constant—Ronaldo is still drawing outsized paydays. He earned an estimated $260 million over the last 12 months, making him the highest-paid athlete in the world for the fourth time in his career. It estimates Ronaldo’s contract with Al Nassr earned him $200 million this season. And as one of the sports world’s most successful pitchmen, Ronaldo earned another $60 million off the field from an endorsement portfolio that includes Nike, Binance and Herbalife, among others.

  18. Tennis ATP Tour Australian Open Final 2019

    • kaggle.com
    zip
    Updated Mar 2, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Robert Seidl (2019). Tennis ATP Tour Australian Open Final 2019 [Dataset]. https://www.kaggle.com/robseidl/tennis-atp-tour-australian-open-final-2019
    Explore at:
    zip(27979 bytes)Available download formats
    Dataset updated
    Mar 2, 2019
    Authors
    Robert Seidl
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    Nowadays, in most sports either tracking or event data is available for sports data scientists to analyse leagues, teams, games or players. For example, in soccer event-based data is available for all major leagues by professional data providers like Opta, Statsbomb or Wyscout. For tennis this is different. Even though a camera-based tracking with Hawkeye is possible, this data is not available to the outside and only the largest courts are equipped with the system. When I think about the latest breakthroughs in machine learning in image classification, detection, NLP (deepl.com) and audio recognition (Siri, Alexa) it is evident that all of these areas provide a huge amount of easily accessable data. Personally, I expect that there would be way more research in tennis if there would be a large amount of freely available match data. There exists statistics of all matches played on ATP Tour which are available from different sources. For example, Jeff Sackmans github repository is a great way to start. He also has a match charting project where point-by-point data is collected. But when I think about tennis, it is about the movement of the players, their tactics, etc. It is the ball movement, the actual rallies and shots I want to be able to see and analyse.

    Event data allows to capture positional, temporal and stroke information. As a proof of concept, and a tribute to Novac Djokovic and Rafael Nadal, two of the greatest tennis players of all time, I manually annotated each rally and stroke of their Australian Open final 2019. Fortunately for me it only went over three sets.

    Content

    The data consists of all points played in the match. It is build hierarchically from events, to rallies, to actual points.
    - Points: a list of all points played in the final with information about the server, receiver, point type, number of strokes, time of rally, new score of the game.
    - Rallies: A list of all rallies with Server, Returner, etc.
    - Events: Each time a player hit the ball, the stroke type, position of the player, and position of the opponent were recorded.
    - Serves: For each successful serve, which was no failure, the position of the serve in the service box was recorded (whenever possible)

    I have already done the hard part of data cleaning, and the dataset is hopefully easy to understand and ready to use.

    Positions The x, y positions are with respect to the court coordinate system shown in Figure 1. They were calculated from the pixel coordinates through a direct linear transformation at the beginning of the match. (As the camera angle changed a bit during the match, some of the positions are off.)

    https://www.dropbox.com/s/gakg677f0uvhmb2/Screenshot%202019-03-02%2021.44.11.png?raw=1" alt="The court coordinate system. The horizontal axis refers to x and the vertical axis to the y-direction.">

    Inspiration

    Look into the data, see what you can find. Is there information about the game in positional, temporal and stroke information that can tell you more about the players and the match than simple match sheet statistics like the number of break points or first serves in?
    You can use the dataset however you want, but here are some things you could start with.
    - It is a great way to practice pandas to generate general statistics like points played, serve percentages, games won, breakpoints etc. and compare them with the statistics from other websites.
    - You can visualize the spatial positioning of the players on the court. I.e. answer the question if there is a difference between the return position of Nadal and Djokovic.
    - You can calculate movement statistics like distance covered.
    - You can calculate the percentage of forehand and backhands, or shot types like slice, topspin for each player.
    - You can find out where the players are serving to? (Do not forget that Nadal is a lefty).

    To get you started, I have created a sample kernel. Find it here.

  19. IPL Dataset 2008-2016

    • kaggle.com
    zip
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhanupratap Biswas (2023). IPL Dataset 2008-2016 [Dataset]. https://www.kaggle.com/datasets/bhanupratapbiswas/ipl-dataset-2008-2016
    Explore at:
    zip(12187 bytes)Available download formats
    Dataset updated
    Jul 3, 2023
    Authors
    Bhanupratap Biswas
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    The Indian Premier League (IPL) is a professional Twenty20 cricket league in India. It is one of the most popular and lucrative cricket leagues in the world. The tournament was established by the Board of Control for Cricket in India (BCCI) in 2007 and is played every year during the months of March, April, and May.

    The IPL follows a franchise-based model, where eight teams representing different cities or regions in India compete against each other. The teams are owned by various individuals, companies, and consortiums, including Bollywood actors, business tycoons, and corporate entities. Some of the prominent teams in the IPL include the Mumbai Indians, Chennai Super Kings, Royal Challengers Bangalore, and Kolkata Knight Riders.

    The IPL has a star-studded lineup of players, with both international and domestic cricketers participating in the tournament. Many of the world's top cricketers, such as Virat Kohli, Rohit Sharma, AB de Villiers, and Chris Gayle, have been a part of the IPL. The league has provided a platform for young talents to showcase their skills and has played a significant role in the development of Indian cricket.

    The tournament format involves a round-robin group stage followed by playoffs. Each team plays a total of 14 matches, facing every other team twice, once at home and once away. The top four teams in the group stage advance to the playoffs, which consist of the qualifier matches and the final. The team that finishes first in the group stage gets two chances to reach the final, while the other three teams compete in the eliminator and the second qualifier matches.

    The IPL has witnessed intense rivalries, thrilling matches, and high-scoring contests over the years. The league has also been a platform for innovation in cricket, with features like strategic time-outs, cheerleaders, and the introduction of the Decision Review System (DRS) in the league before it was implemented globally.

    Apart from the cricketing action, the IPL has become a significant entertainment spectacle, attracting a large fan base. The matches are held in various cricket stadiums across India, with fans supporting their favorite teams with enthusiasm and passion. The league has also been associated with glitz, glamour, and celebrity performances, making it a blend of sports and entertainment.

    The IPL's success has led to the emergence of similar leagues in other countries, such as the Big Bash League in Australia and the Caribbean Premier League. It has also contributed to the growth of franchise-based sports leagues worldwide.

    Overall, the Indian Premier League has revolutionized cricket in India and has become a global phenomenon, attracting players, fans, and sponsors from around the world. It has provided a platform for top-level cricket, entertainment, and commercial opportunities, making it a highly anticipated and celebrated event in the cricket calendar.

  20. Pro-Kabaddi League 2019

    • kaggle.com
    zip
    Updated Oct 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Pandit (2019). Pro-Kabaddi League 2019 [Dataset]. https://www.kaggle.com/sujaypandit/prokabbadi-league-2019
    Explore at:
    zip(15030 bytes)Available download formats
    Dataset updated
    Oct 3, 2019
    Authors
    Sujay Pandit
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    VIVO Pro-Kabaddi league has gained significant popularity since it started in 2014. This dataset presents a learning opportunity for people who want to try their hand at sports analytics. The challenge is divided into 7 parts: Task 1: Predict the winner of the tournament Task 2: Predict the the top team in the points table after the completion of league matches Task 3: Predict the team with the highest points for successful raids Task 4: Predict the team with the highest points for successful tackles Task 5: Predict the team with the highest super-performance total Task 6: Predict the player with the highest SUCCESSFUL RAID percentage Task 7: Predict the player with the highest SUCCESSFUL TACKLE percentage

    Content

    The dataset contains data upto 30th of September 2019. The datasets are pretty self-explanatory once you have a brief look at them. The tournament ends on 19th of October 2019. Code to update the data as tournament progresses and reference data for the challenges can be found at:

    https://github.com/sujay-pandit/Upgrad_Hackathon

    I will not be updating the dataset beyond this point so people can use this dataset for predictions in future as well.

    Acknowledgements

    I would like to thank:

    1. VIVO pro kabaddi to help bring fame to a sport that has deserved it for years. https://www.prokabaddi.com/about-prokabaddi

    2. Upgrad for organizing the hackathon. https://www.upgrad.com/

  21. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Danish Baariq (2025). Dataset Player FIFA Football 2025 [Dataset]. http://doi.org/10.34740/kaggle/dsv/12446270
Organization logo

Dataset Player FIFA Football 2025

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Danish Baariq
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset contains detailed attributes of professional football players, curated from a FIFA-based database. It includes a wide range of physical, technical, and performance-related statistics for each player.

The dataset provides valuable insights for sports analysts, data scientists, machine learning practitioners, and football enthusiasts who are interested in exploring player performance, scouting analysis, player valuation modeling, and team formation strategy.

📩 Dataset Features

Each record in the dataset represents an individual football player and includes the following types of data:

  • Personal Information:

    • full_name, birth_date, age, height_cm, weight_kgs, nationality
  • Player Role & Identity:

    • positions, preferred_foot, national_team, national_team_position, national_rating, national_jersey_number
  • Overall & Potential Ratings:

    • overall_rating, potential, international_reputation(1-5)
  • Technical Attributes:

    • Ball control: ball_control, dribbling, crossing, curve
    • Shooting: finishing, volleys, shot_power, long_shots, penalties
    • Passing: short_passing, long_passing, vision
    • Defending: marking, standing_tackle, sliding_tackle, interceptions
  • Physical & Mental Attributes:

    • acceleration, sprint_speed, agility, strength, stamina, jumping, balance, aggression, composure, reactions
  • Special Skills:

    • skill_moves(1-5), weak_foot(1-5), freekick_accuracy, heading_accuracy
  • Economic Value:

    • value_euro, wage_euro, release_clause_euro

🎯 Potential Use Cases

  • Player performance comparison
  • Player valuation prediction
  • Machine learning models for scouting
  • Visual dashboards for team analytics
  • Career trajectory analysis (based on age and potential)
Search
Clear search
Close search
Google apps
Main menu