17 datasets found
  1. Detailed NFL Play-by-Play Data 2009-2018

    • kaggle.com
    zip
    Updated Dec 22, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Horowitz (2018). Detailed NFL Play-by-Play Data 2009-2018 [Dataset]. https://www.kaggle.com/datasets/maxhorowitz/nflplaybyplay2009to2016
    Explore at:
    zip(287411671 bytes)Available download formats
    Dataset updated
    Dec 22, 2018
    Authors
    Max Horowitz
    Description

    Introduction

    The lack of publicly available National Football League (NFL) data sources has been a major obstacle in the creation of modern, reproducible research in football analytics. While clean play-by-play data is available via open-source software packages in other sports (e.g. nhlscrapr for hockey; PitchF/x data in baseball; the Basketball Reference for basketball), the equivalent datasets are not freely available for researchers interested in the statistical analysis of the NFL. To solve this issue, a group of Carnegie Mellon University statistical researchers including Maksim Horowitz, Ron Yurko, and Sam Ventura, built and released nflscrapR an R package which uses an API maintained by the NFL to scrape, clean, parse, and output clean datasets at the individual play, player, game, and season levels. Using the data outputted by the package, the trio went on to develop reproducible methods for building expected point and win probability models for the NFL. The outputs of these models are included in this dataset and can be accessed using the nflscrapR package.

    Content

    The dataset made available on Kaggle contains all the regular season plays from the 2009-2016 NFL seasons. The dataset has 356,768 rows and 100 columns. Each play is broken down into great detail containing information on: game situation, players involved, results, and advanced metrics such as expected point and win probability values. Detailed information about the dataset can be found at the following web page, along with more NFL data: https://github.com/ryurko/nflscrapR-data.

    Acknowledgements

    This dataset was compiled by Ron Yurko, Sam Ventura, and myself. Special shout-out to Ron for improving our current expected points and win probability models and compiling this dataset. All three of us are proud founders of the Carnegie Mellon Sports Analytics Club.

    Inspiration

    This dataset is meant to both grow and bring together the community of sports analytics by providing clean and easily accessible NFL data that has never been availabe on this scale for free.

  2. R

    American Football Player Detection Dataset

    • universe.roboflow.com
    zip
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FH Technikum Wien (2023). American Football Player Detection Dataset [Dataset]. https://universe.roboflow.com/fh-technikum-wien-m15r2/american-football-player-detection/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset authored and provided by
    FH Technikum Wien
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    American Football Players Bounding Boxes
    Description

    American Football Player Detection

    ## Overview
    
    American Football Player Detection is a dataset for object detection tasks - it contains American Football Players annotations for 171 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  3. Athlete Career Length

    • kaggle.com
    zip
    Updated Mar 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Parks (2025). Athlete Career Length [Dataset]. https://www.kaggle.com/datasets/kevinparks/athlete-career-length
    Explore at:
    zip(580166 bytes)Available download formats
    Dataset updated
    Mar 2, 2025
    Authors
    Kevin Parks
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Data on athletes' professional career lengths in the sports of baseball, basketball, and American football. The data was compiled from baseball-reference.com, pro-football-reference.com, and basketball-reference.com. The data is split into three different files, one for each sport, identified by the title: baseball_career_length.csv, basketball_career_length.csv, football_career_length.csv.

    Dataset Features available in all files: - name: The name of the athlete. - start_year: The year that the athletes professional career started. - end_year: The last year of the athletes professional career. - hall_of_fame: True for athletes who have been admitted to the hall of fame, False otherwise. - status: True if the athlete has finished their career, False otherwise. - career_length: The total number of years the athlete was actively playing professionally. - sport: The sport of the athlete.

    Additional Dataset Features available for football_career_length.csv: - position: The position that the athlete played in their sport. If they played multiple positions they are separated by a '-'.

    Additional Dataset Features available for basketball_career_length.csv: - position: The position that the athlete played in their sport. If they played multiple positions they are separated by a '-'. - height: The height of the athlete in inches. - weight: The weight of the athlete in pounds. - birth_date: The date of the athlete's birth.

  4. Football Manager 2023: 90k+ Player Stats

    • kaggle.com
    zip
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddhraj Thakor (2025). Football Manager 2023: 90k+ Player Stats [Dataset]. https://www.kaggle.com/datasets/siddhrajthakor/football-manager-2023-dataset
    Explore at:
    zip(9373378 bytes)Available download formats
    Dataset updated
    Oct 1, 2025
    Authors
    Siddhraj Thakor
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Football Manager Players Dataset

    Overview

    Dive into the ultimate treasure trove for football enthusiasts, data analysts, and gaming aficionados! The Football Manager Players Dataset is a comprehensive collection of player data extracted from a popular football management simulation game, offering an unparalleled look into the virtual world of football talent. This dataset includes detailed attributes for thousands of players across multiple leagues worldwide, making it a goldmine for analyzing player profiles, scouting virtual stars, and building predictive models for football strategies.

    Whether you're a data scientist exploring sports analytics, a football fan curious about your favorite virtual players, or a game developer seeking inspiration, this dataset is your ticket to unlocking endless possibilities!

    Dataset Description

    This dataset is a meticulously curated compilation of player statistics from five CSV files, merged into a single, unified dataset (merged_players.csv). It captures a diverse range of attributes for players from various clubs, nations, and leagues, including top-tier competitions like the English Premier Division, Argentina's Premier Division, and lower divisions across the globe.

    Key Features

    • Rich Player Attributes: Over 70 columns covering essential metrics such as:
      • Basic Info: UID, Name, Date of Birth (DOB), Nationality, Height, Weight, Age
      • Club & Position: Club, Position (e.g., AM, DM, GK), Based (league/division)
      • Performance Stats: Caps, Appearances (AT Apps), Goals (AT Gls), League Appearances, League Goals
      • Technical Skills: Acceleration, Passing, Dribbling, Finishing, Tackling, and more
      • Mental Attributes: Work Rate, Vision, Leadership, Determination
      • Physical Attributes: Pace, Strength, Stamina, Agility
      • Market Value: Transfer Value (e.g., $0 to millions)
      • Miscellaneous: Preferred Foot, Media Handling, Injury Proneness
    • Global Coverage: Players from diverse regions, including Europe (England, Spain, Italy), South America (Argentina, Brazil), Asia (South Korea, China), Africa (Ivory Coast, Burkina Faso), and North America (USA, Mexico).
    • Varied Player Types: From young prospects (15–18 years old) to veteran stars (up to 45 years old), including amateurs, youth players, and professionals.
    • Realistic Insights: Includes attributes like Media Description (e.g., "Young winger," "Veteran striker") and injury status, mirroring real-world football dynamics.

    Dataset Size

    • Rows: Thousands of player records (exact count depends on deduplication).
    • Columns: 70+ attributes per player.
    • File: merged_players.csv (UTF-8 encoded for compatibility with special characters).

    Potential Use Cases

    • Sports Analytics:
      • Analyze player attributes to identify key traits for success by position (e.g., what makes a top goalkeeper?).
      • Predict transfer values based on skills, age, and performance stats.
      • Cluster players by playing style or potential using machine learning.
    • Scouting & Strategy:
      • Build a dream team by filtering players based on specific attributes (e.g., high Pace and Dribbling for wingers).
      • Compare young talents vs. experienced veterans for team-building strategies.
    • Gaming & Modding:
      • Create custom Football Manager databases or mods.
      • Analyze game balance by studying attribute distributions.
    • Visualization:
      • Develop interactive dashboards to explore player stats by league, nationality, or position.
      • Map player origins to visualize global football talent distribution.
    • Education & Research:
      • Use as a teaching tool for data science, exploring data cleaning, merging, and analysis.
      • Study correlations between mental/physical attributes and in-game performance.

    Why This Dataset Stands Out

    • Comprehensive: Covers every aspect of a player's profile, from technical skills to personality traits.
    • Diverse: Includes players from top-tier to lower divisions, offering a broad spectrum of talent.
    • Engaging: Perfect for football fans and data enthusiasts alike, blending gaming with real-world analytics.
    • Ready-to-Use: Merged and cleaned for immediate analysis, with consistent column structure across all records.

    Getting Started

    1. Download: Grab merged_players.csv and load it into your favorite tool (Python/pandas, R, Excel, etc.).
    2. Explore: Check out columns like Transfer Value, Position, and Media Description to start your analysis.
    3. Analyze: Use Python (e.g., pandas, scikit-learn) or visualization tools (e.g., Tableau, Power BI) to uncover insights.
    4. Share: Build models, visualizations, or scouting reports and share your findings with the Kaggle community!

    Example Questions to Explore

    • Which young players (<18 years) have the highest poten...
  5. 70+ Football Leagues Dataset 2019-2023

    • kaggle.com
    zip
    Updated Jun 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damir Mikic (2023). 70+ Football Leagues Dataset 2019-2023 [Dataset]. https://www.kaggle.com/datasets/takidaki/70-football-leagues-data-2019-2023/suggestions
    Explore at:
    zip(6455320 bytes)Available download formats
    Dataset updated
    Jun 24, 2023
    Authors
    Damir Mikic
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Title: 70 Football Leagues Data (2019-2023)

    Dataset Description: This dataset provides comprehensive data on 70 football leagues from various countries around the world. The dataset covers the period from 2019 to 2023, offering a rich collection of football-related information for data analysis, research, and visualization purposes.

    Content: The dataset contains a wealth of football-related data, including match statistics, team information, player details, and league standings. The dataset covers a diverse range of leagues, encompassing top-tier competitions as well as lower divisions, allowing users to explore football data at various levels.

    Key Features:

    Match Results Home Goals Away Goals Home Goals in First Half Away Goals in First Half Match Odds for 1X2 and O/U 2.5 Goals Total Goals in the Match

    Potential Use Cases: - Statistical Analysis: Analyze match data, team performance, and player statistics to identify trends, patterns, and insights. - Predictive Modeling: Utilize historical data to build predictive models for match outcomes, goal predictions, or player performance. - Visualizations: Create visualizations, graphs, and charts to present key football data in an easily understandable format.

    Data Source: The data for this dataset is collected from reliable sources, including official football websites, sports news portals, and reputable football data providers. The dataset is carefully curated and quality-checked to ensure accuracy and reliability.

    Updates and Maintenance: The dataset will be periodically updated to include new seasons, leagues, and any necessary data corrections. User feedback and contributions are welcome to improve the dataset and keep it up-to-date.

    Disclaimer: While utmost care has been taken to ensure the accuracy and reliability of the data, errors or inconsistencies may still exist. Users are encouraged to verify the data with official sources before making any critical decisions based on the dataset.

    Acknowledgments: We would like to acknowledge the contributions of the data providers, football organizations, and sports enthusiasts whose efforts have made this dataset possible. Their dedication to collecting and sharing football data is greatly appreciated.

    Note: Please be respectful of the data usage policy and terms of service of the dataset. Use the data responsibly and ensure compliance with any applicable legal requirements.

  6. h

    QASports

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Calciolari Jardim, QASports [Dataset]. https://huggingface.co/datasets/PedroCJardim/QASports
    Explore at:
    Authors
    Pedro Calciolari Jardim
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Summary

    QASports is the first large sports-themed question answering dataset counting over 1.5 million questions and answers about 54k preprocessed wiki pages, using as documents the wiki of 3 of the most popular sports in the world, Soccer, American Football and Basketball. Each sport can be downloaded individually as a subset, with the train, test and validation splits, or all 3 can be downloaded together.

    🎲 Complete dataset: https://osf.io/n7r23/ 🔧 Processing scripts:… See the full description on the dataset page: https://huggingface.co/datasets/PedroCJardim/QASports.

  7. NFL Game Data: Scores & Plays (2017-2025)

    • kaggle.com
    zip
    Updated Feb 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Keoni Mortensen (2025). NFL Game Data: Scores & Plays (2017-2025) [Dataset]. https://www.kaggle.com/datasets/keonim/nfl-game-scores-dataset-2017-2023/versions/31
    Explore at:
    zip(24653407 bytes)Available download formats
    Dataset updated
    Feb 10, 2025
    Authors
    Keoni Mortensen
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Summary This dataset contains detailed information from every game listed on the NFL's official website, https://www.nfl.com/. It aims to provide a complete record of scores along with play-by-play data across all available seasons. This dataset was created with the hope of being a valuable resource for sports analysts and data scientists interested in American football statistics. The dataset was last updated on 02/10/2025.

    Data Collection The data was collected using a custom web scraper, which is openly available for review and further development. You can access the scraper code and documentation at the following GitHub repository: https://github.com/KeoniM/NFL_Scraper.git

    Dataset Features For Scores - Season: The NFL season the game belongs to. - Week: Specific week of the NFL season. - GameStatus: Current state or final status of the game. - Day: Day of the week the game was played. - Date: Exact date (month and day) of the game. - AwayTeam: Name of the visiting team. - AwayRecord: Season record of the away team at the time of the game. - AwayScore: Total points scored by the away team. - AwayWin: Boolean indicator if the away team won the game. - HomeTeam: Name of the home team. - HomeRecord: Season record of the home team at the time of the game. - HomeScore: Total points scored by the home team. - HomeWin: Boolean indicator if the home team won the game. - AwaySeeding: Playoff seeding of the away team, if applicable. - HomeSeeding: Playoff seeding of the home team, if applicable. - PostSeason: Boolean indicating whether the game is a postseason match.

    Dataset Features For Plays - Season: The NFL season the play belongs to. - Week: Specific week of the NFL season. - Day: Day of the week the play was attempted. - Date: Exact date (month and day) of the play was attempted. - AwayTeam: Name of the visiting team. - HomeTeam: Name of the home team. - Quarter: The quarter of the game the play was attempted. - DriveNumber: The drive number of the quarter the play was attempted. - TeamWithPossession: Team with possession that attempted the play. - IsScoringDrive: Did the drive result in a score. - PlayNumberInDrive: Play number during the drive that the play was attempted. - IsScoringPlay: Did the play result in a score. - PlayOutcome: Short summary of the attempted play. - PlayDescription: In depth summary of the attempted play. - PlayStart: Starting point on the field of the attempted play.

    Follow My Data Cleaning Journey If you're interested in following my process of refining and cleaning this dataset, check out my Google Colab notebook on GitHub, where I share ongoing updates and insights: https://github.com/KeoniM/NFL_Data_Cleaning.git. The notebook includes data wrangling techniques, code snippets, and continuous improvements, making this dataset even more valuable for analysis.

    Usage Notes This dataset is intended for academic and research purposes. Users are encouraged to attribute data to the source https://www.nfl.com/ when employing this dataset in their projects or publications.

  8. Comparison of video-based and sensor-based head impact exposure

    • plos.figshare.com
    tiff
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo (2023). Comparison of video-based and sensor-based head impact exposure [Dataset]. http://doi.org/10.1371/journal.pone.0199238
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Previous research has sought to quantify head impact exposure using wearable kinematic sensors. However, many sensors suffer from poor accuracy in estimating impact kinematics and count, motivating the need for additional independent impact exposure quantification for comparison. Here, we equipped seven collegiate American football players with instrumented mouthguards, and video recorded practices and games to compare video-based and sensor-based exposure rates and impact location distributions. Over 50 player-hours, we identified 271 helmet contact periods in video, while the instrumented mouthguard sensor recorded 2,032 discrete head impacts. Matching video and mouthguard real-time stamps yielded 193 video-identified helmet contact periods and 217 sensor-recorded impacts. To compare impact locations, we binned matched impacts into frontal, rear, side, oblique, and top locations based on video observations and sensor kinematics. While both video-based and sensor-based methods found similar location distributions, our best method utilizing integrated linear and angular position only correctly predicted 81 of 217 impacts. Finally, based on the activity timeline from video assessment, we also developed a new exposure metric unique to American football quantifying number of cross-verified sensor impacts per player-play. We found significantly higher exposure during games (0.35, 95% CI: 0.29–0.42) than practices (0.20, 95% CI: 0.17–0.23) (p

  9. Fantasy Football Data

    • kaggle.com
    zip
    Updated May 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nick (2020). Fantasy Football Data [Dataset]. https://www.kaggle.com/nickdehart/fantasydata
    Explore at:
    zip(11334515 bytes)Available download formats
    Dataset updated
    May 14, 2020
    Authors
    Nick
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I play fantasy football.

    Content

    Contains fantasy data from 2016-2019

    Inspiration

    Have fun.

  10. NFL Wide Receiver Career Stats and Aging Trends

    • kaggle.com
    zip
    Updated Jan 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). NFL Wide Receiver Career Stats and Aging Trends [Dataset]. https://www.kaggle.com/datasets/thedevastator/nfl-wide-receiver-career-stats-and-aging-trends/discussion
    Explore at:
    zip(197479 bytes)Available download formats
    Dataset updated
    Jan 15, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    NFL Wide Receiver Career Stats and Aging Trends

    Examining Target, Yards-Per-Attempt, and Overall Rating Data

    By FiveThirtyEight [source]

    About this dataset

    This repository contains a comprehensive database on the careers of NFL wide receivers, examining their performance over time to offer insights into physical changes and playing style over the years. With data stretching back all the way to 1990, it reveals key changes in stats and ratings -- including age_from/age_to, trypg_change, career_try/career_ranypa/career_wowy, and bcs_rating -- that provide essential information for football fans looking to understand the history and evolution of this position in American football. This dataset is made available under Creative Commons Attribution 4.0 International License as well as MIT License with hopes of facilitating more public understanding and transparency on this subject. We invite anyone who finds it useful to share their stories by contacting us at andrei.scheinkmanfivethirtyeight.com

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    In order to get started using this dataset: - Read through the columns of data to better understand what is being measured and how it relates to an individual player's performance. - Explore the data by filtering it in different ways (such as looking at only high rated players or seeing how older players fared compared with younger ones). - See if there are any patterns in how certain traits (such as age) affect a player's performance over time by creating graphs or other visualizations that explore these relationships over time.
    - Use these findings to draw your own conclusions about trends in NFL wide receiver aging curves or team strategies related to scouting opportunities for certain players throughout different stages of their career development journey from rookies all the way through veterans who are retiring from playing football professionally on any given year during an off-season year . . . or even beyond!

    Research Ideas

    • Analyzing the performance of NFL wide receivers over time by comparing their age-from and age-to stats.
    • Comparing the AV rating of NFL wide receivers to their total career receiving yards per attempt.
    • Comparing the career wowy stats of NFL wide receivers to their total career targets in order to assess efficiency levels across different players

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: try-per-game-aging-curve.csv | Column name | Description | |:-----------------|:------------------------------------------------------------------------------------------------------------| | age_from | Age when the career started. (Integer) | | age_to | Age when the career ended. (Integer) | | trypg_change | Change in the wide receiver's total receiving yards per game from the start to end of their career. (Float) |

    File: advanced-historical.csv | Column name | Description | |:------------------|:-----------------------------------------------------------| | player_name | Name of the NFL wide receiver. (String) | | career_try | Total number of career targets. (Integer) | | career_ranypa | Average number of receiving yards per attempt. (Float) | | career_wowy | Average number of yards per target. (Float) | | bcs_rating | Player's overall rating according to BCS system. (Integer) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit FiveThirtyEight.

  11. Head impacts per play exposure metric.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo (2023). Head impacts per play exposure metric. [Dataset]. http://doi.org/10.1371/journal.pone.0199238.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Head impacts per play exposure metric.

  12. Previous exposure studies.

    • plos.figshare.com
    xls
    Updated Jun 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo (2023). Previous exposure studies. [Dataset]. http://doi.org/10.1371/journal.pone.0199238.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 18, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Calvin Kuo; Lyndia Wu; Jesus Loza; Daniel Senif; Scott C. Anderson; David B. Camarillo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Previous exposure studies.

  13. Sofascore and Transfermarkt Football Data

    • kaggle.com
    zip
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felipe Sembay (2024). Sofascore and Transfermarkt Football Data [Dataset]. https://www.kaggle.com/datasets/felipesembay/sofascore-and-transfermarkt-football-data
    Explore at:
    zip(3497699 bytes)Available download formats
    Dataset updated
    Oct 21, 2024
    Authors
    Felipe Sembay
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    [en-us] This dataset gathers detailed information on the performance of football players and their market values, collected from two widely recognized sources in the sports world: Sofascore and Transfermarkt. The dataset includes a combination of on-field performance metrics and financial data related to the players' market valuation.

    The performance data is derived from Sofascore, which provides detailed statistics on player performances in various competitions, including goals, assists, completed passes, tackles, and other performance indicators. Meanwhile, the players' market value information is sourced from Transfermarkt, a leading platform that tracks market fluctuations, the highest market values reached by players, and contract expiration dates.

    This dataset is ideal for analyses involving the relationship between sports performance and market value, allowing insights into how on-field performance can impact players’ market value. It is useful for sports analysts, researchers, and enthusiasts looking to explore trends in football, observe the valuation of players over time, and make comparisons between leagues and competitions.

    market_value: History of the player's contract values. partidas_sofascore: Game dates, championships, and match IDs. performance_tm: Some player statistics collected from the Transfermarkt website. players_tm: Information related to the club, URL, and player ID on Transfermarkt. statistics_game: Game statistics, with total values, first and second halves. statistics_player: Individual player statistics.

    The championships collected are: - Campeonato Brasileiro Série A and B - Copa do Brasil - Copa Sudamericana - Copa Libertadores

    The data coming until 2024-10-12.

    At this initial stage, data has been extracted from championships related to Brazil and South America. More data on other European and South American championships will be added soon.

    [pt-br] Este dataset reúne informações detalhadas sobre o desempenho de jogadores de futebol e seus valores de mercado, coletados de duas fontes amplamente reconhecidas no mundo esportivo: Sofascore e Transfermarkt. O conjunto de dados inclui uma combinação de métricas de desempenho em campo e dados financeiros relacionados à avaliação de mercado dos jogadores.

    Os dados de desempenho são derivados do Sofascore, que fornece estatísticas detalhadas sobre as atuações dos jogadores em diversas competições, incluindo gols, assistências, passes completos, desarmes, entre outros indicadores de performance. Já as informações sobre o valor de mercado dos jogadores são extraídas do Transfermarkt, uma plataforma líder que acompanha as flutuações de mercado, maiores valores atingidos pelos jogadores e as datas de expiração de seus contratos.

    Este dataset é ideal para análises que envolvem a relação entre o desempenho esportivo e o valor de mercado, permitindo insights sobre como a performance em campo pode impactar o valor de mercado dos jogadores. É útil para profissionais de análise esportiva, pesquisadores e entusiastas que buscam explorar tendências no futebol, observar a valorização de jogadores ao longo do tempo e realizar comparações entre ligas e competições.

    market_value: Histórico dos valores do contrato do jogador.; partidas_sofascore: Referente a Data dos jogos, campeonatos e ID's das Partidas; peformance_tm : Algumas estatísticas coletadas do jogador no site do Transfermakt; players_tm: informações referentes ao Clube, URL e ID do jogador no Transfermakt.; statistics_game: Estatísticas do jogo, com valores totais, primeiro e segundo tempo; statistics_player : Estatisticas individuais dos jogadores.

    Os campeonatos que forma coletados são: - Campeonato Brasileiro Série A e B; - Copa do Brasil; - Campeonato Sulamericana; - Taça Libertadores da América.

    Os dados vão até o dia 12/10/2024.

    Nesse primeiro momento foram extraídos dados dos campeonatos referente ao Brasil e a América do Sul. Em breve será adicionado mais dados referente a outros campeonatos europeus e sulamericanos.

  14. 2025 Summer Football Transfer Window

    • kaggle.com
    zip
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    abdellah maghous (2025). 2025 Summer Football Transfer Window [Dataset]. https://www.kaggle.com/datasets/abdellahmaghous/2025-summer-football-transfer-window
    Explore at:
    zip(35170 bytes)Available download formats
    Dataset updated
    Jun 23, 2025
    Authors
    abdellah maghous
    Description

    This dataset presents a comprehensive overview of all football transfers completed during the Summer 2025 transfer window, just before the FIFA Club World Cup 2025. It was scraped from Transfermarkt (https://www.transfermarkt.fr), one of the most reliable and up-to-date sources for football transfer data.

    It includes over 1,200 player moves across various leagues and countries, capturing transfers involving top clubs, rising talents, and strategic loans across the globe.

    📦 Dataset Summary Total transfers: 1,208

    Transfer window: Summer 2025 (before FIFA Club World Cup 2025)

    Source: Scraped from Transfermarkt

    Coverage: Global (Europe, South America, Africa, Asia, North America...)

    📑 Columns Description - name :Player’s full name - position :Playing position (e.g., Striker, Goalkeeper, Midfielder, etc.) - age :Player's age at the time of transfer - market_value :Estimated market value before the transfer (as listed by Transfermarkt) - country_from :Origin country of the club the player is leaving - league_from :Origin league - club_from :Club the player is leaving - country_to :Destination country - league_to :Destination league - club_to :Club the player is joining - fee :Transfer fee (can be free, undisclosed, or in euros) - loan :Boolean flag indicating whether the move is a loan (True/False)

    📊 Use Cases Track transfer market trends across countries and leagues

    Analyze market value vs. transfer fee

    Explore position-based transfer patterns

    Study the most active clubs or leagues

    Build predictive models: Who is likely to transfer where, at what value?

    Visualize global player flows during a transfer window

    🧠 Example Ideas for Analysis What positions are most frequently transferred?

    Which leagues spend the most per player?

    How often do transfers occur between specific countries?

    How many loan deals vs. permanent moves?

    How do transfer fees correlate with market values by age group?

    📌 Notes All data was collected manually via web scraping and cleaned using pandas.

    Currency in market_value and fee may need to be parsed into numeric values for quantitative analysis.

    Some entries may include "undisclosed" or "free transfer" as values for fee.

  15. nfl-big-data-bowl-2021 Feather files

    • kaggle.com
    zip
    Updated Oct 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathurin Aché (2020). nfl-big-data-bowl-2021 Feather files [Dataset]. https://www.kaggle.com/mathurinache/nflbigdatabowl2021-feather-files
    Explore at:
    zip(475231363 bytes)Available download formats
    Dataset updated
    Oct 15, 2020
    Authors
    Mathurin Aché
    Description

    When a quarterback takes a snap and drops back to pass, what happens next may seem like chaos. As offensive players move in various patterns, the defense works together to prevent successful pass completions and then to quickly tackle receivers that do catch the ball. In this year’s Kaggle competition, your goal is to use data science to better understand the schemes and players that make for a successful defense against passing plays.

    In American football, there are a plethora of defensive strategies and outcomes. The National Football League (NFL) has used previous Kaggle competitions to focus on offensive plays, but as the old proverb goes, “defense wins championships.” Though metrics for analyzing quarterbacks, running backs, and wide receivers are consistently a part of public discourse, techniques for analyzing the defensive part of the game trail and lag behind. Identifying player, team, or strategic advantages on the defensive side of the ball would be a significant breakthrough for the game.

    This competition uses NFL’s Next Gen Stats data, which includes the position and speed of every player on the field during each play. You’ll employ player tracking data for all drop-back pass plays from the 2018 regular season. The goal of submissions is to identify unique and impactful approaches to measure defensive performance on these plays. There are several different directions for participants to ‘tackle’ (ha)—which may require levels of football savvy, data aptitude, and creativity. As examples:

    What are coverage schemes (man, zone, etc) that the defense employs? What coverage options tend to be better performing? Which players are the best at closely tracking receivers as they try to get open? Which players are the best at closing on receivers when the ball is in the air? Which players are the best at defending pass plays when the ball arrives? Is there any way to use player tracking data to predict whether or not certain penalties – for example, defensive pass interference – will be called? Who are the NFL’s best players against the pass? How does a defense react to certain types of offensive plays? Is there anything about a player – for example, their height, weight, experience, speed, or position – that can be used to predict their performance on defense? What does data tell us about defending the pass play? You are about to find out.

    Note: Are you a university participant? Students have the option to participate in a college-only Competition, where you’ll work on the identical themes above. Students can opt-in for either the Open or College Competitions, but not both.

  16. Football Matches from Europe and South America

    • kaggle.com
    zip
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Posinovec (2023). Football Matches from Europe and South America [Dataset]. https://www.kaggle.com/datasets/ivanposinovec/football-matches-from-europe-and-south-america
    Explore at:
    zip(7146262 bytes)Available download formats
    Dataset updated
    Oct 28, 2023
    Authors
    Ivan Posinovec
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    South America, Europe, Americas
    Description

    Content

    This dataset was created from web scraping the football page FBREF. Includes +27,000 match reports from the Top 5 European leagues and the Top 2 South American leagues, with domestic and international cups games.

    Europe - England: Premier league, EFL Cup and FA Cup (2015-2023). - France: Ligue 1, Coupe de France and Trophée des Champions (2018-2023) - Coupe de la Ligue (2018-2020). - Italy: Serie A and Coppa Italia (2015-2023) - Supercoppa Italiana (2016-2023). - Germany: Fußball-Bundesliga, DFB-Pokal and DFL-Supercup (2015-2023). - Spain: La Liga and Copa del Rey (2015-2023) - Supercopa de España (2016-2023). - International: UEFA Champions League, UEFA Europa League, UEFA Super Cup (2015-2023) - UEFA Europa Conference League (2022-2023).

    South America - Argentina: Argentine Primera División (2017-2022) - Copa de la Liga Profesional (2021-2022). - Brazil: Campeonato Brasileiro Série A (2017-2022). - International: Copa Libertadores (2017-2022) - Copa Sudamericana (2016-2022).

    For each game, data available includes Match Info, Team Stats, Managers, Captains, Formations, Lineups and Player stats (players with at least 1 minute played). Match reports available for domestic cups games only for rounds that include first tier teams.

    Sources

    By using this repository, you are agreeing to Sports Reference LLC Terms of Use.

  17. NFL-QB-Shoulder-Injuries

    • kaggle.com
    zip
    Updated Mar 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Durrant (2023). NFL-QB-Shoulder-Injuries [Dataset]. https://www.kaggle.com/datasets/georgedurrant/nfl-qb-shoulder-injuries/data
    Explore at:
    zip(691056 bytes)Available download formats
    Dataset updated
    Mar 21, 2023
    Authors
    George Durrant
    Description

    Introduction

    Shoulder injuries are among the most common types of upper extremity injuries in both contact and noncontact sports. They are a significant source of morbidity for athletes, accounting for almost one third of all sports-related injuries (Enger). As a result of these factors, shoulder injuries and their post healing metrics are an important area for research in orthopaedics.

    Shoulder injuries most commonly result from direct trauma or a fall onto the ipsilateral shoulder, making athletes especially prone to these types of injuries (Monica). Some of the most common injuries in this population include anterior and posterior glenohumeral instability, acromioclavicular pathology (including separation, osteolysis, and osteoarthritis) and rotator cuff tears (Gibbs). Acromioclavicular joint injuries are the most common among the athletic population with an overall incidence rate of 9.2 per 1000 person-years and an average time of 18.4 days lost per athlete (Pallis) followed by Glenohumeral instability at at 2.79 per 1000 person-years (Lanzi).

    With American football being a high contact sport played at high speeds, the potential for shoulder injuries from minor sprains to career ending tears is significant. Nearly half of players at the NFL combined have reported a history of shoulder injury, with 34% requiring operative intervention (Kaplan). Quarterbacks are particularly affected by shoulder injuries due to their playing position being targeted by the opposing team on every play, and the associated throwing mechanics with their playing actions. Of all QB injuries reported, shoulder injuries are the 2nd most common at 15.2% (Kelly).

    The purpose of this study was to determine (1) the general impact on performance metrics among NFL quarterbacks following shoulder injury and (2) the impact of surgical interventions to repair these injuries had on career outcomes using measures such as passer rating, yards ran, and successful passes. We hypothesized that quarterbacks in the national football league who injure their dominant shoulder will 1) have decreased performance metrics after surgery 2) those that get surgery will have better performance metrics compared to those that do not get surgery.

    Methods

    Overview

    National Football League (NFL) Injured Reserve (IR) lists for the years 1980 to 2019 were pulled from Pro Sports Transactions and entries were queried to find quarterbacks who were placed on the IR with a shoulder injury.

    50 quarterbacks were found to have long-term shoulder injuries, and a subset of these were selected who had first-time shoulder surgery on their dominant, throwing arm. Manual searches were performed to verify the nature of the injury and determine dates of surgery. Age-matched controls were selected with the following criteria:

    same years of experience same number of career seasons +/- 5 same year of NFL play +/- 10

    Detailed

    Quarterbacks (QBs) who suffered a shoulder injury necessitating placement on the Injured Reserve (IR) list were identified. Placement on the IR indicates a long-term injury rendering players unable to play in the remainder of the season. Pro Sports Transactions IR data from 1980 to 2019 was extracted and entries were filtered for injuries using keywords "shoulder", "labrum", "rotator cuff", “dislocation”, and “AC joint”. An additional manual search of news articles from the NFL, official team websites, and reputable news sources was performed to confirm surgery types and dates and obtain information about players who were placed on the IR without a description of their injury. 65 relevant injuries were found. Of these injuries, 14 were repeat injuries for the same player and 17 were injuries to the non-throwing arm, all of which were excluded. The remaining entries were excluded if the shoulder injury was characterized as a "bruise" or a "strain", and therefore not serious enough injuries to evaluate. Clavicle injuries were also excluded. Players who did not return to play in more than 1 regular season game were excluded for the performance analysis. A total of 19 QBs who received surgery and 11 QBs who suffered a severe shoulder injury but did not receive surgery were included.

    QB performance statistics were extracted from Pro Football Reference, which includes statistics by game for players' entire careers. 269 QBs from 1980 to 2020 were found and used as the entire NFL population of QBs. Included performance statistics were selected to be passer rating, passing yards, pass attempts, pass completions, pass completion percentage, passing touchdowns, interceptions, sacks, yards lost to sacks, yards per pass attempt, adjusted yards per pass attempt. Performance statistics were included only if the player attempted more than 1 pass in a game, and statistics were averaged for each game.

    Unique age and experience matched controls were selected...

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Max Horowitz (2018). Detailed NFL Play-by-Play Data 2009-2018 [Dataset]. https://www.kaggle.com/datasets/maxhorowitz/nflplaybyplay2009to2016
Organization logo

Detailed NFL Play-by-Play Data 2009-2018

nflscrapR generated NFL dataset wiith expected points and win probability

Explore at:
zip(287411671 bytes)Available download formats
Dataset updated
Dec 22, 2018
Authors
Max Horowitz
Description

Introduction

The lack of publicly available National Football League (NFL) data sources has been a major obstacle in the creation of modern, reproducible research in football analytics. While clean play-by-play data is available via open-source software packages in other sports (e.g. nhlscrapr for hockey; PitchF/x data in baseball; the Basketball Reference for basketball), the equivalent datasets are not freely available for researchers interested in the statistical analysis of the NFL. To solve this issue, a group of Carnegie Mellon University statistical researchers including Maksim Horowitz, Ron Yurko, and Sam Ventura, built and released nflscrapR an R package which uses an API maintained by the NFL to scrape, clean, parse, and output clean datasets at the individual play, player, game, and season levels. Using the data outputted by the package, the trio went on to develop reproducible methods for building expected point and win probability models for the NFL. The outputs of these models are included in this dataset and can be accessed using the nflscrapR package.

Content

The dataset made available on Kaggle contains all the regular season plays from the 2009-2016 NFL seasons. The dataset has 356,768 rows and 100 columns. Each play is broken down into great detail containing information on: game situation, players involved, results, and advanced metrics such as expected point and win probability values. Detailed information about the dataset can be found at the following web page, along with more NFL data: https://github.com/ryurko/nflscrapR-data.

Acknowledgements

This dataset was compiled by Ron Yurko, Sam Ventura, and myself. Special shout-out to Ron for improving our current expected points and win probability models and compiling this dataset. All three of us are proud founders of the Carnegie Mellon Sports Analytics Club.

Inspiration

This dataset is meant to both grow and bring together the community of sports analytics by providing clean and easily accessible NFL data that has never been availabe on this scale for free.

Search
Clear search
Close search
Google apps
Main menu