Use our trusted SportMonks Football API to build your own sports application and be at the forefront of football data today.
Our Football API is designed for iGaming, media, developers and football enthusiasts alike, ensuring you can create a football application that meets your needs.
Over 20,000 sports fanatics make use of our data. We know what data works best for you, so we ensured that our Football API has all the necessary tools you need to create a successful football application.
Livescores and schedules Our Football API features extremely fast livescores and up-to-date season schedules, meaning your app will be the first to notify its customers about a goal scored. This also works to further improve the look and feel of your website.
Statistics and line-ups We offer various kinds of football statistics, ranging from (live) player statistics to team, match and season statistics. And that’s not all - we also provide pre-match lineups for all important leagues.
Coverage and historical data Our Football API covers over 1,200 leagues, all managed by our in-house scouts and data platform. That means there’s up to 14 years of historical data available.
Bookmakers and odds Build your football sportsbook, odds comparison or betting portal with our pre-match and in-play odds collated from all major bookmakers and markets.
TV Stations and highlights Show your customers where the football games are broadcasted and provide video highlights of major match events.
Standings and topscorers Enhance your football website with standings and live standings, and allow your customers to see the top scorers and what the season's standings are.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Most publicly available football (soccer) statistics are limited to aggregated data such as Goals, Shots, Fouls, Cards. When assessing performance or building predictive models, this simple aggregation, without any context, can be misleading. For example, a team that produced 10 shots on target from long range has a lower chance of scoring than a club that produced the same amount of shots from inside the box. However, metrics derived from this simple count of shots will similarly asses the two teams.
A football game generates hundreds of events and it is very important and interesting to take into account the context in which those events were generated. This incredibly rich data set should keep football analytics enthusiasts awake for long hours as the size of the data set and number of questions that can be asked is huge.
There are 4 main files containing the data: 1) Competition data: Contains information regarding competetion id, competition name, season id, season name, country and gender.
2)Match data: Match information for each match including competition and season information, stadium and referee information, home and away team information as well as the data version the match was collected under.
3) Lineup data: Records the lineup information for the players, managers and referees involved with each match. The following variables are collected in the lineups of each match - team id, team name and lineup. The lineup array is a nested data frame inside of the lineup object, the lineup array contains the following information for each team- player id, player name, player nickname, jersey number and country
4) Event data: Event Data comprises of general attributes and event specific attributes. General attributes are recorded for most event types, depending only on applicability. Event specific attributes help describe the event type in more detail as well as describe the outcome of the event type.
The open data specification document in the doc folder describes the structure of the data along with all attributes in great detail. Take a look at this file for deeper understanding of the data.
This data is from the StatsBomb Open Data repository. StatsBomb are committed to sharing new data and research publicly to enhance understanding of the game of Football. They want to actively encourage new research and analysis at all levels. Therefore they have made certain leagues of StatsBomb Data freely available for public use for research projects and genuine interest in football analytics.
There are many many questions we can ask with such detailed event data. Here are just a few examples: What is the value of a shot? Or what is the probability of a shot being a goal given it's location, shooter, league, assist method, gamestate, number of players on the pitch, time - known as expected goals (xG) models When are teams more likely to score? Which teams are the best or sloppiest at holding the lead? Which teams or players make the best use of set pieces? How do players compare when they shoot with their week foot versus strong foot? Or which players are ambidextrous? Identify different styles of plays (shooting from long range vs shooting from the box, crossing the ball vs passing the ball, use of headers) Which teams have a bias for attacking on a particular flank?
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains detailed player performance statistics for the 2023-2024 season from the Big 5 European soccer leagues: Premier League, La Liga, Serie A, Bundesliga, and Ligue 1. The data has been meticulously scraped from FBref.com, a comprehensive source for soccer statistics.
I am passionate about soccer and have created this dataset in the hope that it can be useful for others who share my love for the game. Whether you're conducting analysis, building models, or just exploring player stats, I hope this dataset provides valuable insights and serves as a helpful resource.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
My family has always been serious about fantasy football. I've managed my own team since elementary school. It's a fun reason to talk with each other on a weekly basis for almost half the year.
Ever since I was in 8th grade I've dreamed of building an AI that could draft players and choose lineups for me. I started off in Excel and have since worked my way up to more sophisticated machine learning. The one thing that I've been lacking is really good data, which is why I decided to scrape pro-football-reference.com
for all recorded NFL player data.
From what I've been able to determine researching, this is the most complete public source of NFL player stats available online. I scraped every NFL player in their database going back to the 1940s. That's over 25,000 players who have played over 1,000,000 football games.
The scraper code can be found here. Feel free to user, alter, or contribute to the repository.
The data was scraped 12/1/17-12/4/17
When I uploaded this dataset back in 2017, I had two people reach out to me who shared my passion for fantasy football and data science. We quickly decided to band together to create machine-learning-generated fantasy football predictions. Our website is https://gridironai.com. Over the last several years, we've worked to add dozens of data sources to our data stream that's collected weekly. Feel free to use this scraper for basic stats, but if you'd like a more complete dataset that's updated every week, check out our site.
The data is broken into two parts. There is a players table where each player has been assigned an ID and a game stats table that has one entry per game played. These tables can be linked together using the player ID.
Football is not only the most popular sport to watch and spectate in the United Kingdom (UK) and England, but also the most popular team sport to participate in. Between November 2023 and November 2024, roughly 2.2 million people in England played the sport. Football nation Being home to not only the biggest football league but the biggest and most successful sports league in the world, the Premier League, England has many football fans who support the sport with famous clubs such as Manchester United, Liverpool FC, Arsenal FC or Manchester City. Champions League Some of these top tier clubs compete in the UEFA Champions League with other high division teams, primarily from the other ’Big Five’ football leagues in Europe, Germany, Spain, Italy and France. In 2023/24, Real Madrid came out as the victor, winning their 15th Champions League title that season.
Real Madrid CF was the most visited soccer club website worldwide as of June 2021, with over 1.5 million unique visitors per month. The website of Manchester United followed second in the list, with online traffic of more than 840 thousand visitors. All top ten websites included in the global ranking belong to European soccer clubs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Real-time match analysis: The "Football" model can be used to provide real-time insights and statistics about the ongoing match, such as ball possession percentages, player movements, goal attempts, successful corner kicks, and identification of goalkeepers making crucial saves.
Automated highlight generation: By identifying critical events like goals, corners, and exceptional goalkeeper saves, the model can automatically create highlight reels of important moments in a football match, saving content creators and broadcasters significant editing effort.
Performance analytics for teams and coaching staff: The model can be used to analyze and quantify individual player performance and team dynamics during a match, providing valuable insights for coaching staff to optimize strategies, identify strengths and weaknesses, and enhance team performance.
Enhanced fan engagement: With its ability to identify various elements of a football match, the model can be used to develop interactive applications and augmented reality solutions that engage fans and provide them with additional information, such as player statistics, goal breakdowns, or immersive replays of key events.
Referee decision support: The model can be integrated into a decision support system for referees, assisting with offside calls or other contentious decisions by providing accurate information about the positions of the ball, players, and goalkeepers during critical moments.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset features player statistics from the 2024-2025 season across the top five European leagues, sourced from FBref. Automatically updated weekly.
It includes two files:
players_data-2024_2025.csv
– A comprehensive dataset with over 250 columns, covering detailed player statistics.
players_data_light-2024_2025.csv
– A streamlined version containing the most crucial attacking, passing, defending, and goalkeeping stats for each player.
Let me know if you'd like further refinements!🚀
Player
– Player's name
Nation
– Player's nationality
Pos
– Position (FW, MF, DF, GK)
Squad
– Club name
Comp
– League
Age
– Age of the player
Born
– Year of birth
MP
– Matches played
Starts
– Games started
Min
– Minutes played
90s
– Number of full 90-minute matches played
Gls
– Goals scored
Ast
– Assists provided
G+A
– Goals + Assists
xG
– Expected goals
xAG
– Expected assists
npxG
– Non-penalty expected goals
G-PK
– Goals excluding penalties
Tkl
– Total tackles
TklW
– Tackles won
Blocks
– Blocks made
Int
– Interceptions
Tkl+Int
– Combined tackles and interceptions
Clr
– Clearances
Err
– Errors leading to goals
PrgP
– Progressive passes
PrgC
– Progressive carries
KP
– Key passes (passes leading to a shot)
Cmp%_stats_passing
– Pass completion percentage
Ast_stats_passing
– Assists
xA
– Expected assists
PPA
– Passes into the penalty area
GA
– Goals conceded
Saves
– Saves made
Save%
– Save percentage
CS
– Clean sheets
CS%
– Clean sheet percentage
PKA
– Penalties faced
PKsv
– Penalty saves
Touches
– Total touches of the ball
Carries
– Total ball carries
PrgR
– Progressive runs (carries moving the ball forward significantly)
Mis
– Miscontrols
Dis
– Times dispossessed
CrdY
– Yellow cards
CrdR
– Red cards
PKwon
– Penalties won
PKcon
– Penalties conceded
Recov
– Ball recoveries
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Annual release of statistics for football-related arrests and football banning orders. Breakdowns provided are by offence, club supported, overseas arrests and arrests by location (inside/outside stadium).
Source agency: Home Office
Designation: Experimental Official Statistics
Language: English
Alternative title: Statistics on football-related arrests and football banning orders
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset consists in 22 JSON files representing a season of the Spanish Football League ("La Liga").
The dataset represents several hierarchically related elements, however, only the Match, Event and Player elements contain relevant information for analysis. The rest of the elements simply serve to keep the data structured, by seasons and matchdays. The dataset collects information from several seasons between the years 2000 and 2022. The attributes of each of the elements that make up the dataset are described below:
Season: JSON documents represent a season, their root contains the following information:
Rounds: (or matchdays) Collection of matches:
Match: contains relevant match information.
Event: contains information that defines each of the relevant actions that occur during a soccer match. Events can be described by the following attributes:
Players: Player information:
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
1) PremierLeaguePlayersDataset: This dataset includes statistics ranging from general information such as the goals and assists in a season, to more precise statistics like key passes and dribble attempts. It also includes the player of the year for a given season. Interesting predictive analysis could be done with this attribute. This dataset ranges from the 02/03 season, to the 20/21 season.
2) League Standings: This dataset includes the final standings of a given season. The data ranges from the 10/11 season, to the 20/21. The attributes are the same you may find on the official Premier League site or Sky Sports site (where the data actually comes from)
3) Full Dataset: This dataset merges the two datasets described above. For a given player and season, you have the final ranking of his team. An interesting analysis would be to see the players involvement in the teams goals.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides detailed information on the top 500 football players in 2024, including their market values, performance statistics, and demographics. Key features include:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides detailed information on football (soccer) shots, capturing various contextual and technical aspects of each attempt. It is designed for sports analytics, machine learning models, and tactical analysis. It was created with the objective to generate a basic xG model.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Welcome to the Premier League Match Statistics dataset! ⚽ This guide will help you understand the structure of the dataset, key variables, and how to make the most of the data for analysis and predictions.
This dataset contains detailed match statistics from the English Premier League, including final scores, player statistics, team performance, goals, yellow cards, red cards, and more. It is ideal for analyzing team performance, predicting match outcomes, and exploring trends in football. This dataset is valuable for football enthusiasts, data analysts, and predictive model developer.
This dataset provides comprehensive match statistics from the English Premier League, including team performance, player stats, goals, assists, yellow/red cards, and more. It is ideal for football enthusiasts, analysts, and machine learning projects.
The dataset consists of multiple columns, each representing different aspects of a match:
Column Name | Description |
---|---|
Match_ID | Unique identifier for each match |
Date | Match date (YYYY-MM-DD format) |
Home_Team | Name of the home team |
Away_Team | Name of the away team |
Home_Goals | Goals scored by the home team |
Away_Goals | Goals scored by the away team |
Possession_% | Possession percentage of each team |
Shots_On_Target | Number of shots on target |
Yellow_Cards | Number of yellow cards given |
Red_Cards | Number of red cards given |
Player_of_Match | Best-performing player of the match |
Additional columns may provide more in-depth insights.
Here are some ideas to explore using this dataset:
✅ Analyze team performance trends over different seasons.
✅ Predict match outcomes using machine learning models.
✅ Identify key players based on goals, assists, and ratings.
✅ Explore disciplinary records (yellow/red cards) for fair play analysis.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I had the need to collect Europe's top 5 leagues' dataset for my own undergraduate project. The idea was to eliminate human bias from the player scouting process.
More Details: https://github.com/Suwadith/Winning-Eleven-Scout-Evaluation-and-Analysis-to-Enhance-Football-Player-Recommendations-ML-Flask
This dataset contains league table data from 2009 - 2018. Leagues included: La Liga, Bundesliga, Serie A, Ligue 1, Premier League
This dataset was compiled from the https://www.whoscored.com website
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘NFL scores and betting data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/tobycrabtree/nfl-scores-and-betting-data on 28 January 2022.
--- Dataset description provided by original source is as follows ---
National Football League historic game and betting info
National Football League (NFL) game results since 1966 with betting odds information since 1979. Dataset was created from a variety of sources including games and scores from a variety of public websites such as ESPN, NFL.com, and Pro Football Reference. Weather information is from NOAA data with NFLweather.com a good cross reference. Betting data was used from http://www.repole.com/sun4cast/data.html for 1978-2013 seasons. Pro-football-reference.com data was then cross referenced for betting lines and odds as well as weather data. From 2013 on betting data reflects lines available at sportsline.com.
Helpful sites with interest in football and sports betting include:
https://github.com/fivethirtyeight/nfl-elo-game
http://www.repole.com/sun4cast/data.html
https://www.pro-football-reference.com/
https://github.com/jp-wright/nfl_betting_market_analysis
http://www.aussportsbetting.com/data/historical-nfl-results-and-odds-data/
Can you build a predictive model to better predict NFL game outcomes and identify successful betting strategies?
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The presened data are used to determine how the change of teams’ efficiency affects the level of competitive balance in the top European football leagues. The data about valuation of teams were collected from Transfermarket, while the number of goals and points were collected from the sites of the national leagues.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset consists in 22 JSON files representing a season of the Spanish Football League ("La Liga").
The dataset represents several hierarchically related elements, however, only the Match, Event and Player elements contain relevant information for analysis. The rest of the elements simply serve to keep the data structured, by seasons and matchdays. The dataset collects information from several seasons between the years 2000 and 2022. The attributes of each of the elements that make up the dataset are described below:
Season: JSON documents represent a season, their root contains the following information:
competition: Name by which the competition is known
country: Country where the competition is held
season_id: Identifier of the season, example: Season 2021/22
season_url: Relative URL of the season's web page
rounds: List of Round elements, the days into which the championship is divided
Rounds: (or matchdays) Collection of matches:
number: Name of the matchday, e.g.: Matchday 1.
matches: List of Match elements, matches that are played on the same day/s of the championship.
Match: contains relevant match information.
id: Match identifier used at BeSoccer.com
status: Code representing the status of the match: Played (1), Not Played (0)
home_team: Name of the home team
away_team: Name of the away team
result: List of two integers representing the match score
date_time: Date and time at which the match started
referee: First and last name of the referee of the match
href: URL relative to the match page
home_tactic: Tactical arrangement of the home team, e.g.: 4-3-3
home_lineup: List of players in the starting lineup of the home team
home_bench: List of the home team's substitute players
away_tactic: Tactical arrangement of the away team, e.g. 4-3-3
away_lineup: List of players in the home team's starting lineup
away_bench: List of substitute players of the away team
Event: contains information that defines each of the relevant actions that occur during a soccer match. Events can be described by the following attributes:
player: Player identifier. Relative URL
team: Team of the player who participates in the event
minute: Minute of the match in which the event occurs
type: Event type (Enumeration)
Players: Player information:
name: First name
fullname: Player's full name
dob: Date of birth
country: Nationality
position: Position the player usually occupies: GOA (GoalKeeper), DF (Defender), MID (Midfielder), STR (Striker)
foot: Dominant Foot: Right-footed, Left-footed, Two-footed, Unknown
weight: Weight of player in kilograms
height: Player height in centimeters
elo: Measurement of the player's skills on a scale of 1 to 100
potential: Estimate of the maximum ELO that a player can reach on a scale of 1 to 100.
href: Relative URL of the player's record
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We exploit the natural experimental setting provided by the Covid-19 lockdown to analyse how performance is affected by a friendly audience. Specifically, we use data on all football matches in the top-level competitions across France, Germany, Italy, Spain, and the United Kingdom over the 2019/2020 season. We compare the difference between the number of points gained by teams playing at home and teams competing away before the Covid-19 outbreak, when supporters could attend any match, with the same difference after the lockdown, when all matches took place behind closed doors. We find that the performance of the home team is halved when stadiums are empty. Further analyses indicate that offensive (defensive) actions taken by the home team are drastically reduced (increased) once games are played behind closed doors. The referee is affected too, as she changes her behaviour in games without spectators. Finally, the home advantage is entirely driven by teams that do not have international experience. Taken together, our findings corroborate the hypothesis that social pressure influences individual behaviour.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Barclays Premiere League for last 12 seasons’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/lumierebatalong/english-premiere-league-team-datasets on 28 January 2022.
--- Dataset description provided by original source is as follows ---
Barclay premier league is the best league in the world 💯 . It has 20 teams that qualified for the title. Among these 20 teams there are 5 teams which have already won the title in the last 12 seasons namely Man City, Liverpool, Man United, Chelsea, Leicester with two outsiders Arsenal and Tottenham. Who is your favorite team and how can you predict their title victory for the current or next season? The ball is in your camp 👀 .
Notes for Football Data
All data is in csv format, ready for use within standard spreadsheet applications. Please note that some abbreviations are no longer in use and refer to data collected in earlier seasons. Each data contains last 12 seasons of English Premier League.
Key to results data:
Div = League Division Date = Match Date (dd/mm/yy) Time = Time of match kick off HomeTeam = Home Team AwayTeam = Away Team FTHG and HG = Full Time Home Team Goals FTAG and AG = Full Time Away Team Goals FTR and Res = Full Time Result (H=Home Win, D=Draw, A=Away Win) HTHG = Half Time Home Team Goals HTAG = Half Time Away Team Goals HTR = Half Time Result (H=Home Win, D=Draw, A=Away Win)
Match Statistics (where available) Attendance = Crowd Attendance Referee = Match Referee HS = Home Team Shots AS = Away Team Shots HST = Home Team Shots on Target AST = Away Team Shots on Target HHW = Home Team Hit Woodwork AHW = Away Team Hit Woodwork HC = Home Team Corners AC = Away Team Corners HF = Home Team Fouls Committed AF = Away Team Fouls Committed HFKC = Home Team Free Kicks Conceded AFKC = Away Team Free Kicks Conceded HO = Home Team Offsides AO = Away Team Offsides HY = Home Team Yellow Cards AY = Away Team Yellow Cards HR = Home Team Red Cards AR = Away Team Red Cards
I remove some features.
This dataset contains data for last 12 seasons of English Premier League. The dataset is sourced from http://www.football-data.co.uk/ website and contains various statistical data such as final and half time result, corners, yellow and red cards etc
Can you explain why Man United has not won the title for last 12 seasons?. Can you predict the victory of your favorite team in every championship game?.
--- Original source retains full ownership of the source dataset ---
Use our trusted SportMonks Football API to build your own sports application and be at the forefront of football data today.
Our Football API is designed for iGaming, media, developers and football enthusiasts alike, ensuring you can create a football application that meets your needs.
Over 20,000 sports fanatics make use of our data. We know what data works best for you, so we ensured that our Football API has all the necessary tools you need to create a successful football application.
Livescores and schedules Our Football API features extremely fast livescores and up-to-date season schedules, meaning your app will be the first to notify its customers about a goal scored. This also works to further improve the look and feel of your website.
Statistics and line-ups We offer various kinds of football statistics, ranging from (live) player statistics to team, match and season statistics. And that’s not all - we also provide pre-match lineups for all important leagues.
Coverage and historical data Our Football API covers over 1,200 leagues, all managed by our in-house scouts and data platform. That means there’s up to 14 years of historical data available.
Bookmakers and odds Build your football sportsbook, odds comparison or betting portal with our pre-match and in-play odds collated from all major bookmakers and markets.
TV Stations and highlights Show your customers where the football games are broadcasted and provide video highlights of major match events.
Standings and topscorers Enhance your football website with standings and live standings, and allow your customers to see the top scorers and what the season's standings are.