Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains detailed attributes of professional football players, curated from a FIFA-based database. It includes a wide range of physical, technical, and performance-related statistics for each player.
The dataset provides valuable insights for sports analysts, data scientists, machine learning practitioners, and football enthusiasts who are interested in exploring player performance, scouting analysis, player valuation modeling, and team formation strategy.
Each record in the dataset represents an individual football player and includes the following types of data:
Personal Information:
full_name, birth_date, age, height_cm, weight_kgs, nationalityPlayer Role & Identity:
positions, preferred_foot, national_team, national_team_position, national_rating, national_jersey_numberOverall & Potential Ratings:
overall_rating, potential, international_reputation(1-5)Technical Attributes:
ball_control, dribbling, crossing, curvefinishing, volleys, shot_power, long_shots, penaltiesshort_passing, long_passing, visionmarking, standing_tackle, sliding_tackle, interceptionsPhysical & Mental Attributes:
acceleration, sprint_speed, agility, strength, stamina, jumping, balance, aggression, composure, reactionsSpecial Skills:
skill_moves(1-5), weak_foot(1-5), freekick_accuracy, heading_accuracyEconomic Value:
value_euro, wage_euro, release_clause_euro
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This comprehensive dataset offers detailed information on approximately 17,000 FIFA football players, meticulously scraped from SoFIFA.com.
It encompasses a wide array of player-specific data points, including but not limited to player names, nationalities, clubs, player ratings, potential, positions, ages, and various skill attributes. This dataset is ideal for football enthusiasts, data analysts, and researchers seeking to conduct in-depth analysis, statistical studies, or machine learning projects related to football players' performance, characteristics, and career progressions.
This dataset is ideal for data analysis, predictive modeling, and machine learning projects. It can be used for:
Please ensure to adhere to the terms of service of SoFIFA.com and relevant data protection laws when using this dataset. The dataset is intended for educational and research purposes only and should not be used for commercial gains without proper authorization.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
What you get:**
**16th Oct 2016: New table containing teams' attributes from FIFA !*
Original Data Source:
You can easily find data about soccer matches but they are usually scattered across different websites. A thorough data collection and processing has been done to make your life easier. I must insist that you do not make any commercial use of the data. The data was sourced from:
When you have a look at the database, you will notice foreign keys for players and matches are the same as the original data sources. I have called those foreign keys "api_id".
Improving the dataset:
You will notice that some players are missing from the lineup (NULL values). This is because I have not been able to source their attributes from FIFA. This will be fixed overtime as the crawling algorithm is being improved. The dataset will also be expanded to include international games, national cups, Champion's League and Europa League. Please ask me if you're after a specific tournament.
Please get in touch with Hugo Mathien if you want to help improve this dataset.
CLICK HERE TO ACCESS THE PROJECT GITHUB
Important note for people interested in using the crawlers: since I first wrote the crawling scripts (in python), it appears sofifa.com has changed its design and with it comes new requirements for the scripts. The existing script to crawl players ('Player Spider') will not work until i've updated it.
Exploring the data:
Now that's the fun part, there is a lot you can do with this dataset. I will be adding visuals and insights to this overview page but please have a look at the kernels and give it a try yourself ! Here are some ideas for you:
The Holy Grail... ... is obviously to predict the outcome of the game. The bookies use 3 classes (Home Win, Draw, Away Win). They get it right about 53% of the time. This is also what I've achieved so far using my own SVM. Though it may sound high for such a random sport game, you've got to know that the home team wins about 46% of the time. So the base case (constantly predicting Home Win) has indeed 46% precision.
Probabilities vs Odds
When running a multi-class classifier like SVM you could also output a probability estimate and compare it to the betting odds. Have a look at your variance vs odds and see for what games you had very different predictions.
Explore and visualize features
With access to players and teams attributes, team formations and in-game events you should be able to produce some interesting insights into The Beautiful Game . Who knows, Guardiola himself may hire one of you some day! Database released under Open Database License, individual papers copyright their original authors
Facebook
TwitterHave you ever found yourself with a football dataset that almost had it all, but left you short of happiness? Time after time, promising datasets failed to deliver the statistics that truly matter â match events, player performances, team results, and season standings.
That time is over!
This in-depth football dataset, curated straight from a RapidAPI endpoint, brings you the data points we've all been waiting for. From fixtures and injuries to goals, assists, and tactical breakdowns, this dataset unlocks the full picture of the beautiful game.
What You Get đ - Fixture Stats & Events: Goals, assists, fouls, and match-defining moments across leagues up to 2024. - Player Performances: From tackles to dribbles, passes, and shots â every stat that makes a difference. - Season Stats & League Standings: Discover how teams dominate, stumble, or rise to glory each season. - Team Insights: Analyze home/away performance, goal-scoring patterns, and defensive strengths. - Match Highlights: Real-time events like own goals, red cards, and critical substitutions. - Injuries & Suspensions: Missing players and their impact on team dynamics. - Iconic Stadiums: Explore venues, capacities, and surfaces that set the stage for football's greatest moments.
Why Itâs Exciting đ
This isnât just another football dataset â itâs the ultimate resource for fans, analysts, and strategists who want to dig deeper. Whether you're predicting outcomes, analyzing player form, or crafting the next big football insights project, you now have all the tools you need.
Get ready to unlock stories, trends, and insights like never before â because this time, the stats you actually care about are all here. Letâs kick it off! âœâš
In terms of fixture stats for players, the endpoint provides data from 2015 up through the 2024 season and I plan to make one more update at the end of all league/cup seasons in June of 2025.
Disclaimer: This dataset is intended for non-commercial, academic purposes and does not infringe upon any intellectual property rights of the original data providers, including RapidAPI or associated sources. For full details, please refer to the respective terms of use provided by the data sources.
If you have questions about the data or simply want to connect, reach out on LinkedIn and if you plan on using this data for any type of analysis, can you please share that with me!
PS: I am a Ronaldo fan... Suiiiii !!!
Leagues/Cups in datasets: - La Liga - Ligue 1 - Serie A - World Cup - Bundesliga - NWSL Women - Pro League - Championship League - Copa America - Premier League - CONCACAF Gold Cup - Euro Championship - UEFA Europa League - MLS - Africa Cup Of Nations - CONCACAF Champions League
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset investigates the impact of player injuries on team performance across seven Premier League clubs from 2019 to 2023, including Tottenham, Aston Villa, Brighton, Arsenal, Brentford, Everton, Burnley, and Manchester City. The dataset contains over 600 injury records, offering insights into how player absences influence match results and individual performance metrics.
Data Sources Transfer Market: Provided player injury records and durations. Football Critic: Offered player ratings for pre- and post-injury matches. Sky Sports: Supplemented additional match statistics and player performance data.
Dataset Overview Each entry includes: Player Information: Name, position, age, FIFA rating (spanning five years). Injury Details: Type of injury, date of injury, date of return. Performance Data: Match results (win, draw, loss), opposition, and goal difference (GD) for three matches before the injury, during missed matches, and for three matches after the player's return. Player ratings for each match, before and after the injury.
Key Data Points Performance fluctuations around injury events. Match outcomes during player absences. Ratings of players over time to observe any decline or improvement post-injury. This dataset is ideal for sports analytics, performance modeling, and evaluating the broader implications of player injuries on Premier League teams. Explore how injuries disrupt team dynamics and contribute to competitive outcomes in one of the worldâs top football leagues.
Facebook
TwitterBy Homeland Infrastructure Foundation [source]
This dataset provides detailed information on major sport venues, along with their usage and affiliations. It includes data related to the National Association for Stock Car Auto Racing, Indy Racing League, Major League Soccer, Major League Baseball, National Basketball Association, Women's National Basketball Association, National Hockey League, National Football League, PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball and thoroughbred horse racing.* This dataset contains columns such as USE (which describes the type of use for the venue), TEAM (the team associated with the venue), LEAGUE (the league associated with the venue) , CONFERENCE (the conference associated with the venue), DIVISION (the division associated with the venue), INST_AFFIL(the institution affiliation associatedwith the venue), TRACK_TYPE(type of track at a specific point in time or over its complete life-cycle) as well as LENGTH_MILEGE ('length of track in milege') ROOF_TYPE(The type of roof covering used at a specific point in time or over its complete life-cycle) and plenty other variables. With this astounding range and quantity of data points -- spanning countries across different continents and leagues -- explore patterns in sports games you never even thought were possible!
For more datasets, click here.
- đš Your notebook can be here! đš!
The MajorUS Sports Venues Usage and Affiliations dataset includes data on major sports venues from leagues including National Association for Stock Car Auto Racing (NASCAR), Indy Racing League (IRL), Major League Soccer (MLS), Major League Baseball (MLB), National Basketball Association (NBA), Women's National Basketball Association (WNBA), National Hockey League (NHL), National Football League(NFL), PGA Tour, NCAA Division 1 FBS Football, NCAA Division 1 Basketball, and thoroughbred horse racing. The columns provided include
USE_,USE_POP,TEAM,LEAGUE,CONFERENCE,DIVISION,INST_AFFIL,TRACK_TYPE.LENGTH_MI,ROOF_TYPESTADIUM_SH,`ADDDATAE , USEWEBSITE',and'COMMENTS'.The `USE~ column specifies the type of usage of each venue at which point can be college athletics or professional athletics. The corresponding column to this is the âUSE~POPâ which informs you about how many people are using each venue for a particular sport at a given time. For example if there were 6 NHL games being played that day then USE~ would say âprofessional Athleticsâ while USE~POP would state âNNNâ reflecting there were NNN people spectating those events collectively: The next column is TEAM which represents what team sponsors or manages each venue or what teams will be playing in them.
Following on from TEAM is LEAGUE; here you can find out what league each team represents such as MLB, NBA etc⊠The next three columns CONFERENCE/DIVISION/INST ~ AFFIL provide more specific details as they blur into collegiate level as well where CONFERENCE indicates which conference they belong within their respective division: while INST ~ AFFIL states its affiliated school body e.g.: Southeastern Conference > University of Arkansas Razorbacks . Rounding up our overview these last three columns TRACK ~ TYPE/LENGTH
- Analyzing the affiliations and usage of different sports venues to determine which teams or leagues have the most presence across a certain geographic area.
- Comparing different stadiums within a given conference in terms of their roof type, track length, and stadium shape for optimal design features for new construction projects.
- Placing sponsorships or advertisements within each sporting arena based on audience size, league popularity, and team affiliation within a given conference or division
If you use this dataset in your research, please credit the original authors. Data Source
License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contribut...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains every Pro Football Hall of Fame Inductee in NFL history. This includes player stats for rushing, passing and receiving.
As of the Class of 2022, there are a total of 362 members of the Hall of Fame. Members are referred to as "Gold Jackets" due to the distinctive gold jackets they receive during the induction ceremony. Between four and eight new inductees are normally enshrined every year.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Content
Weekly Updates would include :
https://data2.origin.com/live/content/dam/originx/web/app/games/fifa/fifa-17/screenshots/fifa-17/PogbaDab_pdp_screenhi_3840x2160_en_ww.jpg" alt="">
Data Source
Data was scraped from https://www.fifaindex.com/ first by getting player profile url set (as stored in PlayerNames.csv) and then scraping the individual pages for their attributes
Improvements
Important note for people interested in using the scraping: The site is not uniform and thus the scraping script requires considering a lot of corner cases (i.e. interchanged position of different attributes). Also the script contains proxy preferences which may be removed if not required.
Exploring the data
For starters you can become a scout:
And that is just the beginning. This is the playground.. literally!
Data description
Inspiration
I am a huge FIFA fanatic. While playing career mode I realised that I picked great young players early on every single time and since a lot of digital learning relies on how our brain works, I thought scouting great qualities in players would be something that can be worked on. Since then I started working on scraping the website and here is the data. I hope we can build something on it.
https://www.xzone.cz/download/products/fifa-17-01.jpg" alt="">
With access to players attributes you can become the best scout in the world. Go for it!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This folder contains data behind the story The Football Hall Of Fame Has A Receiver Problem.
advanced-historical.csv contains advanced career stats for NFL receivers, 1932-2013.
| Header | Definition |
|---|---|
pfr_player_id | Player identification code at Pro-Football-Reference.com |
player_name | The player's name |
career_try | Career True Receiving Yards |
career_ranypa | Adjusted Net Yards Per Attempt (relative to average) of player's career teams, weighted by TRY w/ each team |
career_wowy | The amount by which career_ranypa exceeds what would be expected from his QBs' (age-adjusted) performance without the receiver |
bcs_rating | The number of yards per game by which a player would outgain an average receiver on the same team, after adjusting for teammate quality and age (update of http://www.sabernomics.com/sabernomics/index.php/2005/02/ranking-the-all-time-great-wide-receivers/) |
try-per-game-aging-curve.csv contains receiver aging curve definitions.
| Header | Definition |
|---|---|
age_from | The age (as of December 31st) the player is moving from |
age_to | The age (as of December 31st) the player is moving to |
trypg_change | Expected change in TRY/game from one age-season to the next |
This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!
This dataset is maintained using GitHub's API and Kaggle's API.
This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This extensive dataset offers a granular look at the penalties incurred by teams and players, providing a valuable resource for football enthusiasts, analysts, and researchers alike.
Explore the nuances of each NFL season, dissecting penalty types, frequency, and the teams and players most frequently penalized. Uncover trends, anomalies, and strategic shifts that have shaped the league's dynamic landscape over the years.
Whether you're an avid fan seeking a deeper understanding of your favorite team's discipline on the field or a data scientist in search of rich, reliable information for analytical purposes, this NFL Penalties Data delivers a comprehensive and insightful perspective into the intricate world of penalties in professional football. From false starts to pass interference, this dataset serves as a powerful tool for unraveling the threads of each NFL season's story, penalty by penalty.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains comprehensive data from 800 Chinese university football players participating in collegiate and provincial leagues. The goal is to predict whether a player will suffer an injury in the next academic season using machine learning classification methods.
Injury_Next_Season: Binary classification where injury is defined as training/competition-related injury causing â„7 consecutive days of absence, verified by university medical center and coaching staff.
This dataset bridges sports science and machine learning, offering insights into university-level athletic injury prediction. It's particularly valuable for researchers in sports medicine, preventive healthcare, and applied machine learning.
This dataset is intended for academic research and educational purposes. Please respect data privacy and usage guidelines.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
College football is one of the most long-living fascinations in American culture. Its TV rankings routinely dominate the fall TV schedules. . The NCAA has a stats website but it does not have all the team information and uses many acronyms that are obscure.
With the data available, I went ahead and scraped the team statistics for each college football season from 2013 to the present.
Inside the data is the team statistics for all of the FBS level teams at the year of the college season, it includes offensive, defensive, turnover, redzone, special teams, first down, third down, and fourth down stats. There are around 145 differenc team statistics that can be used.
All of this information is thanks to the NCAA stats website which makes the data easy to use and find. See more here: https://www.ncaa.com/stats/football/fbs
College Football is the only sport in the world where the college version is much older than the professional version. It has a very storied history and many antidotes about it. Explore the data to learn for yourself the following: - Does defense really does win championships? - What features translate into wins? - Are special teams of particular value for a team's performance? - Which Collegiate Conference is the best? - What's the correlation between offensive and defensive performance?
Facebook
TwitterThis is the first live data stream on Kaggle providing a simple yet rich source of all soccer matches around the world 24/7 in real-time.
What makes it unique compared to other datasets?
Simply train your algorithm on the first version of training dataset of approximately 11.5k matches and predict the data provided in the following data feed.
The CSV file is updated every 30 minutes at minutes 20â and 50â of every hour. I kindly request not to download it more than twice per hour as it incurs additional cost.
You may download the csv data file from the following link from Amazon S3 server by changing the FOLDER_NAME as below,
https://s3.amazonaws.com/FOLDER_NAME/amasters.csv
*. Substitute the FOLDER_NAME with "**analyst-masters**"
Our goal is to identify the outcome of a match as Home, Draw or Away. The variety of sources and nature of information provided in this data stream makes it a unique database. Currently, FIVE servers are collecting data from soccer matches around the world, communicating with each other and finally aggregating the data based on the dominant features learned from 400,000 matches over 7 years. I describe every column and the data collection below in two categories, Category I â Current situation and Category II â Head-to-Head History. Hence, we divide the type of data we have from each team to 4 modes,
Below you can find a full illustration of each category.
I. Current situation
Col 1 to 3:
Votes_for_Home Votes_for_Draw Votes_for_Away
The most distinctive parts of the database are these 3 columns. We are releasing opinions of over 100 professional soccer analysts predicting the outcome of a match. Their votes is the result of every piece of information they receive on players, team line-up, injuries and the urge of a team to win a match to stay in the league. They are spread around the world in various time zones and are experts on soccer teams from various regions. Our servers aggregate their opinions to update the CSV file until kickoff. Therefore, even if 40 users predict Real-Madrid wins against Real-Sociedad in Santiago Bernabeu on January 6th, 2019 but 5 users predict Real-Sociedad (the away team) will be the winner, you should doubt the home win. Here, the âmajority of votesâ works in conjunction with other features.
Col 4 to 9:
Weekday Day Month Year Hour Minute
There are over 60,000 matches during a year, and approximately 400 ones are usually held per day on weekends. More critical and exciting matches, which are usually less predictable, are held toward the evening in Europe. We are currently providing time in Central Europe Time (CET) equivalent to GMT +01:00.
*. Please note that the 2nd row of the CSV file represents the time, data values are saved from all servers to the file.
Col 10 to 13:
Total_Bettors Bet_Perc_on_Home Bet_Perc_on_Draw Bet_Perc_on_Away
This data is recorded a few hours before the match as people place bets emotionally when kickoff approaches. The percentage of the overall number of people denoted as âTotal_Bettorsâ is indicated in each column for âHome,â âDrawâ and âAwayâ outcomes.
Col 14 to 15:
Team_1 Team_2
The team playing âHomeâ is âTeam_1â and the opponent playing âAwayâ is âTeam_2â.
Col 16 to 36:
League_Rank_1 League_Rank_2 Total_teams Points_1 Points_2 Max_points Min_points Won_1 Draw_1 Lost_1 Won_2 Draw_2 Lost_2 Goals_Scored_1 Goals_Scored_2 Goals_Rec_1 Goal_Rec_2 Goals_Diff_1 Goals_Diff_2
If the match is betw...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This data is collected from https://www.pro-football-reference.com/players/B/BradTo00/gamelog and it has a good experiment in data cleaning and data analysis.
Facebook
TwitterThis dataset, named "state_trends.csv," contains information about different U.S. states. Let's break down the attributes and understand what each column represents:
In summary, this dataset provides a variety of information about U.S. states, including demographic data, geographical region, psychological region, personality traits, and scores related to interests or proficiencies in various fields such as data science, art, and sports.
Facebook
TwitterContains web scrapped (rvest) Market Value information and other related data on Players from the top 9 European leagues including: Premier League, La Liga, Liga NOS, Ligue 1, Bundesliga, Seria A, Premier Liga, Eredivisie and Jupiler Pro League (20+ variables)
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
In the eight years since he became the worldâs highest-paid athlete for the first time, much has changed for Cristiano Ronaldo. The 39-year-old Portuguese soccer star went from lighting up the BernabĂ©u with Real Madrid to stints with Juventus and Manchester United, until finally landing at his current home, Al Nassr of the Saudi Pro League. But no matter the location, one thing has remained constantâRonaldo is still drawing outsized paydays. He earned an estimated $260 million over the last 12 months, making him the highest-paid athlete in the world for the fourth time in his career. It estimates Ronaldoâs contract with Al Nassr earned him $200 million this season. And as one of the sports worldâs most successful pitchmen, Ronaldo earned another $60 million off the field from an endorsement portfolio that includes Nike, Binance and Herbalife, among others.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nowadays, in most sports either tracking or event data is available for sports data scientists to analyse leagues, teams, games or players. For example, in soccer event-based data is available for all major leagues by professional data providers like Opta, Statsbomb or Wyscout. For tennis this is different. Even though a camera-based tracking with Hawkeye is possible, this data is not available to the outside and only the largest courts are equipped with the system. When I think about the latest breakthroughs in machine learning in image classification, detection, NLP (deepl.com) and audio recognition (Siri, Alexa) it is evident that all of these areas provide a huge amount of easily accessable data. Personally, I expect that there would be way more research in tennis if there would be a large amount of freely available match data. There exists statistics of all matches played on ATP Tour which are available from different sources. For example, Jeff Sackmans github repository is a great way to start. He also has a match charting project where point-by-point data is collected. But when I think about tennis, it is about the movement of the players, their tactics, etc. It is the ball movement, the actual rallies and shots I want to be able to see and analyse.
Event data allows to capture positional, temporal and stroke information. As a proof of concept, and a tribute to Novac Djokovic and Rafael Nadal, two of the greatest tennis players of all time, I manually annotated each rally and stroke of their Australian Open final 2019. Fortunately for me it only went over three sets.
The data consists of all points played in the match. It is build hierarchically from events, to rallies, to actual points.
- Points: a list of all points played in the final with information about the server, receiver, point type, number of strokes, time of rally, new score of the game.
- Rallies: A list of all rallies with Server, Returner, etc.
- Events: Each time a player hit the ball, the stroke type, position of the player, and position of the opponent were recorded.
- Serves: For each successful serve, which was no failure, the position of the serve in the service box was recorded (whenever possible)
I have already done the hard part of data cleaning, and the dataset is hopefully easy to understand and ready to use.
Positions The x, y positions are with respect to the court coordinate system shown in Figure 1. They were calculated from the pixel coordinates through a direct linear transformation at the beginning of the match. (As the camera angle changed a bit during the match, some of the positions are off.)
https://www.dropbox.com/s/gakg677f0uvhmb2/Screenshot%202019-03-02%2021.44.11.png?raw=1" alt="The court coordinate system. The horizontal axis refers to x and the vertical axis to the y-direction.">
Look into the data, see what you can find. Is there information about the game in positional, temporal and stroke information that can tell you more about the players and the match than simple match sheet statistics like the number of break points or first serves in?
You can use the dataset however you want, but here are some things you could start with.
- It is a great way to practice pandas to generate general statistics like points played, serve percentages, games won, breakpoints etc. and compare them with the statistics from other websites.
- You can visualize the spatial positioning of the players on the court. I.e. answer the question if there is a difference between the return position of Nadal and Djokovic.
- You can calculate movement statistics like distance covered.
- You can calculate the percentage of forehand and backhands, or shot types like slice, topspin for each player.
- You can find out where the players are serving to? (Do not forget that Nadal is a lefty).
To get you started, I have created a sample kernel. Find it here.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
The Indian Premier League (IPL) is a professional Twenty20 cricket league in India. It is one of the most popular and lucrative cricket leagues in the world. The tournament was established by the Board of Control for Cricket in India (BCCI) in 2007 and is played every year during the months of March, April, and May.
The IPL follows a franchise-based model, where eight teams representing different cities or regions in India compete against each other. The teams are owned by various individuals, companies, and consortiums, including Bollywood actors, business tycoons, and corporate entities. Some of the prominent teams in the IPL include the Mumbai Indians, Chennai Super Kings, Royal Challengers Bangalore, and Kolkata Knight Riders.
The IPL has a star-studded lineup of players, with both international and domestic cricketers participating in the tournament. Many of the world's top cricketers, such as Virat Kohli, Rohit Sharma, AB de Villiers, and Chris Gayle, have been a part of the IPL. The league has provided a platform for young talents to showcase their skills and has played a significant role in the development of Indian cricket.
The tournament format involves a round-robin group stage followed by playoffs. Each team plays a total of 14 matches, facing every other team twice, once at home and once away. The top four teams in the group stage advance to the playoffs, which consist of the qualifier matches and the final. The team that finishes first in the group stage gets two chances to reach the final, while the other three teams compete in the eliminator and the second qualifier matches.
The IPL has witnessed intense rivalries, thrilling matches, and high-scoring contests over the years. The league has also been a platform for innovation in cricket, with features like strategic time-outs, cheerleaders, and the introduction of the Decision Review System (DRS) in the league before it was implemented globally.
Apart from the cricketing action, the IPL has become a significant entertainment spectacle, attracting a large fan base. The matches are held in various cricket stadiums across India, with fans supporting their favorite teams with enthusiasm and passion. The league has also been associated with glitz, glamour, and celebrity performances, making it a blend of sports and entertainment.
The IPL's success has led to the emergence of similar leagues in other countries, such as the Big Bash League in Australia and the Caribbean Premier League. It has also contributed to the growth of franchise-based sports leagues worldwide.
Overall, the Indian Premier League has revolutionized cricket in India and has become a global phenomenon, attracting players, fans, and sponsors from around the world. It has provided a platform for top-level cricket, entertainment, and commercial opportunities, making it a highly anticipated and celebrated event in the cricket calendar.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
VIVO Pro-Kabaddi league has gained significant popularity since it started in 2014. This dataset presents a learning opportunity for people who want to try their hand at sports analytics. The challenge is divided into 7 parts: Task 1: Predict the winner of the tournament Task 2: Predict the the top team in the points table after the completion of league matches Task 3: Predict the team with the highest points for successful raids Task 4: Predict the team with the highest points for successful tackles Task 5: Predict the team with the highest super-performance total Task 6: Predict the player with the highest SUCCESSFUL RAID percentage Task 7: Predict the player with the highest SUCCESSFUL TACKLE percentage
The dataset contains data upto 30th of September 2019. The datasets are pretty self-explanatory once you have a brief look at them. The tournament ends on 19th of October 2019. Code to update the data as tournament progresses and reference data for the challenges can be found at:
https://github.com/sujay-pandit/Upgrad_Hackathon
I will not be updating the dataset beyond this point so people can use this dataset for predictions in future as well.
I would like to thank:
VIVO pro kabaddi to help bring fame to a sport that has deserved it for years. https://www.prokabaddi.com/about-prokabaddi
Upgrad for organizing the hackathon. https://www.upgrad.com/
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains detailed attributes of professional football players, curated from a FIFA-based database. It includes a wide range of physical, technical, and performance-related statistics for each player.
The dataset provides valuable insights for sports analysts, data scientists, machine learning practitioners, and football enthusiasts who are interested in exploring player performance, scouting analysis, player valuation modeling, and team formation strategy.
Each record in the dataset represents an individual football player and includes the following types of data:
Personal Information:
full_name, birth_date, age, height_cm, weight_kgs, nationalityPlayer Role & Identity:
positions, preferred_foot, national_team, national_team_position, national_rating, national_jersey_numberOverall & Potential Ratings:
overall_rating, potential, international_reputation(1-5)Technical Attributes:
ball_control, dribbling, crossing, curvefinishing, volleys, shot_power, long_shots, penaltiesshort_passing, long_passing, visionmarking, standing_tackle, sliding_tackle, interceptionsPhysical & Mental Attributes:
acceleration, sprint_speed, agility, strength, stamina, jumping, balance, aggression, composure, reactionsSpecial Skills:
skill_moves(1-5), weak_foot(1-5), freekick_accuracy, heading_accuracyEconomic Value:
value_euro, wage_euro, release_clause_euro