Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides an extensive collection of popular video games across various platforms and genres. It includes key details such as game titles, genres, platforms, release years, and user ratings, allowing for in-depth analysis of gaming trends and player preferences.
With a diverse selection of games, this dataset reflects current industry trends, helping researchers, developers, and gaming enthusiasts explore popularity patterns, platform preferences, and genre evolution over time.
📌 Key Features: 🕹️ Game Information: Game Title 🎮: The name of the game. Genre 🎭: The category of the game (e.g., Action, RPG, FPS, Strategy). Platform 🖥️🎮📱: The platform(s) the game is available on (e.g., PC, PlayStation, Xbox, Switch, Mobile). Release Year 📅: The year the game was released. Developer 🏢: The company or studio that developed the game. ⭐ Popularity & Reception: User Ratings 🌟: Average player ratings based on reviews. Metacritic Score 🎯: Professional critic score from Metacritic (if applicable). Total Sales (in millions) 💰: Estimated number of copies sold worldwide. 🎮 Additional Insights: Multiplayer Support 🔄: Whether the game offers multiplayer features (Yes/No). Game Mode 🎯: Single-player, Multiplayer, or Both. ESRB Rating 🔞: Age classification (E, T, M, etc.). 📊 Use Cases & Applications: 🔹 Analyzing gaming trends over time 🔹 Comparing the popularity of different game genres 🔹 Identifying top-selling games by platform 🔹 Understanding user preferences and ratings 🔹 Exploring relationships between game ratings and sales
⚠️ Important Note: This dataset is synthetically generated for educational and analytical purposes. It does not contain real proprietary data but is designed to resemble real-world gaming statistics.
Facebook
TwitterContext: Valorant, developed by Riot Games, has quickly become one of the most popular tactical first-person shooter games since its release. The game emphasizes strategic team play, individual skills, and tactical execution, making it a fascinating subject for performance analysis. Understanding the various metrics that contribute to player success can offer insights into effective strategies and gameplay techniques. This dataset was created to help players, coaches, and analysts delve into the detailed aspects of player performance and identify key areas for improvement.
Sources: The data for this dataset was collected from various online sources, including:
In-Game Statistics: Aggregated from player profiles and match histories available within the game client. Third-Party Valorant Trackers: Websites and tools that track player statistics and match performance, such as Tracker.gg and Blitz.gg. Community Contributions: Insights and data shared by the Valorant community, including professional players, streamers, and analysts, who often provide detailed breakdowns of their gameplay. Inspiration: The inspiration for compiling this dataset stems from several key areas:
Performance Analysis: In competitive gaming, understanding the granular details of player performance is crucial for improvement. Metrics like win rate, damage per round, and headshot percentage provide actionable insights. Strategic Development: By analyzing this data, players and teams can develop better strategies, identify strengths and weaknesses, and tailor their training regimes accordingly. Predictive Modeling: The dataset serves as a foundation for building predictive models to forecast future performance, which can be useful for coaching, match preparation, and scouting new talent. Community Engagement: Providing this dataset to the wider Valorant community fosters engagement and encourages collaborative analysis. It allows enthusiasts to test hypotheses, share findings, and contribute to a deeper understanding of the game. Educational Purposes: For educators and students in data science, sports analytics, and game design, this dataset offers a real-world application of data analysis techniques and methodologies. Future Directions: The dataset can be expanded by including additional metrics such as agent pick rates, map-specific performance, and team composition analysis. Incorporating more granular data over longer periods can also enhance the depth of analysis and provide a more comprehensive view of player performance trends.
By sharing this dataset, we aim to empower the Valorant community with data-driven insights that can elevate gameplay, inform strategic decisions, and contribute to the overall growth of the esports ecosystem.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Troy and UNLV including scores, dates, locations, and game statistics.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This contains more detailed information than the dataset from https://www.kaggle.com/datasets/codytipton/understat-data, which includes the individual player stats per game for the English Premier League, La Liga, Bundesliga, Serie A, Ligue 1, and the Russian Football Premier League. In particular, it contains each player's xG, xGBuildup, goals, and shots per game. Furthermore, it has the events for each shot in the events table, clubs and their stats per season in the clubs table, and each game with who lost, won, shots, possession, probabilities of who wins, ect..
This is for educational purposes in our data science bootcamp project.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between UConn and Wake Forest including scores, dates, locations, and game statistics.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Louisiana Tech and Miami (FL) including scores, dates, locations, and game statistics.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dive into the intricate world of League of Legends with "Champions of the Rift," an extensive dataset that compiles detailed in-game statistics for all champions. This dataset includes vital information such as base health, mana, armor, attack damage, ability power, and gold efficiency for each champion, categorized by their primary roles. Whether you're a data analyst, a game developer, or an avid gamer looking to deepen your understanding of champion mechanics, this dataset provides a comprehensive foundation for analysis and strategy development in the ever-evolving battlefield of Summoner's Rift. Explore the strengths and weaknesses of your favorite champions and gain a competitive edge with this meticulously curated collection of champion statistics.
Column Descriptions:
Champion Name: The name of the champion in League of Legends.
Role : The primary role or lane typically played by the champion. Common roles include Top, Jungle, Mid, ADC (Attack Damage Carry), and Support.
Base Health: The initial health points (HP) of the champion at level 1.
Base Mana: The initial mana points (MP) of the champion at level 1. Some champions do not use mana, in which case this value may be zero.
Base Armor: The initial armor value of the champion at level 1, which reduces incoming physical damage.
Base Attack Damage: The initial attack damage (AD) of the champion at level 1, which affects the amount of physical damage dealt by basic attacks.
Gold Efficiency: A relative measure of how cost-effective the champion's base stats are, expressed as a ratio. A higher value indicates better gold efficiency.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Louisiana and Sam Houston including scores, dates, locations, and game statistics.
Facebook
TwitterA team's mean seasons statistics can be used as predictors for their performance in future games. However, these statistics gain additional meaning when placed in the context of their opponents' (and opponents' opponents') performance. This dataset provides this context for each team. Furthermore, predicting games based on post-season stats causes data leakage, which from experience can be significant in this context (15-20% loss in accuracy). Thus, this dataset provides each of these statistics prior to each game of the regular season, preventing any source of data leakage.
All data is derived from the March Madness competition data. Each original column was renamed to "A" and "B" instead of "W" and "L," and the mirrored to represent both orderings of opponents. Each team's mean stats are computed (both their stats, and the mean "allowed" or "forced" statistics by their opponents). To compute the mean opponents' stats, we analyze the games played by each opponent (excluding games played against the team in question), and compute the mean statistics for those games. We then compute the mean of these mean statistics, weighted by the number of times the team in question played each opponent. The opponents' opponent's stats are computed as a weighted average of the opponents' average. This results in statistics similar to those used to compute strength of schedule or RPI, just that they go beyond win percentages (See: https://en.wikipedia.org/wiki/Rating_percentage_index)
The per game statistics are computed by pretending we don't have any of the data on or after the day in question.
Currently, the data isn't computed particularly efficiently. Computing the per game averages for every day of the season is necessary to compute fully accurate opponents' opponents' average, but takes about 90 minutes to obtain. It is probably possible to parallelize this, and the per-game averages involve a lot of repeated computation (basically computing the final averages over and over again for each day). Speeding this up will make it more convenient to make changes to the dataset.
I would like to transform these statistics to be per-possession, add shooting percentages, pace, and number of games played (to give an idea of the amount uncertainty that exists in the per-game averages). Some of these can be approximated with the given data (but the results won't be exact), while others will need to be computed from scratch.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Rice and UCLA including scores, dates, locations, and game statistics.
Facebook
TwitterThis dataset contains detailed match and player data from League of Legends, one of the most popular multiplayer online battle arena (MOBA) games in the world. It includes 270,000+ matches and contains 900,000+ summoner statistics, capturing a wide range of in-game statistics, such as champion selection, player performance metrics, match outcomes, and more.
The dataset is structured to support a variety of analyses, including:
Whether you are interested in competitive gaming, data science, or predictive modeling, this dataset provides a rich source of structured data to explore the dynamics of League of Legends at scale.
Data was collected from Riot Games API using Python script(link) from Patch 25.19
The datase consists of 7 csv files:
-MySQL Database using Linux -Database Schema Script can be found here. (Works with the gtihub project to collect your own data)
The Riot API only provides the "BOTTOM" lane for bot-lane players. During Data collection, roles were inferred by combining chapions that often played support with CS metrics to distinguish ADC vs Support — especially for ambiguous picks like Senna or off-meta choices.
Data is collected using the official Riot Games API. We thank Riot Games for providing the data and tools that make this project possible. This dataset is not endorsed or certified by Riot Games. No personal or identifiable player data (e.g., Summoner Names, Summoner IDs, or PUUIDs) are included. The SummonerTbl has been intentionally excluded from this public release.
The Python scripts used for data collection, as well as various scripts I developed for API calls, database management, and initial data analytics, can be found on GitHub
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Sam Houston and Southern Miss including scores, dates, locations, and game statistics.
Facebook
TwitterBy Chase Willden [source]
This dataset, named 'College-Bowl-Football.csv', offers a comprehensive study of player performance measurements and game outcomes in college football bowl games, an important aspect of American college athletics.
The dataset is anchored on the premise of evaluating tangible performance metrics to investigate the credibility and impact of the home field advantage phenomenon often touted in sports. Specifically using statistical data from college football bowl games, this dataset examines whether having a home field advantage is as beneficial as it's generally perceived.
The data was meticulously collected via web scraping techniques from a reputable sports news website www.foxsports.com. For deeper insights into this research, interested parties can find valuable content presented in an engaging blog format at The Concept Center website (http://theconceptcenter.com/simple-research-study-college-football-bowl-games/).
The primary unit of analysis in this dataset pertains to individual players and their respective teams denoted by columns such as 'team' and 'player'. Several sub-categories record numerous aspects revolving around different types of plays that occur in American Football like passing, rushing, receiving kicks returns among others.
Essentially covering every conceivable metric related to gameplay including attempts made for each play type (e.g., Passing - Attempts, Rushing - Attempts), success rates (% completions for passing attempts or FG Made Pct for kicking) or even mishaps during gameplay (such as Fumbles Lost), the range is wide
Distance-related attributes are also well documented with total yards covered during successful moves provided along with information on longest distance achieved for numerous actions from punt returns to rushing endeavors. Touchdowns secured via multiple methods like passing or rush receive recognition too through respective data points.
On special factors impacting scoring results directly such as kicking & punting activities are further divided into more granular levels capturing specifics on successful instances alongside indicators about these activities' success rates.
In every respect this dataset presents all necessary dimensions one needs to dissect College Football player performance in Bowl Games exhaustively. Ideal for enthusiasts or analysts seeking to delve into detailed evaluations of game situations and player performance with quantifiable evidence
This dataset comes with a wide range of performance metrics for individual players in the college football bowl games. It opens up opportunities for comprehensive analysis and insights into college football sports data, allowing enthusiasts, analysts or potential scouts to garner valuable insights and trends.
Here's a guide on how you can put this data to good use:
1. Player Performance Analysis:
Evaluate individual player performance across various metrics like passing efficiency (Completion Pct), rushing capabilities (Rushing Attempts, Rushing Touchdowns), receiving skills(Receiving Yards) etc.
Identify top performers in each metric category. This could help in scouting new talent.
2. Team Performance Analysis:
Assess the overall team's performance by aggregating player data.
Understand if specific teams are better at certain aspects of the game like kicking, returning punts etc.
Assist team management in their tactical strategies by highlighting areas of strength and improvement.
3. Predictive Analytics:
Using past statistics available in this dataset, predictive models can be constructed to predict future outcomes such as:
Predicting potential star players
Foresee team’s future performances
4. Statistical Comparison & Benchmarking:
Comparison between teams or between players based on their stats; useful for scouting activities and drafting pick discussions.The possibilities with this archive of information are vast depending upon your objective! As all dataset it might need some preprocessing as per the requirement but it is nothing compared to the insightful journey that lies ahead once you've decoded these numbers from past games! Happy Data Diving!
Note:
- Remember that fields denoted like '**Field-name' means total counts/accumulation while 'Field-name **' denotes a highlight/common term which you will find across...
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides comprehensive performance statistics for NBA players throughout the 2024/2025 season. It includes both advanced and traditional stats, making it ideal for player performance analysis, efficiency assessments, and exploring game patterns and trends. Data was collected from reliable sources, ensuring quality and consistency across each record.
23.5 = 23 minutes and 30 seconds).YYYY-MM-DD format.This dataset is perfectly suited for: - Statistical analysis: Gain insights into player and team performance trends. - Machine learning projects: Build predictive models using detailed player stats. - Performance prediction: Forecast player outcomes or team results. - Player comparisons: Analyze players across various metrics and categories. - Efficiency analysis: Evaluate player and team efficiency, comparing statistics across games. - Game trend exploration: Investigate patterns within the season, identifying shifts in strategies and performance.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
All statistics sourced from FBRef and Transfermarkt and scraped using Ruby on Rails. This dataset contains 3 CSVs, "Teams" with basic info (just name and FBRef link), "Players" with name and summary statistics for the 2023-2024 season as well as their age and value from Transfermarkt, and "Match Logs" which has in-depth information for every Premier League game the player's team took part in over the past 2 seasons (or whatever length of time the player was on said team).
Facebook
TwitterFootball is more than just a game — it’s data-rich and decision-driven. From match results to player statistics, the English Premier League (EPL) offers a goldmine of insights for analysts, fans, and data scientists.
This dataset is part of a personal data preprocessing project designed to transform messy raw data into a clean, structured format — enabling meaningful analysis, modeling, or visualization. Whether you're predicting match outcomes, exploring season trends, or learning data science, this dataset gives you a strong starting point.
This dataset was originally sourced from football-data.co.uk, a trusted source for historical football data. The raw data was downloaded in CSV format and carefully cleaned using Python. The resulting dataset is ready for analysis and includes statistics such as:
Match dates
Full-time and half-time results
Goals, corners, shots, fouls
Yellow and red cards
It’s ideal for building machine learning models, dashboards, or practicing sports analytics.
This dataset is for educational and non-commercial use only. Raw data sourced from football-data.co.uk. Please credit the source if you use or share this dataset.
Facebook
Twitterhttp://www.gnu.org/licenses/fdl-1.3.htmlhttp://www.gnu.org/licenses/fdl-1.3.html
This dataset has the game data for top 300 players for league of legends in the EUW region. This contains in game statistics together with win data. Each players past 20 matches were taken into accord and the data was added to the dataset using the riot api.
Facebook
Twitterhttps://www.winsipedia.com/termshttps://www.winsipedia.com/terms
Complete historical game data between Rice and Utah including scores, dates, locations, and game statistics.
Facebook
TwitterValorant is still relatively new to the eSports scene, so the data analysis for pro or semi-pro games is still in its infancy stage. One of the biggest issues is sourcing the data. Vlr.gg is similar to CSGO's hltv.org that provides some great information on matches, but extracting its data isn't very accessible. Luckily, they (for now) allow scraping the website as much as you want.
I had a lot of issues because even though the HTML/CSS format is generally the same, there's a bunch of edge cases to account for and even times where the formatting completely breaks my parser. I didn't upload my code because it's honestly super messy, but I might in the future when I clean it up. The data set currently get most matches up to Jan 1, 2022, and I think there's like 400 out of ~11.5k that got errors and I couldn't add to the database. Probably about 200 are from the very first matches that got posted on vlr.gg.
There is four tables. The top level is Matches that will tell you teams playing and match (map) score. Game is the next level that breaks down the specific maps played. Then Game_Rounds gives a round by round breakdown which shows who won, economy of each team, win type, and buy type, whenever the info is available. The game rounds are packaged in one string that you should be able to cast as a json. Lastly there is Game_Scoreboard which gives you the player performance, as well as things like number of first kills, first deaths, 2Ks, 3Ks, One v Ones, One v Twos, ect.
https://www.kaggle.com/hidious/valorant-vlrgg-results-and-stats
This dataset scrapes the results pages based on map score, but will only get match score, or map score if they only played 1 map. My data set tries to scrape everything available.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
NOTE: As of now, the source of player data that I had used has changed. So unless I find a new source/figure out how to retrieve it from the changed website, that part of the dataset won't be updated anymore. The tables, matches, and events parts are all fine and will be updated around midweek while the season is going.
NOTE: Version 31 doesn't contain an updated table. I will try to soon either fix how I get it or just compute that information based on the game data.
This dataset contains player, game, event, and table data from Major League Soccer (MLS).
There is currently information on over 6000 matches and almost 420,000 events from those matches.
For an introduction to the data, check out this kernel. For a quick overview of what data is available for what years, visit this spreadsheet
V21 - all_tables.csv updated, now has 1996-2020 and additional columns; includes a section for each conference and then (for most years) an overall table as well V8 - matches.csv now 1996-2020 (dates look a little different for this period) V4 - all updated with matches through 8/17/2020 (except all_tables.csv, webpage with the table seems to be down) V3 - events.csv and matches.csv now 2001-2020 (events only available for 2008 onwards) V2 - events.csv and matches.csv now 2012-2020
I plan to update this dataset weekly while MLS games are being played.
Image: https://www.houstondynamo.com/post/2020/06/10/mls-back-all-26-teams-resume-season-espn-wide-world-sports-starting-july-8 Player Data: https://www.mlssoccer.com/stats Match Data 1996-2000: https://fbref.com/en/comps/22/44/schedule/2000-Major-League-Soccer-Scores-and-Fixtures Table Data: https://en.wikipedia.org/wiki/1996_Major_League_Soccer_season Other Data: https://www.espn.com/soccer/league/_/name/usa.1/
All data was scraped from the above two sources by me. My code is here.
There's a myriad of questions that can be asked about soccer teams and players. - What stats impact the chances of a team winning the most? - Which players have the best win rate and worst win rate over the years? - Which goalkeepers have performed the best? etc.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides an extensive collection of popular video games across various platforms and genres. It includes key details such as game titles, genres, platforms, release years, and user ratings, allowing for in-depth analysis of gaming trends and player preferences.
With a diverse selection of games, this dataset reflects current industry trends, helping researchers, developers, and gaming enthusiasts explore popularity patterns, platform preferences, and genre evolution over time.
📌 Key Features: 🕹️ Game Information: Game Title 🎮: The name of the game. Genre 🎭: The category of the game (e.g., Action, RPG, FPS, Strategy). Platform 🖥️🎮📱: The platform(s) the game is available on (e.g., PC, PlayStation, Xbox, Switch, Mobile). Release Year 📅: The year the game was released. Developer 🏢: The company or studio that developed the game. ⭐ Popularity & Reception: User Ratings 🌟: Average player ratings based on reviews. Metacritic Score 🎯: Professional critic score from Metacritic (if applicable). Total Sales (in millions) 💰: Estimated number of copies sold worldwide. 🎮 Additional Insights: Multiplayer Support 🔄: Whether the game offers multiplayer features (Yes/No). Game Mode 🎯: Single-player, Multiplayer, or Both. ESRB Rating 🔞: Age classification (E, T, M, etc.). 📊 Use Cases & Applications: 🔹 Analyzing gaming trends over time 🔹 Comparing the popularity of different game genres 🔹 Identifying top-selling games by platform 🔹 Understanding user preferences and ratings 🔹 Exploring relationships between game ratings and sales
⚠️ Important Note: This dataset is synthetically generated for educational and analytical purposes. It does not contain real proprietary data but is designed to resemble real-world gaming statistics.