MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.
This dataset was collected to work on NBA games data. I used the nba stats website to create this dataset.
You can find more details about data collection in my GitHub repo here : nba predictor repo.
If you want more informations about this api endpoint feel free to go on the nba_api
GitHub repo that documentate each endpoint : link here
You can find 5 datasets :
CONFERENCE
columnI would like to thanks nba stats website which allows all NBA data freely open to everyone and with a great api endpoint.
Enjoy it ! Nathan
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains end-of-season box-score aggregates for NBA players over the 2012–13 through 2023–24 seasons, split into training and test sets for both regular season and playoffs. Each CSV has one row per player per season with columns for points, rebounds, steals, turnovers, 3-pt attempts, FG attempts, plus identifiers.
end-of-season box-score aggregates (2012–13 – 2023–24) split into train/test;
the Jupyter notebook (Analysis.ipynb); All the code can be executed in there
the trained model binary (nba_model.pkl); Serialized Random Forest model artifact
Evaluation plots (LAL vs. whole‐league) for regular & playoff predictions are given as png outputs and uploaded in here
FAIR4ML metadata (fair4ml_metadata.jsonld);
see README.md and abbreviations.txt for file details.”
Notebook
Analysis.ipynb: Involves the graphica output of the trained and tested data.
Trained/ Test csv Data
Name | Description | PID |
regular_train.csv | For training purposes, the seasons 2012-2013 through 2021-2022 were selected as training purpose | 4421e56c-4cd3-4ec1-a566-a89d7ec0bced |
regular_test.csv: | For testing purpose of the regular season, the 2022-2023 season was selected | f9d84d5e-db01-4475-b7d1-80cfe9fe0e61 |
playoff_train.csv | For training purposes of the playoff season, the seasons 2012-2013 through 2022-2023 were selected | bcb3cf2b-27df-48cc-8b76-9e49254783d0 |
playoff_test.csv | For testing purpose of the playoff season, 2023-2024 season was selected | de37d568-e97f-4cb9-bc05-2e600cc97102 |
Others
abbrevations.txt: Involves the fundemental abbrevations of the columns in csv data
Additional Notes
Raw csv files are taken from Kaggle (Source: https://www.kaggle.com/datasets/shivamkumar121215/nba-stats-dataset-for-last-10-years/data)
Some preprocessing has to be done before uploading into dbrepo
Plots have also been uploaded as an output for visual purposes.
A more detailed version can be found on github (Link: https://github.com/bubaltali/nba-prediction-analysis/)
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Current and updated dataset of NBA Playoff statistics since the 1949-1950 season!
All standard statistics like Assists Per Game, Minutes Per Game, etc. are present as well as advanced statistics like Player Efficiency Rating (PER), Value Over Replacement Player (VORP), Win Share, and more!
This dataset was web scraped from https://www.basketball-reference.com.
Feel free to let me know if there are any statistics or player information that isn't present that you think should be added!
If you want the regular season statistics check out my other data set.
For more details on how some statistics are calculated, please see the https://www.basketball-reference.com/about/glossary.html
NBA Games Data
This data is an updated version of the original NBA Games by Nathan Lauga.
Data source Code Updated to: 2025-02-13
The dataset retains the original format and includes the following files:
games.csv – Summary of NBA games, including scores and team details. games_details.csv – Detailed player statistics for each game. players.csv – Player information. ranking.csv – Daily NBA team rankings. teams.csv – List of all NBA teams.
As the season has come to an end and at the moment we are already deep in playoff basketball, I wanted to take a look and see if I can at any way get to some data so I can predict the MVP of 2018-19 season. After a quick search, I came across all mvp votings since 1968-69 up to this past seasons on basketball-reference. I wrote a scraper and got the data. I also got the data for current season. However, I scraped only the data from 1980-81 season up to now because that's when the media started to choose MVP of the league.
The mvp_votings.csv
represents the train data. It holds various basketball statistics. You can view some of the descriptions of the stats in my medium post The target value for regression can be award_share
column which represents the share of the votes that the players have won.
All of the data is owned by basketball reference, and I do not own any of the data.
Image belongs to nba.com
What is the most important statistic which defines how will be the MVP?
What are your predictions for this season?
How did the most important feature change over the year?
How big of an impact does a team's win percentage hold with all other features?
This file contains links to the data behind The Complete History Of The NBA and our NBA Predictions.
nba_elo.csv contains game-by-game Elo ratings and forecasts back to 1946.
This dataset was scrap...
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
All data comes from publicly available datasets compiled into one.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The database contains several datasets and files with NBA statistical data spanning four seasons (2015-2016 to 2018-2019). These datasets were procured from the Basketball Reference database (https://www.basketball-reference.com/), a publicly accessible source of NBA data.
The main file, dat.cleaned.csv
, includes the Win/Loss records for all thirty NBA teams, along with box scores and advanced statistics. The data captured over the four seasons correspond to about 4,920 regular-season games. A distinguishing feature of this dataset is the repeated measurements per player within a team across the seasons. However, it's important to note that these repeated measurements are not independent, necessitating the use of hierarchical modelling to properly handle the data.
Two sets of additional text files (per_2017.txt
, per_2018.txt
, rpm_2017.txt
, rpm_2018.txt
) provide specific metrics for player performance. The 'PER' files contain the Athlete Efficiency Rating (PER) for the years 2017 and 2018. The 'RPM' files contain the ESPN-developed score called Real Plus-Minus (RPM) for the same years.
However, potential biases or limitations within the datasets should be acknowledged. For instance, the Basketball Reference website might not include data from some matches or may exclude certain variables, potentially affecting the quality and accuracy of the dataset.
I needed an easy, interesting dataset for a presentation a few days before NBA final this year. So I thought NBA players' stats might catch my audiences attention. Here is the dataset. I've also shared a kernel including a few interactive visualizations of the data. Let me know what you think.
NBA 2018/2019 Season Players' Statistics Data. I obtained data from NBA.com and converted to CSV format.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Predicting Women's NBA (WNBA)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/wnba-forecastse on 28 January 2022.
--- Dataset description provided by original source is as follows ---
https://i.ibb.co/4dcHDh5/WNBA.png" alt="">
About
This file contains links to the data behind our WNBA Predictions. More information on how our WNBA Elo model works can be found in this article.
wnba_elo.csv
contains game-by-game Elo ratings and forecasts since 1997.
wnba_elo_latest.csv
contains game-by-game Elo ratings and forecasts for only the latest season.License
Data released under the Creative Commons Attribution 4.0 License
Source
This dataset was created by data.world's Admin and contains around 6000 samples along with Home Team Postgame Rating, Home Team, technical information and other features such as: - Date - Away Team - and more.
- Analyze Neutral in relation to Home Team Pregame Rating
- Study the influence of Away Team Postgame Rating on Season
- More datasets
If you use this dataset in your research, please credit data.world's Admin
--- Original source retains full ownership of the source dataset ---
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
This data is obtained from basketball-reference.com using a self-written webcrawler. It contains detailed game data and player specific stats for each game of the respective season.
Data for each season is arranged in two csv-files. The first file season_XXXX_basic.csv
contains basic data for each game of the season, such as the date, time, scores and attendance. The second file season_XXXX_detailed.csv
contains additional statistics for each player participating in a specific game, such as the minutes played, field goals made and field goals attempted. A lot of data is missing for older seasons, since it wasn't recorded and is not listed on basketball-reference.com.
It would be interesting to see what statistics changed over the course of time when the game evolved and teams focused more on 3PT shots for example.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
🏀 NBA Shooting Stats: Synthetic Data
Este repositorio contiene un conjunto de datos sintéticos generados a partir del scraping responsable y ético de estadísticas avanzadas de equipos de la NBA, obtenidas de NBA.com/stats. El objetivo del proyecto es analizar la evolución del estilo de juego en la liga, con foco en la selección de tiro por zonas y posición, así como en la presión defensiva, medida a través de tiros defendidos y control del rebote.
📊 Descripción del dataset
El conjunto de datos cubre la evolución de los equipos de la NBA desde la temporada 1996-1997 hasta la 2024-2025, agregando estadísticas por:
Equipo
Temporada
Conferencia (East/West)
Posición del jugador (Guard, Forward, Center)
Incluye métricas ofensivas como:- Tiros intentados, anotados y % de acierto por zonas del campo (por ejemplo, <5 ft, 5–9 ft, 10–14 ft, etc.)
Y defensivas como:
Contested 2pt shots
Contested 3pt shots
Offensive boxouts (off_boxouts)
Defensive boxouts (def_boxouts)
⚙️ Generación del dataset
El scraping se realizó utilizando Seleniumy BeautifulSoup, automatizando filtros por temporada, conferencia y posición. Para garantizar buenas prácticas:
Se verificó previamente el acceso permitido mediante la librería robotparser, respetando el archivo robots.txt.
Se implementaron tiempos de espera aleatorios y navegación simulada para imitar el comportamiento humano y evitar sobrecargar los servidores.
🔐 Importante:
Los datos originales extraídos no se publican en este repositorio debido a las restricciones descritas en los Términos de uso y la Política de privacidad de NBA.com. En su lugar, se ha generado un conjunto de datos sintéticos, estadísticamente representativo pero libre de contenido propietario.
📁 Archivos incluidos
nba_synthetic_ds.csv: Dataset principal en formato CSV (delimitado por comas)
nba_synthetic_ds_excel.cs: Versión del dataset con delimitador ;
, compatible con Excel
README.md: Este documento
📌 Origen de los datos
Los datos originales fueron obtenidos desde:
https://www.nba.com/stats. Sitio oficial de estadísticas de la NBA, propiedad de © NBA Media Ventures, LLC.
El conjunto sintético aquí presentado es un trabajo derivado con fines exclusivamente académicos, que no infringe los derechos del propietario original y respeta el uso permitido especificado en los Términos y el archivo robots.txt.
📜 Licencia
Este dataset se publica bajo la licencia: 👉 CC BY-NC-SA 4.0 – Attribution-NonCommercial-ShareAlike
Esto significa que:
Puedes usar, compartir y adaptar los datos para fines no comerciales
Debes reconocer la fuente original (NBA.com) y este proyecto
Cualquier trabajo derivado debe distribuirse bajo la misma licencia
👥 Autores
Proyecto desarrollado por:
Etel Silva García – esilgar@uoc.edu
José Morote García – josemorote21@uoc.edu
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset opens the door to the intricacies of the 2023 NBA season, offering a profound understanding of the art of scoring in professional basketball. Within its comprehensive analysis, it showcases the remarkable prowess of 3 players LeBron James, James Harden, and Stephen Curry—true icons of the sport. Delve deep into the strategic brilliance that defines these players' shooting trends, performance metrics, and unwavering precision on the court. Whether you're a passionate basketball enthusiast or a data-driven analyst, this dataset provides a unique and invaluable window into the mastery of these legendary athletes and the ever-evolving game of basketball.
Column Names | Description |
---|---|
Top | The vertical position on the court where the shot was taken. |
Left | The horizontal position on the court where the shot was taken. |
Date | The date when the shot was taken. (e.g., Oct 18, 2022) |
Qtr | The quarter in which the shot was attempted, typically represented as "1st Qtr," "2nd Qtr," etc. |
Time Remaining | The time remaining in the quarter when the shot was attempted, typically displayed as minutes and seconds (e.g., 09:26). |
Result | Indicates whether the shot was successful, with "TRUE" for a made shot and "FALSE" for a missed shot. |
Shot Type | Describes the type of shot attempted, such as a "2" for a two-point shot or "3" for a three-point shot. |
Distance (ft) | The distance in feet from the hoop to where the shot was taken. |
Lead | Indicates whether the team was leading when the shot was attempted, with "TRUE" for a lead and "FALSE" for no lead. |
LeBron Team Score | The team's score (in points) when the shot was taken. |
Opponent Team Score | The opposing team's score (in points) when the shot was taken. |
Opponent | The abbreviation for the opposing team (e.g., GSW for Golden State Warriors). |
Team | The abbreviation for LeBron James's team (e.g., LAL for Los Angeles Lakers). |
Season | The season in which the shots were taken, indicated as the year (e.g., 2023). |
Color | Represents the color code associated with the shot, which may indicate shot outcomes or other characteristics (e.g., "red" or "green"). |
Data Scientists and Analysts: Employ advanced statistical analysis to uncover hidden patterns and insights in the shooting trends of LeBron James, James Harden, and Stephen Curry.
Basketball Researchers and Analysts: Evaluate the impact of shooting techniques and performance on game outcomes.
NBA Coaches and Officials: Utilize the dataset to study the strengths and weaknesses of individual players, enabling more targeted coaching and defensive strategies.
Sports Journalists and Commentators: Access detailed statistics to enhance game commentary and provide viewers with deeper insights into player performance.
Basketball Enthusiasts and Fans: Gain a new perspective on the game by exploring the shooting trends and performance of their favorite players.
Context-
Stephen Curry's heroics in 3-point shooting lead me to create the dataset.
Content-
This Dataset contains 3-point shots made, attempted, Field Goal Percentage and Percentage share of 3-pointers in total points for the time period of 1996-2020. Initial 3 columns are taken from NBA.com official website and Percentage share of 3-pointers in total points was calculated using the data retrieved from official website.
Column Description-
A) For Sheet 1 (Year wise data) : This sheet has average stats for every NBA team for each season Teams: All the existing teams Every season e.g. 1996-97 has 4 columns under them: 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
B) For Sheet 2 (NBA Average data) : This sheet has average stats for whole of NBA for each season Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
C) For Sheet 3 (GSW Average data) : This sheet has average stats only for GSW every season Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
D) For Sheet 4 (4-Year Range data) : This sheet has 4-year average stats for every NBA team Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains links to the data behind The Complete History Of The NBA and our NBA Predictions.
nba_elo.csv contains game-by-game Elo ratings and forecasts back to 1946. nba_elo_latest.csv contains game-by-game Elo ratings and forecasts for only the latest season.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset is based on box score and standing statistics from the NBA.
Calculations such as number of possessions, floor impact counter, strength of schedule, and simple rating system are performed.
Finally, extracts are created based on a perspective:
teamBoxScore.csv communicates game data from each teams perspective
officialBoxScore.csv communicates game data from each officials perspective
playerBoxScore.csv communicates game data from each players perspective
standing.csv communicates standings data for each team every day during the season
Data Sources
Box score and standing statistics were obtained by a Java application using RESTful APIs provided by xmlstats.
Calculation Sources
Another Java application performs advanced calculations on the box score and standing data.
Formulas for these calculations were primarily obtained from these sources:
Favoritism
Does a referee impact the number of fouls made against a player or the pace of a game?
Forcasting
Can the aggregated points scored by and against a team along with their strength of schedule be used to determine their projected winning percentage for the season?
Predicting the Past
For a given game, can games played earlier in the season help determine how a team will perform?
Lots of data elements and possibilities. Let your imagination roam!
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Contains the latitude and longitude coordinates in decimal format of every Major League Baseball (MLB), National Football League (NFL), National Basketball Association (NBA), National Hockey League (NHL), and Major League Soccer (MLS) team's home stadium. Also includes information about each team's division.
Note that as teams change names, new stadiums are built, and sports league realign divisions this information will become out of date.
Credit to Mick Haupt via Unsplash for the banner photo.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.