MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
All data comes from publicly available datasets compiled into one.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains end-of-season box-score aggregates for NBA players over the 2012–13 through 2023–24 seasons, split into training and test sets for both regular season and playoffs. Each CSV has one row per player per season with columns for points, rebounds, steals, turnovers, 3-pt attempts, FG attempts, plus identifiers.
end-of-season box-score aggregates (2012–13 – 2023–24) split into train/test;
the Jupyter notebook (Analysis.ipynb); All the code can be executed in there
the trained model binary (nba_model.pkl); Serialized Random Forest model artifact
Evaluation plots (LAL vs. whole‐league) for regular & playoff predictions are given as png outputs and uploaded in here
FAIR4ML metadata (fair4ml_metadata.jsonld);
see README.md and abbreviations.txt for file details.”
Notebook
Analysis.ipynb: Involves the graphica output of the trained and tested data.
Trained/ Test csv Data
Name | Description | PID |
regular_train.csv | For training purposes, the seasons 2012-2013 through 2021-2022 were selected as training purpose | 4421e56c-4cd3-4ec1-a566-a89d7ec0bced |
regular_test.csv: | For testing purpose of the regular season, the 2022-2023 season was selected | f9d84d5e-db01-4475-b7d1-80cfe9fe0e61 |
playoff_train.csv | For training purposes of the playoff season, the seasons 2012-2013 through 2022-2023 were selected | bcb3cf2b-27df-48cc-8b76-9e49254783d0 |
playoff_test.csv | For testing purpose of the playoff season, 2023-2024 season was selected | de37d568-e97f-4cb9-bc05-2e600cc97102 |
Others
abbrevations.txt: Involves the fundemental abbrevations of the columns in csv data
Additional Notes
Raw csv files are taken from Kaggle (Source: https://www.kaggle.com/datasets/shivamkumar121215/nba-stats-dataset-for-last-10-years/data)
Some preprocessing has to be done before uploading into dbrepo
Plots have also been uploaded as an output for visual purposes.
A more detailed version can be found on github (Link: https://github.com/bubaltali/nba-prediction-analysis/)
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Current and updated dataset of NBA Playoff statistics since the 1949-1950 season!
All standard statistics like Assists Per Game, Minutes Per Game, etc. are present as well as advanced statistics like Player Efficiency Rating (PER), Value Over Replacement Player (VORP), Win Share, and more!
This dataset was web scraped from https://www.basketball-reference.com.
Feel free to let me know if there are any statistics or player information that isn't present that you think should be added!
If you want the regular season statistics check out my other data set.
For more details on how some statistics are calculated, please see the https://www.basketball-reference.com/about/glossary.html
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset opens the door to the intricacies of the 2023 NBA season, offering a profound understanding of the art of scoring in professional basketball. Within its comprehensive analysis, it showcases the remarkable prowess of 3 players LeBron James, James Harden, and Stephen Curry—true icons of the sport. Delve deep into the strategic brilliance that defines these players' shooting trends, performance metrics, and unwavering precision on the court. Whether you're a passionate basketball enthusiast or a data-driven analyst, this dataset provides a unique and invaluable window into the mastery of these legendary athletes and the ever-evolving game of basketball.
Column Names | Description |
---|---|
Top | The vertical position on the court where the shot was taken. |
Left | The horizontal position on the court where the shot was taken. |
Date | The date when the shot was taken. (e.g., Oct 18, 2022) |
Qtr | The quarter in which the shot was attempted, typically represented as "1st Qtr," "2nd Qtr," etc. |
Time Remaining | The time remaining in the quarter when the shot was attempted, typically displayed as minutes and seconds (e.g., 09:26). |
Result | Indicates whether the shot was successful, with "TRUE" for a made shot and "FALSE" for a missed shot. |
Shot Type | Describes the type of shot attempted, such as a "2" for a two-point shot or "3" for a three-point shot. |
Distance (ft) | The distance in feet from the hoop to where the shot was taken. |
Lead | Indicates whether the team was leading when the shot was attempted, with "TRUE" for a lead and "FALSE" for no lead. |
LeBron Team Score | The team's score (in points) when the shot was taken. |
Opponent Team Score | The opposing team's score (in points) when the shot was taken. |
Opponent | The abbreviation for the opposing team (e.g., GSW for Golden State Warriors). |
Team | The abbreviation for LeBron James's team (e.g., LAL for Los Angeles Lakers). |
Season | The season in which the shots were taken, indicated as the year (e.g., 2023). |
Color | Represents the color code associated with the shot, which may indicate shot outcomes or other characteristics (e.g., "red" or "green"). |
Data Scientists and Analysts: Employ advanced statistical analysis to uncover hidden patterns and insights in the shooting trends of LeBron James, James Harden, and Stephen Curry.
Basketball Researchers and Analysts: Evaluate the impact of shooting techniques and performance on game outcomes.
NBA Coaches and Officials: Utilize the dataset to study the strengths and weaknesses of individual players, enabling more targeted coaching and defensive strategies.
Sports Journalists and Commentators: Access detailed statistics to enhance game commentary and provide viewers with deeper insights into player performance.
Basketball Enthusiasts and Fans: Gain a new perspective on the game by exploring the shooting trends and performance of their favorite players.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
🏀 NBA Shooting Stats: Synthetic Data
Este repositorio contiene un conjunto de datos sintéticos generados a partir del scraping responsable y ético de estadísticas avanzadas de equipos de la NBA, obtenidas de NBA.com/stats. El objetivo del proyecto es analizar la evolución del estilo de juego en la liga, con foco en la selección de tiro por zonas y posición, así como en la presión defensiva, medida a través de tiros defendidos y control del rebote.
📊 Descripción del dataset
El conjunto de datos cubre la evolución de los equipos de la NBA desde la temporada 1996-1997 hasta la 2024-2025, agregando estadísticas por:
Equipo
Temporada
Conferencia (East/West)
Posición del jugador (Guard, Forward, Center)
Incluye métricas ofensivas como:- Tiros intentados, anotados y % de acierto por zonas del campo (por ejemplo, <5 ft, 5–9 ft, 10–14 ft, etc.)
Y defensivas como:
Contested 2pt shots
Contested 3pt shots
Offensive boxouts (off_boxouts)
Defensive boxouts (def_boxouts)
⚙️ Generación del dataset
El scraping se realizó utilizando Seleniumy BeautifulSoup, automatizando filtros por temporada, conferencia y posición. Para garantizar buenas prácticas:
Se verificó previamente el acceso permitido mediante la librería robotparser, respetando el archivo robots.txt.
Se implementaron tiempos de espera aleatorios y navegación simulada para imitar el comportamiento humano y evitar sobrecargar los servidores.
🔐 Importante:
Los datos originales extraídos no se publican en este repositorio debido a las restricciones descritas en los Términos de uso y la Política de privacidad de NBA.com. En su lugar, se ha generado un conjunto de datos sintéticos, estadísticamente representativo pero libre de contenido propietario.
📁 Archivos incluidos
nba_synthetic_ds.csv: Dataset principal en formato CSV (delimitado por comas)
nba_synthetic_ds_excel.cs: Versión del dataset con delimitador ;
, compatible con Excel
README.md: Este documento
📌 Origen de los datos
Los datos originales fueron obtenidos desde:
https://www.nba.com/stats. Sitio oficial de estadísticas de la NBA, propiedad de © NBA Media Ventures, LLC.
El conjunto sintético aquí presentado es un trabajo derivado con fines exclusivamente académicos, que no infringe los derechos del propietario original y respeta el uso permitido especificado en los Términos y el archivo robots.txt.
📜 Licencia
Este dataset se publica bajo la licencia: 👉 CC BY-NC-SA 4.0 – Attribution-NonCommercial-ShareAlike
Esto significa que:
Puedes usar, compartir y adaptar los datos para fines no comerciales
Debes reconocer la fuente original (NBA.com) y este proyecto
Cualquier trabajo derivado debe distribuirse bajo la misma licencia
👥 Autores
Proyecto desarrollado por:
Etel Silva García – esilgar@uoc.edu
José Morote García – josemorote21@uoc.edu
As the season has come to an end and at the moment we are already deep in playoff basketball, I wanted to take a look and see if I can at any way get to some data so I can predict the MVP of 2018-19 season. After a quick search, I came across all mvp votings since 1968-69 up to this past seasons on basketball-reference. I wrote a scraper and got the data. I also got the data for current season. However, I scraped only the data from 1980-81 season up to now because that's when the media started to choose MVP of the league.
The mvp_votings.csv
represents the train data. It holds various basketball statistics. You can view some of the descriptions of the stats in my medium post The target value for regression can be award_share
column which represents the share of the votes that the players have won.
All of the data is owned by basketball reference, and I do not own any of the data.
Image belongs to nba.com
What is the most important statistic which defines how will be the MVP?
What are your predictions for this season?
How did the most important feature change over the year?
How big of an impact does a team's win percentage hold with all other features?
Context-
Stephen Curry's heroics in 3-point shooting lead me to create the dataset.
Content-
This Dataset contains 3-point shots made, attempted, Field Goal Percentage and Percentage share of 3-pointers in total points for the time period of 1996-2020. Initial 3 columns are taken from NBA.com official website and Percentage share of 3-pointers in total points was calculated using the data retrieved from official website.
Column Description-
A) For Sheet 1 (Year wise data) : This sheet has average stats for every NBA team for each season Teams: All the existing teams Every season e.g. 1996-97 has 4 columns under them: 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
B) For Sheet 2 (NBA Average data) : This sheet has average stats for whole of NBA for each season Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
C) For Sheet 3 (GSW Average data) : This sheet has average stats only for GSW every season Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
D) For Sheet 4 (4-Year Range data) : This sheet has 4-year average stats for every NBA team Years: Played season year 3PM: Average 3-pointers per game made in that particular season for by specified team 3PA: Average 3-pointers per game attempted in that particular season by specified team 3P%: Average 3-pointer shooting percentage per game in that particular season by specified team 3P% share in Total points: Average share of 3-pointers in total points scored per game by the specified team
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.