MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains end-of-season box-score aggregates for NBA players over the 2012–13 through 2023–24 seasons, split into training and test sets for both regular season and playoffs. Each CSV has one row per player per season with columns for points, rebounds, steals, turnovers, 3-pt attempts, FG attempts, plus identifiers.
end-of-season box-score aggregates (2012–13 – 2023–24) split into train/test;
the Jupyter notebook (Analysis.ipynb); All the code can be executed in there
the trained model binary (nba_model.pkl); Serialized Random Forest model artifact
Evaluation plots (LAL vs. whole‐league) for regular & playoff predictions are given as png outputs and uploaded in here
FAIR4ML metadata (fair4ml_metadata.jsonld);
see README.md and abbreviations.txt for file details.”
Notebook
Analysis.ipynb: Involves the graphica output of the trained and tested data.
Trained/ Test csv Data
Name | Description | PID |
regular_train.csv | For training purposes, the seasons 2012-2013 through 2021-2022 were selected as training purpose | 4421e56c-4cd3-4ec1-a566-a89d7ec0bced |
regular_test.csv: | For testing purpose of the regular season, the 2022-2023 season was selected | f9d84d5e-db01-4475-b7d1-80cfe9fe0e61 |
playoff_train.csv | For training purposes of the playoff season, the seasons 2012-2013 through 2022-2023 were selected | bcb3cf2b-27df-48cc-8b76-9e49254783d0 |
playoff_test.csv | For testing purpose of the playoff season, 2023-2024 season was selected | de37d568-e97f-4cb9-bc05-2e600cc97102 |
Others
abbrevations.txt: Involves the fundemental abbrevations of the columns in csv data
Additional Notes
Raw csv files are taken from Kaggle (Source: https://www.kaggle.com/datasets/shivamkumar121215/nba-stats-dataset-for-last-10-years/data)
Some preprocessing has to be done before uploading into dbrepo
Plots have also been uploaded as an output for visual purposes.
A more detailed version can be found on github (Link: https://github.com/bubaltali/nba-prediction-analysis/)
I needed an easy, interesting dataset for a presentation a few days before NBA final this year. So I thought NBA players' stats might catch my audiences attention. Here is the dataset. I've also shared a kernel including a few interactive visualizations of the data. Let me know what you think.
NBA 2018/2019 Season Players' Statistics Data. I obtained data from NBA.com and converted to CSV format.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
NBA Games Data
This data is an updated version of the original NBA Games by Nathan Lauga.
Data source Code Updated to: 2025-02-13
The dataset retains the original format and includes the following files:
games.csv – Summary of NBA games, including scores and team details. games_details.csv – Detailed player statistics for each game. players.csv – Player information. ranking.csv – Daily NBA team rankings. teams.csv – List of all NBA teams.
http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
This data is obtained from basketball-reference.com using a self-written webcrawler. It contains detailed game data and player specific stats for each game of the respective season.
Data for each season is arranged in two csv-files. The first file season_XXXX_basic.csv
contains basic data for each game of the season, such as the date, time, scores and attendance. The second file season_XXXX_detailed.csv
contains additional statistics for each player participating in a specific game, such as the minutes played, field goals made and field goals attempted. A lot of data is missing for older seasons, since it wasn't recorded and is not listed on basketball-reference.com.
It would be interesting to see what statistics changed over the course of time when the game evolved and teams focused more on 3PT shots for example.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The database contains several datasets and files with NBA statistical data spanning four seasons (2015-2016 to 2018-2019). These datasets were procured from the Basketball Reference database (https://www.basketball-reference.com/), a publicly accessible source of NBA data.
The main file, `dat.cleaned.csv`, includes the Win/Loss records for all thirty NBA teams, along with box scores and advanced statistics. The data captured over the four seasons correspond to about 4,920 regular-season games. A distinguishing feature of this dataset is the repeated measurements per player within a team across the seasons. However, it's important to note that these repeated measurements are not independent, necessitating the use of hierarchical modelling to properly handle the data.
Two sets of additional text files (`per_2017.txt`, `per_2018.txt`, `rpm_2017.txt`, `rpm_2018.txt`) provide specific metrics for player performance. The 'PER' files contain the Athlete Efficiency Rating (PER) for the years 2017 and 2018. The 'RPM' files contain the ESPN-developed score called Real Plus-Minus (RPM) for the same years.
However, potential biases or limitations within the datasets should be acknowledged. For instance, the Basketball Reference website might not include data from some matches or may exclude certain variables, potentially affecting the quality and accuracy of the dataset.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description
This dataset contains a single CSV file with lifetime statistics for NBA players. The data includes various box score stats and personal information for each player's career.
Data Fields
The CSV file contains the following columns:
FULL_NAME: The player's full name AST: Total career assists BLK: Total career blocks DREB: Total career defensive rebounds FG3A: Total 3-point field goal attempts FG3M: Total 3-point field goals made FG3_PCT: 3-point field… See the full description on the dataset page: https://huggingface.co/datasets/Hatman/NBA-Player-Career-Stats.