This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.
The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns.
The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares).
The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet
objects from the chess
library.
We provide code for converting between different data and explanation representations.
Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.
This dataset was created by Marcos Garcia
This dataset is used by the Approvers
team's NNUE in the FIDE & Google Efficient Chess AI Challenge.
It uses Bulletformat, a binary format that stores chess positions in 32 bytes, which the Bullet trainer uses to train the NNUE.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 31,744 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:
2.25
cuda-fp16
2.5
The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:
games-2.5s.pgn
: The main dataset file containing all 31,744 games in PGN format.2.25
cuda-fp16
2.5
This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.
This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.
If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 197,218 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:
2.25
cuda-fp16
2.225
The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:
games-2.225s.pgn
: The main dataset file containing all 197,218 games in PGN format.2.25
cuda-fp16
2.225
This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.
This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.
If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.
General Info
This is a set of just over 20,000 games collected from a selection of users on the site Lichess.org, and how to collect more. I will also upload more games in the future as I collect them. This set contains the:
Possible Uses
Lots of information is contained within a single chess game, let alone a full dataset of multiple games. It is primarily a game of patterns, and data science is all about detecting patterns in data, which is why chess has been one of the most invested in areas of AI in the past. This dataset collects all of the information available from 20,000 games and presents it in a format that is easy to process for analysis of, for example, what allows a player to win as black or white, how much meta (out-of-game) factors affect a game, the relationship between openings and victory for black and white and more.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset, Digital Chess Pieces Images, is a curated collection of chess piece images sourced from online platforms like Chess.com, Lichess, and various books. It includes 616 files organized by piece type and color, covering bishops, kings, knights, pawns, queens, and rooks. Useful for chess enthusiasts, AI training, and digital applications. ♟️📚
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
274,369,477 chess positions evaluated with Stockfish at various depths and node count. Produced by, and for, the Lichess analysis board, running various flavours of Stockfish within user browsers. This version of the dataset is a de-normalized version of the original dataset and contains 683,913,654 rows.
This dataset is updated monthly, and was last updated on August 6th, 2025.
One row of the dataset as a Python dictionary:
{
"fen": "2bq1rk1/pr3ppn/1p2p3/7P/2pP1B1P/2P5/PPQ2PB1/R3R1K1 w - -",
"line": "g2e4 f7f5 e4b7 c8b7 f2f3 b7f3 e1e6 d8h4 c2h2 h4g4",
"depth": 36,
"knodes": 206765,
"cp": 311,
"mate": None
}
Every row of the dataset contains the following fields:
fen
: string
, the position FEN only contains pieces, active color, castling rights, and en passant square.line
: string
, the principal variation, in UCI format.depth
: string
, the depth reached by the engine.knodes
: int
, the number of kilo-nodes searched by the engine.cp
: int
, the position's centipawn evaluation. This is None
if mate is certain.mate
: int
, the position's mate evaluation. This is None
if mate is not certain.Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset "Artificial Intelligence Systems & Applications" contains meta-data about popular AI chatbots, Large Language Model (LLM), systems, chess engines and apps, such as:
The features (columns) include:
File format: .xlsx
The dataset has been prepared by Farial Mahmod Tishan and the source of data has been added to the "Provenance" section.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.
The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns.
The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares).
The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet
objects from the chess
library.
We provide code for converting between different data and explanation representations.
Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.