This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.
The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns.
The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares).
The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet
objects from the chess
library.
We provide code for converting between different data and explanation representations.
Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.
This dataset was created by Marcos Garcia
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains 31,744 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:
2.25
cuda-fp16
2.5
The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:
games-2.5s.pgn
: The main dataset file containing all 31,744 games in PGN format.2.25
cuda-fp16
2.5
This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.
This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.
If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.
https://public.roboflow.ai/object-detection/chess-full
Provided by Roboflow License: Public Domain
This is a dataset of Chess board photos and various pieces. All photos were captured from a constant angle, a tripod to the left of the board. The bounding boxes of all pieces are annotated as follows: white-king
, white-queen
, white-bishop
, white-knight
, white-rook
, white-pawn
, black-king
, black-queen
, black-bishop
, black-knight
, black-rook
, black-pawn
. There are 2894 labels across 292 images.
https://i.imgur.com/nkjobw1.png" alt="Chess Example">
Follow this tutorial to see an example of training an object detection model using this dataset or jump straight to the Colab notebook.
At Roboflow, we built a chess piece object detection model using this dataset.
https://blog.roboflow.ai/content/images/2020/01/chess-detection-longer.gif" alt="ChessBoss">
You can see a video demo of that here. (We did struggle with pieces that were occluded, i.e. the state of the board at the very beginning of a game has many pieces obscured - let us know how your results fare!)
We're releasing the data free on a public license.
Roboflow makes managing, preprocessing, augmenting, and versioning datasets for computer vision seamless.
Developers reduce 50% of their boilerplate code when using Roboflow's workflow, save training time, and increase model reproducibility.
General Info
This is a set of just over 20,000 games collected from a selection of users on the site Lichess.org, and how to collect more. I will also upload more games in the future as I collect them. This set contains the:
Possible Uses
Lots of information is contained within a single chess game, let alone a full dataset of multiple games. It is primarily a game of patterns, and data science is all about detecting patterns in data, which is why chess has been one of the most invested in areas of AI in the past. This dataset collects all of the information available from 20,000 games and presents it in a format that is easy to process for analysis of, for example, what allows a player to win as black or white, how much meta (out-of-game) factors affect a game, the relationship between openings and victory for black and white and more.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.
The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns.
The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares).
The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet
objects from the chess
library.
We provide code for converting between different data and explanation representations.
Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.