9 datasets found

Chess XAI Benchmark
kaggle.com
Updated Jul 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
smuecke (2022). Chess XAI Benchmark [Dataset]. https://www.kaggle.com/datasets/smuecke/chess-xai-benchmark
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 25, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
smuecke
Description
This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.

The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns. The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares). The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet objects from the chess library. We provide code for converting between different data and explanation representations.

Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.
Ai Crowd Chess Challange Dataset
kaggle.com
Updated Aug 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcos Garcia (2023). Ai Crowd Chess Challange Dataset [Dataset]. https://www.kaggle.com/marcosgarcia75/ai-crowd-chess-challange-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 14, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Marcos Garcia
Description
Dataset

This dataset was created by Marcos Garcia

Contents
test77-2021-nov-filtered
kaggle.com
Updated Feb 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arseniy Surkov (2025). test77-2021-nov-filtered [Dataset]. https://www.kaggle.com/datasets/rickonaut/test77-2021-nov-filtered
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Arseniy Surkov
Description
Approvers NNUE

This dataset is used by the Approvers team's NNUE in the FIDE & Google Efficient Chess AI Challenge.

Format

It uses Bulletformat, a binary format that stores chess positions in 32 bytes, which the Bullet trainer uses to train the NNUE.
Leela Chess Zero Self-Play Chess Games Dataset 3
kaggle.com
Updated Dec 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). Leela Chess Zero Self-Play Chess Games Dataset 3 [Dataset]. http://doi.org/10.34740/kaggle/dsv/10217785
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10217785
Dataset updated
Dec 16, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview

This dataset contains 31,744 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:

Policy Temperature: 2.25

Backend: cuda-fp16

Time Control: 2.5

The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:

Training and evaluating chess engines

Analyzing chess strategies and opening theory

Conducting AI and machine learning experiments in chess

Files

games-2.5s.pgn: The main dataset file containing all 31,744 games in PGN format.

Dataset Features

Game Format: Each game is recorded in PGN format, including headers for metadata such as player names, result, and opening.

Generated by AI: All games were played by the AI model Leela Chess Zero using the following configuration:

Policy Temperature: 2.25

Backend: cuda-fp16

Time Control: 2.5

Diverse Playstyles: The higher policy temperature introduces greater diversity in the decision-making process, providing a wide range of strategies and outcomes.

High-Quality Self-Play: Games generated by Lc0 are renowned for their depth of calculation and understanding of positional play.

Usage

Chess Engine Training and Evaluation

This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.

AI Research

Investigate the effects of a high policy temperature on gameplay.

Explore the diversity and quality of play in self-play scenarios.

Chess Strategy Analysis

Study opening trends and preferences.

Analyze game outcomes under the given configurations.

License

This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.

Acknowledgments

Leela Chess Zero: A groundbreaking open-source chess engine that leverages neural networks for unparalleled chess understanding.

CUDA: The GPU acceleration technology that enabled efficient game generation.

Feedback

If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.
Leela Chess Zero Self-Play Chess Games Dataset 6
kaggle.com
Updated Dec 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AnthonyTherrien (2024). Leela Chess Zero Self-Play Chess Games Dataset 6 [Dataset]. http://doi.org/10.34740/kaggle/dsv/10273295
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10273295
Dataset updated
Dec 22, 2024
Dataset provided by
Kaggle
Authors
AnthonyTherrien
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Dataset Overview

This dataset contains 197,218 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:

Policy Temperature: 2.25

Backend: cuda-fp16

Time Control: 2.225

The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:

Training and evaluating chess engines

Analyzing chess strategies and opening theory

Conducting AI and machine learning experiments in chess

Files

games-2.225s.pgn: The main dataset file containing all 197,218 games in PGN format.

Dataset Features

Game Format: Each game is recorded in PGN format, including headers for metadata such as player names, result, and opening.

Generated by AI: All games were played by the AI model Leela Chess Zero using the following configuration:

Policy Temperature: 2.25

Backend: cuda-fp16

Time Control: 2.225

Diverse Playstyles: The higher policy temperature introduces greater diversity in the decision-making process, providing a wide range of strategies and outcomes.

High-Quality Self-Play: Games generated by Lc0 are renowned for their depth of calculation and understanding of positional play.

Usage

Chess Engine Training and Evaluation

This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.

AI Research

Investigate the effects of a high policy temperature on gameplay.

Explore the diversity and quality of play in self-play scenarios.

Chess Strategy Analysis

Study opening trends and preferences.

Analyze game outcomes under the given configurations.

License

This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.

Acknowledgments

Leela Chess Zero: A groundbreaking open-source chess engine that leverages neural networks for unparalleled chess understanding.

CUDA: The GPU acceleration technology that enabled efficient game generation.

Feedback

If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.
Chess Game Dataset
kaggle.com
zip
Updated Sep 19, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Praveen Kumar (2020). Chess Game Dataset [Dataset]. https://www.kaggle.com/penchalaiah123/chess-game-dataset
Explore at:
zip(2903760 bytes)Available download formats
Dataset updated
Sep 19, 2020
Authors
Praveen Kumar
Description
General Info

This is a set of just over 20,000 games collected from a selection of users on the site Lichess.org, and how to collect more. I will also upload more games in the future as I collect them. This set contains the:

Game ID;

Rated (T/F);

Start Time;

End Time;

Number of Turns;

Game Status;

Winner;

Time Increment;

White Player ID;

White Player Rating;

Black Player ID;

Black Player Rating;

All Moves in Standard Chess Notation;

Opening Eco (Standardised Code for any given opening)

Opening Name;

Opening Ply (Number of moves in the opening phase) For each of these separate games from Lichess. I collected this data using the Lichess API, which enables collcollection of any given users' game history. The difficult part was collecting usernames to use, however the API also enables dumpdumping of all users in a Lichess team. There are several teams on Lichess with over 1,500 players, so this proved an effective way to get users to collect games.

Possible Uses

Lots of information is contained within a single chess game, let alone a full dataset of multiple games. It is primarily a game of patterns, and data science is all about detecting patterns in data, which is why chess has been one of the most invested in areas of AI in the past. This dataset collects all of the information available from 20,000 games and presents it in a format that is easy to process for analysis of, for example, what allows a player to win as black or white, how much meta (out-of-game) factors affect a game, the relationship between openings and victory for black and white and more.
Digital Chess Pieces Images
kaggle.com
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahir Habib (2025). Digital Chess Pieces Images [Dataset]. https://www.kaggle.com/datasets/shahirhabib/digital-chess-pieces-images/versions/2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 8, 2025
Dataset provided by
Kaggle
Authors
Shahir Habib
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset, Digital Chess Pieces Images, is a curated collection of chess piece images sourced from online platforms like Chess.com, Lichess, and various books. It includes 616 files organized by piece type and color, covering bishops, kings, knights, pawns, queens, and rooks. Useful for chess enthusiasts, AI training, and digital applications. ♟️📚
Chess Position Evaluations
kaggle.com
Updated Aug 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lichess (2025). Chess Position Evaluations [Dataset]. https://www.kaggle.com/datasets/lichess/chess-evaluations
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Lichess
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset Card for the Lichess Evaluations dataset

Dataset Description

274,369,477 chess positions evaluated with Stockfish at various depths and node count. Produced by, and for, the Lichess analysis board, running various flavours of Stockfish within user browsers. This version of the dataset is a de-normalized version of the original dataset and contains 683,913,654 rows.

This dataset is updated monthly, and was last updated on August 6th, 2025.

One row of the dataset as a Python dictionary:

{ "fen": "2bq1rk1/pr3ppn/1p2p3/7P/2pP1B1P/2P5/PPQ2PB1/R3R1K1 w - -", "line": "g2e4 f7f5 e4b7 c8b7 f2f3 b7f3 e1e6 d8h4 c2h2 h4g4", "depth": 36, "knodes": 206765, "cp": 311, "mate": None }

Dataset Fields

Every row of the dataset contains the following fields:

fen: string, the position FEN only contains pieces, active color, castling rights, and en passant square.

line: string, the principal variation, in UCI format.

depth: string, the depth reached by the engine.

knodes: int, the number of kilo-nodes searched by the engine.

cp: int, the position's centipawn evaluation. This is None if mate is certain.

mate: int, the position's mate evaluation. This is None if mate is not certain.
Artificial Intelligence Systems & Applications
kaggle.com
Updated Jul 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Farial Mahmod Tishan (2024). Artificial Intelligence Systems & Applications [Dataset]. https://www.kaggle.com/datasets/farialmahmod/artificial-intelligence-systems-and-applications/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 8, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Farial Mahmod Tishan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset "Artificial Intelligence Systems & Applications" contains meta-data about popular AI chatbots, Large Language Model (LLM), systems, chess engines and apps, such as:

IBM DeepBlue

IBM Watson

Wolfram|Alpha

Apple Siri

Amazon Alexa

Google Assistant

Bixby

Leela Chess Zero

ChatGPT

Microsoft Copilot

Gemini

YandexGPT

The features (columns) include:

Name

Year

Speciality

Developer

Platform(s)

Programming Language(s)

File format: .xlsx

The dataset has been prepared by Farial Mahmod Tishan and the source of data has been added to the "Provenance" section.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

smuecke (2022). Chess XAI Benchmark [Dataset]. https://www.kaggle.com/datasets/smuecke/chess-xai-benchmark

Chess XAI Benchmark

A Sanity Check for Trustworthy AI

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jul 25, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

smuecke

Description

This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.

The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns. The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares). The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet objects from the chess library. We provide code for converting between different data and explanation representations.

Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.

Clear search

Close search

Google apps

Main menu

Chess XAI Benchmark

Ai Crowd Chess Challange Dataset

Dataset

Contents

test77-2021-nov-filtered

Approvers NNUE

Format

Leela Chess Zero Self-Play Chess Games Dataset 3

Dataset Overview

Files

Dataset Features

Usage

Chess Engine Training and Evaluation

AI Research

Chess Strategy Analysis

License

Acknowledgments

Feedback

Leela Chess Zero Self-Play Chess Games Dataset 6

Dataset Overview

Files

Dataset Features

Usage

Chess Engine Training and Evaluation

AI Research

Chess Strategy Analysis

License

Acknowledgments

Feedback

Chess Game Dataset

Digital Chess Pieces Images

Chess Position Evaluations

Dataset Card for the Lichess Evaluations dataset

Dataset Description

Dataset Fields

Artificial Intelligence Systems & Applications

Chess XAI Benchmark

A Sanity Check for Trustworthy AI