9 datasets found
  1. Chess XAI Benchmark

    • kaggle.com
    Updated Jul 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    smuecke (2022). Chess XAI Benchmark [Dataset]. https://www.kaggle.com/datasets/smuecke/chess-xai-benchmark
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    smuecke
    Description

    This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.

    The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns. The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares). The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet objects from the chess library. We provide code for converting between different data and explanation representations.

    Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.

  2. Ai Crowd Chess Challange Dataset

    • kaggle.com
    Updated Aug 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcos Garcia (2023). Ai Crowd Chess Challange Dataset [Dataset]. https://www.kaggle.com/marcosgarcia75/ai-crowd-chess-challange-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 14, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Marcos Garcia
    Description

    Dataset

    This dataset was created by Marcos Garcia

    Contents

  3. test77-2021-nov-filtered

    • kaggle.com
    Updated Feb 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arseniy Surkov (2025). test77-2021-nov-filtered [Dataset]. https://www.kaggle.com/datasets/rickonaut/test77-2021-nov-filtered
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Arseniy Surkov
    Description

    Approvers NNUE

    This dataset is used by the Approvers team's NNUE in the FIDE & Google Efficient Chess AI Challenge.

    Format

    It uses Bulletformat, a binary format that stores chess positions in 32 bytes, which the Bullet trainer uses to train the NNUE.

  4. Leela Chess Zero Self-Play Chess Games Dataset 3

    • kaggle.com
    Updated Dec 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnthonyTherrien (2024). Leela Chess Zero Self-Play Chess Games Dataset 3 [Dataset]. http://doi.org/10.34740/kaggle/dsv/10217785
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 16, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    AnthonyTherrien
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Overview

    This dataset contains 31,744 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:

    • Policy Temperature: 2.25
    • Backend: cuda-fp16
    • Time Control: 2.5

    The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:

    • Training and evaluating chess engines
    • Analyzing chess strategies and opening theory
    • Conducting AI and machine learning experiments in chess

    Files

    • games-2.5s.pgn: The main dataset file containing all 31,744 games in PGN format.

    Dataset Features

    1. Game Format: Each game is recorded in PGN format, including headers for metadata such as player names, result, and opening.
    2. Generated by AI: All games were played by the AI model Leela Chess Zero using the following configuration:
      • Policy Temperature: 2.25
      • Backend: cuda-fp16
      • Time Control: 2.5
    3. Diverse Playstyles: The higher policy temperature introduces greater diversity in the decision-making process, providing a wide range of strategies and outcomes.
    4. High-Quality Self-Play: Games generated by Lc0 are renowned for their depth of calculation and understanding of positional play.

    Usage

    Chess Engine Training and Evaluation

    This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.

    AI Research

    • Investigate the effects of a high policy temperature on gameplay.
    • Explore the diversity and quality of play in self-play scenarios.

    Chess Strategy Analysis

    • Study opening trends and preferences.
    • Analyze game outcomes under the given configurations.

    License

    This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.

    Acknowledgments

    • Leela Chess Zero: A groundbreaking open-source chess engine that leverages neural networks for unparalleled chess understanding.
    • CUDA: The GPU acceleration technology that enabled efficient game generation.

    Feedback

    If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.

  5. Leela Chess Zero Self-Play Chess Games Dataset 6

    • kaggle.com
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AnthonyTherrien (2024). Leela Chess Zero Self-Play Chess Games Dataset 6 [Dataset]. http://doi.org/10.34740/kaggle/dsv/10273295
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 22, 2024
    Dataset provided by
    Kaggle
    Authors
    AnthonyTherrien
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset Overview

    This dataset contains 197,218 self-play chess games generated using Leela Chess Zero (Lc0) with specific configurations to explore unique gameplay dynamics. The games were generated using:

    • Policy Temperature: 2.25
    • Backend: cuda-fp16
    • Time Control: 2.225

    The games are stored in the PGN (Portable Game Notation) format, which is widely used for recording chess games. This dataset can serve as a resource for:

    • Training and evaluating chess engines
    • Analyzing chess strategies and opening theory
    • Conducting AI and machine learning experiments in chess

    Files

    • games-2.225s.pgn: The main dataset file containing all 197,218 games in PGN format.

    Dataset Features

    1. Game Format: Each game is recorded in PGN format, including headers for metadata such as player names, result, and opening.
    2. Generated by AI: All games were played by the AI model Leela Chess Zero using the following configuration:
      • Policy Temperature: 2.25
      • Backend: cuda-fp16
      • Time Control: 2.225
    3. Diverse Playstyles: The higher policy temperature introduces greater diversity in the decision-making process, providing a wide range of strategies and outcomes.
    4. High-Quality Self-Play: Games generated by Lc0 are renowned for their depth of calculation and understanding of positional play.

    Usage

    Chess Engine Training and Evaluation

    This dataset can be used to fine-tune or evaluate chess engines by: - Extracting positions for supervised learning. - Analyzing endgames or middlegame strategies.

    AI Research

    • Investigate the effects of a high policy temperature on gameplay.
    • Explore the diversity and quality of play in self-play scenarios.

    Chess Strategy Analysis

    • Study opening trends and preferences.
    • Analyze game outcomes under the given configurations.

    License

    This dataset is shared under the Creative Commons Attribution 4.0 International License (CC BY 4.0). You are free to use, share, and adapt the data, provided you give appropriate credit to the creators.

    Acknowledgments

    • Leela Chess Zero: A groundbreaking open-source chess engine that leverages neural networks for unparalleled chess understanding.
    • CUDA: The GPU acceleration technology that enabled efficient game generation.

    Feedback

    If you use this dataset or have suggestions for improvements, please leave feedback or share your project in the Kaggle discussion forums.

  6. Chess Game Dataset

    • kaggle.com
    zip
    Updated Sep 19, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Praveen Kumar (2020). Chess Game Dataset [Dataset]. https://www.kaggle.com/penchalaiah123/chess-game-dataset
    Explore at:
    zip(2903760 bytes)Available download formats
    Dataset updated
    Sep 19, 2020
    Authors
    Praveen Kumar
    Description

    General Info

    This is a set of just over 20,000 games collected from a selection of users on the site Lichess.org, and how to collect more. I will also upload more games in the future as I collect them. This set contains the:

    • Game ID;
    • Rated (T/F);
    • Start Time;
    • End Time;
    • Number of Turns;
    • Game Status;
    • Winner;
    • Time Increment;
    • White Player ID;
    • White Player Rating;
    • Black Player ID;
    • Black Player Rating;
    • All Moves in Standard Chess Notation;
    • Opening Eco (Standardised Code for any given opening)
    • Opening Name;
    • Opening Ply (Number of moves in the opening phase) For each of these separate games from Lichess. I collected this data using the Lichess API, which enables collcollection of any given users' game history. The difficult part was collecting usernames to use, however the API also enables dumpdumping of all users in a Lichess team. There are several teams on Lichess with over 1,500 players, so this proved an effective way to get users to collect games.

    Possible Uses

    Lots of information is contained within a single chess game, let alone a full dataset of multiple games. It is primarily a game of patterns, and data science is all about detecting patterns in data, which is why chess has been one of the most invested in areas of AI in the past. This dataset collects all of the information available from 20,000 games and presents it in a format that is easy to process for analysis of, for example, what allows a player to win as black or white, how much meta (out-of-game) factors affect a game, the relationship between openings and victory for black and white and more.

  7. Digital Chess Pieces Images

    • kaggle.com
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shahir Habib (2025). Digital Chess Pieces Images [Dataset]. https://www.kaggle.com/datasets/shahirhabib/digital-chess-pieces-images/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 8, 2025
    Dataset provided by
    Kaggle
    Authors
    Shahir Habib
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset, Digital Chess Pieces Images, is a curated collection of chess piece images sourced from online platforms like Chess.com, Lichess, and various books. It includes 616 files organized by piece type and color, covering bishops, kings, knights, pawns, queens, and rooks. Useful for chess enthusiasts, AI training, and digital applications. ♟️📚

  8. Chess Position Evaluations

    • kaggle.com
    Updated Aug 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lichess (2025). Chess Position Evaluations [Dataset]. https://www.kaggle.com/datasets/lichess/chess-evaluations
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Lichess
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset Card for the Lichess Evaluations dataset

    Dataset Description

    274,369,477 chess positions evaluated with Stockfish at various depths and node count. Produced by, and for, the Lichess analysis board, running various flavours of Stockfish within user browsers. This version of the dataset is a de-normalized version of the original dataset and contains 683,913,654 rows.

    This dataset is updated monthly, and was last updated on August 6th, 2025.

    One row of the dataset as a Python dictionary:

    {
     "fen": "2bq1rk1/pr3ppn/1p2p3/7P/2pP1B1P/2P5/PPQ2PB1/R3R1K1 w - -",
     "line": "g2e4 f7f5 e4b7 c8b7 f2f3 b7f3 e1e6 d8h4 c2h2 h4g4",
     "depth": 36,
     "knodes": 206765,
     "cp": 311,
     "mate": None
    }
    

    Dataset Fields

    Every row of the dataset contains the following fields:

    • fen: string, the position FEN only contains pieces, active color, castling rights, and en passant square.
    • line: string, the principal variation, in UCI format.
    • depth: string, the depth reached by the engine.
    • knodes: int, the number of kilo-nodes searched by the engine.
    • cp: int, the position's centipawn evaluation. This is None if mate is certain.
    • mate: int, the position's mate evaluation. This is None if mate is not certain.
  9. Artificial Intelligence Systems & Applications

    • kaggle.com
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farial Mahmod Tishan (2024). Artificial Intelligence Systems & Applications [Dataset]. https://www.kaggle.com/datasets/farialmahmod/artificial-intelligence-systems-and-applications/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 8, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Farial Mahmod Tishan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset "Artificial Intelligence Systems & Applications" contains meta-data about popular AI chatbots, Large Language Model (LLM), systems, chess engines and apps, such as:

    1. IBM DeepBlue
    2. IBM Watson
    3. Wolfram|Alpha
    4. Apple Siri
    5. Amazon Alexa
    6. Google Assistant
    7. Bixby
    8. Leela Chess Zero
    9. ChatGPT
    10. Microsoft Copilot
    11. Gemini
    12. YandexGPT

    The features (columns) include:

    1. Name
    2. Year
    3. Speciality
    4. Developer
    5. Platform(s)
    6. Programming Language(s)

    File format: .xlsx

    The dataset has been prepared by Farial Mahmod Tishan and the source of data has been added to the "Provenance" section.

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
smuecke (2022). Chess XAI Benchmark [Dataset]. https://www.kaggle.com/datasets/smuecke/chess-xai-benchmark
Organization logo

Chess XAI Benchmark

A Sanity Check for Trustworthy AI

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 25, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
smuecke
Description

This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.

The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns. The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares). The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet objects from the chess library. We provide code for converting between different data and explanation representations.

Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.

Search
Clear search
Close search
Google apps
Main menu