http://xmm.vilspa.esa.es/cgidoc/ccb/NRCOread?NRC52. Dataset provided by the ESDC. Please refer to the datasets landing page at http://esdcdoi.esac.esa.int/doi/html/data/astronomy/xmmnewton/031099.html
This data set contains 30 million chess positions along with a label that indicates if the position is not check (0), check (1) or checkmate (2). In addition, we provide 3 reference explanations per data point consisting of 8×8 bit masks that highlight certain squares that are relevant for the decision. For each class, we identified one explanation type that characterizes it most accurately: - No check (0): All squares that are controlled by the enemy player, i.e., all squares that can be reached or captured on by any enemy piece. - Check (1): All squares (origin or target) of legal moves. As a checkmate is a check where the player under attack has no more legal moves, highlighting legal moves is sufficient to disprove a checkmate. - Checkmate (2): All squares with pieces that are essential for creating the checkmate. This includes attackers, friendly pieces blocking the King, enemy pieces guarding escape squares and enemy pieces protecting attackers.
The data is saved as a CSV file containing the chess positions in Forsyth–Edwards Notation (FEN) and the label (0-2) as columns.
The FEN string can be read by most chess software packages and encodes the current piece setup, whose turn it is and some more game-specific information (castling rights, en-passant squares).
The explanations are saved as 64-bit unsigned integers, which can be converted to SquareSet
objects from the chess
library.
We provide code for converting between different data and explanation representations.
Our data set is based on the Lichess open database, which contains records of over 3 billion games of chess played online by human players on the free chess website Lichess. To read and process the games and to create the explanations, we used the Python package chess. We selected only those games that end in checkmate, excluding those that end by timeout or resignation. Also we skip the first ten moves, as they lead to lots of duplicate positions.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Dataset Name
Dataset Summary
This is a tiny version of the RedPajama dataset. It contains 64 samples from each of the 7 sources. This dataset is intended for developing and testing data/training pipeline for loading the full RedPajama dataset or any general HuggingFace dataset. It is very fast to download and easy to examine. You should not use it for training a full model, but you can use it for overfitting test or any other sanity checks.… See the full description on the dataset page: https://huggingface.co/datasets/severo/RedPajama-Tiny.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
http://xmm.vilspa.esa.es/cgidoc/ccb/NRCOread?NRC52. Dataset provided by the ESDC. Please refer to the datasets landing page at http://esdcdoi.esac.esa.int/doi/html/data/astronomy/xmmnewton/031099.html