Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
[!CAUTION] This dataset is still a work in progress and some breaking changes might occur.
Lichess Rated Standard Chess Games Dataset
Dataset Description
6,771,826,271 standard rated games, played on lichess.org, updated monthly from the database dumps. This version of the data is meant for data analysis. If you need PGN files you can find those here. That said, once you have a subset of interest, it is trivial to convert it back to PGN as shown in the Dataset Usage… See the full description on the dataset page: https://huggingface.co/datasets/Lichess/standard-chess-games.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Chess Game Dataset (Lichess) includes details of more than 20,000 chess matches collected on Lichess.org , as well as players, openings, and match results.
2) Data Utilization (1) Chess Game Dataset (Lichess) has characteristics that: • This dataset provides a variety of variables related to chess matches, including match ID, match start and end times, turn count, winner, player rating, opening code and name, and full sequence. (2) Chess Game Dataset (Lichess) can be used to: • Openings Win Rate Analysis: Analysis of the frequency of use and win rate of each opening can be used to study effective chess strategies. • Player skill prediction: Based on player ratings, match results, and sequential data, it can be used to predict wins and losses and analyze performance improvement factors.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for Lichess Puzzles
Dataset Description
5,423,662 chess puzzles, rated and tagged. See them in action on Lichess. This dataset is updated monthly, and was last updated on October 7th, 2025.
Dataset Creation
Generating the initial dataset chess puzzles took more than 50 years of CPU time. We went through 300,000,000 analyzed games from the Lichess database, and re-analyzed interesting positions with Stockfish 12/13/14/15 NNUE at 40 meganodes. The… See the full description on the dataset page: https://huggingface.co/datasets/Lichess/chess-puzzles.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for the Lichess Evaluations dataset
Dataset Description
302,517,109 chess positions evaluated with Stockfish at various depths and node count. Produced by, and for, the Lichess analysis board, running various flavours of Stockfish within user browsers. This version of the dataset is a de-normalized version of the original dataset and contains 752,452,094 rows. This dataset is updated monthly, and was last updated on Thursday 16th, 2025.
Dataset… See the full description on the dataset page: https://huggingface.co/datasets/Lichess/chess-position-evaluations.
Facebook
TwitterA collection of 25,940 chess games played by the user 'pawnster101' on the online platform Lichess.org. The dataset includes game data and player interactions in a clean interface without ads or registration requirements.
Facebook
TwitterGeneral Info
This is a set of just over 20,000 games collected from a selection of users on the site Lichess.org, and how to collect more. I will also upload more games in the future as I collect them. This set contains the:
Possible Uses
Lots of information is contained within a single chess game, let alone a full dataset of multiple games. It is primarily a game of patterns, and data science is all about detecting patterns in data, which is why chess has been one of the most invested in areas of AI in the past. This dataset collects all of the information available from 20,000 games and presents it in a format that is easy to process for analysis of, for example, what allows a player to win as black or white, how much meta (out-of-game) factors affect a game, the relationship between openings and victory for black and white and more.
Facebook
TwitterThis is a collection of ~1.5M chess puzzles from the Lichess database of ~3.9M puzzles (as of 2024-05-09). The set of puzzles from "Can You Learn an Algorithm? Generalizing from Easy to Hard Problems with Recurrent Networks" is included, with the exception of 26,079 puzzles that are no longer in the Lichess database (on the assumption that they might have been removed for a good reason). For each puzzle, ctx is a SAN transcript (with every half-move numbered) of an actual Lichess game, up to… See the full description on the dataset page: https://huggingface.co/datasets/EleutherAI/lichess-puzzles.
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Dataset Card for Lichess Puzzles
Dataset Description
3546 chess openings with their Encyclopaedia of Chess Openings (ECO) classification.
Dataset Creation
Creating this dataset is detailed in its original GitHub repository. Updates to the original repo will also be reflected in this version. Dataset last updated on October 7th, 2025.
Dataset Usage
Using the datasets library: from datasets import load_dataset
dset =… See the full description on the dataset page: https://huggingface.co/datasets/Lichess/chess-openings.
Facebook
TwitterTraffic analytics, rankings, and competitive metrics for lichess.org as of August 2025
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset is a comprehensive collection of chess positions sourced from Lichess, one of the most popular online chess platforms. This dataset offers a valuable resource for researchers, enthusiasts, and developers interested in exploring and analyzing chess games.
Note: While some positions may appear multiple times in the dataset, they represent distinct moments within different games. The dataset encourages a comprehensive understanding of chess dynamics by offering a diverse range of positions and evaluations.
Facebook
TwitterThis dataset was created by Seqaeon
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
It all started with Covid-19 pandemic. The world got home struck. I started watching videos of a lot of content creators in Youtube. I started watching a stand-up comedian turned streamer's videos and he introduced chess and lo-and-behold, my fond for chess started!!
The data in this dataset are all stats of my chess matches that I've played either in the morning or in the evening, with my friends or randoms. All the data is mine but I have collected it from Lichess.org, an online chess playing and learning webapp. All the data is manually written by my ownself!
Thanks to Lichess.org for having come up with such a thought that I was able to access this wonderful place.
I'm facing issues as of now crossing 1000 rating in 5 min + 0 increment Blitz. So, what can I do, what opening's should I go for, which pieces I'm good with and etcetera.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset showcases synthetic chess commentary generated from real Lichess games, designed to empower research in LLMs, sports analytics, and game understanding.
Each JSON object contains: - Unique match ID - Player moves and board states - Stockfish's turn-level analysis - Synthetic commentary in alpaca format with each move
This dataset supports: - Fine-tuning LLMs for dynamic game commentary - Evaluating generative models in sports settings - Building intelligent chess dashboards or companion apps
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Condensed Lichess Database
This dataset is a condensed version of the Lichess database. It only includes games for which Stockfish evaluations were available. Currently, the dataset contains the entire year 2023, which consists of >100M games and >2B positions. Games are stored in a format that is much faster to process than the original PGN data.
Requirements: pip install zstandard python-chess datasets
Quick Guide
In the following, I explain the data format… See the full description on the dataset page: https://huggingface.co/datasets/mauricett/lichess_sf.
Facebook
TwitterThis dataset was created by Seqaeon
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
[!CAUTION] This dataset is still a work in progress and some breaking changes might occur. In the meantime, please use https://database.lichess.org/#variant_games
Facebook
TwitterThis dataset is contains more than 7 million chess matches between players above 2200+ rating on lichess excluding bullet games
This contains folder which is monthly rapid games played on lichess starting from june-2020 all upto june-2021 in pgn format
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
It was collected from li chess to make a chess engine capable of making decent moves and then self learn to improve
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
[!CAUTION] This dataset is still a work in progress and some breaking changes might occur.
Note
The FEN column has 961 unique values instead of the expected 960, because some rematches were recorded with invalid castling rights in their starting FEN in November 2023.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Lichess Broadcasts
This is a dataset of chess games from chess tournaments tracked using Lichess Broadcasts. Lichess Broadcasts show live games as they unfold with new moves arriving in real time. They are built to connect to the live-updating PGN file produced by DGT boards but can work with other sources as well. Broadcasts are organized in "tournaments" and "rounds."
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
[!CAUTION] This dataset is still a work in progress and some breaking changes might occur. In the meantime, please use https://database.lichess.org/#variant_games
Facebook
Twitterhttps://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
[!CAUTION] This dataset is still a work in progress and some breaking changes might occur.
Lichess Rated Standard Chess Games Dataset
Dataset Description
6,771,826,271 standard rated games, played on lichess.org, updated monthly from the database dumps. This version of the data is meant for data analysis. If you need PGN files you can find those here. That said, once you have a subset of interest, it is trivial to convert it back to PGN as shown in the Dataset Usage… See the full description on the dataset page: https://huggingface.co/datasets/Lichess/standard-chess-games.