100+ datasets found

Russian literature
kaggle.com
zip
Updated Dec 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
d0rj_ (2022). Russian literature [Dataset]. https://www.kaggle.com/d0rj3228/russian-literature
Explore at:
zip(21547079 bytes)Available download formats
Dataset updated
Dec 10, 2022
Authors
d0rj_
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Area covered
Russia
Description
Content

This repository contains a collection of Russian literature in txt format (all in UTF-8 encoding). In addition, for each author there is a csv file containing information about the year of writing of each work.

This dataset was created for a project to determine the authorship of a piece of text, but I'm sure that you can use this dataset for anything 😉.

The main feature that allows this dataset to be used for any purpose is that the data is not processed at all. The text has not been pre-processed in any way, the designations of authors, chapters and references to the translation of foreign inserts have not been removed.

Acknowledgements

Thanks Ilibrary, LitLib, Wikisource and all-all-all.
w
Russian literature and thought
workwithdata.com
Updated Jan 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2022). Russian literature and thought [Dataset]. https://www.workwithdata.com/book-series/Russian%20literature%20and%20thought
Explore at:
Dataset updated
Jan 11, 2022
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Russia
Description
Dashboard - Russian literature and thought - Russian literature and thought is a series of 11 books by 10 authors between 1995 and 2011
Russian literature texts
kaggle.com
zip
Updated May 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matvei Danilov (2024). Russian literature texts [Dataset]. https://www.kaggle.com/datasets/luchsmann/russian-literature-texts/suggestions?status=pending&yourSuggestions=true
Explore at:
zip(5306921 bytes)Available download formats
Dataset updated
May 25, 2024
Authors
Matvei Danilov
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Russia
Description
Dataset

This dataset was created by Matvei Danilov

Released under Apache 2.0

Contents
n
Institute of Russian Literature Dataverse - St Petersburg, Russia
nixa.ca
Updated May 18, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute of Russian Literature Dataverse (2026). Institute of Russian Literature Dataverse - St Petersburg, Russia [Dataset]. https://nixa.ca/data/open-data-dataverse-institute-of-russian-literature-dataverse
Explore at:
Dataset updated
May 18, 2026
Dataset authored and provided by
Institute of Russian Literature Dataverse
Area covered
Russia
Description
Explore Institute of Russian Literature Dataverse open data for St Petersburg, Russia, published by Institute of Russian Literature Dataverse. Browse 45 public datasets, resources, and metadata in Nixa.
Russian books text corpus
kaggle.com
zip
Updated Feb 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Itachi666 (2023). Russian books text corpus [Dataset]. https://www.kaggle.com/datasets/itachi666/dontsova-books
Explore at:
zip(471794638 bytes)Available download formats
Dataset updated
Feb 26, 2023
Authors
Itachi666
Description
Joined corpus of russian books. Can be good for text generation networks like gpt I do not own any of these texts and they should be used for educational purposes only, i guess
r
The Uppsala Russian Corpus
researchdata.se
Updated Nov 25, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lennart Lönngren (2020). The Uppsala Russian Corpus [Dataset]. https://researchdata.se/en/catalogue/dataset/ext0071-1
Explore at:
Dataset updated
Nov 25, 2020
Dataset provided by
University of Tromsø
Authors
Lennart Lönngren
Time period covered
1960 - 1988
Area covered
Uppsala
Description
The Uppsala Corpus (Upsal'skij korpus russkix tekstov) consists of some 600 Russian texts with a total of one million running words (word tokens), equally divided between informative and literary prose. The informative texts are from between 1985 and 1989, while the literary texts, whose vocabulary does not date as quickly, cover a longer period, 1960-88. The corpus does not include poetry or drama.

Within the given frameword, considerable effort has been made to ensure as representative and varied a corpus as possible. The informative texts are drawn from 25 different subject areas: economics, foreign affairs / foreign policy, ideology / domestic policy, party matters, Soviet society, social issues, defence, education, law, history, culture, linguistics, medicine / health care, psychology, environment / ecology, agriculture, engineering, information technology, space research, energy, biology, geology / geography, physics, chemistry and sport. Certain areas which were felt to be more important are represented by a larger volume of texts.

The literary half of the corpus comprises work by the following 40 authors: Abramov, Ajtmatov, Astaf'ev, Baklanov, Bek, Belov, Bitov, Bondarev, Dubov, Ganin, Gladyshev, Granin, Grekova, Goncharov, Iskander, Kaverin, Kazakov, Kochnev, Kozhevnikova, Nagibin, Lixanov, Lidin, Paustovskij, Pogodin, Pristavkin, Troepol'skij, Rasputin, Shcherbakova, Simonov, Solouxin, Shmelev, Tendrjakov, Tokareva, Tolstaja, Trifonov, Vasil'ev, Vorobl'ev, Zalygin and Zorin. Here, too, there is unequal representation, with a larger amount of writing by the better-known authors.

For further details about the corpus, see Lönngren, Lennart (ed.), 1993. Chastotnyj slovar' sovremennogo russkogo jazyka. (A Frequency Dictionary of Modern Russian. With a Summary in English.) Acta Universitatis Upsaliensis, Studia Slavica Upsaliensia 32. 188 pp. Uppsala. ISBN 91-554-3134-8.

Purpose:

The aim is to provide a corpus of Russian prose texts.
h
Data from: Kitzinger_MappingNetworksC&P_OffstageEdges
works.hcommons.org
csv
Updated Oct 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chloe Kitzinger; Chloe Kitzinger (2025). Kitzinger_MappingNetworksC&P_OffstageEdges [Dataset]. http://doi.org/10.17613/74pp-vd08
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.17613/74pp-vd08
Dataset updated
Oct 13, 2025
Dataset provided by
unknown
Authors
Chloe Kitzinger; Chloe Kitzinger
Time period covered
Jul 2020
Description
These data were used to produce the network graphs that accompany the chapter 'Mapping the Networks of Crime and Punishment," published in 'Approaches to Teaching Dostoevsky's Crime and Punishment' (ed. M. Katz and A. Burry, MLA Approaches to Teaching World Literature series, forthcoming in 2021). Graph: "Offstage" connections only.
h
tape
huggingface.co
Updated Oct 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Natural Language Processing in Russian (2022). tape [Dataset]. https://huggingface.co/datasets/RussianNLP/tape
Explore at:
Dataset updated
Oct 23, 2022
Dataset authored and provided by
Natural Language Processing in Russian
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
The Winograd schema challenge composes tasks with syntactic ambiguity, which can be resolved with logic and reasoning (Levesque et al., 2012).

The texts for the Winograd schema problem are obtained using a semi-automatic pipeline. First, lists of 11 typical grammatical structures with syntactic homonymy (mainly case) are compiled. For example, two noun phrases with a complex subordinate: 'A trinket from Pompeii that has survived the centuries'. Requests corresponding to these constructions are submitted in search of the Russian National Corpus, or rather its sub-corpus with removed homonymy. In the resulting 2+k examples, homonymy is removed automatically with manual validation afterward. Each original sentence is split into multiple examples in the binary classification format, indicating whether the homonymy is resolved correctly or not.
r
Swedish reviews of post-Soviet Russian novels published in Swedish...
researchdata.se
docx, tsv, xlsx
Updated Apr 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Malin Podlevskikh Carlström (2024). Swedish reviews of post-Soviet Russian novels published in Swedish translation 1992-2020 [Dataset]. http://doi.org/10.5878/e1k4-1058
Explore at:
tsv(5713), docx(26108), tsv(20224), tsv(97726), tsv(1692), xlsx(2020406), tsv(13006), tsv(78201)Available download formats
Unique identifier
https://doi.org/10.5878/e1k4-1058
Dataset updated
Apr 24, 2024
Dataset provided by
University of Gothenburg
Authors
Malin Podlevskikh Carlström
Area covered
Sweden
Description
The data material consists of a detailed description of a review corpus used in order to analyze the reception of Russian literature in Sweden. The investigations that have and will be conducted based on the review corpus analyze for example translation visibility, translation criticism and the image of Russian literature in the Swedish literary system. The review corpus consists of 430 reviews of post-Soviet Russian novels published in Swedish translation between 1992 and 2020. The reviews are protected by copyright and may not be made available. Therefore, the data instead contains a complete specification of the review database, information regarding how the reviews have been classified, and finally, information about thematic coding related to specific investigations (articles).
g
Swedish reviews of post-Soviet Russian novels published in Swedish...
gimi9.com
Updated Mar 2, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Swedish reviews of post-Soviet Russian novels published in Swedish translation 1992-2020 | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_https-doi-org-10-5878-e1k4-1058
Explore at:
Dataset updated
Mar 2, 2022
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Sweden, Soviet Union
Description
The data material consists of a detailed description of a review corpus used in order to analyze the reception of Russian literature in Sweden. The investigations that have and will be conducted based on the review corpus analyze for example translation visibility, translation criticism and the image of Russian literature in the Swedish literary system. The review corpus consists of 430 reviews of post-Soviet Russian novels published in Swedish translation between 1992 and 2020. The reviews are protected by copyright and may not be made available. Therefore, the data instead contains a complete specification of the review database, information regarding how the reviews have been classified, and finally, information about thematic coding related to specific investigations (articles).
POSTMODERNISM IN RUSSIAN LITERATURE
zenodo.org
pdf
Updated Mar 27, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dilnoza Boltaeva; Dilnoza Boltaeva (2026). POSTMODERNISM IN RUSSIAN LITERATURE [Dataset]. http://doi.org/10.5281/zenodo.19246343
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.19246343
Dataset updated
Mar 27, 2026
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Dilnoza Boltaeva; Dilnoza Boltaeva
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Russia
Description
This article examines the phenomenon of Russian literary postmodernism: its chronological scope, key differences from its Western counterpart (its traumatic nature, its reaction to the collapse of the Soviet utopia), and its main aesthetic principles (intertextuality, irony, and demythologization). The author identifies the main trends (Moscow conceptualism, Leningrad metarealism, “other prose”) and key figures (Venedikt Erofeev, Vladimir Sorokin, Viktor Pelevin, Dmitry Prigov, Tatyana Tolstaya). Special attention is given to seminal texts (“Moscow—Petushki,” “Pushkin’s House,” “Chapaev and the Void”) and an analysis of the crisis of postmodernism in the 2000s with the transition to new literary strategies (metamodernism, new sincerity). The material is structured and suitable both for an introduction to the topic and for consolidating knowledge.
d
National Science and Technology Commission Literature II Discipline Project...
data.gov.tw
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Science and Technology Council, National Science and Technology Commission Literature II Discipline Project Subsidy List [Dataset]. https://data.gov.tw/en/datasets/40520
Explore at:
csvAvailable download formats
Dataset authored and provided by
National Science and Technology Council
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
National Science and Technology Committee Literature II Discipline Project Subsidy List.
Complex Russian Dataset
kaggle.com
zip
Updated Dec 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Artalmaz31 (2023). Complex Russian Dataset [Dataset]. https://www.kaggle.com/datasets/artalmaz31/complex-russian-dataset/code
Explore at:
zip(74930729 bytes)Available download formats
Dataset updated
Dec 20, 2023
Authors
Artalmaz31
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains a large number of diverse Russian-language text collected from a variety of sources

• articles.txt – Texts of popular articles on various topics published on dzen.ru (~20 million characters)

• books-A.txt – Fragments of various works of world-class Russian and foreign literature (~20 million characters)

• books-B.txt – Fragments of various works of literature, both world-famous and little-known (~20 million characters)

• fanfiction.txt – Texts of popular fanfiction on various topics published on ficbook.net (~20 million characters)

• jokes.txt – Texts of various jokes and puns (~6.7 million characters)

• poems.txt – Texts of various poems by world-famous authors (~40 million characters)
h
Data from: Kitzinger_MappingNetworksC&P_AllCharactersEdges
works.hcommons.org
csv
Updated Oct 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chloe Kitzinger; Chloe Kitzinger (2025). Kitzinger_MappingNetworksC&P_AllCharactersEdges [Dataset]. http://doi.org/10.17613/bzp4-0z53
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.17613/bzp4-0z53
Dataset updated
Oct 13, 2025
Dataset provided by
unknown
Authors
Chloe Kitzinger; Chloe Kitzinger
Time period covered
Jul 2020
Description
These data were used to produce the network graphs that accompany the chapter 'Mapping the Networks of Crime and Punishment," published in 'Approaches to Teaching Dostoevsky's Crime and Punishment' (ed. M. Katz and A. Burry, MLA Approaches to Teaching World Literature series, forthcoming in 2021). Graph: Complete character network.
THE EVOLUTION OF THE HERO'S IMAGE IN RUSSIAN LITERATURE OF THE XVIII CENTURY...
zenodo.org
pdf
Updated Feb 26, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Haydarova Guzaliya Zinurovna; Haydarova Guzaliya Zinurovna (2026). THE EVOLUTION OF THE HERO'S IMAGE IN RUSSIAN LITERATURE OF THE XVIII CENTURY [Dataset]. http://doi.org/10.5281/zenodo.18793624
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.18793624
Dataset updated
Feb 26, 2026
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Haydarova Guzaliya Zinurovna; Haydarova Guzaliya Zinurovna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Russia
Description
The article examines the evolution of the hero's image in 18th-century Russian literature in the context of changing artistic trends and the transformation of aesthetic paradigms of the era. The movement from the normative-allegorical model of personality in the system of classicism to the socio-educational and further to the emotional-psychological concept of a person in sentimentalism is analyzed. Based on the works of Alexander Sumarokov, Mikhail Lomonosov, Denis Fonvizin, Gavriil Derzhavin, Alexander Radishchev and Nikolai Karamzin, structural changes in the characterological organization of the hero, the principles of motivation of his actions and ways of artistic representation of the inner world are revealed.
And Quiet Flows the Don
kaggle.com
zip
Updated Apr 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Max B. (2021). And Quiet Flows the Don [Dataset]. https://www.kaggle.com/max398434434/and-quiet-flows-the-don
Explore at:
zip(1785343 bytes)Available download formats
Dataset updated
Apr 19, 2021
Authors
Max B.
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

And Quiet Flows the Don or Quietly Flows the Don (Russian: Тихий Дон, literally "The Quiet Don") is an epic novel in four volumes by Russian writer Mikhail Alexandrovich Sholokhov. The first three volumes were written from 1925 to 1932 and published in the Soviet magazine Oktyabr in 1928–1932, and the fourth volume was finished in 1940. The English translation of the first three volumes appeared under this title in 1934.

The novel is considered one of the most significant works of world and Russian literature in the 20th century. It depicts the lives and struggles of Don Cossacks during the First World War, the Russian Revolution, and Russian Civil War. In 1965, Sholokhov was awarded the Nobel Prize for Literature for this novel.

source: https://en.wikipedia.org/wiki/And_Quiet_Flows_the_Don

Book is written in Russian with a lot of dialecticisms specific to the basin of the lower and middle Don

Content

Data provided as text file
a
Russian Educational Text Collection
academictorrents.com
bittorrent
Updated Jan 25, 2026
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nyuuzyou (2026). Russian Educational Text Collection [Dataset]. https://academictorrents.com/details/1f6b373346a0fa34de6b4d916984d698e0a623b3
Explore at:
bittorrent(304218686)Available download formats
Dataset updated
Jan 25, 2026
Dataset authored and provided by
nyuuzyou
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
Dataset Card for Russian Educational Text Collection ### Dataset Summary This dataset contains approximately 1.38M educational texts primarily in Russian with some content in Ukrainian and English. The content is extracted from presentations and documents, including educational presentations, essays, and various academic documents covering diverse topics from natural sciences to literature. ### Languages - Russian (ru) - primary language - Ukrainian (uk) - secondary language - English (en) - secondary language With Russian being the predominant language in the dataset, while Ukrainian and English content appears less frequently. ## Dataset Structure ### Data Fields The dataset is split into two parquet files: - presentations (1,335,171 entries): - title : Title of the presentation (string) - slide_text : Array of slide contents (list of strings) - documents (47,474 entries): - title : Title of the document (string) - document_text : Full text content of the document (string) ## A
World Literature Summaries On Russian
kaggle.com
zip
Updated Jun 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Timur Tuleuov (2023). World Literature Summaries On Russian [Dataset]. https://www.kaggle.com/datasets/timurtuleuov/world-literature-summaries-on-russian
Explore at:
zip(10235582 bytes)Available download formats
Dataset updated
Jun 17, 2023
Authors
Timur Tuleuov
Area covered
World
Description
The "World Literature Summaries on Russian" dataset is a comprehensive collection of concise summaries of literary works from around the globe, presented in the Russian language. This dataset offers a valuable resource for researchers, students, and literature enthusiasts interested in exploring and analyzing a wide range of literary masterpieces, including novels, plays, poems, and short stories. With summaries spanning various genres, time periods, and cultural backgrounds, this dataset provides a rich source of information, enabling users to gain insights into the plots, themes, and characters of renowned literary works. Whether you're conducting literary analysis, studying world literature, or simply seeking a curated selection of summaries to enhance your reading experience, this dataset is a valuable tool for unlocking the essence of global literature through the lens of the Russian language.
h
stihi_ru
huggingface.co
Updated Mar 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ilya Gusev (2023). stihi_ru [Dataset]. https://huggingface.co/datasets/IlyaGusev/stihi_ru
Explore at:
Dataset updated
Mar 17, 2023
Authors
Ilya Gusev
Description
Stihi.ru dataset

Description

Summary: A subset if Taiga, uploaded here for convenience. Additional cleaning was performed. Script: create_stihi.py Point of Contact: Ilya Gusev Languages: Russian.

Usage

Prerequisites: pip install datasets zstandard jsonlines pysimdjson

Dataset iteration: from datasets import load_dataset dataset = load_dataset('IlyaGusev/stihi_ru', split="train", streaming=True) for example in dataset: print(example["text"])… See the full description on the dataset page: https://huggingface.co/datasets/IlyaGusev/stihi_ru.
g
Swedish reviews of contemporary Russian novels published in Swedish...
gimi9.com
Updated Mar 3, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Swedish reviews of contemporary Russian novels published in Swedish translation 1994-2020 | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_https-doi-org-10-5878-2maz-cm70/
Explore at:
Dataset updated
Mar 3, 2022
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Sweden
Description
The data material consists of a detailed description of a review corpus used in order to analyze the reception of Russian literature in Sweden. The reviews are protected by copyright and may not be made available. Therefore, the data instead contains a complete specification of the review database, information regarding how the reviews have been classified, and finally, information about the authors, translators, critics and media sources related to the material.

Facebook

Twitter

Click to copy link

Link copied

Cite

d0rj_ (2022). Russian literature [Dataset]. https://www.kaggle.com/d0rj3228/russian-literature

Russian literature

A collection of Russian literature

Explore at:

zip(21547079 bytes)Available download formats

Dataset updated

Dec 10, 2022

Authors

d0rj_

License

ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically

Area covered

Russia

Description

Content

This repository contains a collection of Russian literature in txt format (all in UTF-8 encoding). In addition, for each author there is a csv file containing information about the year of writing of each work.

This dataset was created for a project to determine the authorship of a piece of text, but I'm sure that you can use this dataset for anything 😉.

The main feature that allows this dataset to be used for any purpose is that the data is not processed at all. The text has not been pre-processed in any way, the designations of authors, chapters and references to the translation of foreign inserts have not been removed.

Acknowledgements

Thanks Ilibrary, LitLib, Wikisource and all-all-all.

Clear search

Close search

Google apps

Main menu

Russian literature

Content

Acknowledgements

Russian literature and thought

Russian literature texts

Dataset

Contents

Institute of Russian Literature Dataverse - St Petersburg, Russia

Russian books text corpus

The Uppsala Russian Corpus

Data from: Kitzinger_MappingNetworksC&P_OffstageEdges

tape

Swedish reviews of post-Soviet Russian novels published in Swedish...

Swedish reviews of post-Soviet Russian novels published in Swedish...

POSTMODERNISM IN RUSSIAN LITERATURE

National Science and Technology Commission Literature II Discipline Project...

Complex Russian Dataset

This dataset contains a large number of diverse Russian-language text collected from a variety of sources

Data from: Kitzinger_MappingNetworksC&P_AllCharactersEdges

THE EVOLUTION OF THE HERO'S IMAGE IN RUSSIAN LITERATURE OF THE XVIII CENTURY...

And Quiet Flows the Don

Context

Content

Russian Educational Text Collection

World Literature Summaries On Russian

stihi_ru

Swedish reviews of contemporary Russian novels published in Swedish...

Russian literature

A collection of Russian literature

Content

Acknowledgements