This dataset was created by Balamurugan1603
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
See http://static.wiki for where this dataset is used and https://github.com/segfall/static-wiki for the explanation around the code backing this dataset. Generated from XML found in https://dumps.wikimedia.org/enwiki/.
See https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License and https://en.wikipedia.org/wiki/Wikipedia:Copyrights for the licensing around the transformed content in the dataset.
Contains two filled tables, wiki_articles and wiki_article_title_search. The former has three columns, title, text, and redirect. The latter has three columns, title, search_title, and redirect. search_title is stripped of spaces and special characters to speed up the autocomplete search.
Articles are not guaranteed to be 1:1 with their original Wikipedia counterparts. Expect weird formatting bugs like links that are randomly reduced to plaintext.
See https://github.com/segfall/static-wiki#credits for the complete list. Special thanks to Kaggle for allowing up to 100GB per public dataset, for free!
Not seeing a result you expected?
Learn how you can add new datasets to our index.
This dataset was created by Balamurugan1603