2 datasets found
  1. E-commerce product names for search autocomplete

    • kaggle.com
    Updated Dec 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Balamurugan1603 (2021). E-commerce product names for search autocomplete [Dataset]. https://www.kaggle.com/balamurugan1603/ecommerce-product-names-for-search-autocomplete/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 2, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Balamurugan1603
    Description

    Dataset

    This dataset was created by Balamurugan1603

    Contents

  2. Markdown-like Wikipedia dumps in SQLite

    • kaggle.com
    Updated Jul 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    segfall (2021). Markdown-like Wikipedia dumps in SQLite [Dataset]. https://www.kaggle.com/datasets/segfall/markdownlike-wikipedia-dumps-in-sqlite
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 24, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    segfall
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    See http://static.wiki for where this dataset is used and https://github.com/segfall/static-wiki for the explanation around the code backing this dataset. Generated from XML found in https://dumps.wikimedia.org/enwiki/.

    See https://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License and https://en.wikipedia.org/wiki/Wikipedia:Copyrights for the licensing around the transformed content in the dataset.

    Content

    Contains two filled tables, wiki_articles and wiki_article_title_search. The former has three columns, title, text, and redirect. The latter has three columns, title, search_title, and redirect. search_title is stripped of spaces and special characters to speed up the autocomplete search.

    Articles are not guaranteed to be 1:1 with their original Wikipedia counterparts. Expect weird formatting bugs like links that are randomly reduced to plaintext.

    Acknowledgements

    See https://github.com/segfall/static-wiki#credits for the complete list. Special thanks to Kaggle for allowing up to 100GB per public dataset, for free!

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Balamurugan1603 (2021). E-commerce product names for search autocomplete [Dataset]. https://www.kaggle.com/balamurugan1603/ecommerce-product-names-for-search-autocomplete/code
Organization logo

E-commerce product names for search autocomplete

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Balamurugan1603
Description

Dataset

This dataset was created by Balamurugan1603

Contents

Search
Clear search
Close search
Google apps
Main menu