2 datasets found
  1. h

    Gutenberg-BookCorpus-Cleaned-Data-English

    • huggingface.co
    Updated Oct 28, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gutenberg-BookCorpus-Cleaned-Data-English [Dataset]. https://huggingface.co/datasets/incredible45/Gutenberg-BookCorpus-Cleaned-Data-English
    Explore at:
    Dataset updated
    Oct 28, 2018
    Authors
    Lokesh Parab
    Description

    Gutenberg-BookCorpus-Cleaned-Data-English

    This dataset is been cleaned and preprocessed using Gutenberg_English_Preprocessor class method (given below) from preference Kaggle dataset 75,000+ Gutenberg Books and Metadata 2025. This dataset is only specialisation for english contented with rights as "Public domain in the USA" hence you can free used it anywhere. Following reference metadata of Gutenberg is also available and downloaded it using following CLI command below :- pip… See the full description on the dataset page: https://huggingface.co/datasets/incredible45/Gutenberg-BookCorpus-Cleaned-Data-English.

  2. o

    Linux/CMD/MacOS Commands

    • opendatabay.com
    .undefined
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Linux/CMD/MacOS Commands [Dataset]. https://www.opendatabay.com/data/ai-ml/68822edd-7bdf-485f-a3a7-48fe2fc8136e
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Data Science and Analytics
    Description

    linux_commands.csv:

    Description: This dataset provides a comprehensive list of Linux commands commonly used in Unix-based operating systems. Each entry includes details such as the command name, a brief description of its functionality, and any additional parameters or options.

    cmd_commands.csv:

    Description: The "cmd_commands.csv" dataset presents a collection of commands for the Windows Command Prompt (cmd). It covers a range of command-line operations and system management tasks specific to the Windows operating system. Entries include the command name, a brief description, and relevant usage information.

    macos_commands.csv:

    Description: This dataset compiles a set of commands tailored for macOS command-line interfaces. It encompasses commands commonly used in Terminal on Apple's macOS operating system. Each entry in the dataset includes the command, a concise description of its purpose, and any pertinent options or arguments.

    vbscript_commands.csv:

    Description: The "vbscript_commands.csv" dataset contains a list of VBScript commands and functionalities. VBScript, or Visual Basic Scripting Edition, is a scripting language developed by Microsoft. This dataset provides insights into various VBScript commands, their applications, and usage details, making it a valuable resource for scripting on Windows-based systems.

    Original Data Source: Linux/CMD/MacOS Commands

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gutenberg-BookCorpus-Cleaned-Data-English [Dataset]. https://huggingface.co/datasets/incredible45/Gutenberg-BookCorpus-Cleaned-Data-English

Gutenberg-BookCorpus-Cleaned-Data-English

incredible45/Gutenberg-BookCorpus-Cleaned-Data-English

Explore at:
Dataset updated
Oct 28, 2018
Authors
Lokesh Parab
Description

Gutenberg-BookCorpus-Cleaned-Data-English

This dataset is been cleaned and preprocessed using Gutenberg_English_Preprocessor class method (given below) from preference Kaggle dataset 75,000+ Gutenberg Books and Metadata 2025. This dataset is only specialisation for english contented with rights as "Public domain in the USA" hence you can free used it anywhere. Following reference metadata of Gutenberg is also available and downloaded it using following CLI command below :- pip… See the full description on the dataset page: https://huggingface.co/datasets/incredible45/Gutenberg-BookCorpus-Cleaned-Data-English.

Search
Clear search
Close search
Google apps
Main menu