55 datasets found
  1. DELETE-bot-fight-data

    • huggingface.co
    Updated May 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huggingface Projects (2024). DELETE-bot-fight-data [Dataset]. https://huggingface.co/datasets/huggingface-projects/DELETE-bot-fight-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 16, 2024
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Huggingface Projects
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    huggingface-projects/DELETE-bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community

  2. h

    delete-episodes-from-dataset-1

    • huggingface.co
    Updated Oct 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cortex AI (2025). delete-episodes-from-dataset-1 [Dataset]. https://huggingface.co/datasets/cortexairobot/delete-episodes-from-dataset-1
    Explore at:
    Dataset updated
    Oct 11, 2025
    Dataset authored and provided by
    Cortex AI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset was created using LeRobot.

      Dataset Structure
    

    meta/info.json: { "codebase_version": "v2.1", "robot_type": "bi_piper", "total_episodes": 2, "total_frames": 1216, "total_tasks": 1, "total_videos": 8, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:2" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/cortexairobot/delete-episodes-from-dataset-1.

  3. h

    lmsys-delete-tie-standard

    • huggingface.co
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RLHFlow (2024). lmsys-delete-tie-standard [Dataset]. https://huggingface.co/datasets/RLHFlow/lmsys-delete-tie-standard
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 23, 2024
    Dataset authored and provided by
    RLHFlow
    Description

    RLHFlow/lmsys-delete-tie-standard dataset hosted on Hugging Face and contributed by the HF Datasets community

  4. h

    2bus-delete-block-all-formats

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iain, 2bus-delete-block-all-formats [Dataset]. https://huggingface.co/datasets/SimuGPT/2bus-delete-block-all-formats
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Authors
    Iain
    Description

    SimuGPT/2bus-delete-block-all-formats dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. h

    Non-Delete-ORM-Llama3-tmp07-prompt

    • huggingface.co
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    selfcorrexp2 (2025). Non-Delete-ORM-Llama3-tmp07-prompt [Dataset]. https://huggingface.co/datasets/selfcorrexp2/Non-Delete-ORM-Llama3-tmp07-prompt
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    selfcorrexp2
    Description

    selfcorrexp2/Non-Delete-ORM-Llama3-tmp07-prompt dataset hosted on Hugging Face and contributed by the HF Datasets community

  6. h

    Remove-Watermarks-Dataset

    • huggingface.co
    Updated Sep 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joel Andrés Navarro (2025). Remove-Watermarks-Dataset [Dataset]. https://huggingface.co/datasets/YarvixPA/Remove-Watermarks-Dataset
    Explore at:
    Dataset updated
    Sep 27, 2025
    Authors
    Joel Andrés Navarro
    Description

    YarvixPA/Remove-Watermarks-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    filter-delete-1

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HydraLM, filter-delete-1 [Dataset]. https://huggingface.co/datasets/HydraLM/filter-delete-1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    HydraLM
    Description

    Dataset Card for "deleted-2"

    More Information needed

  8. h

    synth-bg-remove-v4-genbg-1499

    • huggingface.co
    Updated Aug 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dhruv K (2024). synth-bg-remove-v4-genbg-1499 [Dataset]. https://huggingface.co/datasets/unography/synth-bg-remove-v4-genbg-1499
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2024
    Authors
    Dhruv K
    Description

    unography/synth-bg-remove-v4-genbg-1499 dataset hosted on Hugging Face and contributed by the HF Datasets community

  9. h

    ROSE-Dataset

    • huggingface.co
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kunbyte AI (2025). ROSE-Dataset [Dataset]. https://huggingface.co/datasets/Kunbyte/ROSE-Dataset
    Explore at:
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Kunbyte AI
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    ROSE: Remove Objects with Side Effects in Videos Dataset

    This repository contains the dataset released alongside the paper ROSE: Remove Objects with Side Effects in Videos.

    Paper: ROSE: Remove Objects with Side Effects in Videos Project Page: https://rose2025-inpaint.github.io/ Code: https://github.com/Kunbyte-AI/ROSE

      Abstract
    

    Video object removal has achieved advanced performance due to the recent success of video generative models. However, when addressing the… See the full description on the dataset page: https://huggingface.co/datasets/Kunbyte/ROSE-Dataset.

  10. mesh-2025-update-delete-report

    • huggingface.co
    Updated Dec 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health and Human Services (2024). mesh-2025-update-delete-report [Dataset]. https://huggingface.co/datasets/HHS-Official/mesh-2025-update-delete-report
    Explore at:
    Dataset updated
    Dec 16, 2024
    Dataset provided by
    United States Department of Health and Human Serviceshttp://www.hhs.gov/
    Authors
    Department of Health and Human Services
    License

    https://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/

    Description

    MeSH 2025 Update - Delete Report

      Description
    

    (Includes MeSH 2023 and 2024 changes) The MeSH 2025 Update - Delete Report lists Descriptors and Supplementary Concept Records (SCRs) that have been removed from MeSH. This report includes MeSH changes from previous years, starting from 2023.

      Dataset Details
    

    Publisher: National Library of Medicine Last Modified: 2024-12-16 Contact: National Library of Medicine (custserv@nlm.nih.gov)

      Source
    

    Original… See the full description on the dataset page: https://huggingface.co/datasets/HHS-Official/mesh-2025-update-delete-report.

  11. h

    Non-Delete-ORM-Llama3-tmp07-N3-Rewards

    • huggingface.co
    Updated Dec 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hanning Zhang (2024). Non-Delete-ORM-Llama3-tmp07-N3-Rewards [Dataset]. https://huggingface.co/datasets/HanningZhang/Non-Delete-ORM-Llama3-tmp07-N3-Rewards
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 30, 2024
    Authors
    Hanning Zhang
    Description

    HanningZhang/Non-Delete-ORM-Llama3-tmp07-N3-Rewards dataset hosted on Hugging Face and contributed by the HF Datasets community

  12. h

    filter-dataset-from-6k-delete-empty

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kieu Trong Thanh, filter-dataset-from-6k-delete-empty [Dataset]. https://huggingface.co/datasets/thanhsc02/filter-dataset-from-6k-delete-empty
    Explore at:
    Authors
    Kieu Trong Thanh
    Description

    thanhsc02/filter-dataset-from-6k-delete-empty dataset hosted on Hugging Face and contributed by the HF Datasets community

  13. h

    pickup-carrot-remove-parquet-metadata

    • huggingface.co
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Argus Systems (2025). pickup-carrot-remove-parquet-metadata [Dataset]. https://huggingface.co/datasets/argus-systems/pickup-carrot-remove-parquet-metadata
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset authored and provided by
    Argus Systems
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset was created using LeRobot.

      Dataset Structure
    

    meta/info.json: { "codebase_version": "v2.1", "trossen_subversion": "v1.0", "robot_type": "trossen_ai_stationary", "total_episodes": 21, "total_frames": 9383, "total_tasks": 1, "total_videos": 84, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:21" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet"… See the full description on the dataset page: https://huggingface.co/datasets/argus-systems/pickup-carrot-remove-parquet-metadata.

  14. h

    augfv-w-random-delete

    • huggingface.co
    Updated Sep 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PAD (2024). augfv-w-random-delete [Dataset]. https://huggingface.co/datasets/PAD6/augfv-w-random-delete
    Explore at:
    Dataset updated
    Sep 12, 2024
    Authors
    PAD
    Description

    Stanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.

  15. h

    long-sentence-dataset-from-6k-delete-empty

    • huggingface.co
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kieu Trong Thanh, long-sentence-dataset-from-6k-delete-empty [Dataset]. https://huggingface.co/datasets/thanhsc02/long-sentence-dataset-from-6k-delete-empty
    Explore at:
    Authors
    Kieu Trong Thanh
    Description

    thanhsc02/long-sentence-dataset-from-6k-delete-empty dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. h4-tests-format-sft-dataset

    • huggingface.co
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hugging Face H4 (2024). h4-tests-format-sft-dataset [Dataset]. https://huggingface.co/datasets/HuggingFaceH4/h4-tests-format-sft-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 7, 2024
    Dataset provided by
    Hugging Facehttps://huggingface.co/
    Authors
    Hugging Face H4
    Description

    DO NOT DELETE ME! I'M USED IN THE H4 UNIT TESTS :)

  17. h

    codeparrot-clean

    • huggingface.co
    Updated Dec 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CodeParrot (2021). codeparrot-clean [Dataset]. https://huggingface.co/datasets/codeparrot/codeparrot-clean
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 7, 2021
    Dataset provided by
    Good Engineering, Inc
    Authors
    CodeParrot
    Description

    CodeParrot 🦜 Dataset Cleaned

      What is it?
    

    A dataset of Python files from Github. This is the deduplicated version of the codeparrot.

      Processing
    

    The original dataset contains a lot of duplicated and noisy data. Therefore, the dataset was cleaned with the following steps:

    Deduplication Remove exact matches

    Filtering Average line length < 100 Maximum line length < 1000 Alpha numeric characters fraction > 0.25 Remove auto-generated files (keyword search)

    For… See the full description on the dataset page: https://huggingface.co/datasets/codeparrot/codeparrot-clean.

  18. h

    pii-masking-65k

    • huggingface.co
    Updated Apr 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai4Privacy (2024). pii-masking-65k [Dataset]. http://doi.org/10.57967/hf/2012
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 5, 2024
    Dataset authored and provided by
    Ai4Privacy
    Description

    Purpose and Features

    The purpose of the model and dataset is to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The model is a fine-tuned version of "Distilled BERT", a smaller and faster version of BERT. It was adapted for the task of token classification based on the largest to our knowledge open-source PII masking dataset, which we are releasing simultaneously. The model size is 62 million parameters. The original… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-65k.

  19. h

    pii-masking-200k

    • huggingface.co
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai4Privacy (2024). pii-masking-200k [Dataset]. http://doi.org/10.57967/hf/1532
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 22, 2024
    Dataset authored and provided by
    Ai4Privacy
    Description

    Ai4Privacy Community

    Join our community at https://discord.gg/FmzWshaaQT to help build open datasets for privacy masking.

      Purpose and Features
    

    Previous world's largest open dataset for privacy. Now it is pii-masking-300k The purpose of the dataset is to train models to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The example texts have 54 PII classes (types of sensitive data), targeting 229 discussion… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-200k.

  20. h

    xlerobot-candybar-007-remove-fourth

    • huggingface.co
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siyu Liu (2025). xlerobot-candybar-007-remove-fourth [Dataset]. https://huggingface.co/datasets/siyulw2025/xlerobot-candybar-007-remove-fourth
    Explore at:
    Dataset updated
    Oct 24, 2025
    Authors
    Siyu Liu
    Description

    siyulw2025/xlerobot-candybar-007-remove-fourth dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Huggingface Projects (2024). DELETE-bot-fight-data [Dataset]. https://huggingface.co/datasets/huggingface-projects/DELETE-bot-fight-data
Organization logo

DELETE-bot-fight-data

huggingface-projects/DELETE-bot-fight-data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 16, 2024
Dataset provided by
Hugging Facehttps://huggingface.co/
Authors
Huggingface Projects
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

huggingface-projects/DELETE-bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community

Search
Clear search
Close search
Google apps
Main menu