Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
huggingface-projects/DELETE-bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created using LeRobot.
Dataset Structure
meta/info.json: { "codebase_version": "v2.1", "robot_type": "bi_piper", "total_episodes": 2, "total_frames": 1216, "total_tasks": 1, "total_videos": 8, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:2" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/cortexairobot/delete-episodes-from-dataset-1.
Facebook
TwitterRLHFlow/lmsys-delete-tie-standard dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterSimuGPT/2bus-delete-block-all-formats dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterselfcorrexp2/Non-Delete-ORM-Llama3-tmp07-prompt dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterYarvixPA/Remove-Watermarks-Dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDataset Card for "deleted-2"
More Information needed
Facebook
Twitterunography/synth-bg-remove-v4-genbg-1499 dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
ROSE: Remove Objects with Side Effects in Videos Dataset
This repository contains the dataset released alongside the paper ROSE: Remove Objects with Side Effects in Videos.
Paper: ROSE: Remove Objects with Side Effects in Videos Project Page: https://rose2025-inpaint.github.io/ Code: https://github.com/Kunbyte-AI/ROSE
Abstract
Video object removal has achieved advanced performance due to the recent success of video generative models. However, when addressing the… See the full description on the dataset page: https://huggingface.co/datasets/Kunbyte/ROSE-Dataset.
Facebook
Twitterhttps://choosealicense.com/licenses/odbl/https://choosealicense.com/licenses/odbl/
MeSH 2025 Update - Delete Report
Description
(Includes MeSH 2023 and 2024 changes) The MeSH 2025 Update - Delete Report lists Descriptors and Supplementary Concept Records (SCRs) that have been removed from MeSH. This report includes MeSH changes from previous years, starting from 2023.
Dataset Details
Publisher: National Library of Medicine Last Modified: 2024-12-16 Contact: National Library of Medicine (custserv@nlm.nih.gov)
Source
Original… See the full description on the dataset page: https://huggingface.co/datasets/HHS-Official/mesh-2025-update-delete-report.
Facebook
TwitterHanningZhang/Non-Delete-ORM-Llama3-tmp07-N3-Rewards dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
Twitterthanhsc02/filter-dataset-from-6k-delete-empty dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created using LeRobot.
Dataset Structure
meta/info.json: { "codebase_version": "v2.1", "trossen_subversion": "v1.0", "robot_type": "trossen_ai_stationary", "total_episodes": 21, "total_frames": 9383, "total_tasks": 1, "total_videos": 84, "total_chunks": 1, "chunks_size": 1000, "fps": 30, "splits": { "train": "0:21" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet"… See the full description on the dataset page: https://huggingface.co/datasets/argus-systems/pickup-carrot-remove-parquet-metadata.
Facebook
TwitterStanford Question Answering Dataset (SQuAD) is a reading comprehension dataset, consisting of questions posed by crowdworkers on a set of Wikipedia articles, where the answer to every question is a segment of text, or span, from the corresponding reading passage, or the question might be unanswerable.
Facebook
Twitterthanhsc02/long-sentence-dataset-from-6k-delete-empty dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDO NOT DELETE ME! I'M USED IN THE H4 UNIT TESTS :)
Facebook
TwitterCodeParrot 🦜 Dataset Cleaned
What is it?
A dataset of Python files from Github. This is the deduplicated version of the codeparrot.
Processing
The original dataset contains a lot of duplicated and noisy data. Therefore, the dataset was cleaned with the following steps:
Deduplication Remove exact matches
Filtering Average line length < 100 Maximum line length < 1000 Alpha numeric characters fraction > 0.25 Remove auto-generated files (keyword search)
For… See the full description on the dataset page: https://huggingface.co/datasets/codeparrot/codeparrot-clean.
Facebook
TwitterPurpose and Features
The purpose of the model and dataset is to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The model is a fine-tuned version of "Distilled BERT", a smaller and faster version of BERT. It was adapted for the task of token classification based on the largest to our knowledge open-source PII masking dataset, which we are releasing simultaneously. The model size is 62 million parameters. The original… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-65k.
Facebook
TwitterAi4Privacy Community
Join our community at https://discord.gg/FmzWshaaQT to help build open datasets for privacy masking.
Purpose and Features
Previous world's largest open dataset for privacy. Now it is pii-masking-300k The purpose of the dataset is to train models to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The example texts have 54 PII classes (types of sensitive data), targeting 229 discussion… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-200k.
Facebook
Twittersiyulw2025/xlerobot-candybar-007-remove-fourth dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
huggingface-projects/DELETE-bot-fight-data dataset hosted on Hugging Face and contributed by the HF Datasets community