MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions
CanItEdit is a benchmark for evaluating LLMs on instructional code editing, the task of updating a program given a natural language instruction. The benchmark contains 105 hand-crafted Python programs with before and after code blocks, two types of natural language instructions (descriptive and lazy), and a hidden test suite. The dataset’s dual natural language instructions test model… See the full description on the dataset page: https://huggingface.co/datasets/nuprl/CanItEdit.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Card for PIPE Masks Dataset
Dataset Summary
The PIPE (Paint by InPaint Edit) dataset is designed to enhance the efficacy of mask-free, instruction-following image editing models by providing a large-scale collection of image pairs and diverse object addition instructions. Here, we provide the masks used for the inpainting process to generate the source image for the PIPE dataset for both the train and test sets. Further details can be found in our project page… See the full description on the dataset page: https://huggingface.co/datasets/paint-by-inpaint/PIPE_Masks.
VectorEdits: A Dataset and Benchmark for Instruction-Based Editing of Vector Graphics
NOTE: Currently only test set has generated labels, other sets will have them soon Paper (Soon) We introduce a large-scale dataset for instruction-guided vector image editing, consisting of over 270,000 pairs of SVG images paired with natural language edit instructions. Our dataset enables training and evaluation of models that modify vector graphics based on textual commands. We describe the data… See the full description on the dataset page: https://huggingface.co/datasets/mikronai/VectorEdits.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Card for MagicBrush
Dataset Summary
MagicBrush is the first large-scale, manually-annotated instruction-guided image editing dataset covering diverse scenarios single-turn, multi-turn, mask-provided, and mask-free editing. MagicBrush comprises 10K (source image, instruction, target image) triples, which is sufficient to train large-scale image editing models. Please check our website to explore more visual results.
Dataset Structure
"img_id" (str):… See the full description on the dataset page: https://huggingface.co/datasets/osunlp/MagicBrush.
The ImageNet dataset contains 14,197,122 annotated images according to the WordNet hierarchy. Since 2010 the dataset is used in the ImageNet Large Scale Visual Recognition Challenge (ILSVRC), a benchmark in image classification and object detection. The publicly released dataset contains a set of manually annotated training images. A set of test images is also released, with the manual annotations withheld. ILSVRC annotations fall into one of two categories: (1) image-level annotation of a binary label for the presence or absence of an object class in the image, e.g., “there are cars in this image” but “there are no tigers,” and (2) object-level annotation of a tight bounding box and class label around an object instance in the image, e.g., “there is a screwdriver centered at position (20,25) with width of 50 pixels and height of 30 pixels”. The ImageNet project does not own the copyright of the images, therefore only thumbnails and URLs of images are provided.
Total number of non-empty WordNet synsets: 21841 Total number of images: 14197122 Number of images with bounding box annotations: 1,034,908 Number of synsets with SIFT features: 1000 Number of images with SIFT features: 1.2 million
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for CoEdIT: Text Editing via Instruction Tuning
Paper: CoEdIT: Text Editing by Task-Specific Instruction Tuning
Authors: Vipul Raheja, Dhruv Kumar, Ryan Koo, Dongyeop Kang
Project Repo: https://github.com/vipulraheja/coedit
Dataset Summary
This is the dataset that was used to train the CoEdIT text editing models. Full details of the dataset can be found in our paper.
Dataset Structure
The dataset is in JSON format.… See the full description on the dataset page: https://huggingface.co/datasets/grammarly/coedit.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
🐋 The OpenOrca Dataset! 🐋
We are thrilled to announce the release of the OpenOrca dataset! This rich collection of augmented FLAN data aligns, as best as possible, with the distributions outlined in the Orca paper. It has been instrumental in generating high-performing model checkpoints and serves as a valuable resource for all NLP researchers and developers!
Official Models
Mistral-7B-OpenOrca
Our latest model, the first 7B to score better overall than all… See the full description on the dataset page: https://huggingface.co/datasets/Open-Orca/OpenOrca.
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Dataset Card for "dolly_hhrlhf"
This dataset is a combination of Databrick's dolly-15k dataset and a filtered subset of Anthropic's HH-RLHF. It also includes a test split, which was missing in the original dolly set. That test set is composed of 200 randomly selected samples from dolly + 4,929 of the test set samples from HH-RLHF which made it through the filtering process. The train set contains 59,310 samples; 15,014 - 200 = 14,814 from Dolly, and the remaining 44,496 from… See the full description on the dataset page: https://huggingface.co/datasets/mosaicml/dolly_hhrlhf.
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for "imdb"
Dataset Summary
Large Movie Review Dataset. This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. We provide a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. There is additional unlabeled data for use as well.
Supported Tasks and Leaderboards
More Information Needed
Languages
More Information Needed
Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/stanfordnlp/imdb.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Update
[01/31/2024] We update the OpenAI Moderation API results for ToxicChat (0124) based on their updated moderation model on on Jan 25, 2024.[01/28/2024] We release an official T5-Large model trained on ToxicChat (toxicchat0124). Go and check it for you baseline comparision![01/19/2024] We have a new version of ToxicChat (toxicchat0124)!
Content
This dataset contains toxicity annotations on 10K user prompts collected from the Vicuna online demo. We utilize a human-AI… See the full description on the dataset page: https://huggingface.co/datasets/lmsys/toxic-chat.
✍️ Commit Message Edits Dataset
This dataset is a collection of expert-labeled commit message edits contributed via Commit Message Editing app presented in Towards Realistic Evaluation of Commit Message Generation by Matching Online and Offline Settings. Labelers were presented with GPT-4 generated messages for 15 commits from CMG benchmark from Long Code Arena and asked to manually edit them to be of good enough quality to submit to VCS. You can check Manual tab in our… See the full description on the dataset page: https://huggingface.co/datasets/JetBrains-Research/commit-msg-edits.
Dataset origin: https://live.european-language-grid.eu/catalogue/corpus/709/
Description
The human evaluation (HE) dataset created for English to German (EnDe) and English to French (EnFr) MT tasks was a subset of one of the official test sets of the IWSLT 2016 evaluation campaign. The resulting HE sets are composed of 600 segments for both EnDe and EnFr, each corresponding to around 10,000 words. Human evaluation was based on Post-Editing, i.e. the manual correction of the MT… See the full description on the dataset page: https://huggingface.co/datasets/FrancophonIA/IWSLT_2016.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for "commonsense_qa"
Dataset Summary
CommonsenseQA is a new multiple-choice question answering dataset that requires different types of commonsense knowledge to predict the correct answers . It contains 12,102 questions with one correct answer and four distractor answers. The dataset is provided in two major training/validation/testing set splits: "Random split" which is the main evaluation split, and "Question token split", see paper for details.… See the full description on the dataset page: https://huggingface.co/datasets/tau/commonsense_qa.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides the template sentences and relationships defined in the ATOMIC common sense dataset. There are three splits - train, test, and dev.
From the authors.
Disclaimer/Content warning: the events in atomic have been automatically extracted from blogs, stories and books written at various times. The events might depict violent or problematic actions, which we left in the corpus for the sake of learning the (probably negative but still important) commonsense implications associated with the events. We removed a small set of truly out-dated events, but might have missed some so please email us (msap@cs.washington.edu) if you have any concerns.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Can It Edit? Evaluating the Ability of Large Language Models to Follow Code Editing Instructions
CanItEdit is a benchmark for evaluating LLMs on instructional code editing, the task of updating a program given a natural language instruction. The benchmark contains 105 hand-crafted Python programs with before and after code blocks, two types of natural language instructions (descriptive and lazy), and a hidden test suite. The dataset’s dual natural language instructions test model… See the full description on the dataset page: https://huggingface.co/datasets/nuprl/CanItEdit.