MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
philipfourie/reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
helloTR/reverse-seed-data dataset hosted on Hugging Face and contributed by the HF Datasets community
Alignment-Lab-AI/reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
willcb/R1-reverse-wikipedia-paragraphs-v1-100 dataset hosted on Hugging Face and contributed by the HF Datasets community
jianqunZ/Length-v1.1-reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
lucadang/reverse-text-qwen72B dataset hosted on Hugging Face and contributed by the HF Datasets community
jianqunZ/Clarity-v1.1-reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
CAD-Recode: Reverse Engineering CAD Code from Point Clouds
CAD-Recode dataset is provided in form of Python (CadQuery) codes. Train size is ~1M and validation size is ~1k. CAD-Recode model and code are released at github https://github.com/filaPro/cad-recode. And if you like it, give us a github 🌟.
Citation
If you find this work useful for your research, please cite our paper: @misc{rukhovich2024cadrecode, title={CAD-Recode: Reverse Engineering CAD Code from Point… See the full description on the dataset page: https://huggingface.co/datasets/filapro/cad-recode.
Arnab13/LIMA-Reverse-Instructi dataset hosted on Hugging Face and contributed by the HF Datasets community
orionweller/instruct-cl-reverse-embeddings dataset hosted on Hugging Face and contributed by the HF Datasets community
willcb/V3-reverse-wiki-paragraphs-test dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Card for OpenAI HumanEval
Dataset Summary
The HumanEval dataset released by OpenAI includes 164 programming problems with a function sig- nature, docstring, body, and several unit tests. They were handwritten to ensure not to be included in the training set of code generation models.
Supported Tasks and Leaderboards
Languages
The programming problems are written in Python and contain English natural text in comments and docstrings.… See the full description on the dataset page: https://huggingface.co/datasets/openai/openai_humaneval.
worldcuisines/vqa-v1.1-reversed dataset hosted on Hugging Face and contributed by the HF Datasets community
rinabuoy/Eng-Khmer-Agg-Reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
jspr/pop-lyrics-reverse-prompts dataset hosted on Hugging Face and contributed by the HF Datasets community
jkkn/guanaco-reverse-instruct-transformed dataset hosted on Hugging Face and contributed by the HF Datasets community
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
LongForm
The LongForm dataset is created by leveraging English corpus examples with reverse instructions. We select a diverse set of human-written documents from existing corpora such as C4 and Wikipedia and generate instructions for the given documents via LLMs. Then, we extend these examples with structured corpora examples such as Stack Exchange and WikiHow and task examples such as question answering, email writing, grammar error correction, story/poem… See the full description on the dataset page: https://huggingface.co/datasets/akoksal/LongForm.
jianqunZ/Audience-v1.1-reverse dataset hosted on Hugging Face and contributed by the HF Datasets community
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
High-Quality OpenAssistant Subset for Reverse Instruction Generation
Dataset Description
This dataset is a carefully curated subset of the Open Assistant dataset, specifically designed for training models to generate instructions based on responses.
Key Features:
Contains only the highest-rated conversation paths from the original dataset Filtered to include only English language conversations Removed mentions of Open Assistant to improve generalizability… See the full description on the dataset page: https://huggingface.co/datasets/Arnab13/guanaco-llama2-reverse-instruct.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created using LeRobot.
Dataset Structure
meta/info.json: { "codebase_version": "v2.0", "robot_type": "koch", "total_episodes": 1, "total_frames": 449, "total_tasks":1, "total_videos": 2, "total_chunks": 1, "chunks_size": 1000, "fps": 10, "splits": { "train": "0:1" }, "data_path": "data/chunk-{episode_chunk:03d}/episode_{episode_index:06d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/abbyoneill/reverse.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
philipfourie/reverse dataset hosted on Hugging Face and contributed by the HF Datasets community