DS-1000 is a code generation benchmark with a thousand data science questions spanning seven Python libraries that (1) reflects diverse, realistic, and practical use cases, (2) has a reliable metric, (3) defends against memorization by perturbing questions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DS-bench: Code Generation Benchmark for Data Science Code
GitHub repo
Abstract
We introduce DS-bench, a new benchmark designed to evaluate large language models (LLMs) on complicated data science code generation tasks. Existing benchmarks, such as DS-1000, often consist of overly simple code snippets, imprecise problem descriptions, and inadequate testing. DS-bench sources 1,000 realistic problems from GitHub across ten widely used Python data science libraries, offering… See the full description on the dataset page: https://huggingface.co/datasets/LaPluma077/DS_bench.
Die nationalen Verkehrsmodelle basieren auf den Strukturdaten 2010. Dieser Datensatz enthält einige Modelle für 2040. Weitere Modelle sind für 2040 sowie für 2010, 2020 und 2030 in den anderen Datensätzen des Projekts verfügbar. Für einen Überblick über den Aufbau des Datensatzes können Sie sich das Dokument ansehen: (Read me first) Projektbeschreibung Verkehrsmodellierung im UVEK D/F/I.
Les modèles nationaux des transports sont effectués sur la base des données structurelles de 2010. Ce jeu de données contient certains modèles pour 2040. D’autres modèles sont disponibles pour 2040, ainsi que pour 2010, 2020 et 2030 dans les autres jeux de données du projet. Pour une vue d'ensemble de la structure des données, vous pouvez consulter le document: (Read me first) Projektbeschreibung Verkehrsmodellierung im UVEK D/F/I.
National transport models are based on structural data from 2010. This dataset contains some models for 2040. Other models are available for 2040, as well as for 2010, 2020 and 2030 in the other datasets of the project. For an overview of the data structure, you can consult the document: (Read me first) Projektbeschreibung Verkehrsmodellierung im UVEK D/F/I.
dnanper/ft-dscoder_qwen2-7B_eval-ds1000 dataset hosted on Hugging Face and contributed by the HF Datasets community
dnanper/basemodel-qwen2-7B-eval-ds1000 dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Objdet Ds is a dataset for object detection tasks - it contains Objects annotations for 1,000 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan EPI: W: EEP: ECD: DS: Rectifiers data was reported at 1.200 Per 1000 in Dec 2016. This stayed constant from the previous number of 1.200 Per 1000 for Nov 2016. Japan EPI: W: EEP: ECD: DS: Rectifiers data is updated monthly, averaging 1.200 Per 1000 from Jan 1995 (Median) to Dec 2016, with 264 observations. The data reached an all-time high of 1.200 Per 1000 in Dec 2016 and a record low of 1.200 Per 1000 in Dec 2016. Japan EPI: W: EEP: ECD: DS: Rectifiers data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I160: Export Price Index: 2010=100: Weight.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
## Overview
DS PG7 MinneApple is a dataset for object detection tasks - it contains Apple annotations for 1,000 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [MIT license](https://creativecommons.org/licenses/MIT).
description: This data set combines vegetation datasets from three mapping project areas in the Sacramento Valley and riparian areas of the San Joaquin Valley to facilitate regional planning, conservation, and enhancement of biological resources by state and local agencies, project partners and regional stakeholders. This dataset meets the National Vegetation Classfication Standared and California Vegetation Classification and Mapping Standards. Vegetation is mapped to the Alliance level with a 1-acre minimum mapping unit. Polygons are also attributed with total bird's-eye cover of trees, shrubs and herbs. Detailed reports on the classification and mapping standards can be downloaded (see summary for links).; abstract: This data set combines vegetation datasets from three mapping project areas in the Sacramento Valley and riparian areas of the San Joaquin Valley to facilitate regional planning, conservation, and enhancement of biological resources by state and local agencies, project partners and regional stakeholders. This dataset meets the National Vegetation Classfication Standared and California Vegetation Classification and Mapping Standards. Vegetation is mapped to the Alliance level with a 1-acre minimum mapping unit. Polygons are also attributed with total bird's-eye cover of trees, shrubs and herbs. Detailed reports on the classification and mapping standards can be downloaded (see summary for links).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan EPI: W: EEP: ECD: Discrete Semiconductors (DS) data was reported at 4.400 Per 1000 in Apr 2022. This stayed constant from the previous number of 4.400 Per 1000 for Mar 2022. Japan EPI: W: EEP: ECD: Discrete Semiconductors (DS) data is updated monthly, averaging 4.400 Per 1000 from Jan 2015 (Median) to Apr 2022, with 88 observations. The data reached an all-time high of 4.400 Per 1000 in Apr 2022 and a record low of 4.400 Per 1000 in Apr 2022. Japan EPI: W: EEP: ECD: Discrete Semiconductors (DS) data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I143: Export Price Index: 2015=100: Weight.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan EPI: W: EEP: ECD: DS: Diodes data was reported at 1.300 Per 1000 in Apr 2022. This stayed constant from the previous number of 1.300 Per 1000 for Mar 2022. Japan EPI: W: EEP: ECD: DS: Diodes data is updated monthly, averaging 1.300 Per 1000 from Jan 2015 (Median) to Apr 2022, with 88 observations. The data reached an all-time high of 1.300 Per 1000 in Apr 2022 and a record low of 1.300 Per 1000 in Apr 2022. Japan EPI: W: EEP: ECD: DS: Diodes data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I143: Export Price Index: 2015=100: Weight.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan EPI: W: EEP: ECD: DS: Transistors data was reported at 3.100 Per 1000 in Apr 2022. This stayed constant from the previous number of 3.100 Per 1000 for Mar 2022. Japan EPI: W: EEP: ECD: DS: Transistors data is updated monthly, averaging 3.100 Per 1000 from Jan 1980 (Median) to Apr 2022, with 508 observations. The data reached an all-time high of 3.100 Per 1000 in Apr 2022 and a record low of 3.100 Per 1000 in Apr 2022. Japan EPI: W: EEP: ECD: DS: Transistors data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I143: Export Price Index: 2015=100: Weight.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan CGPI: W: ED: DE: DS: Light Emitting Diodes data was reported at 0.700 Per 1000 in May 2012. This stayed constant from the previous number of 0.700 Per 1000 for Apr 2012. Japan CGPI: W: ED: DE: DS: Light Emitting Diodes data is updated monthly, averaging 0.700 Per 1000 from Jan 2005 (Median) to May 2012, with 89 observations. The data reached an all-time high of 0.700 Per 1000 in May 2012 and a record low of 0.700 Per 1000 in May 2012. Japan CGPI: W: ED: DE: DS: Light Emitting Diodes data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I289: Corporate Goods Price Index: 2005=100: Weight.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Japan CGPI: W: ED: DE: DS: Transistors data was reported at 1.500 Per 1000 in May 2012. This stayed constant from the previous number of 1.500 Per 1000 for Apr 2012. Japan CGPI: W: ED: DE: DS: Transistors data is updated monthly, averaging 1.500 Per 1000 from Jan 2005 (Median) to May 2012, with 89 observations. The data reached an all-time high of 1.500 Per 1000 in May 2012 and a record low of 1.500 Per 1000 in May 2012. Japan CGPI: W: ED: DE: DS: Transistors data remains active status in CEIC and is reported by Bank of Japan. The data is categorized under Global Database’s Japan – Table JP.I289: Corporate Goods Price Index: 2005=100: Weight.
Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
License information was derived automatically
Das Informationssystem gibt einen stark generalisierten Überblick über die Verteilung der Rohstoffvorkommen in Nordrhein-Westfalen. Das Kartenwerk zeigt energetische (Braun- und Steinkohle, Erd- und Grubengas) und nicht-energetische Rohstoffvorkommen (Locker- und Festgesteine, Steinsalz) sowie die Bezirke der Erz- und Industrieminerale in NRW.
BLiMP is a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English. BLiMP consists of 67 sub-datasets, each containing 1000 minimal pairs isolating specific contrasts in syntax, morphology, or semantics. The data is automatically generated according to expert-crafted grammars.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('blimp', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
DS-1000 is a code generation benchmark with a thousand data science questions spanning seven Python libraries that (1) reflects diverse, realistic, and practical use cases, (2) has a reliable metric, (3) defends against memorization by perturbing questions.