100+ datasets found

h
train-test-valid
huggingface.co
Updated Jun 23, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M,MEYYAPPAN (2024). train-test-valid [Dataset]. https://huggingface.co/datasets/mmtg/train-test-valid
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 23, 2024
Authors
M,MEYYAPPAN
License
https://choosealicense.com/licenses/cdla-permissive-1.0/https://choosealicense.com/licenses/cdla-permissive-1.0/
Description
Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

Dataset Details Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

Dataset Sources [optional]

Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/mmtg/train-test-valid.
R
Valid And Test Dataset
universe.roboflow.com
zip
Updated Jun 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justins Group (2025). Valid And Test Dataset [Dataset]. https://universe.roboflow.com/justins-group/valid-and-test
Explore at:
zipAvailable download formats
Dataset updated
Jun 12, 2025
Dataset authored and provided by
Justins Group
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Valid And Test Bounding Boxes
Description
Valid And Test

## Overview Valid And Test is a dataset for object detection tasks - it contains Valid And Test annotations for 4,145 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
R
Extrusion Valid & Test Dataset
universe.roboflow.com
zip
Updated Mar 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
THESIS 2 (2025). Extrusion Valid & Test Dataset [Dataset]. https://universe.roboflow.com/thesis-2-lhf0a/extrusion-valid-test
Explore at:
zipAvailable download formats
Dataset updated
Mar 21, 2025
Dataset authored and provided by
THESIS 2
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Defects Bounding Boxes
Description
Extrusion Valid & Test

## Overview Extrusion Valid & Test is a dataset for object detection tasks - it contains Defects annotations for 200 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
R
Test Valid Dataset
universe.roboflow.com
zip
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
basura (2024). Test Valid Dataset [Dataset]. https://universe.roboflow.com/basura-6j9cj/test-valid-h3pze
Explore at:
zipAvailable download formats
Dataset updated
Sep 17, 2024
Dataset authored and provided by
basura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Biodegradable Non Biodegradable Bounding Boxes
Description
Test Valid

## Overview Test Valid is a dataset for object detection tasks - it contains Biodegradable Non Biodegradable annotations for 1,530 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
h
mft-valid-dataset-test
huggingface.co
Updated May 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kiran Shivaraju (2025). mft-valid-dataset-test [Dataset]. https://huggingface.co/datasets/nxtr-kiranshivaraju/mft-valid-dataset-test
Explore at:
Dataset updated
May 28, 2025
Authors
Kiran Shivaraju
Description
nxtr-kiranshivaraju/mft-valid-dataset-test dataset hosted on Hugging Face and contributed by the HF Datasets community
h
VALID-test-2
huggingface.co
Updated Dec 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ontocord.AI (2024). VALID-test-2 [Dataset]. https://huggingface.co/datasets/ontocord/VALID-test-2
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 7, 2024
Dataset provided by
Ontocord.AI
Description
ontocord/VALID-test-2 dataset hosted on Hugging Face and contributed by the HF Datasets community
valid and test TA
kaggle.com
Updated Jun 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shidqie Taufiqurrahman (2020). valid and test TA [Dataset]. https://www.kaggle.com/datasets/shidqiet/valid-and-test-ta/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 22, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shidqie Taufiqurrahman
Description
Dataset

This dataset was created by Shidqie Taufiqurrahman

Contents
h
alpaca-train-validation-test-split
huggingface.co
Updated Aug 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Doula Isham Rashik Hasan (2023). alpaca-train-validation-test-split [Dataset]. https://huggingface.co/datasets/disham993/alpaca-train-validation-test-split
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 12, 2023
Authors
Doula Isham Rashik Hasan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Dataset Card for Alpaca

I have just performed train, test and validation split on the original dataset. Repository to reproduce this will be shared here soon. I am including the orignal Dataset card as follows.

Dataset Summary

Alpaca is a dataset of 52,000 instructions and demonstrations generated by OpenAI's text-davinci-003 engine. This instruction data can be used to conduct instruction-tuning for language models and make the language model follow instruction better.… See the full description on the dataset page: https://huggingface.co/datasets/disham993/alpaca-train-validation-test-split.
100 Sports Image Classification
kaggle.com
Updated May 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerry (2023). 100 Sports Image Classification [Dataset]. https://www.kaggle.com/datasets/gpiosenka/sports-classification/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 3, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Gerry
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Please upvote if you find this dataset of use. - Thank you This version is an update of the earlier version. I ran a data set quality evaluation program on the previous version which found a considerable number of duplicate and near duplicate images. Duplicate images can lead to falsely higher values of validation and test set accuracy and I have eliminated these images in this version of the dataset. Images were gathered from internet searches. The images were scanned with a duplicate image detector program I wrote. Any duplicate images were removed to prevent bleed through of images between the train, test and valid data sets. All images were then resized to 224 X224 X 3 and converted to jpg format. A csv file is included that for each image file contains the relative path to the image file, the image file class label and the dataset (train, test or valid) that the image file resides in. This is a clean dataset. If you build a good model you should achieve at least 95% accuracy on the test set. If you build a very good model for example using transfer learning you should be able to achieve 98%+ on test set accuracy. If you find this data set useful please upvote. Thanks

Content

Collection of sports images covering 100 different sports.. Images are 224,224,3 jpg format. Data is separated into train, test and valid directories. Additionallly a csv file is included for those that wish to use it to create there own train, test and validation datasets. .

Inspiration

Wanted to build a high quality clean data set that was easy to use and had no bad images or duplication between the train, test and validation data sets. Provides a good data set to test your models on. Design for straight forward application of keras preprocessing functions like ImageDataenerator.flow_from_directory or if you use the csv file ImageDataGenerator.flow_from_dataframe. This dataset was carefully created so that the region of interest (ROI) in this case the sport occupies approximately 50% of the pixels in the image. As a consequence even models of moderate complexity should achieve training and validation accuracies in the high 90's.
R
Kobirds451 Valid Test 2 Dataset
universe.roboflow.com
zip
Updated Aug 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
kobirds451 (2022). Kobirds451 Valid Test 2 Dataset [Dataset]. https://universe.roboflow.com/kobirds451/kobirds451-valid-test-2
Explore at:
zipAvailable download formats
Dataset updated
Aug 21, 2022
Dataset authored and provided by
kobirds451
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Bird Bounding Boxes
Description
Kobirds451 Valid Test 2

## Overview Kobirds451 Valid Test 2 is a dataset for object detection tasks - it contains Bird annotations for 4,508 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Z
TRAVEL: A Dataset with Toolchains for Test Generation and Regression Testing...
data.niaid.nih.gov
Updated Jul 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessio Gambi (2024). TRAVEL: A Dataset with Toolchains for Test Generation and Regression Testing of Self-driving Cars Software [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5911160
Explore at:
Dataset updated
Jul 17, 2024
Dataset provided by
Alessio Gambi
Sebastiano Panichella
Annibale Panichella
Christian Birchler
Pouria Derakhshanfar
Vincenzo Riccio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

This repository hosts the Testing Roads for Autonomous VEhicLes (TRAVEL) dataset. TRAVEL is an extensive collection of virtual roads that have been used for testing lane assist/keeping systems (i.e., driving agents) and data from their execution in state of the art, physically accurate driving simulator, called BeamNG.tech. Virtual roads consist of sequences of road points interpolated using Cubic splines.

Along with the data, this repository contains instructions on how to install the tooling necessary to generate new data (i.e., test cases) and analyze them in the context of test regression. We focus on test selection and test prioritization, given their importance for developing high-quality software following the DevOps paradigms.

This dataset builds on top of our previous work in this area, including work on

test generation (e.g., AsFault, DeepJanus, and DeepHyperion) and the SBST CPS tool competition (SBST2021),

test selection: SDC-Scissor and related tool

test prioritization: automated test cases prioritization work for SDCs.

Dataset Overview

The TRAVEL dataset is available under the data folder and is organized as a set of experiments folders. Each of these folders is generated by running the test-generator (see below) and contains the configuration used for generating the data (experiment_description.csv), various statistics on generated tests (generation_stats.csv) and found faults (oob_stats.csv). Additionally, the folders contain the raw test cases generated and executed during each experiment (test..json).

The following sections describe what each of those files contains.

Experiment Description

The experiment_description.csv contains the settings used to generate the data, including:

Time budget. The overall generation budget in hours. This budget includes both the time to generate and execute the tests as driving simulations.

The size of the map. The size of the squared map defines the boundaries inside which the virtual roads develop in meters.

The test subject. The driving agent that implements the lane-keeping system under test. The TRAVEL dataset contains data generated testing the BeamNG.AI and the end-to-end Dave2 systems.

The test generator. The algorithm that generated the test cases. The TRAVEL dataset contains data obtained using various algorithms, ranging from naive and advanced random generators to complex evolutionary algorithms, for generating tests.

The speed limit. The maximum speed at which the driving agent under test can travel.

Out of Bound (OOB) tolerance. The test cases' oracle that defines the tolerable amount of the ego-car that can lie outside the lane boundaries. This parameter ranges between 0.0 and 1.0. In the former case, a test failure triggers as soon as any part of the ego-vehicle goes out of the lane boundary; in the latter case, a test failure triggers only if the entire body of the ego-car falls outside the lane.

Experiment Statistics

The generation_stats.csv contains statistics about the test generation, including:

Total number of generated tests. The number of tests generated during an experiment. This number is broken down into the number of valid tests and invalid tests. Valid tests contain virtual roads that do not self-intersect and contain turns that are not too sharp.

Test outcome. The test outcome contains the number of passed tests, failed tests, and test in error. Passed and failed tests are defined by the OOB Tolerance and an additional (implicit) oracle that checks whether the ego-car is moving or standing. Tests that did not pass because of other errors (e.g., the simulator crashed) are reported in a separated category.

The TRAVEL dataset also contains statistics about the failed tests, including the overall number of failed tests (total oob) and its breakdown into OOB that happened while driving left or right. Further statistics about the diversity (i.e., sparseness) of the failures are also reported.

Test Cases and Executions

Each test..json contains information about a test case and, if the test case is valid, the data observed during its execution as driving simulation.

The data about the test case definition include:

The road points. The list of points in a 2D space that identifies the center of the virtual road, and their interpolation using cubic splines (interpolated_points)

The test ID. The unique identifier of the test in the experiment.

Validity flag and explanation. A flag that indicates whether the test is valid or not, and a brief message describing why the test is not considered valid (e.g., the road contains sharp turns or the road self intersects)

The test data are organized according to the following JSON Schema and can be interpreted as RoadTest objects provided by the tests_generation.py module.

{ "type": "object", "properties": { "id": { "type": "integer" }, "is_valid": { "type": "boolean" }, "validation_message": { "type": "string" }, "road_points": { §\label{line:road-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "interpolated_points": { §\label{line:interpolated-points}§ "type": "array", "items": { "$ref": "schemas/pair" }, }, "test_outcome": { "type": "string" }, §\label{line:test-outcome}§ "description": { "type": "string" }, "execution_data": { "type": "array", "items": { "$ref" : "schemas/simulationdata" } } }, "required": [ "id", "is_valid", "validation_message", "road_points", "interpolated_points" ] }

Finally, the execution data contain a list of timestamped state information recorded by the driving simulation. State information is collected at constant frequency and includes absolute position, rotation, and velocity of the ego-car, its speed in Km/h, and control inputs from the driving agent (steering, throttle, and braking). Additionally, execution data contain OOB-related data, such as the lateral distance between the car and the lane center and the OOB percentage (i.e., how much the car is outside the lane).

The simulation data adhere to the following (simplified) JSON Schema and can be interpreted as Python objects using the simulation_data.py module.

{ "$id": "schemas/simulationdata", "type": "object", "properties": { "timer" : { "type": "number" }, "pos" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel" : { "type": "array", "items":{ "$ref" : "schemas/triple" } } "vel_kmh" : { "type": "number" }, "steering" : { "type": "number" }, "brake" : { "type": "number" }, "throttle" : { "type": "number" }, "is_oob" : { "type": "number" }, "oob_percentage" : { "type": "number" } §\label{line:oob-percentage}§ }, "required": [ "timer", "pos", "vel", "vel_kmh", "steering", "brake", "throttle", "is_oob", "oob_percentage" ] }

Dataset Content

The TRAVEL dataset is a lively initiative so the content of the dataset is subject to change. Currently, the dataset contains the data collected during the SBST CPS tool competition, and data collected in the context of our recent work on test selection (SDC-Scissor work and tool) and test prioritization (automated test cases prioritization work for SDCs).

SBST CPS Tool Competition Data

The data collected during the SBST CPS tool competition are stored inside data/competition.tar.gz. The file contains the test cases generated by Deeper, Frenetic, AdaFrenetic, and Swat, the open-source test generators submitted to the competition and executed against BeamNG.AI with an aggression factor of 0.7 (i.e., conservative driver).

Name Map Size (m x m) Max Speed (Km/h) Budget (h) OOB Tolerance (%) Test Subject DEFAULT 200 × 200 120 5 (real time) 0.95 BeamNG.AI - 0.7 SBST 200 × 200 70 2 (real time) 0.5 BeamNG.AI - 0.7

Specifically, the TRAVEL dataset contains 8 repetitions for each of the above configurations for each test generator totaling 64 experiments.

SDC Scissor

With SDC-Scissor we collected data based on the Frenetic test generator. The data is stored inside data/sdc-scissor.tar.gz. The following table summarizes the used parameters.

Name Map Size (m x m) Max Speed (Km/h) Budget (h) OOB Tolerance (%) Test Subject SDC-SCISSOR 200 × 200 120 16 (real time) 0.5 BeamNG.AI - 1.5

The dataset contains 9 experiments with the above configuration. For generating your own data with SDC-Scissor follow the instructions in its repository.

Dataset Statistics

Here is an overview of the TRAVEL dataset: generated tests, executed tests, and faults found by all the test generators grouped by experiment configuration. Some 25,845 test cases are generated by running 4 test generators 8 times in 2 configurations using the SBST CPS Tool Competition code pipeline (SBST in the table). We ran the test generators for 5 hours, allowing the ego-car a generous speed limit (120 Km/h) and defining a high OOB tolerance (i.e., 0.95), and we also ran the test generators using a smaller generation budget (i.e., 2 hours) and speed limit (i.e., 70 Km/h) while setting the OOB tolerance to a lower value (i.e., 0.85). We also collected some 5, 971 additional tests with SDC-Scissor (SDC-Scissor in the table) by running it 9 times for 16 hours using Frenetic as a test generator and defining a more realistic OOB tolerance (i.e., 0.50).

Generating new Data

Generating new data, i.e., test cases, can be done using the SBST CPS Tool Competition pipeline and the driving simulator BeamNG.tech.

Extensive instructions on how to install both software are reported inside the SBST CPS Tool Competition pipeline Documentation;
R
80% Test 3rd Iteration Dataset
universe.roboflow.com
zip
Updated Nov 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Master (2022). 80% Test 3rd Iteration Dataset [Dataset]. https://universe.roboflow.com/master-cgd8n/master-dataset-10-train-10-valid-80-test-3rd-iteration
Explore at:
zipAvailable download formats
Dataset updated
Nov 21, 2022
Dataset authored and provided by
Master
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Surgical Devices Bounding Boxes
Description
80% Test 3rd Iteration

## Overview 80% Test 3rd Iteration is a dataset for object detection tasks - it contains Surgical Devices annotations for 2,804 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Extended dataset for the validation the competent Computational Thinking...
zenodo.org
data.niaid.nih.gov
bin, csv
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laila El-Hamamsy; Laila El-Hamamsy; Barbara Bruno; Barbara Bruno; Jessica Dehler Zufferey; Jessica Dehler Zufferey; Francesco Mondada; Francesco Mondada (2025). Extended dataset for the validation the competent Computational Thinking test in grades 3-6 [Dataset]. http://doi.org/10.5281/zenodo.8071464
Explore at:
csv, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8071464
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Laila El-Hamamsy; Laila El-Hamamsy; Barbara Bruno; Barbara Bruno; Jessica Dehler Zufferey; Jessica Dehler Zufferey; Francesco Mondada; Francesco Mondada
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Extended dataset for the validation the competent Computational Thinking test in grades 3-6
=======================================================

• If you publish material based on this dataset, please cite the following :

• The Zenodo repository : Laila El-Hamamsy, Barbara Bruno, Jessica Dehler Zufferey, & Francesco Mondada (2023). Extended dataset for the validation of the competent Computational Thinking test in grades 3-6 [Data set]. Zenodo. https://doi.org/10.5281/zenodo.7983525

• The article on the validation of the computational thinking test for grades 3-6 : El-Hamamsy, L., Zapata-Cáceres, M., Martín-Barroso, E., Mondada, F., Zufferey, J. D., Bruno, B., & Román-González, M. (2025). The competent Computational Thinking test (cCTt): A valid, reliable and gender-fair test for longitudinal CT studies in grades 3–6. Technology, Knowledge and Learning, 1-55. https://doi.org/10.1007/s10758-024-09777-8

• License : This work is licensed under a Creative Commons Attribution 4.0 International license (CC-BY-4.0)

• Creators : El-Hamamsy, L., Bruno, B., Dehler Zufferey, J., and Mondada, F.

• Date May 30th 2023

• Subject : Computational Thinking (CT), Assessment, Primary education, Psychometric validation

• Dataset format : CSV. The dataset contains four files (one per grade, see detailed description below). Please note that the spreadsheets may contain missing values due to students not being present for a part of the data collection. To have access to the specific cCTt questions please refer to the original publication [1] and Zenodo repository [2] which provide the full set of questions and correct responses.

• Dataset size < 500 kB

• Data collection period : January and November 2021

• Abbreviations :
- CT : Computational Thinking
- cCTt: competent CT test

• Funding : This work was funded by the the NCCR Robotics, a National Centre of Competence in Research, funded by the Swiss National Science Foundation (grant number 51NF40_185543)

# References

[1] El-Hamamsy, L., Zapata-Cáceres, M., Barroso, E. M., Mondada, F., Zufferey, J. D., & Bruno, B. (2022). The Competent Computational Thinking Test: Development and Validation of an Unplugged Computational Thinking Test for Upper Primary School. Journal of Educational Computing Research, 60(7), 1818–1866. https://doi.org/10.1177/07356331221081753

[2] El-Hamamsy, L., Zapata-Cáceres, M., Marcelino, P., Dehler Zufferey, J., Bruno, B., Martín Barroso, E., & ‪Román-González, M.‬ (2022). Dataset for the comparison of two Computational Thinking (CT) test for upper primary school (grades 3-4) : the Beginners' CT test (BCTt) and the competent CT test (cCTt) (Version 1) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.5885034 ‬‬‬‬‬‬‬‬‬‬‬

[3] El-Hamamsy, L., Zapata-Cáceres, M., Martín-Barroso, E. et al. The Competent Computational Thinking Test (cCTt): A Valid, Reliable and Gender-Fair Test for Longitudinal CT Studies in Grades 3–6. Tech Know Learn (2025). https://doi.org/10.1007/s10758-024-09777-8

[4] Brennan, K. and Resnick, M. (2012). New frameworks for studying and assessing the development of computational thinking. page 25

[5] El-Hamamsy, L., Zapata-Cáceres, M., Marcelino, P., Bruno, B., Dehler Zufferey, J., Martín-Barroso, E., & Román-González, M. (2022). Comparing the psychometric properties of two primary school Computational Thinking (CT) assessments for grades 3 and 4: The Beginners’ CT test (BCTt) and the competent CT test (cCTt). Frontiers in Psychology, 13. https://www.frontiersin.org/articles/10.3389/fpsyg.2022.1082659
h
nq-test-valid-adversary-replace-processed
huggingface.co
Updated May 18, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SI (2018). nq-test-valid-adversary-replace-processed [Dataset]. https://huggingface.co/datasets/Atipico1/nq-test-valid-adversary-replace-processed
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 18, 2018
Authors
SI
Description
Atipico1/nq-test-valid-adversary-replace-processed dataset hosted on Hugging Face and contributed by the HF Datasets community
Data from: Temporal Validity Change Prediction - Dataset
zenodo.org
data.niaid.nih.gov
csv
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Georg Wenzel; Georg Wenzel (2025). Temporal Validity Change Prediction - Dataset [Dataset]. http://doi.org/10.5281/zenodo.8340858
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8340858
Dataset updated
Apr 24, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Georg Wenzel; Georg Wenzel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains data for temporal validity change prediction, an NLP task that will be defined in an upcoming publication. The dataset consists of five columns.

target - A Tweet ID. This column must be manually rehydrated via the Twitter API to obtain the tweet text.

follow_up - A synthetic follow-up tweet that semantically relates to the target tweet.

context_only_tv - The expected temporal validity duration of the target tweet, when read in isolation.

combined_tv - The expected temporal validity duration of the target tweet, when read together with the follow-up tweet.

change - The TVCP task label, i.e., whether the temporal validity duration of the target tweet is decreased, unchanged (neutral), or increased by the information in the follow-up tweet.

The duration labels (context_only_tv, combined_tv) are class indices of the following class distribution:
[no time-sensitive information, less than one minute, 1-5 minutes, 5-15 minutes, 15-45 minutes, 45 minutes - 2 hours, 2-6 hours, more than 6 hours, 1-3 days, 3-7 days, 1-4 weeks, more than one month]

Different dataset splits are provided.

"dataset.csv" contains the full dataset.

"train.csv", "val.csv", "test.csv" contain an 80-10-10 train-val-test split.

"train[0-4].csv" and "test[0-4].csv" respectively contain training and test data for one of 5 folds for 5-fold cross-validation. The train file contains 80% of the data, while the test file contains 20%. To replicate the original experiments, the train file should be sorted by the preprocessed target tweet text, then the first 12.5% of target tweets should be sampled to generate validation data, leading to a 70-10-20 train-val-test split.
f
Reliability and Validity output test E2).xls
figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bambang Dwi Suseno (2023). Reliability and Validity output test E2).xls [Dataset]. http://doi.org/10.6084/m9.figshare.22730468.v1
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22730468.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Bambang Dwi Suseno
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The result of the calculation of the research model's reliability and validity.
S
Specimen Validity Test (SVT) Service Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jan 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Specimen Validity Test (SVT) Service Report [Dataset]. https://www.datainsightsmarket.com/reports/specimen-validity-test-svt-service-578670
Explore at:
pdf, ppt, docAvailable download formats
Dataset updated
Jan 29, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Specimen Validity Test (SVT) Service market size was valued at approximately USD 1187.2 million in 2025 and is projected to grow at a CAGR of around 4.8% during the forecast period, 2025 to 2033. The market is driven by the increasing demand for accurate and reliable drug testing, the growing prevalence of substance abuse, and the stringent government regulations mandating SVTs for employment and legal purposes. Furthermore, the increasing adoption of advanced technologies, such as liquid chromatography-mass spectrometry (LC-MS) and gas chromatography-mass spectrometry (GC-MS), for SVTs is further fueling market growth. North America is expected to dominate the global SVT service market throughout the forecast period due to the high prevalence of substance abuse, the strong presence of key market players, and the stringent government regulations. However, emerging economies in Asia Pacific, such as China and India, are expected to witness significant growth in the coming years due to the increasing adoption of SVTs in the workplace and the growing awareness of the importance of accurate drug testing. The Asia-Pacific is projected to register a CAGR of around 6.2%.
Split Garbage Dataset
kaggle.com
Updated May 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Santoro (2019). Split Garbage Dataset [Dataset]. https://www.kaggle.com/andreasantoro/split-garbage-dataset/kernels
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 18, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Andrea Santoro
Description
Split version of the garbage classification dataset (link below). train, test and valid folders have been generated as specified by the one-indexed files of the original dataset

Acknowledgements

Original dataset here: https://www.kaggle.com/asdasdasasdas/garbage-classification
f
Data_Sheet_1_Online Testing Yields the Same Results as Lab Testing: A...
frontiersin.figshare.com
txt
Updated Jun 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lydia Paulin Schidelko; Britta Schünemann; Hannes Rakoczy; Marina Proft (2023). Data_Sheet_1_Online Testing Yields the Same Results as Lab Testing: A Validation Study With the False Belief Task.CSV [Dataset]. http://doi.org/10.3389/fpsyg.2021.703238.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyg.2021.703238.s001
Dataset updated
Jun 6, 2023
Dataset provided by
Frontiers
Authors
Lydia Paulin Schidelko; Britta Schünemann; Hannes Rakoczy; Marina Proft
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Recently, online testing has become an increasingly important instrument in developmental research, in particular since the COVID-19 pandemic made in-lab testing impossible. However, online testing comes with two substantial challenges. First, it is unclear how valid results of online studies really are. Second, implementing online studies can be costly and/or require profound coding skills. This article addresses the validity of an online testing approach that is low-cost and easy to implement: The experimenter shares test materials such as videos or presentations via video chat and interactively moderates the test session. To validate this approach, we compared children’s performance on a well-established task, the change-of-location false belief task, in an in-lab and online test setting. In two studies, 3- and 4-year-old received online implementations of the false belief version (Study 1) and the false and true belief version of the task (Study 2). Children’s performance in these online studies was compared to data of matching tasks collected in the context of in-lab studies. Results revealed that the typical developmental pattern of performance in these tasks found in in-lab studies could be replicated with the novel online test procedure. These results suggest that the proposed method, which is both low-cost and easy to implement, provides a valid alternative to classical in-person test settings.
d
Castlegate Sandstone True Triaxial Test Data
catalog.data.gov
data.openei.org
+1more
Updated Jun 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandia National Laboratories (2024). Castlegate Sandstone True Triaxial Test Data [Dataset]. https://catalog.data.gov/dataset/castlegate-sandstone-true-triaxial-test-data-295b1
Explore at:
Dataset updated
Jun 19, 2024
Dataset provided by
Sandia National Laboratories
Description
Data set containing results from constant mean stress - constant Lode angle true triaxial compression tests performed on Castlegate Sandstone. From the test preformed, the bedding plane and the strain type inside the band of sedimentary rocks can be related to stress histories. The goal of these tests are to understand the conditions that lead to localized deformation in porous sandstone which has geotechnical applications such as oil and natural gas production, carbon dioxide sequestration, and hazardous waste storage.

Facebook

Twitter

Click to copy link

Link copied

Cite

M,MEYYAPPAN (2024). train-test-valid [Dataset]. https://huggingface.co/datasets/mmtg/train-test-valid

train-test-valid

mmtg/train-test-valid

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Jun 23, 2024

Authors

M,MEYYAPPAN

License

https://choosealicense.com/licenses/cdla-permissive-1.0/https://choosealicense.com/licenses/cdla-permissive-1.0/

Description

Dataset Card for Dataset Name

This dataset card aims to be a base template for new datasets. It has been generated using this raw template.

  Dataset Details





  Dataset Description

Curated by: [More Information Needed] Funded by [optional]: [More Information Needed] Shared by [optional]: [More Information Needed] Language(s) (NLP): [More Information Needed] License: [More Information Needed]

  Dataset Sources [optional]

Repository: [More… See the full description on the dataset page: https://huggingface.co/datasets/mmtg/train-test-valid.

Clear search

Close search

Google apps

Main menu

train-test-valid

Valid And Test Dataset

Valid And Test

Extrusion Valid & Test Dataset

Extrusion Valid & Test

Test Valid Dataset

Test Valid

mft-valid-dataset-test

VALID-test-2

valid and test TA

Dataset

Contents

alpaca-train-validation-test-split

100 Sports Image Classification

Context

Content

Inspiration

Kobirds451 Valid Test 2 Dataset

Kobirds451 Valid Test 2

TRAVEL: A Dataset with Toolchains for Test Generation and Regression Testing...

80% Test 3rd Iteration Dataset

80% Test 3rd Iteration

Extended dataset for the validation the competent Computational Thinking...

nq-test-valid-adversary-replace-processed

Data from: Temporal Validity Change Prediction - Dataset

Reliability and Validity output test E2).xls

Specimen Validity Test (SVT) Service Report

Split Garbage Dataset

Acknowledgements

Data_Sheet_1_Online Testing Yields the Same Results as Lab Testing: A...

Castlegate Sandstone True Triaxial Test Data

train-test-validSee More Versions

mmtg/train-test-valid

train-test-valid