5 datasets found

S233
zenodo.org
tar
Updated Oct 6, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Zurowietz; Martin Zurowietz (2020). S233 [Dataset]. http://doi.org/10.5281/zenodo.3603815
Explore at:
tarAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3603815
Dataset updated
Oct 6, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Martin Zurowietz; Martin Zurowietz
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
A fully annotated subset of the SO242/2_233-1 image dataset. The annotations are given as train and test splits that can be used to evaluate machine learning methods. The following classes of fauna were used for annotation:

anemone

coral

crustacean

ipnops fish

litter

ophiuroid

other fauna

sea cucumber

sponge

stalked crinoid

For a definition of the classes see [1].

Related datasets:

S083: https://doi.org/10.5281/zenodo.3600132

S155: https://doi.org/10.5281/zenodo.3603803

S171: https://doi.org/10.5281/zenodo.3603809

This dataset contains the following files:

annotations/test.csv: The BIIGLE CSV annotation report of the annotations of the test split of this dataset. These annotations are used to test the performance of the trained Mask R-CNN model.

annotations/train.csv: The BIIGLE CSV annotation report of the annotations of the train split of this dataset. These annotations are used to generate the annotation patches which are transformed with scale and style transfer to be used to train the Mask R-CNN model.

images/: Directory that contains all the original image files.

dataset.json: JSON file that contains information about the dataset.

name: The name of the dataset.

images_dir: Name of the directory that contains the original image files.

metadata_file: Path to the CSV file that contains image metadata.

test_annotations_file: Path to the CSV file that contains the test annotations.

train_annotations_file: Path to the CSV file that contains the train annotations.

annotation_patches_dir: Name of the directory that should contain the scale- and style-transferred annotation patches.

crop_dimension: Edge length of an annotation or style patch in pixels.

metadata.csv: A CSV file that contains metadata for each original image file. In this case the distance of the camera to the sea floor is given for each image.
Data from: Time-Split Cross-Validation as a Method for Estimating the...
acs.figshare.com
txt
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert P. Sheridan (2023). Time-Split Cross-Validation as a Method for Estimating the Goodness of Prospective Prediction. [Dataset]. http://doi.org/10.1021/ci400084k.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1021/ci400084k.s001
Dataset updated
Jun 2, 2023
Dataset provided by
ACS Publications
Authors
Robert P. Sheridan
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Cross-validation is a common method to validate a QSAR model. In cross-validation, some compounds are held out as a test set, while the remaining compounds form a training set. A model is built from the training set, and the test set compounds are predicted on that model. The agreement of the predicted and observed activity values of the test set (measured by, say, R2) is an estimate of the self-consistency of the model and is sometimes taken as an indication of the predictivity of the model. This estimate of predictivity can be optimistic or pessimistic compared to true prospective prediction, depending how compounds in the test set are selected. Here, we show that time-split selection gives an R2 that is more like that of true prospective prediction than the R2 from random selection (too optimistic) or from our analog of leave-class-out selection (too pessimistic). Time-split selection should be used in addition to random selection as a standard for cross-validation in QSAR model building.
r
Training.gov.au - Web service access to sandbox environment
researchdata.edu.au
data.gov.au
+2more
Updated Sep 17, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Employment and Workplace Relations (2014). Training.gov.au - Web service access to sandbox environment [Dataset]. https://researchdata.edu.au/traininggovau-web-service-sandbox-environment/2996152
Explore at:
Dataset updated
Sep 17, 2014
Dataset provided by
data.gov.au
Authors
Department of Employment and Workplace Relations
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Area covered

Description
Introduction\r

Training.gov.au (TGA) is the National Register of Vocational Education and Training in Australia and contains authoritative information about Registered Training Organisations (RTOs), Nationally Recognised Training (NRT) and the approved scope of each RTO to deliver NRT as required in national and jurisdictional legislation.\r \r

TGA web-services overview\r

TGA has a web service available to allow external systems to access and utilise information stored in TGA through an external system. The TGA web service is exposed through a single interface and web service users are assigned a data reader role which will apply to all data stored in the TGA.\r \r The web service can be broadly split into three categories:\r \r 1. RTOs and other organisation types;\r \r 2. Training components including Accredited courses, Accredited course Modules Training Packages, Qualifications, Skill Sets and Units of Competency;\r \r 3. System metadata including static data and statistical classifications.\r \r Users will gain access to the TGA web service by first passing a user name and password through to the web server. The web server will then authenticate the user against the TGA security provider before passing the request to the application that supplies the web services.\r \r There are two web services environments:\r \r 1. Production - ws.training.gov.au – National Register production web services\r \r 2. Sandbox - ws.sandbox.training.gov.au – National Register sandbox web services. \r \r The National Register sandbox web service is used to test against the current version of the web services where the functionality will be identical to the current production release. The web service definition and schema of the National Register sandbox database will also be identical to that of production release at any given point in time. The National Register sandbox database will be cleared down at regular intervals and realigned with the National Register production environment.\r \r Each environment has three configured services:\r \r 1. Organisation Service;\r \r 2. Training Component Service; and\r \r 3. Classification Service.\r \r

Sandbox environment access\r

To access the download area for web services, navigate to http://tga.hsd.com.au and use the below name and password:\r \r Username: WebService.Read (case sensitive)\r \r Password: Asdf098 (case sensitive)\r \r This download area contains various versions of the following artefacts that you may find useful\r \r • Training.gov.au web service specification document;\r \r • Training.gov.au logical data model and definitions document;\r \r • .NET web service SDK sample app (with source code);\r \r • Java sample client (with source code);\r \r • How to setup web service client in VS 2010 video; and\r \r • Web services WSDL's and XSD's.\r \r For the business areas, the specification/definition documents and the sample application is a good place to start while the IT areas will find the sample source code and the video useful to start developing against the TGA web services.\r \r The web services Sandbox end point is: https://ws.sandbox.training.gov.au/Deewr.Tga.Webservices \r \r

Production web service access\r

Once you are ready to access the production web service, please email the TGA team at tgaproject@education.gov.au to obtain a unique user name and password.\r
RRegrs study for Growth Yield
figshare.com
txt
Updated Jun 5, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cristian Robert Munteanu (2016). RRegrs study for Growth Yield [Dataset]. http://doi.org/10.6084/m9.figshare.3409804.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3409804.v2
Dataset updated
Jun 5, 2016
Dataset provided by
figshare
Authors
Cristian Robert Munteanu
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
RRegrs study for Growth Yield for original and corrected/filterred datasets: inputs training and test files, R scripts to split the datasets, plot for outlier removal.
Detailed breakdown of overfitting comparison of CARRoT output and the other...
plos.figshare.com
figshare.com
txt
Updated Oct 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alina Bazarova; Marko Raseta (2023). Detailed breakdown of overfitting comparison of CARRoT output and the other models. [Dataset]. http://doi.org/10.1371/journal.pone.0292597.s002
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0292597.s002
Dataset updated
Oct 12, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Alina Bazarova; Marko Raseta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Overfitting in terms of absolute/relative error, accuracy/AUROC and accuracy only (for continuous, binary and multinomial outcomes respectively) computed both on training and test sets of different prediction methods on 43 datasets available in R using the default 90%/10% training/validation split. The methods used are CARRoT with EPV = 10, model, based on significant predictors only, lasso-based model, CARRoT with EPV = 10 and additional R2 constraint. (CSV)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Martin Zurowietz; Martin Zurowietz (2020). S233 [Dataset]. http://doi.org/10.5281/zenodo.3603815

S233

Explore at:

tarAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.3603815

Dataset updated

Oct 6, 2020

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Martin Zurowietz; Martin Zurowietz

License

Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically

Description

A fully annotated subset of the SO242/2_233-1 image dataset. The annotations are given as train and test splits that can be used to evaluate machine learning methods. The following classes of fauna were used for annotation:

anemone
coral
crustacean
ipnops fish
litter
ophiuroid
other fauna
sea cucumber
sponge
stalked crinoid

For a definition of the classes see [1].

Related datasets:

S083: https://doi.org/10.5281/zenodo.3600132
S155: https://doi.org/10.5281/zenodo.3603803
S171: https://doi.org/10.5281/zenodo.3603809

This dataset contains the following files:

annotations/test.csv: The BIIGLE CSV annotation report of the annotations of the test split of this dataset. These annotations are used to test the performance of the trained Mask R-CNN model.
annotations/train.csv: The BIIGLE CSV annotation report of the annotations of the train split of this dataset. These annotations are used to generate the annotation patches which are transformed with scale and style transfer to be used to train the Mask R-CNN model.
images/: Directory that contains all the original image files.
dataset.json: JSON file that contains information about the dataset.
- name: The name of the dataset.
- images_dir: Name of the directory that contains the original image files.
- metadata_file: Path to the CSV file that contains image metadata.
- test_annotations_file: Path to the CSV file that contains the test annotations.
- train_annotations_file: Path to the CSV file that contains the train annotations.
- annotation_patches_dir: Name of the directory that should contain the scale- and style-transferred annotation patches.
- crop_dimension: Edge length of an annotation or style patch in pixels.
metadata.csv: A CSV file that contains metadata for each original image file. In this case the distance of the camera to the sea floor is given for each image.

Clear search

Close search

Google apps

Main menu

S233

Data from: Time-Split Cross-Validation as a Method for Estimating the...

Training.gov.au - Web service access to sandbox environment

Introduction\r

TGA web-services overview\r

Sandbox environment access\r

Production web service access\r

RRegrs study for Growth Yield

Detailed breakdown of overfitting comparison of CARRoT output and the other...

S233See More Versions

S233