89 datasets found

d
APD Cadets in Training Interactive Dataset Guide
catalog.data.gov
Updated Nov 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). APD Cadets in Training Interactive Dataset Guide [Dataset]. https://catalog.data.gov/dataset/apd-cadets-in-training-interactive-dataset-guide
Explore at:
Dataset updated
Nov 25, 2025
Dataset provided by
data.austintexas.gov
Description
Guide for APD Cadets in Training Dataset
w
Dataset of books called The complete guide to rat training
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The complete guide to rat training [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+complete+guide+to+rat+training
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is The complete guide to rat training. It features 7 columns including author, publication date, language, and book publisher.
d
APD Cadets in Training Interactive Dashboard Guide
catalog.data.gov
Updated Nov 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). APD Cadets in Training Interactive Dashboard Guide [Dataset]. https://catalog.data.gov/dataset/apd-cadets-in-training-interactive-dashboard-guide
Explore at:
Dataset updated
Nov 25, 2025
Dataset provided by
data.austintexas.gov
Description
Guide for APD Cadets in Training Dashboard
Data from: Design guide and usability questionnaire to develop and assess...
tandf.figshare.com
search.datacite.org
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
María Luisa Rodríguez-Almendros; María José Rodríguez-Fórtiz; Miguel J. Hornos; José Samos-Jiménez; Carlos Rodríguez-Domínguez; Sandra Rute-Pérez (2023). Design guide and usability questionnaire to develop and assess VIRTRAEL, a web-based cognitive training tool for the elderly [Dataset]. http://doi.org/10.6084/m9.figshare.12110658.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12110658.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
María Luisa Rodríguez-Almendros; María José Rodríguez-Fórtiz; Miguel J. Hornos; José Samos-Jiménez; Carlos Rodríguez-Domínguez; Sandra Rute-Pérez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In most developed countries, the population is gradually ageing. Due to this, there is an increasing demand for technologies whose design is specifically oriented towards meeting the needs of the elderly. In this paper, we describe a web-based cognitive training tool for elderly people, called VIRTRAEL, which comprises 18 exercises presented in 13 working sessions. In order to reach a high degree of user acceptance, we have applied a user-centred development methodology and a guide defining a set of design principles and usability guidelines specifically intended for older people. Moreover, a usability questionnaire to assess VIRTRAEL has been especially designed to be completed by this type of users. Both guide and questionnaire can be easily applied in other software developments, and especially in those related to the specific domain of cognitive training for this user group. As a means to objectively measure the usability of VIRTRAEL, an EFA (Exploratory Factor Analysis) has been conducted on a 32-item questionnaire with 149 subjects. The results confirm that our proposal is usable and highlight some differences between user groups (female versus male users, and those who live alone versus those living with other people) that should be taken into consideration in future developments.
s
Learning about climate change - A guide for teachers
pacific-data.sprep.org
pacificdata.org
+1more
pdf
Updated Feb 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Environment and Conservation Division-MELAD (2025). Learning about climate change - A guide for teachers [Dataset]. https://pacific-data.sprep.org/dataset/learning-about-climate-change-guide-teachers
Explore at:
pdf(2030930)Available download formats
Dataset updated
Feb 8, 2025
Dataset provided by
Environment and Conservation Division-MELAD
License
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Area covered
-181.25244140625 -3.7363624313238)), -181.25244140625 3.9629003196695, POLYGON ((-189.51416015625 -3.7363624313238, -189.51416015625 3.9629003196695, Kiribati
Description
The focus of this resource is on the effects of changes in air and sea surface temperature, rainfall, sea-level rise and extreme weather events on island environments, economies and people. It is vital to enhance individual and community skills to adapt to these changes – in other words, to reduce risks and maximize potential benefits.
w
Dataset of books called The skills of human relations training : a guide for...
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called The skills of human relations training : a guide for managers and practitioners [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=The+skills+of+human+relations+training+%3A+a+guide+for+managers+and+practitioners
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is The skills of human relations training : a guide for managers and practitioners. It features 7 columns including author, publication date, language, and book publisher.
Lunar Reconnaissance Orbiter Imagery for LROCNet Moon Classifier
zenodo.org
bin, zip
Updated Nov 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emily Dunkel; Emily Dunkel (2022). Lunar Reconnaissance Orbiter Imagery for LROCNet Moon Classifier [Dataset]. http://doi.org/10.5281/zenodo.7041842
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7041842
Dataset updated
Nov 1, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Emily Dunkel; Emily Dunkel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary

We provide imagery used to train LROCNet -- our Convolutional Neural Network classifier of orbital imagery of the moon. Images are divided into train, validation, and test zip files, which contain class specific sub-folders. We have three classes: "fresh crater", "old crater", and "none". Classes are described in detail in the attached labeling guide.

Directory Contents

We include the labeling guide and training, testing, and validation data. Training data was split to avoid upload timeouts.

LROC_Labeling_Intro_for_release.ppt: Labeling guide

val: Validation images divided into class sub-folders

ejecta: "fresh crater" class

oldcrater: "old crater" class

none: "none" class

test: Testing images divided into class sub-folders

ejecta: "fresh crater" class

oldcrater: "old crater" class

none: "none" class

ejecta_train: Training images of "fresh crater" class

oldcrater_train: Training images of "old crater" class

none_train1-4: Training images of "none" class (divided into 4 just for uploading)

Data Description

We use CDR (Calibrated Data Record) browse imagery (50% resolution) from the Lunar Reconnaissance Orbiter's Narrow Angle Cameras (NACs). Data we get from the NACs are 5-km swaths, at nominal orbit, so we perform a saliency detection step to find surface features of interest. A detector developed for Mars HiRISE (Wagstaff et al.) worked well for our purposes, after updating based on LROC NAC image resolution. We use this detector to create a set of image chipouts (small 227x277 cutouts) from the larger image, sampling the lunar globe.

Class Labeling

We select classes of interest based on what is visible at the NAC resolution, consulting with scientists and performing a literature review. Initially, we have 7 classes: "fresh crater", "old crater", "overlapping craters", "irregular mare patches", "rockfalls and landfalls", "of scientific interest", and "none".

Using the Zooniverse platform, we set up a labeling tool and labeled 5,000 images. We found that "fresh crater" make up 11% of the data, "old crater" 18%, with the vast majority "none". Due to limited examples of the other classes, we reduce our initial class set to: "fresh crater" (with impact ejecta), "old crater", and "none".

We divide the images into train/validation/test sets making sure no image swaths span multiple sets.

Data Augmentation

Using PyTorch, we apply the following augmentation on the training set only: horizontal flip, vertical flip, rotation by 90/180/270 degrees, and brightness adjustment (0.5, 2). In addition, we use weighted sampling so that each class is weighted equally. The training set included here does not include augmentation since that was performed within PyTorch.

Acknowledgements

The author would like to thank the volunteers who provided annotations for this data set, as well as others who contributed to this work (as in the Contributor list). We would also like to thank the PDS Imaging Node for support of this work.

The research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (80NM0018D0004).

CL#22-4763

© 2022 California Institute of Technology. Government sponsorship acknowledged.
e
Dkc Grup Training Guide As Export Import Data | Eximpedia
eximpedia.app
Updated Oct 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Dkc Grup Training Guide As Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/dkc-grup-training-guide-as/75190158
Explore at:
Dataset updated
Oct 1, 2025
Description
Dkc Grup Training Guide As Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
d
Where can I find help? DLI Basics
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carolyn DeLorey (2023). Where can I find help? DLI Basics [Dataset]. http://doi.org/10.5683/SP3/PSZDOV
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/PSZDOV
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Carolyn DeLorey
Description
Overview of where to go to find help with the DLI, including the DLI Survival Guide, Training Repository, DLI list, and DIGRS (Data Interest Group for Reference Services).
P
Data from: Basic sea safety for Pacific Island mariners - A training...
pacificdata.org
pdf
Updated Jul 30, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SPC Fisheries, Aquaculture and Marine Ecosystems division (FAME) (2012). Basic sea safety for Pacific Island mariners - A training strategy and curriculum - Trainer's guide [Dataset]. https://pacificdata.org/data/dataset/activity/oai-www-spc-int-7824cd26-26b7-4a95-8216-0f08cbf71d0a
Explore at:
pdfAvailable download formats
Dataset updated
Jul 30, 2012
Dataset provided by
SPC Fisheries, Aquaculture and Marine Ecosystems division (FAME)
Description
Carnie, G. 2000. Basic sea safety for Pacific Island mariners - A training strategy and curriculum - Trainer's guide. Noumea, New Caledonia: SPC, Secretariat of the Pacific Community. 39 p.
Titanic Solution for Beginner's Guide
kaggle.com
zip
Updated Mar 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harun-Ur-Rashid (2018). Titanic Solution for Beginner's Guide [Dataset]. https://www.kaggle.com/harunshimanto/titanic-solution-for-beginners-guide
Explore at:
zip(34881 bytes)Available download formats
Dataset updated
Mar 12, 2018
Authors
Harun-Ur-Rashid
Description
Overview

The data has been split into two groups:

training set (train.csv) test set (test.csv)

The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the “ground truth”) for each passenger. Your model will be based on “features” like passengers’ gender and class. You can also use feature engineering to create new features.

The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.

We also include gender_submission.csv, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.

Data Dictionary

Variable Definition Key survival Survival 0 = No, 1 = Yes pclass Ticket class 1 = 1st, 2 = 2nd, 3 = 3rd sex Sex
Age Age in years
sibsp # of siblings / spouses aboard the Titanic
parch # of parents / children aboard the Titanic
ticket Ticket number
fare Passenger fare
cabin Cabin number
embarked Port of Embarkation C = Cherbourg, Q = Queenstown, S = Southampton

Variable Notes

pclass: A proxy for socio-economic status (SES) 1st = Upper 2nd = Middle 3rd = Lower

age: Age is fractional if less than 1. If the age is estimated, is it in the form of xx.5

sibsp: The dataset defines family relations in this way... Sibling = brother, sister, stepbrother, stepsister Spouse = husband, wife (mistresses and fiancés were ignored)

parch: The dataset defines family relations in this way... Parent = mother, father Child = daughter, son, stepdaughter, stepson Some children travelled only with a nanny, therefore parch=0 for them.
Data from: Machine learning model inputs, outputs, and scripts associated...
osti.gov
Updated Dec 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bruen, Michael; Fluet-Chouinard, Etienne; Forbes, Brieanne; Garayburu-Caruso, Vanessa A.; Gary, Stefan; Goldman, Amy E.; Malhotra, Avni; Mehan, Sushant; Rivera Waterman, Bre; Rubin, Tod; Scheibe, Timothy D.; Stegen, James C.; Ward, Nicholas (2024). Machine learning model inputs, outputs, and scripts associated with “Artificial intelligence-guided iterations between observations and modeling significantly improve environmental predictions” [Dataset]. https://www.osti.gov/dataexplorer/biblio/2998468
Explore at:
Dataset updated
Dec 31, 2024
Dataset provided by
United States Department of Energyhttp://energy.gov/
Office of Sciencehttp://www.er.doe.gov/
Department of Energy Biological and Environmental Research Program
River Corridor Hydro-biogeochemistry from Molecular to Multi-Basin Scales SFA
Authors
Bruen, Michael; Fluet-Chouinard, Etienne; Forbes, Brieanne; Garayburu-Caruso, Vanessa A.; Gary, Stefan; Goldman, Amy E.; Malhotra, Avni; Mehan, Sushant; Rivera Waterman, Bre; Rubin, Tod; Scheibe, Timothy D.; Stegen, James C.; Ward, Nicholas
Description
NOTE: The manuscript associated with this data package is currently in review. The data may be revised based on reviewer feedback. Upon manuscript acceptance, this data package will be updated with the final dataset and additional metadata.This data package is associated with the manuscript “Artificial intelligence-guided iterations between observations and modeling significantly improve environmental predictions” (Malhotra et al., in prep). This effort was designed following ICON (integrated, coordinated, open, and networked) principles to facilitate a model-experiment (ModEx) iteration approach, leveraging crowdsourced sampling across the contiguous United States (CONUS). New machine learning models were created every month to guide sampling locations. Data from the resulting samples were used to test and rebuild the machine learning models for the next round of sampling guidance. Associated sediment and water geochemistry and in situ sensor data can be found at https://data.ess-dive.lbl.gov/datasets/doi:10.15485/1923689, https://data.ess-dive.lbl.gov/datasets/doi:10.15485/1729719, and https://data.ess-dive.lbl.gov/datasets/doi:10.15485/1603775. This data package is associated with two GitHub repositories found at https://github.com/parallelworks/dynamic-learning-rivers and https://github.com/WHONDRS-Hub/ICON-ModEx_Open_Manuscript. In addition to this readme, this data package also includes two file-level metadata (FLMD) files that describes each file and two data dictionaries (DD) that describe all column/row headers and variable definitions. This data package consists of two main folders (1) dynamic-learning-rivers and (2) ICON-ModEx_Open_Manuscript whichmore » contain snapshots of the associated GitHub repositories. The input data, output data, and machine learning models used to guide sampling locations are within dynamic-learning-rivers. The folder is organized into five top-level directories: (1) “input_data” holds the training data for the ML models; (2) “ml_models” holds machine learning (ML) models trained on the data in “input_data”; (3) “examples” contains files for direct experimentation with the machine learning model, including scripts for setting up “hindcast” run; (4) “scripts” contains data preprocessing and postprocessing scripts and intermediate results specific to this data set that bookend the ML workflow; and (5) “output_data” holds the overall results of the ML model on that branch. Each trained ML model resides on its own branch in the repository; this means that inputs and outputs can be different branch-to-branch. There is also one hidden directory “.github/workflows”. This hidden directory contains information for how to run the ML workflow as an end-to-end automated GitHub Action but it is not needed for reusing the ML models archived here. Please see the top-level README.md in the GitHub repository for more details on the automation.The scripts and data used to create figures in the manuscript are within ICON-ModEx_Open_Manuscript. The folder is organized into four folders which contain the scripts, data, and pdf for each figure. Within the “fig-model-score-evolution” folder, there is a folder called “intermediate_branch_data” which contains some intermediate files pulled from dynamic-learning-rivers and reorganized to easily integrate into the workflows. NOTE: THIS FOLDER INCLUDES THE FILES AT THE POINT OF PAPER SUBMISSION. IT WILL BE UPDATED ONCE THE PAPER IS ACCEPTED WITH ANY REVISIONS AND WILL INCLUDE A DD/FLMD AT THAT POINT.« less
H
HIPAA Training Courses Report
archivemarketresearch.com
doc, pdf, ppt
Updated Mar 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). HIPAA Training Courses Report [Dataset]. https://www.archivemarketresearch.com/reports/hipaa-training-courses-57877
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Mar 14, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The HIPAA training market is booming, projected to reach $875 million by 2033 with a 12% CAGR. Learn about market trends, key players (Accountable, OSHAcademy, Medscape, etc.), and regional insights in this comprehensive analysis. Ensure your organization's compliance with this essential guide to HIPAA training courses.
C
AI SEO Tools Training Programs: Building Team Expertise and Adoption
caseysseo.com
txt
Updated Aug 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Casey Miller (2025). AI SEO Tools Training Programs: Building Team Expertise and Adoption [Dataset]. https://caseysseo.com/ai-seo-tools-training-programs-building-team-expertise-and-adoption
Explore at:
txtAvailable download formats
Dataset updated
Aug 22, 2025
Dataset provided by
Casey's SEO
Authors
Casey Miller
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Variables measured
Military Population, Cost of AI SEO Tools, Team Adoption Concerns, Organic Traffic Increase, Time Savings for Local Rank Tracking, Colorado Springs Mobile Search Growth
Measurement technique
Team performance tracking, Industry benchmarking, User feedback and surveys, Client case studies
Description
This dataset covers a comprehensive guide on how to effectively train teams to adopt and use AI SEO tools, including starting with hands-on experience, addressing the trust gap, creating role-specific training, solving real problems, and building internal champions. It also covers common mistakes to avoid and best practices for measuring success.
o
mushroom
openml.org
Updated Apr 6, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeff Schlimmer (2014). mushroom [Dataset]. https://www.openml.org/d/24
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 6, 2014
Authors
Jeff Schlimmer
Description
Author: Jeff Schlimmer
Source: UCI - 1981
Please cite: The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf

Description

This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible.

Source

(a) Origin: Mushroom records are drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf (b) Donor: Jeff Schlimmer (Jeffrey.Schlimmer '@' a.gp.cs.cmu.edu)

Dataset description

This dataset includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family. Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no simple rule for determining the edibility of a mushroom; no rule like ``leaflets three, let it be'' for Poisonous Oak and Ivy.

Attributes Information

1. cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s 2. cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s 3. cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y 4. bruises?: bruises=t,no=f 5. odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s 6. gill-attachment: attached=a,descending=d,free=f,notched=n 7. gill-spacing: close=c,crowded=w,distant=d 8. gill-size: broad=b,narrow=n 9. gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y 10. stalk-shape: enlarging=e,tapering=t 11. stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? 12. stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s 13. stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s 14. stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 15. stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y 16. veil-type: partial=p,universal=u 17. veil-color: brown=n,orange=o,white=w,yellow=y 18. ring-number: none=n,one=o,two=t 19. ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z 20. spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y 21. population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y 22. habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d

Relevant papers

Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19). Doctoral disseration, Department of Information and Computer Science, University of California, Irvine.

Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79. Ann Arbor, Michigan: Morgan Kaufmann.

Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules from training data using backpropagation networks, in: Proc. of the The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30, [Web Link]

Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of crisp logical rules using constrained backpropagation networks - comparison of two new approaches, in: Proc. of the European Symposium on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997.
Code and Source Data for "Knowledge-Guided Machine Learning can improve C...
zenodo.org
zip
Updated Oct 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LICHENG LIU; LICHENG LIU; Zhenong Jin; Zhenong Jin (2024). Code and Source Data for "Knowledge-Guided Machine Learning can improve C cycle quantification in agroecosystems" [Dataset]. http://doi.org/10.5281/zenodo.10155516
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10155516
Dataset updated
Oct 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
LICHENG LIU; LICHENG LIU; Zhenong Jin; Zhenong Jin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets for code and Source Data for the study "Knowledge-Guided Machine Learning can improve C cycle quantification in agroecosystems" https://doi.org/10.1038/s41467-023-43860-5. All files belong to Licheng Liu and Zhenong Jin at University of Minnesota. deposit_code_v2.zip contains packaged codes and sample runs for KGML-ag-Carbon training, validation and implementations. Source Data.zip contains data for generating the figures inside the study.

Note: We used Pytorch 1.6.0 (https://pytorch.org/get-started/previous-versions/, last access: 21 Oct 2023) and Python 3.7.11 (https://www.python.org/downloads/release/python-3711/, last access: 21 Oct 2023) as the programming environment for model development. Statistical analysis, such as linear regression, was conducted using Statsmodels 0.14.0 (https://github.com/statsmodels/statsmodels/, last access: 21 Oct 2023) In order to use a GPU to speed-up the training process, we installed the CUDA Toolkit 10.1.243 (https://developer.nvidia.com/cuda-toolkit, last access: 21 Oct 2023).

To use the full kgml_lib function, please create a new environment with the same python and libs above.
w
Dataset of books called Training for the complete rower : a guide to...
workwithdata.com
Updated Apr 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Training for the complete rower : a guide to improving performance [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Training+for+the+complete+rower+:+a+guide+to+improving+performance
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Training for the complete rower : a guide to improving performance. It features 7 columns including author, publication date, language, and book publisher.
d
Data from: Please Use Our Data!
search.dataone.org
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jane Fry; Data Liberation Initiative (DLI) (2023). Please Use Our Data! [Dataset]. http://doi.org/10.5683/SP3/YQPXW5
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/YQPXW5
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Jane Fry; Data Liberation Initiative (DLI)
Description
At the outset of the Data Liberation Initiative (DLI), there were only 9 Data Centres in Canada with experienced staff. A Training Module was developed in 1997 but it is now outdated. Today, there are over 70 Data Centres in Canada. The staff who manage them have varying job descriptions but the new generation, as well as those who have been there longer, need to be able to find information about DLI quickly for their clients, as well as for themselves. The DLI wants to inform its communities of the content of their holdings and then help them to access the data. And thus, the Compleat DLI Survival Kit was born. This presentation will give the background of the Compleat DLI Survival Kit and look at each of the chapters in some detail. And we will show how this tool will prove beneficial to all Canadian Data Centres and to other users of Statistics Canada data.
Alpaca
kaggle.com
zip
Updated Nov 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Alpaca [Dataset]. https://www.kaggle.com/datasets/thedevastator/alpaca-instructions-word-level-classification
Explore at:
zip(26297842 bytes)Available download formats
Dataset updated
Nov 24, 2023
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Alpaca

Alpaca - Training LLMs to follow instructions

By Huggingface Hub [source]

About this dataset

This dataset, TokenBender: 122k Alpaca-Style Instructions Word-Level Classification Towards Accurate Natural Language Understanding, provides a comprehensive collection of 122K Alpaca-style instructions with their associated input, text and output for word-level classification. It enables natural language understanding research to be done conveniently as it contains entries from diverse areas such as programming code instructions and gaming instructions that are written in varying levels of complexity. With the help of this dataset, developers aiming to apply natural language processing techniques for machines may gain insight into how to improve the accuracy and facilitate the comprehension of human language commands. By using this dataset, one may develop advanced algorithms such as neural networks or decision trees that can quickly understand commands in foreign languages and bridge the gap between machines and humans for different practical purposes

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset contains 122k Alpaca-Style Instructions with their corresponding input, text, and output for word-level classification. It is a valuable resource to those who wish to gain insight into natural language understanding through data science approaches. This guide will provide some tips on how to use this dataset in order to maximize accuracy and gain a better understanding of natural language.

Preprocessing: Cleaning the data is an essential step when dealing with any sort of text data which includes the Alpaca instructions dataset. This involves removing stopwords like articles, pronouns, etc., normalizing words such as capitalization or lemmatization, filtering for relevant terms based on context or key problems you are trying to solve; and finally tokenizing the remaining text into appropriate individual pieces that can be provided as input features for different models – SentencePiece is perfect for this sort of task.

Feature extraction: After preprocessing your text data it’s time to extract insightful features from it utilizing techniques like Bag-of-Words (BOW), Term Frequency - Inverse Document Frequency (TF-IDF) Vectorizer etc., which might help you better understand the context behind each instruction sentence/word within the corpus. Additionally embedding techniques using word2vec/GloVe might also serve useful in extracting semantic information from these instructions while helping build classifiers successful at predicting word level categories related tasks (Semantic segmentation).

Model selection: Depending on your problem setup AI architectures like Support vector machines(SVMs)/Conditional Random Fields(CRFs)/ Attention Based Models should work well in tackling these types of tasks related towards NLP analysis at both sentence or shallow representation form levels (Part Of Speech tagging). If learning what words are used together efficiently matters more than all other options then selecting an RNN model such as LSTM or GRU might do wonders; they are similarly effective but faster modelling approach due its recursive structure that allows you store context information more effectively compared BOWs or TFIDF Vectors spaces separately built up during feature engineering processing periods per individual supervised training tasks points instead across all!

Evaluating Results: After choosing the best algorithm model fit analysis performance measures such as F1 scores should enable easier tracking end goal results adjustments if needed precision/recall levels are declining significantly past certain number values threshold points compared lower task confirming holding out uncategorized sample documents versus larger ID test portion splits train tests datasets subsets collected

Research Ideas

Developing an AI-based algorithm capable of accurately understanding the meaning of natural language instructions.

Using this dataset for training and testing machine learning models to classify specific words and phrases within natural language instructions.

Training a deep learning model to generate visual components based on the given input, text, and output values from this dataset

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

**License: [CC0 1.0 Univer...
a
Data SPR-774 Measuring and Improving the Effectiveness of ADOT’s Employee...
adotrc-agic.hub.arcgis.com
Updated Dec 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AZGeo ArcGIS Online (AGO) (2023). Data SPR-774 Measuring and Improving the Effectiveness of ADOT’s Employee Learning and Development Program [Dataset]. https://adotrc-agic.hub.arcgis.com/documents/b046d872d10f493296ad0809fc7dae93
Explore at:
Dataset updated
Dec 15, 2023
Dataset authored and provided by
AZGeo ArcGIS Online (AGO)
Description
Data: "Focus Group Guide (Employees)", "Focus Group Guide (Leaders)", and Data Summary.

Task 1.3: Employee Learning and Development Training Programs and Current Practices

Task 1.4: Literature Review

Task 2.1: Measures, Data Collection Tools, and Protocols

Task 2.2: Test and Validate Measures – Pilot

Task 3.1: Data Coding, Cleaning, and Validation Procedures

Task 3.2: Data Analysis

Final Report: Measuring and Improving the Effectiveness of ADOT’s Employee Learning and Development Program

Compendium of Research: Measuring and Improving the Effectiveness of ADOT’s Employee Learning and Development Program

Facebook

Twitter

Click to copy link

Link copied

Cite

data.austintexas.gov (2025). APD Cadets in Training Interactive Dataset Guide [Dataset]. https://catalog.data.gov/dataset/apd-cadets-in-training-interactive-dataset-guide

APD Cadets in Training Interactive Dataset Guide

Explore at:

Dataset updated

Nov 25, 2025

Dataset provided by

data.austintexas.gov

Description

Guide for APD Cadets in Training Dataset

Clear search

Close search

Google apps

Main menu

APD Cadets in Training Interactive Dataset Guide

Dataset of books called The complete guide to rat training

APD Cadets in Training Interactive Dashboard Guide

Data from: Design guide and usability questionnaire to develop and assess...

Learning about climate change - A guide for teachers

Dataset of books called The skills of human relations training : a guide for...

Lunar Reconnaissance Orbiter Imagery for LROCNet Moon Classifier

Dkc Grup Training Guide As Export Import Data | Eximpedia

Where can I find help? DLI Basics

Data from: Basic sea safety for Pacific Island mariners - A training...

Titanic Solution for Beginner's Guide

Overview

Data Dictionary

Variable Notes

Data from: Machine learning model inputs, outputs, and scripts associated...

HIPAA Training Courses Report

AI SEO Tools Training Programs: Building Team Expertise and Adoption

mushroom

Description

Source

Dataset description

Attributes Information

Relevant papers

Code and Source Data for "Knowledge-Guided Machine Learning can improve C...

Dataset of books called Training for the complete rower : a guide to...

Data from: Please Use Our Data!

Alpaca

Alpaca

Alpaca - Training LLMs to follow instructions

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Data SPR-774 Measuring and Improving the Effectiveness of ADOT’s Employee...

APD Cadets in Training Interactive Dataset Guide