100+ datasets found

Pytorch-data
kaggle.com
Updated Dec 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luong Hoang Minh (2024). Pytorch-data [Dataset]. https://www.kaggle.com/datasets/minhlnghong/pytorch-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 29, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Luong Hoang Minh
Description
Dataset

This dataset was created by Luong Hoang Minh

Contents
E
Data from: PyTorch model for Slovenian Named Entity Recognition SloNER 1.0
live.european-language-grid.eu
Updated Jan 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). PyTorch model for Slovenian Named Entity Recognition SloNER 1.0 [Dataset]. https://live.european-language-grid.eu/catalogue/tool-service/20980
Explore at:
Dataset updated
Jan 26, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The SloNER is a model for Slovenian Named Entity Recognition. It is is a PyTorch neural network model, intended for usage with the HuggingFace transformers library (https://github.com/huggingface/transformers).

The model is based on the Slovenian RoBERTa contextual embeddings model SloBERTa 2.0 (http://hdl.handle.net/11356/1397). The model was trained on the SUK 1.0 training corpus (http://hdl.handle.net/11356/1747).The source code of the model is available on GitHub repository https://github.com/clarinsi/SloNER.
PyTorch Geometric processed database of water cluster minima
figshare.com
zip
Updated Nov 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hatem Helal (2022). PyTorch Geometric processed database of water cluster minima [Dataset]. http://doi.org/10.6084/m9.figshare.21456702.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21456702.v1
Dataset updated
Nov 2, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Hatem Helal
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Pre-processed dataset for working with the HydroNet dataset in PyTorch Geometric.

See:

https://sites.uw.edu/wdbase/database-of-water-clusters/
Data from: pytorch-lightning
kaggle.com
Updated Mar 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sunghyun Jun (2021). pytorch-lightning [Dataset]. https://www.kaggle.com/datasets/sunghyunjun/pytorchlightning/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 31, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sunghyun Jun
Description
Dataset

This dataset was created by Sunghyun Jun

Contents
h
pytorch-reasoning
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
RELAI, pytorch-reasoning [Dataset]. https://huggingface.co/datasets/relai-ai/pytorch-reasoning
Explore at:
Dataset authored and provided by
RELAI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Samples in this benchmark were generated by RELAI using the following data source(s): Data Source Name: pytorch Data Source Link: https://pytorch.org/docs/stable/index.html Data Source License: https://github.com/pytorch/pytorch/blob/main/LICENSE Data Source Authors: PyTorch AI Benchmarks by Data Agents. 2025 RELAI.AI. Licensed under CC BY 4.0. Source: https://relai.ai
d
PyTorch geometric datasets for morphVQ models
datadryad.org
dataone.org
+2more
zip
Updated Sep 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oshane Thomas; Hongyu Shen; Ryan L. Rauum; William E. H. Harcourt-Smith; John D. Polk; Mark Hasegawa-Johnson (2022). PyTorch geometric datasets for morphVQ models [Dataset]. http://doi.org/10.5061/dryad.bvq83bkcr
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.bvq83bkcr
Dataset updated
Sep 29, 2022
Dataset provided by
Dryad
Authors
Oshane Thomas; Hongyu Shen; Ryan L. Rauum; William E. H. Harcourt-Smith; John D. Polk; Mark Hasegawa-Johnson
Time period covered
Sep 2, 2022
Description
These datasets are customized Torch Geometric Datasets that contain raw .off polygon meshes as well as preprocessed .pt files needed for training morphVQ models. morphVQ can be found at https://github.com/oothomas/morphVQ.
E
Data from: PyTorch model for Slovenian Coreference Resolution
live.european-language-grid.eu
Updated Feb 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). PyTorch model for Slovenian Coreference Resolution [Dataset]. https://live.european-language-grid.eu/catalogue/tool-service/20990
Explore at:
Dataset updated
Feb 16, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Slovenian model for coreference resolution: a neural network based on a customized transformer architecture, usable with the code published on https://github.com/matejklemen/slovene-coreference-resolution. The model is based on the Slovenian CroSloEngual BERT 1.1 model (http://hdl.handle.net/11356/1330). It was trained on the SUK 1.0 training corpus (http://hdl.handle.net/11356/1747), specifically the SentiCoref subcorpus.

Using the evaluation setting where entity mentions are assumed to be correctly pre-detected, the model achieves the following metric values: MUC: precision = 0.931, recall = 0.957, F1 = 0.943 BCubed: precision = 0.887, recall = 0.947, F1 = 0.914 CEAFe: precision = 0.945, recall = 0.893, F1 = 0.916 CoNLL-12: precision = 0.921, recall = 0.932, F1 = 0.924
o
Data from: Federated Learning Demonstrator MNIST Example (Version 1.0.1)
explore.openaire.eu
Updated Oct 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Heinrich; Benedikt Franke (2024). Federated Learning Demonstrator MNIST Example (Version 1.0.1) [Dataset]. https://explore.openaire.eu/search/other?orpId=od_1640::02069c46417b50d8cd5088c9b8fbf7d6
Explore at:
Dataset updated
Oct 18, 2024
Authors
Florian Heinrich; Benedikt Franke
Description
Federated Learning Demonstrator MNIST Example (Version 1.0.1)
d
Data from: Torchtree: flexible phylogenetic model development and inference...
search.dataone.org
datadryad.org
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathieu Fourment; Matthew Macaulay; Christiaan Swanepoel; Xiang Ji; Marc Suchard; Frederick Matsen IV (2025). Torchtree: flexible phylogenetic model development and inference using PyTorch [Dataset]. http://doi.org/10.5061/dryad.zw3r228gv
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.zw3r228gv
Dataset updated
Jun 21, 2025
Dataset provided by
Dryad Digital Repository
Authors
Mathieu Fourment; Matthew Macaulay; Christiaan Swanepoel; Xiang Ji; Marc Suchard; Frederick Matsen IV
Description
Bayesian inference has predominantly relied on the Markov chain Monte Carlo (MCMC) algorithm for many years. However, MCMC is computationally laborious, especially for complex phylogenetic models of time trees. This bottleneck has led to the search for alternatives, such as variational Bayes, which can scale better to large datasets. In this paper, we introduce torchtree, a framework written in Python that allows developers to easily implement rich phylogenetic models and algorithms using a fixed tree topology. One can either use automatic differentiation, or leverage torchtree's plug-in system to compute gradients analytically for model components for which automatic differentiation is slow. We demonstrate that the torchtree variational inference framework performs similarly to BEAST in terms of speed and approximation accuracy. Furthermore, we explore the use of the forward KL divergence as an optimizing criterion for variational inference, which can handle discontinuous and non-diffe..., , , # torchtree: flexible phylogenetic model development and inference using PyTorch

Mathieu Fourment,Â Matthew Macaulay,Â Christiaan J Swanepoel,Â Xiang Ji,Â Marc A Suchard,Â Frederick A Matsen IV.Â torchtree: flexible phylogenetic model development and inference using PyTorch.Â arXiv:2406.18044 (2024)

Description of the data

The SI.pdf file contains supplementary methods and figures referenced in the main manuscript (found on Zenodo under Supplemental Information).

The data.zip contains input files and phylogenetic trees used for analyses in the associated manuscript. The data are organized by dataset (HCV and SC2) and by tool (beast and torchtree), and include sequence alignments (see next section for SC2 alignment) and configuration files (xml and json files). torchtree uses variational Bayes while BEAST uses MCMC.

data/ â”œâ”€â”€ HCV/ â”‚ â”œâ”€â”€ HCV.fasta # Sequence alignment for HCV â”‚ â”œâ”€â”€ HCV.tree # Newick ...,
DUNEdn supporting data
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Jun 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Rossi; Marco Rossi (2022). DUNEdn supporting data [Dataset]. http://doi.org/10.5281/zenodo.6599305
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6599305
Dataset updated
Jun 1, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marco Rossi; Marco Rossi
Description
A dataset containing a sample event inspired by ProtoDUNE-SP simulation.
Checkpoints of trained DUNEdn package models used for Springer original article.
o
Data from: Federated Learning Client Base Image (Version 1.0.1)
explore.openaire.eu
Updated Oct 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Heinrich; Benedikt Franke (2024). Federated Learning Client Base Image (Version 1.0.1) [Dataset]. https://explore.openaire.eu/search/other?orpId=od_1640::5ef86e2516df78126d40d5faeca2e907
Explore at:
Dataset updated
Oct 18, 2024
Authors
Florian Heinrich; Benedikt Franke
Description
Federated Learning Client Base Image (Version 1.0.1)

Sentence/Table Pair Data from Wikipedia for Pre-training with...

zenodo.org
data.niaid.nih.gov

application/gzip

Updated Oct 29, 2021

Facebook

Twitter

Click to copy link

Link copied

Cite

Xiang Deng; Yu Su; Alyssa Lees; You Wu; Cong Yu; Huan Sun; Xiang Deng; Yu Su; Alyssa Lees; You Wu; Cong Yu; Huan Sun (2021). Sentence/Table Pair Data from Wikipedia for Pre-training with Distant-Supervision [Dataset]. http://doi.org/10.5281/zenodo.5612316

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5612316

Dataset updated

Oct 29, 2021

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Xiang Deng; Yu Su; Alyssa Lees; You Wu; Cong Yu; Huan Sun; Xiang Deng; Yu Su; Alyssa Lees; You Wu; Cong Yu; Huan Sun

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This is the dataset used for pre-training in "ReasonBERT: Pre-trained to Reason with Distant Supervision", EMNLP'21.

There are two files:

sentence_pairs_for_pretrain_no_tokenization.tar.gz -> Contain only sentences as evidence, Text-only

table_pairs_for_pretrain_no_tokenization.tar.gz -> At least one piece of evidence is a table, Hybrid

The data is chunked into multiple tar files for easy loading. We use WebDataset, a PyTorch Dataset (IterableDataset) implementation providing efficient sequential/streaming data access.

For pre-training code, or if you have any questions, please check our GitHub repo https://github.com/sunlab-osu/ReasonBERT

Below is a sample code snippet to load the data

import webdataset as wds

# path to the uncompressed files, should be a directory with a set of tar files
url = './sentence_multi_pairs_for_pretrain_no_tokenization/{000000...000763}.tar'
dataset = (
  wds.Dataset(url)
  .shuffle(1000) # cache 1000 samples and shuffle
  .decode()
  .to_tuple("json")
  .batched(20) # group every 20 examples into a batch
)

# Please see the documentation for WebDataset for more details about how to use it as dataloader for Pytorch
# You can also iterate through all examples and dump them with your preferred data format

Below we show how the data is organized with two examples.

Text-only

{'s1_text': 'Sils is a municipality in the comarca of Selva, in Catalonia, Spain.', # query sentence
 's1_all_links': {
  'Sils,_Girona': [[0, 4]],
  'municipality': [[10, 22]],
  'Comarques_of_Catalonia': [[30, 37]],
  'Selva': [[41, 46]],
  'Catalonia': [[51, 60]]
 }, # list of entities and their mentions in the sentence (start, end location)
 'pairs': [ # other sentences that share common entity pair with the query, group by shared entity pairs
  {
    'pair': ['Comarques_of_Catalonia', 'Selva'], # the common entity pair
    's1_pair_locs': [[[30, 37]], [[41, 46]]], # mention of the entity pair in the query
    's2s': [ # list of other sentences that contain the common entity pair, or evidence
     {
       'md5': '2777e32bddd6ec414f0bc7a0b7fea331',
       'text': 'Selva is a coastal comarque (county) in Catalonia, Spain, located between the mountain range known as the Serralada Transversal or Puigsacalm and the Costa Brava (part of the Mediterranean coast). Unusually, it is divided between the provinces of Girona and Barcelona, with Fogars de la Selva being part of Barcelona province and all other municipalities falling inside Girona province. Also unusually, its capital, Santa Coloma de Farners, is no longer among its larger municipalities, with the coastal towns of Blanes and Lloret de Mar having far surpassed it in size.',
       's_loc': [0, 27], # in addition to the sentence containing the common entity pair, we also keep its surrounding context. 's_loc' is the start/end location of the actual evidence sentence
       'pair_locs': [ # mentions of the entity pair in the evidence
        [[19, 27]], # mentions of entity 1
        [[0, 5], [288, 293]] # mentions of entity 2
       ],
       'all_links': {
        'Selva': [[0, 5], [288, 293]],
        'Comarques_of_Catalonia': [[19, 27]],
        'Catalonia': [[40, 49]]
       }
      }
    ,...] # there are multiple evidence sentences
   },
 ,...] # there are multiple entity pairs in the query
}

Hybrid

{'s1_text': 'The 2006 Major League Baseball All-Star Game was the 77th playing of the midseason exhibition baseball game between the all-stars of the American League (AL) and National League (NL), the two leagues comprising Major League Baseball.',
 's1_all_links': {...}, # same as text-only
 'sentence_pairs': [{'pair': ..., 's1_pair_locs': ..., 's2s': [...]}], # same as text-only
 'table_pairs': [
  'tid': 'Major_League_Baseball-1',
  'text':[
    ['World Series Records', 'World Series Records', ...],
    ['Team', 'Number of Series won', ...],
    ['St. Louis Cardinals (NL)', '11', ...],
  ...] # table content, list of rows
  'index':[
    [[0, 0], [0, 1], ...],
    [[1, 0], [1, 1], ...],
  ...] # index of each cell [row_id, col_id]. we keep only a table snippet, but the index here is from the original table.
  'value_ranks':[
    [0, 0, ...],
    [0, 0, ...],
    [0, 10, ...],
  ...] # if the cell contain numeric value/date, this is its rank ordered from small to large, follow TAPAS
  'value_inv_ranks': [], # inverse rank
  'all_links':{
    'St._Louis_Cardinals': {
     '2': [
      [[2, 0], [0, 19]], # [[row_id, col_id], [start, end]]
     ] # list of mentions in the second row, the key is row_id
    },
    'CARDINAL:11': {'2': [[[2, 1], [0, 2]]], '8': [[[8, 3], [0, 2]]]},
  }
  'name': '', # table name, if exists
  'pairs': {
    'pair': ['American_League', 'National_League'],
    's1_pair_locs': [[[137, 152]], [[162, 177]]], # mention in the query
    'table_pair_locs': {
     '17': [ # mention of entity pair in row 17
       [
        [[17, 0], [3, 18]],
        [[17, 1], [3, 18]],
        [[17, 2], [3, 18]],
        [[17, 3], [3, 18]]
       ], # mention of the first entity
       [
        [[17, 0], [21, 36]],
        [[17, 1], [21, 36]],
       ] # mention of the second entity
     ]
    }
   }
 ]
}

f
Data from: Deep learning neural network derivation and testing to...
tandf.figshare.com
png
Updated Aug 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Omid Mehrpour; Christopher Hoyte; Abdullah Al Masud; Ashis Biswas; Jonathan Schimmel; Samaneh Nakhaee; Mohammad Sadegh Nasr; Heather Delva-Clark; Foster Goss (2023). Deep learning neural network derivation and testing to distinguish acute poisonings [Dataset]. http://doi.org/10.6084/m9.figshare.23694504.v1
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23694504.v1
Dataset updated
Aug 8, 2023
Dataset provided by
Taylor & Francis
Authors
Omid Mehrpour; Christopher Hoyte; Abdullah Al Masud; Ashis Biswas; Jonathan Schimmel; Samaneh Nakhaee; Mohammad Sadegh Nasr; Heather Delva-Clark; Foster Goss
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Acute poisoning is a significant global health burden, and the causative agent is often unclear. The primary aim of this pilot study was to develop a deep learning algorithm that predicts the most probable agent a poisoned patient was exposed to from a pre-specified list of drugs. Data were queried from the National Poison Data System (NPDS) from 2014 through 2018 for eight single-agent poisonings (acetaminophen, diphenhydramine, aspirin, calcium channel blockers, sulfonylureas, benzodiazepines, bupropion, and lithium). Two Deep Neural Networks (PyTorch and Keras) designed for multi-class classification tasks were applied. There were 201,031 single-agent poisonings included in the analysis. For distinguishing among selected poisonings, PyTorch model had specificity of 97%, accuracy of 83%, precision of 83%, recall of 83%, and a F1-score of 82%. Keras had specificity of 98%, accuracy of 83%, precision of 84%, recall of 83%, and a F1-score of 83%. The best performance was achieved in the diagnosis of single-agent poisoning in diagnosing poisoning by lithium, sulfonylureas, diphenhydramine, calcium channel blockers, then acetaminophen, in PyTorch (F1-score = 99%, 94%, 85%, 83%, and 82%, respectively) and Keras (F1-score = 99%, 94%, 86%, 82%, and 82%, respectively). Deep neural networks can potentially help in distinguishing the causative agent of acute poisoning. This study used a small list of drugs, with polysubstance ingestions excluded.Reproducible source code and results can be obtained at https://github.com/ashiskb/npds-workspace.git.
Data from: pytorch-metric-learning
kaggle.com
Updated Mar 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryazantsev Gleb (2021). pytorch-metric-learning [Dataset]. https://www.kaggle.com/permoment/pytorchmetriclearning/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ryazantsev Gleb
Description
Dataset

This dataset was created by Ryazantsev Gleb

Contents
m
Turbulent Flow data as PyTorch tensors for ML: Kolmogorov Flow at Re=222,...
figshare.manchester.ac.uk
zip
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Sardar; Alex Skillen (2025). Turbulent Flow data as PyTorch tensors for ML: Kolmogorov Flow at Re=222, and Kelvin-Helmholtz instability [Dataset]. http://doi.org/10.48420/29329565.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48420/29329565.v1
Dataset updated
Jun 17, 2025
Dataset provided by
University of Manchester
Authors
Mohammed Sardar; Alex Skillen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains three files, listed below. The Kolmogorov flow is generated using a spectral solver, available at: https://github.com/google/jax-cfd. The Kelvin-Helmholtz Instability is generated using an in-house code.Case 1: Kolmogorov Flownu_0p0045_2500_8f_uv_128.pt -- a PyTorch tensor containing 2500 eight-frame videos of a 2D Re=222 forced turbulent flow (Kolmogorov flow), with only velocity vectors provided. The first 2000 samples are used as training data, the next 450 are used for validation and the final 50 are used to test the model, after training.Case 2: Kelvin Helmholtz InstabilityTraining and Validation:kh_8f_72_208_r34568.pt -- a PyTorch tensor containing 1000 eight-frame videos of a Kelvin-Helmholtz instability flow from 5 realisations of the flow (i.e. initialised from different random seeds). Each two hundred videos are from one simulation - the last two hundred may be used as a validation set.Testing: kh_8f_72_208_r9.pt -- a PyTorch tensor containing 200 eight-frame videos of a Kelvin-Helmholtz instability flow from a realisation of the flow different to the above. This is used as the test set for a model trained on kh_8f_72_208_r34568.pt.
Data and scripts from "Unsupervised learning for structure detection in...
zenodo.org
explore.openaire.eu
+1more
zip
Updated Jan 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BARBOT Armand; GATTI Riccardo; BARBOT Armand; GATTI Riccardo (2023). Data and scripts from "Unsupervised learning for structure detection in plastically deformed crystals" [Dataset]. http://doi.org/10.5281/zenodo.7582668
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7582668
Dataset updated
Jan 31, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
BARBOT Armand; GATTI Riccardo; BARBOT Armand; GATTI Riccardo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This documents contains the scripts and dataset used for the paper "Unsupervised learning for structure detection in plastically deformed crystals".

More precisely it contains 4 folders :

DumpForFigures : subfolder containing the atomic positions in .dump format (see lammps documentation) used for the article figures.

DumpForTraining : subfolder containing the atomic position in .dump format (see lammps documentation) used for training the autoencoder.

ScriptsToDetectStructuresFromDump : subfolder containing the script sused to detect the substructures of the system by combining autoencoder and clustering methods. This folder contains a readme with the details of the contents.

ScriptToGenerateDump : subfolder containing the scripts used to generate the atomic data with molecular dynamics. These data are then used to train the autoencoder. This folder contains a readme with the details of the contents.

REQUIREMENTS :

Lammps

Python3 with packages :

-numpy

-matplotlib

-pyscal

-sci-kit learn

-pytorch

-glob
T
Graph Network Simulator PyTorch training dataset for water drop sample
dataverse.tdl.org
bin, json
Updated Apr 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Krishna Kumar; Krishna Kumar (2022). Graph Network Simulator PyTorch training dataset for water drop sample [Dataset]. http://doi.org/10.18738/T8/HUBMDM
Explore at:
json(365), bin(5933885), bin(7174932), bin(7596095)Available download formats
Unique identifier
https://doi.org/10.18738/T8/HUBMDM
Dataset updated
Apr 1, 2022
Dataset provided by
Texas Data Repository
Authors
Krishna Kumar; Krishna Kumar
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
DataSet for training the PyTorch Graph Network Simulator. https://github.com/geoelements/gns. The repository contains the data sets for water drop sample
u
Data from: Efficient imaging and computer vision detection of two cell...
agdatacommons.nal.usda.gov
datasets.ai
+1more
zip
Updated Feb 21, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benjamin P. Graham; Jeremy Park; Grant Billings; Amanda M. Hulse-Kemp; Candace H. Haigler; Edgar Lobaton (2024). Data from: Efficient imaging and computer vision detection of two cell shapes in young cotton fibers [Dataset]. http://doi.org/10.15482/USDA.ADC/1528324
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1528324
Dataset updated
Feb 21, 2024
Dataset provided by
Ag Data Commons
Authors
Benjamin P. Graham; Jeremy Park; Grant Billings; Amanda M. Hulse-Kemp; Candace H. Haigler; Edgar Lobaton
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Methods Cotton plants were grown in a well-controlled greenhouse in the NC State Phytotron as described previously (Pierce et al, 2019). Flowers were tagged on the day of anthesis and harvested three days post anthesis (3 DPA). The distinct fiber shapes had already formed by 2 DPA (Stiff and Haigler, 2016; Graham and Haigler, 2021), and fibers were still relatively short at 3 DPA, which facilitated the visualization of multiple fiber tips in one image. Cotton fiber sample preparation, digital image collection, and image analysis: Ovules with attached fiber were fixed in the greenhouse. The fixative previously used (Histochoice) (Stiff and Haigler, 2016; Pierce et al., 2019; Graham and Haigler, 2021) is obsolete, which led to testing and validation of another low-toxicity, formalin-free fixative (#A5472; Sigma-Aldrich, St. Louis, MO; Fig. S1). The boll wall was removed without damaging the ovules. (Using a razor blade, cut away the top 3 mm of the boll. Make about 1 mm deep longitudinal incisions between the locule walls, and finally cut around the base of the boll.) All of the ovules with attached fiber were lifted out of the locules and fixed (1 h, RT, 1:10 tissue:fixative ratio) prior to optional storage at 4°C. Immediately before imaging, ovules were examined under a stereo microscope (incident light, black background, 31X) to select three vigorous ovules from each boll while avoiding drying. Ovules were rinsed (3 x 5 min) in buffer [0.05 M PIPES, 12 mM EGTA. 5 mM EDTA and 0.1% (w/v) Tween 80, pH 6.8], which had lower osmolarity than a microtubule-stabilizing buffer used previously for aldehyde-fixed fibers (Seagull, 1990; Graham and Haigler, 2021). While steadying an ovule with forceps, one to three small pieces of its chalazal end with attached fibers were dissected away using a small knife (#10055-12; Fine Science Tools, Foster City, CA). Each ovule piece was placed in a single well of a 24-well slide (#63430-04; Electron Microscopy Sciences, Hatfield, PA) containing a single drop of buffer prior to applying and sealing a 24 x 60 mm coverslip with vaseline. Samples were imaged with brightfield optics and default settings for the 2.83 mega-pixel, color, CCD camera of the Keyence BZ-X810 imaging system (www.keyence.com; housed in the Cellular and Molecular Imaging Facility of NC State). The location of each sample in the 24-well slides was identified visually using a 2X objective and mapped using the navigation function of the integrated Keyence software. Using the 10X objective lens (plan-apochromatic; NA 0.45) and 60% closed condenser aperture setting, a region with many fiber apices was selected for imaging using the multi-point and z-stack capture functions. The precise location was recorded by the software prior to visual setting of the limits of the z-plane range (1.2 µm step size). Typically, three 24-sample slides (representing three accessions) were set up in parallel prior to automatic image capture. The captured z-stacks for each sample were processed into one two-dimensional image using the full-focus function of the software. (Occasional samples contained too much debris for computer vision to be effective, and these were reimaged.) Resources in this dataset:Resource Title: Deltapine 90 - Manually Annotated Training Set. File Name: GH3 DP90 Keyence 1_45 JPEG.zipResource Description: These images were manually annotated in Labelbox.Resource Title: Deltapine 90 - AI-Assisted Annotated Training Set. File Name: GH3 DP90 Keyence 46_101 JPEG.zipResource Description: These images were AI-labeled in RoboFlow and then manually reviewed in RoboFlow. Resource Title: Deltapine 90 - Manually Annotated Training-Validation Set. File Name: GH3 DP90 Keyence 102_125 JPEG.zipResource Description: These images were manually labeled in LabelBox, and then used for training-validation for the machine learning model.Resource Title: Phytogen 800 - Evaluation Test Images. File Name: Gb cv Phytogen 800.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Pima 3-79 - Evaluation Test Images. File Name: Gb cv Pima 379.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Pima S-7 - Evaluation Test Images. File Name: Gb cv Pima S7.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Coker 312 - Evaluation Test Images. File Name: Gh cv Coker 312.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Deltapine 90 - Evaluation Test Images. File Name: Gh cv Deltapine 90.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Half and Half - Evaluation Test Images. File Name: Gh cv Half and Half.zipResource Description: These images were used to validate the machine learning model. They were manually annotated in ImageJ.Resource Title: Fiber Tip Annotations - Manual. File Name: manual_annotations.coco_.jsonResource Description: Annotations in COCO.json format for fibers. Manually annotated in Labelbox.Resource Title: Fiber Tip Annotations - AI-Assisted. File Name: ai_assisted_annotations.coco_.jsonResource Description: Annotations in COCO.json format for fibers. AI annotated with human review in Roboflow.

Resource Title: Model Weights (iteration 600). File Name: model_weights.zipResource Description: The final model, provided as a zipped Pytorch .pth file. It was chosen at training iteration 600. The model weights can be imported for use of the fiber tip type detection neural network in Python.Resource Software Recommended: Google Colab,url: https://research.google.com/colaboratory/
e
Database of scalable training of neural network potentials for complex...
b2find.eudat.eu
Updated Apr 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Database of scalable training of neural network potentials for complex interfaces through data augmentation - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/46e840d3-d4f3-5754-b86f-30d99487fa30
Explore at:
Dataset updated
Apr 2, 2025
Description
This database contains the reference data used for direct force training of Artificial Neural Network (ANN) interatomic potentials using the atomic energy network (ænet) and ænet-PyTorch packages (https://github.com/atomisticnet/aenet-PyTorch). It also includes the GPR-augmented data used for indirect force training via Gaussian Process Regression (GPR) surrogate models using the ænet-GPR package (https://github.com/atomisticnet/aenet-gpr). Each data file contains atomic structures, energies, and atomic forces in XCrySDen Structure Format (XSF). The dataset includes all reference training/test data and corresponding GPR-augmented data used in the four benchmark examples presented in the reference paper, “Scalable Training of Neural Network Potentials for Complex Interfaces Through Data Augmentation”. A hierarchy of the dataset is described in the README.txt file, and an overview of the dataset is also summarized in supplementary Table S1 of the reference paper.
e
Towards physics-based deep learning in OpenFOAM: Combining OpenFOAM with the...
b2find.eudat.eu
Updated Jul 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Towards physics-based deep learning in OpenFOAM: Combining OpenFOAM with the PyTorch C++ API (Source Code and Data) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/ed5539dd-5c59-5b52-981d-4cf60b17f0ab
Explore at:
Dataset updated
Jul 15, 2023
Description
Source Code and Data snapshot accompanying the Training " Towards physics-based deep learning in OpenFOAM: Combining OpenFOAM with the PyTorch C++ API" given at the 17th OpenFOAM Workshop

Facebook

Twitter

Click to copy link

Link copied

Cite

Luong Hoang Minh (2024). Pytorch-data [Dataset]. https://www.kaggle.com/datasets/minhlnghong/pytorch-data/code

Pytorch-data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Dec 29, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Luong Hoang Minh

Description

Dataset

This dataset was created by Luong Hoang Minh

Clear search

Close search

Google apps

Main menu

Pytorch-data

Dataset

Contents

Data from: PyTorch model for Slovenian Named Entity Recognition SloNER 1.0

PyTorch Geometric processed database of water cluster minima

Data from: pytorch-lightning

Dataset

Contents

pytorch-reasoning

PyTorch geometric datasets for morphVQ models

Data from: PyTorch model for Slovenian Coreference Resolution

Data from: Federated Learning Demonstrator MNIST Example (Version 1.0.1)

Data from: Torchtree: flexible phylogenetic model development and inference...

Description of the data

DUNEdn supporting data

Data from: Federated Learning Client Base Image (Version 1.0.1)

Sentence/Table Pair Data from Wikipedia for Pre-training with...

Data from: Deep learning neural network derivation and testing to...

Data from: pytorch-metric-learning

Dataset

Contents

Turbulent Flow data as PyTorch tensors for ML: Kolmogorov Flow at Re=222,...

Data and scripts from "Unsupervised learning for structure detection in...

Graph Network Simulator PyTorch training dataset for water drop sample

Data from: Efficient imaging and computer vision detection of two cell...

Database of scalable training of neural network potentials for complex...

Towards physics-based deep learning in OpenFOAM: Combining OpenFOAM with the...

Pytorch-data

Dataset

Contents