100+ datasets found

AZtec projects reach the data size limit
zenodo.org
data.niaid.nih.gov
zip
Updated Nov 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiaohan Zeng; Xiaohan Zeng; Alec E Davis; Alec E Davis; Jack Donoghue; Jack Donoghue (2021). AZtec projects reach the data size limit [Dataset]. http://doi.org/10.5281/zenodo.5660090
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5660090
Dataset updated
Nov 19, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Xiaohan Zeng; Xiaohan Zeng; Alec E Davis; Alec E Davis; Jack Donoghue; Jack Donoghue
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Ten Ti-6Al-4V samples were mounted on a multi-sample stage for EBSD on a Thermo Fisher Apreo SEM equipped with an Oxford Instruments' Symmetry 2 detector at the University of Manchester.

In project multi-sample_1, AZtec reported a saving error when scanning the fifth sample and stopped with 5646 frames saved (.oip~4GB). It is able to montage and export the maps, but any edit on the .oip file cannot be saved.

In project multi-sample_2, we restarted the scan on the rest of the samples and completed with 5601 frames. The .oip is 3.97GB, which almost reaches the size limit. No error was reported during the scanning, and the .oip file is still editable.
o
China's first sub-meter building footprints derived by deep learning (part 2...
explore.openaire.eu
zenodo.org
Updated Jan 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xin Huang; Zhen Zhang; Jiayi Li (2024). China's first sub-meter building footprints derived by deep learning (part 2 of 2). [Dataset]. http://doi.org/10.5281/zenodo.10043351
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10043351
Dataset updated
Jan 10, 2024
Authors
Xin Huang; Zhen Zhang; Jiayi Li
Area covered
China
Description
Download Due to Zenodo's file size limitations, we are releasing different parts of CBF and GBD in different versions. See the below for specific information: 1. China's first sub-meter building footprints (CBF) derived by deep learning: part1: version12 (v12), City 1 to 210, https://doi.org/10.5281/zenodo.10473278 part2: version13 (v13), City 211 to 356, https://doi.org/10.5281/zenodo.10475803 (current version) Building attributes: id: Index number of the current building. year: Year of construction retrieved from GISA. height_mean: The average height of the building (computed from the pixels within the building footprint) obtained from CNBH (meters). height_max: Maximum height of the building (based on the highest pixel value within the building footprint) obtained from CNBH (meters). height_min: Minimum height of the building (based on the lowest pixel value within the building footprint) obtained from CNBH (meters). miniDist: Shortest straight-line distance to another building. dist_id: Index number of the building with the shortest straight-line distance to the current building. area: Area of the current building (square meters). perimeter: Perimeter of the current building (meters). inurban_19: A value of 1 indicates that the building was situated in an urban area in 1990, while a value of 0 signifies that it was located in a rural area in 1990. This determination is made using GUB data. inurban_1: A value of 1 indicates that the building was situated in an urban area in 1995, while a value of 0 signifies that it was located in a rural area in 1995. This determination is made using GUB data. inurban_20: A value of 1 indicates that the building was situated in an urban area in 2000, while a value of 0 signifies that it was located in a rural area in 2000. This determination is made using GUB data. inurban_2: A value of 1 indicates that the building was situated in an urban area in 2005, while a value of 0 signifies that it was located in a rural area in 2005. This determination is made using GUB data. inurban_3: A value of 1 indicates that the building was situated in an urban area in 2010, while a value of 0 signifies that it was located in a rural area in 2010. This determination is made using GUB data. inurban_4: A value of 1 indicates that the building was situated in an urban area in 2015, while a value of 0 signifies that it was located in a rural area in 2015. This determination is made using GUB data. inurban_5: A value of 1 indicates that the building was situated in an urban area in 2020, while a value of 0 signifies that it was located in a rural area in 2020. This determination is made using GUB data. 2. Global Building Dataset (GBD): This dataset comprises approximately 800,000 images(512*512) with diverse architectural styles worldwide. It can be served as training and test samples for building extraction in different regions globally. In order to enhance usability, we did not break the continuity of the image and published it in 1024*1024 size. Version description link v1 All labels. Images of Africa, Australia, and South America. https://zenodo.org/records/10043352 v2 image of Asia (part 1 to 30 of 53). https://zenodo.org/records/10456238 v3 image of Asia (part 31 to 53 of 53). https://zenodo.org/records/10457368 v4 image of Europe (part 1 to 21 of 58). https://zenodo.org/records/10458273 v5 image of Europe (part 21 to 42 of 58). https://zenodo.org/records/10460868 v6 image of Europe (part 43 to 58 of 58). https://zenodo.org/records/10462506 v7 image of North America (part 1 to 20 of 93). https://zenodo.org/records/10463385 v8 image of North America (part 21 to 40 of 93). https://zenodo.org/records/10465076 v9 image of North America (part 41 to 60 of 93). https://zenodo.org/records/10466569 v10 image of North America (part 61 to 80 of 93). https://zenodo.org/records/10467291 v11 image of North America (part 81 to 93 of 93). https://zenodo.org/records/10471557
Synthetic dataset used in "The maximum weighted submatrix coverage problem:...
zenodo.org
text/x-python, zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Derval Guillaume; Derval Guillaume; Branders Vincent; Dupont Pierre; Schaus Pierre; Branders Vincent; Dupont Pierre; Schaus Pierre (2020). Synthetic dataset used in "The maximum weighted submatrix coverage problem: A CP approach" [Dataset]. http://doi.org/10.5281/zenodo.3549866
Explore at:
zip, text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3549866
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Derval Guillaume; Derval Guillaume; Branders Vincent; Dupont Pierre; Schaus Pierre; Branders Vincent; Dupont Pierre; Schaus Pierre
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Synthetic dataset used in "The maximum weighted submatrix coverage problem: A CP approach".

Includes both the generated datasets as a zip archive and the python script used to generate them.

Each instance is composed of two files in the form

XxY_K_O_0xN_AxB_Smatrix.tsv being the matrix to use. Each row on a separate line, with tab-separated cells.

XxY_K_O_0xN_AxB_Ssolution.txt giving the implanted solution. One submatrix per line. Then two JSON arrays follow, separated by a tabulation. The first is the list of rows selected in the submatrix, the second the columns.

With:

X and Y the size of the matrix

K the number of submatrices in the implanted solution

O the (minimum) overlap percentage of each submatrix

N the sigma used for the background noise

A and B the size of the implanted submatrices (subject to noise)
Z
MELA Dataset: A Benchmark for Mediastinal Lesion Analysis (Training Set Part...
data.niaid.nih.gov
explore.openaire.eu
Updated May 24, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mengmeng Zhao (2022). MELA Dataset: A Benchmark for Mediastinal Lesion Analysis (Training Set Part 2) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6444097
Explore at:
Dataset updated
May 24, 2022
Dataset provided by
Rui Xu
Mengmeng Zhao
Jiancheng Yang
Bo Du
Yong Luo
Yunlang She
Kaiming Kuang
Shuang Song
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MELA dataset is a benchmark for developing algorithms on mediastinal lesion analysis. We hope this large-scale dataset could facilitate the research and application of automatic mediastinal lesion detection and diagnosis.

MELA dataset contains 1100 CT scans collected from patients with one or more lesions in the mediastinum. The MELA dataset is split into a subset of 770 CT scans for training, a subset of 110 CT scans for validation, and a test set of 220 CT scans for evaluation.

Due to the size limit of zenodo.org, we split the MELA training set into 3 parts; this is the Training Set Part 2 of MELA dataset, including 260 CTs. Files include:

Train3.zip: 130 CTs in NII format (nii.gz).

Train4.zip: 130 CTs in NII format (nii.gz).
o
Reproduction package for the paper: "Detection of ultra-fast radio bursts...
explore.openaire.eu
zenodo.org
Updated Oct 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Peter Snelders (2023). Reproduction package for the paper: "Detection of ultra-fast radio bursts from FRB 20121102A" [Dataset]. http://doi.org/10.5281/zenodo.8112803
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8112803
Dataset updated
Oct 19, 2023
Authors
Mark Peter Snelders
Description
Reproduction package for the paper "Detection of ultra-fast radio bursts from FRB 20121102A"Authors: Mark P. Snelders, K. Nimmo, J.W.T. Hessels, Z. Bensellam, L.P. Zwaan, P. Chawla, O.S. Ould-Boukattine, F. Kirsten, J.T. Faber and V. Gajjar.arXiv link: https://arxiv.org/abs/2307.02303DOI published article: Nature Astronomy, 19 October 2023, https://doi.org/10.1038/s41550-023-02101-xThis work has been made possible by an NWO Vici grant (Principal investigator, J.W.T.H.). ## Raw Data- The data are 100% publicly available and are explained in great detail in the following post: http://seti.berkeley.edu:8000/frb-data/- The data are available from the Breakthrough Initiatives Open Data Portal with target name FRB121102: https://breakthroughinitiatives.org/opendatasearchIn this paper we have re-processed and re-analysed data from the Green Bank Telescope that made use of the Breakthrough Listen digital backend. I will call this the GBT BL data. Below you can find links to multiple papers, GitHub repositories and blogposts that explain the GBT BL data. - My paper describing the search and analysis of the ultra-fast radio bursts: * https://ui.adsabs.harvard.edu/abs/2023arXiv230702303S/abstract * https://www.nature.com/articles/s41550-023-02101-x- First detection of the bursts at 8 GHz: https://ui.adsabs.harvard.edu/abs/2018ApJ...863....2G/abstract- More bursts from the same dataset with machine learning detections: https://ui.adsabs.harvard.edu/abs/2018ApJ...866..149Z/abstract- Explaining the Breakthrough Listen project: https://ui.adsabs.harvard.edu/abs/2017AcAau.139...98W/abstract- Explaining the GBT breakthrough listen recorder: https://ui.adsabs.harvard.edu/abs/2018PASP..130d4502M/abstract- Explaining the data formats: https://ui.adsabs.harvard.edu/abs/2019PASP..131l4505L/abstract- Python 2 code to work with the baseband data: https://github.com/greghell/extractor (NOTE THAT IT IS PYTHON 2!!) (I recommend using Python 2.7 if you make use of that repo)- Structure of the baseband data: https://github.com/UCBerkeleySETI/breakthrough/blob/master/doc/RAW-File-Format.md- More information: https://github.com/UCBerkeleySETI/breakthrough/blob/master/GBT/waterfall.md- A version of dspsr, called bl-dspsr, that can work with the GBT BL baseband data: https://github.com/UCBerkeleySETI/bl-dspsr## Software- The data was processed on multiple machines with various operating systems, which include, but are not limited to, macOS, Ubuntu and centOS.- All the used software is open source, see the section above for more information, and also see the 'software' section in the paper.## Figures and Tables The files in this Zenodo package should be self-explanatory. E.g. table_1.tar contains all the scripts/notebooks/files needed to make table_1, and also contains table 1 itself. The file: 'general_info.tar' is basically a txt file with the same info as provided here and it contains an offline version of the Breakthrough Listen blogpost that that explains the raw data. The file: helper_functions.tar is a tarball that contains a Python file with a collection of helper functions that are used in the Jupyter notebooks (and the figures are made in the Jupyter notebooks). It also contains some files that are needed to e.g. remove the instrumental delay from the data. ## End-to-End analysis scriptsThe Python code/Jupyter notebooks in the tarfiles are end-to-end. ## Intermediate data products The file data_and_data_info.tar contains two intermediate data products (both several gigabytes in size) and a txt file explaining the files and how they were made. Due to the Zenodo file size limitations I cannot upload everything. Please contact me at snelders@astron.nl or m.p.snelders@uva.nl or via ORCID to request any other files.
Data from: Predicting parameters for the Quantum Approximate...
zenodo.org
xz
Updated Oct 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sami Boulebnane; Ashley Montanaro; Sami Boulebnane; Ashley Montanaro (2021). Predicting parameters for the Quantum Approximate OptimizationAlgorithm for MAX-CUT from the infinite-size limit [Dataset]. http://doi.org/10.5281/zenodo.5569075
Explore at:
xzAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5569075
Dataset updated
Oct 15, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sami Boulebnane; Ashley Montanaro; Sami Boulebnane; Ashley Montanaro
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the raw data used to generate the figures of the paper. The code generating the figures is in a separate GitHub repository: https://github.com/sami-b95/predicting_qaoa_parameters_data
Z
Two-bubble simulation and gravitational wave spectrum codes and data
data.niaid.nih.gov
Updated Dec 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gould, Oliver (2024). Two-bubble simulation and gravitational wave spectrum codes and data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5127537
Explore at:
Dataset updated
Dec 13, 2024
Dataset provided by
Gould, Oliver
Sukuvaara, Satumaaria
Weir, David
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Code and data used in the paper with title Vacuum bubble collisions: from microphysics to gravitational waves by Oliver Gould, Satumaaria Sukuvaara, and David Weir [arXiv:2107.05657].

The field simulation and gravitational wave spectrum calculation codes are based on Gravitational radiation from colliding vacuum bubbles by Arthur Kosowsky, Michael S. Turner and Richard Watkins [Inspire].

Contains files:

two_bubbles_code-v1.0.1.zip is a snapshot of a git repository, corresponding to commit v1.0.1. Contains the codes with which the majority of the data was produced.

two_bubbles_data-v1.0.1.zip is a snapshot of a git repository, corresponding to commit v1.0.1. It contains the majority of data used in the paper. Note however that the simulation pickle files are examples run on a coarser lattice due to Zenodo file size restrictions. Apart from few exceptions, the data in this file was produced by the codes in two_bubbles_code-v1.0.1.zip.

README.md files, specifying and explaining the contents and usage, are included within. The v1.0.1 of code README.md and the data README.md can be found from the repositories as well.

The update v1.0.1 updates the README and fixes a small error in the calculation of the gravitational wave spectrum. We thank Toby Opferkuch for pointing this out. The error in the code does not affect the results in two_bubbles_data-v1.0.0.zip or the paper as they were produced with a slightly earlier version of the code, before the appearance of this error. The version two_bubbles_data-v1.0.1 updates the README, clarifying some points.
Long-term Continuous SIF-informed Photosynthesis Proxy reconstructed with...
zenodo.org
application/gzip
Updated Jan 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jianing Fang; Jianing Fang; Xu Lian; Xu Lian; Youngryel Ryu; Youngryel Ryu; Sungchan Jeong; Sungchan Jeong; Chongya Jiang; Chongya Jiang; Pierre Gentine; Pierre Gentine (2025). Long-term Continuous SIF-informed Photosynthesis Proxy reconstructed with calibrated AVHRR surface reflectance (LCSPP-AVHRR), 2001-2023 [Dataset]. http://doi.org/10.5281/zenodo.14568491
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14568491
Dataset updated
Jan 10, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jianing Fang; Jianing Fang; Xu Lian; Xu Lian; Youngryel Ryu; Youngryel Ryu; Sungchan Jeong; Sungchan Jeong; Chongya Jiang; Chongya Jiang; Pierre Gentine; Pierre Gentine
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Usage Notes:
This is the updated LCSPP dataset (v3.2), generated using the LCREF-AVHRR record from 1982–2023. Due to Zenodo’s size constraints, LCSPP-AVHRR is divided into two separate repositories. Previously referred to as "LCSIF," the dataset was renamed to emphasize its role as a SIF-informed long-term photosynthesis proxy derived from surface reflectance and to avoid confusion with directly measured SIF signals.

Key updates in version 3.2 include:

Improved Calibration: Enhanced consistency in calibration methods, addressing technical limitations in version 3.1 including applying more stringent quality filtering and snow masks.

Quality Flags: New quality flag layer enables users to identify whether a pixel is derived from observed surface reflectance (QA=0), high-quality gap-filled values (QA=1), lower-quality gap-filled based on the mean seasonal cycle (QA=2), or missing entirely (QA=3). We advice the user to rely only on observed and high-quality gap-filled values for their analyses.

Extension to include observations from the year of 2023.

Other LCSPP repositories can be accessed via the following links:

LCSPP-AVHRR v3.2 (1982-2000): 10.5281/zenodo.7916850

LCSPP-MODIS v3.2(2001-2023): 10.5281/zenodo.11658088

The user can choose between LCSPP-AVHRR and LCSPP-MODIS for the overlapping period from 2001-2023. The two datasets are generally consistent during this overlapping period, although LCSPP-MODIS shows a stronger greening trend between 2001-2023. For studies exploring the long-term vegetation dynamics, the user can either use only LCSPP-AVHRR or use a blend dataset of LCSPP-AVHRR and LCSPP-MODIS as a sensitivity test.

In addition, the updated long-term continuous reflectance datasets (LCREF), used for the production of LCSPP, can be accessed using the following links:

LCREF-AVHRR v3.2 (1982-2023): 10.5281/zenodo.11905959

LCREF-MODIS v3.2 (2001-2023): 10.5281/zenodo.11657458

A manuscript describing the technical details is available at https://arxiv.org/abs/2311.14987, while detailed the uses and limitations of the dataset. In particular, we note that LCSPP is a reconstruction of SIF-informed photosynthesis proxy and should not be treated as SIF measurements. Although LCSPP has demonstrated skill in tracking the dynamics of GPP and PAR absorbed by canopy chlorophyll (APARchl), it is not suitable for estimating fluorescence quantum yield.

All data outputs from this study are available at 0.05° spatial resolution and biweekly temporal resolution in NetCDF format. Each month is divided into two files, with the first file “a” representative of the 1^st day to the 15^th day of a month, and the second file “b” representative of the 16^th day to the last day of a month.

Abstract:

Satellite-observed solar-induced chlorophyll fluorescence (SIF) is a powerful proxy for the photosynthetic characteristics of terrestrial ecosystems. Direct SIF observations are primarily limited to the recent decade, impeding their application in detecting long-term dynamics of ecosystem function. In this study, we leverage two surface reflectance bands available both from Advanced Very High-Resolution Radiometer (AVHRR, 1982-2023) and MODerate-resolution Imaging Spectroradiometer (MODIS, 2001-2023). Importantly, we calibrate and orbit-correct the AVHRR bands against their MODIS counterparts during their overlapping period. Using the long-term bias-corrected reflectance data from AVHRR and MODIS, a neural network is trained to produce a Long-term Continuous SIF-informed Photosynthesis Proxy (LCSPP) by emulating Orbiting Carbon Observatory-2 SIF, mapping it globally over the 1982-2023 period. Compared with previous SIF-informed photosynthesis proxies, LCSPP has similar skill but can be advantageously extended to the AVHRR period. Further comparison with three widely used vegetation indices (NDVI, kNDVI, NIRv) shows a higher or comparable correlation of LCSPP with satellite SIF and site-level GPP estimates across vegetation types, ensuring a greater capacity for representing long-term photosynthetic activity.
RibFrac Dataset: A Benchmark for Rib Fracture Detection, Segmentation and...
zenodo.org
data.niaid.nih.gov
csv, zip
Updated Dec 2, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiancheng Yang; Liang Jin; Bingbing Ni; Ming Li; Jiancheng Yang; Liang Jin; Bingbing Ni; Ming Li (2020). RibFrac Dataset: A Benchmark for Rib Fracture Detection, Segmentation and Classification (Training Set Part 1) [Dataset]. http://doi.org/10.5281/zenodo.3893508
Explore at:
zip, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3893508
Dataset updated
Dec 2, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jiancheng Yang; Liang Jin; Bingbing Ni; Ming Li; Jiancheng Yang; Liang Jin; Bingbing Ni; Ming Li
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
RibFrac dataset is a benchmark for developping algorithms on rib fracture detection, segmentation and classification. We hope this large-scale dataset could facilitate both clinical research for automatic rib fracture detection and diagnoses, and engineering research for 3D detection, segmentation and classification.

Due to size limit of zenodo.org, we split the whole RibFrac Training Set into 2 parts; This is the Training Set Part 1 of RibFrac dataset, including 300 CTs and the corresponding annotations. Files include:

ribfrac-train-images-1.zip: 300 chest-abdomen CTs in NII format (nii.gz).

ribfrac-train-labels-1.zip: 300 annotations in NII format (nii.gz).

ribfrac-train-info-1.csv: labels in the annotation NIIs.

public_id: anonymous patient ID to match images and annotations.

label_id: discrete label value in the NII annotations.

label_code: 0, 1, 2, 3, 4, -1

0: it is background

1: it is a displaced rib fracture

2: it is a non-displaced rib fracture

3: it is a buckle rib fracture

4: it is a segmental rib fracture

-1: it is a rib fracture, but we could not define its type due to ambiguity, diagnosis difficulty, etc. Ignore it in the classification task.

If you find this work useful in your research, please acknowledge the RibFrac project teams in the paper and cite this project as:

Liang Jin, Jiancheng Yang, Kaiming Kuang, Bingbing Ni, Yiyi Gao, Yingli Sun, Pan Gao, Weiling Ma, Mingyu Tan, Hui Kang, Jiajun Chen, Ming Li. Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet. EBioMedicine (2020). (DOI)

or using bibtex

@article{ribfrac2020,
title={Deep-Learning-Assisted Detection and Segmentation of Rib Fractures from CT Scans: Development and Validation of FracNet},
author={Jin, Liang and Yang, Jiancheng and Kuang, Kaiming and Ni, Bingbing and Gao, Yiyi and Sun, Yingli and Gao, Pan and Ma, Weiling and Tan, Mingyu and Kang, Hui and Chen, Jiajun and Li, Ming},
journal={EBioMedicine},
year={2020},
publisher={Elsevier}
}

The RibFrac dataset is a research effort of thousands of hours by experienced radiologists, computer scientists and engineers. We kindly ask you to respect our effort by appropriate citation and keeping data license.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
o
TEI survey 2021
explore.openaire.eu
Updated Feb 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philip Daniel Allfrey (2022). TEI survey 2021 [Dataset]. http://doi.org/10.5281/zenodo.6293260
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6293260
Dataset updated
Feb 26, 2022
Authors
Philip Daniel Allfrey
Description
This dataset comprises the processing scripts, processed files, and summary statistics from a survey of two million publicly available TEI files on GitHub. The survey was carried out between December 2020 �� May 2021 with the motivation of examining real-world usage of the Text Encoding Initiative guidelines. Outputs include counts of TEI and non-TEI tag usage, TEI modules, and other namespaces, languages declared, and lists of filenames/repositories. The processed files are plain text and consist of the tag names used within the corresponding TEI XML file, separated by spaces. E.g. file 1000001.txt contains: ?xml TEI teiHeader fileDesc titleStmt title publicationStmt idno idno availability p p ref p ref p ref notesStmt note sourceDesc biblFull titleStmt title author extent publicationStmt date pubPlace profileDesc creation date textClass keywords term text body div div head head p p p p p pb p p pb Files with the XML attributes as well are available on request, due to size limitations.
Data and scripts (1) for Storkey et al, "Resolution dependence of...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated May 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Storkey; David Storkey (2024). Data and scripts (1) for Storkey et al, "Resolution dependence of interlinked Southern Ocean biases in global coupled HadGEM3 models", GMD (2024) [Dataset]. http://doi.org/10.5281/zenodo.11102733
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11102733
Dataset updated
May 13, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Storkey; David Storkey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Southern Ocean
Description

================================================================
Data and scripts for producing plots from Storkey et al (2024):
"Resolution dependence of interlinked Southern Ocean biases in
global coupled HadGEM3 models"
================================================================

The plots in the paper consist of 10-year mean fields from the third
decade of the spin up and timeseries of scalar quantities for the first
150 years of the spin up. The data to produce these plots are stored
in the MEANS_YEARS_21-30 and TIMESERIES_DATA directories respectively.

Note that due to the size limit on records on Zenodo, the 10-year mean
output from the N216-ORCA12 integration has been stored as a separate
record.

Scripts to produce the plots are in SCRIPT, with section definitions
in SECTIONS. Bespoke plotting scripts are included in SCRIPT. They use
python 3 including the Matplotlib, Iris and Cartopy packages. The
plotting of the timeseries data used the Marine_Val VALSO-VALTRANS
package which is available here:

https://github.com/JMMP-Group/MARINE_VAL/tree/main/VALSO-VALTRANS

Much of the processing of the model output data was performed with the
CDFTools package, which is available here:

https://github.com/meom-group/CDFTOOLS

and the NCO package:

https://web.mit.edu/course/13/13.715/nco-2.8.1/doc/

Global Building Dataset (labels of all, images of Africa, Australia, and...

zenodo.org

bin, tiff

Updated Jul 11, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Xin Huang; Zhen Zhang; Zhen Zhang; Jiayi Li; Xin Huang; Jiayi Li (2024). Global Building Dataset (labels of all, images of Africa, Australia, and South America). [Dataset]. http://doi.org/10.5281/zenodo.10043352

Explore at:

bin, tiffAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.10043352

Dataset updated

Jul 11, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Xin Huang; Zhen Zhang; Zhen Zhang; Jiayi Li; Xin Huang; Jiayi Li

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Australia, South America

Description

Download

Due to Zenodo's file size limitations, we are releasing different parts of CBF and GBD in different versions. See the below for specific information:

1. China's first sub-meter building footprints (CBF) derived by deep learning:

part1: version12 (v12), City 1 to 210, https://doi.org/10.5281/zenodo.10473278
part2: version13 (v13), City 211 to 356, https://doi.org/10.5281/zenodo.10475803

2. Global Building Dataset (GBD):

This dataset comprises approximately 800,000 images(512*512) with diverse architectural styles worldwide. It can be served as training and test samples for building extraction in different regions globally. In order to enhance usability, we did not break the continuity of the image and published it in 1024*1024 size.

Version	description	link
v1	All labels. Images of Africa, Australia, and South America.	current version (https://zenodo.org/records/10043352)
v2	image of Asia (part 1 to 30 of 53).	https://zenodo.org/records/10456238
v3	image of Asia (part 31 to 53 of 53).	https://zenodo.org/records/10457368
v4	image of Europe (part 1 to 21 of 58).	https://zenodo.org/records/10458273
v5	image of Europe (part 21 to 42 of 58).	https://zenodo.org/records/10460868
v6	image of Europe (part 43 to 58 of 58).	https://zenodo.org/records/10462506
v7	image of North America (part 1 to 20 of 93).	https://zenodo.org/records/10463385
v8	image of North America (part 21 to 40 of 93).	https://zenodo.org/records/10465076
v9	image of North America (part 41 to 60 of 93).	https://zenodo.org/records/10466569
v10	image of North America (part 61 to 80 of 93).	https://zenodo.org/records/10467291
v11	image of North America (part 81 to 93 of 93).	https://zenodo.org/records/10471557

Z
Speech recognition alignments for Finnish parliament data
data.niaid.nih.gov
zenodo.org
Updated May 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mansikkaniemi, André (2021). Speech recognition alignments for Finnish parliament data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4581940
Explore at:
Dataset updated
May 24, 2021
Dataset provided by
Mansikkaniemi, André
Kurimo, Mikko
Virkkunen, Anja
Area covered
Finland
Description
This dataset contains speech from Finnish parliament 2008-2020 plenary sessions, segmented and aligned for speech recognition training. In total, the training set has:

1.4 million samples

3100 hours of audio

460 speakers

over 19 million word tokens

Additionally, the upload contains 5h long development and 5h long evaluation sets described in publication 10.21437/Interspeech.2017-1115. Due to the size of the training set (~300 GB) and Zenodo upload limit (50 GB), only the development and evaluation sets are published on Zenodo. Rest of the data is available at: http://urn.fi/urn:nbn:fi:lb-2021051903

The training set comes in two parts:

2008-2016 set which is originally described in publication 10.21437/Interspeech.2017-1115. This set includes a list of samples from sessions in 2008-2014 that can be combined with the 2015-2020 set to form the 3100 hour training set.

A new 2015-2020 dataset.

All audio samples are single-channel, 16 kHz and 16-bit wav files. Each wav file has corresponding transcript in a .trn text file. The data is machine-extracted so there still remains small inaccuracies in the training set transcripts and possibly few Swedish samples. Development and evaluation sets have been corrected by hand.

The licenses can be viewed at:

http://urn.fi/urn:nbn:fi:lb-2019112822 (audio)

http://urn.fi/urn:nbn:fi:lb-2019112823 (text)

The code used in extraction is available at:

https://github.com/aalto-speech/finnish-parliament-scripts (2008-2014, dev and eval sets)

https://github.com/aalto-speech/fi-parliament-tools (2015-2020 set)
Z
ClimateForecasts: Globally Observed Environmental Data for 15,504 Weather...
data.niaid.nih.gov
zenodo.org
Updated Mar 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kindt, Roeland (2025). ClimateForecasts: Globally Observed Environmental Data for 15,504 Weather Station Locations [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10726088
Explore at:
Dataset updated
Mar 26, 2025
Dataset authored and provided by
Kindt, Roeland
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ClimateForecasts is a database that provides environmental data for 15,504 weather station locations and 49 environmental variables, including 38 bioclimatic variables, 8 soil variables and 3 topographic variables. Data were extracted from the same 30 arc-seconds global grid layers that were prepared when making the TreeGOER (Tree Globally Observed Environmental Ranges) database that is available from https://doi.org/10.5281/zenodo.7922927. Details on the preparations of these layers are provided by Kindt, R. (2023). TreeGOER: A database with globally observed environmental ranges for 48,129 tree species. Global Change Biology 29: 6303–6318. https://onlinelibrary.wiley.com/doi/10.1111/gcb.16914. A similar extraction process was used for the CitiesGOER database that is also available from Zenodo via https://zenodo.org/doi/10.5281/zenodo.8175429.

ClimateForecasts (as the CitiesGOER) was designed to be used together with TreeGOER and possibly also with the GlobalUsefulNativeTrees database (Kindt et al. 2023) to allow users to filter suitable tree species based on environmental conditions of the planting site. One example of combining data from these different sets in the R statistical environment is available from this Rpub: https://rpubs.com/Roeland-KINDT/1114902.

The identities including the geographical coordinates of weather stations were sourced from Meteostat, specifically by downloading (17-FEB-2024) the ‘lite dump’ data set with information for active weather stations only. Two weather stations where the country could not be determined from the ISO 3166-1 code of ‘XA’ were removed. If weather stations had the same name, but occurred in different ISO 3166-2 regions, this region code was added to the name of the weather station between square brackets. Afterwards duplicates (weather stations of the same name and region) were manually removed.

Bioclimatic variables for future climates correspond to the median values from 24 Global Climate Models (GCMs) for Shared Socio-Economic Pathway (SSP) 1-2.6 for the 2050s (2041-2060), from 21 GCMs for SSP 3-7.0 for the 2050s and from 13 GCMs for SSP 5-8.5 for the 2090s. Similar methods were used to calculate these median values as in the case studies for the TreeGOER manuscript (calculations were partially done via the BiodiversityR::ensemble.envirem.run function and with downscaled bioclimatic and monthly climate 2.5 arc-minutes future grid layers available from WorldClim 2.1).

Maps were added in version 2024.03 where locations of weather stations were shown on a map of the Climatic Moisture Index (CMI). These maps were created by a similar process as in the TreeGOER Global Zones Atlas from the environmental raster layers used to create the TreeGOER via the terra package (Hijmans et al. 2022, version 1.7-46) in the R 4.2.1 environment. Added country boundaries were obtained from Natural Earth as Admin 0 – countries vector layers (version 5.1.1). Also added after obtaining them from Natural Earth were Admin 0 – Breakaway, Disputed areas (version 5.1.0, coloured yellow in the atlas) and Roads (version 5.0.0, coloured red in the atlas). For countries where the GlobalUsefulNativeTrees database included subnational levels, boundaries were added and depicted as dot-dash lines. These subnational levels correspond to level 3 boundaries in the World Geographical Scheme for Recording Plant Distributions. These were obtained from https://github.com/tdwg/wgsrpd. Check Brummit 2001 for details such as the maps shown at the end of this document.

Maps for version 2024.07 modified the dimensions of the sheets to those used in version 2024.06 of the TreeGOER Global Zones Atlas. Another modification was the inclusion of Natural Earth boundaries for Lakes (version 5.0.0, coloured darkblue in the atlas).

Version 2024.10 includes a new data set that documents the location of the city locations in Holdridge Life Zones. Information is given for historical (1901-1920), contemporary (1979-2013) and future (2061-2080; separately for RCP 4.5 and RCP 8.5) that are available for download from DRYAD and were created for the following article: Elsen et al. 2022. Accelerated shifts in terrestrial life zones under rapid climate change. Global Change Biology, 28, 918–935. https://doi.org/10.1111/gcb.15962. Version 2024.10 further includes Holdridge Life Zones for the climates available from the previously included climates, calculating biotemperatures and life zones with similar methods as used by Holdridge (1947; 1967) and Elsen et al. (2022) (for future climates, median values were determined first for monthly maximum and minimum temperatures across GCMs ). The distributions of the 48,129 species documented in TreeGOER across the Holdridge Life Zones are given in this Zenodo archive: https://zenodo.org/records/14020914.

Version 2024.11 includes a new data set that documents the location of the weather stations in Köppen-Geiger climate zones. Information is given for historical (1901-1930, 1931-1960, 1961-1990) and future (2041-2070 and 2071-2099) climates, with for the future climates seven scenarios each (SSP 1-1.9, SSP 1-2.6, SSP 2-4.5, SSP 3-7.0, SSP 4-3.4, SSP 4-6.0 and SSP 5-8.5). This data set was created from raster layers available via: Beck, H.E., McVicar, T.R., Vergopolan, N. et al. High-resolution (1 km) Köppen-Geiger maps for 1901–2099 based on constrained CMIP6 projections. Sci Data 10, 724 (2023). https://doi.org/10.1038/s41597-023-02549-6.

Version 2025.03 includes extra columns for the baseline, 2050s and 2090s datasets that partially correspond to climate zones used in the GlobalUsefulNativeTrees database. One of these zones are the Whittaker biome types, available as a polygon from the plotbiomes package (see also here). Whittaker biome types were extracted with similar R scripts as described by Kindt 2025 (these were also used to calculate environmental ranges of TreeGOER species, as archived here).

Version 2025.03 further includes information for the baseline climate on the steady state water table depth, obtained from a 30 arc-seconds raster layer calculated by the GLOBGM v1.0 model (Verkaik et al. 2024).

When using ClimateForecasts in your work, cite this depository and the following:

Fick, S. E., & Hijmans, R. J. (2017). WorldClim 2: New 1‐km spatial resolution climate surfaces for global land areas. International Journal of Climatology, 37(12), 4302–4315. https://doi.org/10.1002/joc.5086

Title, P. O., & Bemmels, J. B. (2018). ENVIREM: An expanded set of bioclimatic and topographic variables increases flexibility and improves performance of ecological niche modeling. Ecography, 41(2), 291–307. https://doi.org/10.1111/ecog.02880

Poggio, L., de Sousa, L. M., Batjes, N. H., Heuvelink, G. B. M., Kempen, B., Ribeiro, E., & Rossiter, D. (2021). SoilGrids 2.0: Producing soil information for the globe with quantified spatial uncertainty. SOIL, 7(1), 217–240. https://doi.org/10.5194/soil-7-217-2021

Kindt, R. (2023). TreeGOER: A database with globally observed environmental ranges for 48,129 tree species. Global Change Biology, 00, 1–16. https://onlinelibrary.wiley.com/doi/10.1111/gcb.16914.

Meteostat (2024) Weather stations: Lite dump with active weather stations. https://github.com/meteostat/weather-stations (accessed 17-FEB-2024)

When using information from the Holdridge Life Zones, also cite:

Elsen, P. R., Saxon, E. C., Simmons, B. A., Ward, M., Williams, B. A., Grantham, H. S., Kark, S., Levin, N., Perez-Hammerle, K.-V., Reside, A. E., & Watson, J. E. M. (2022). Accelerated shifts in terrestrial life zones under rapid climate change. Global Change Biology, 28, 918–935. https://doi.org/10.1111/gcb.15962

When using information from Köppen-Geiger climate zones, also cite:

Beck, H.E., McVicar, T.R., Vergopolan, N., Berg, A., Lutsko, N.J., Dufour, A., Zeng, Z., Jiang, X., van Dijk, A.I. and Miralles, D.G. 2023. High-resolution (1 km) Köppen-Geiger maps for 1901–2099 based on constrained CMIP6 projections. Sci Data 10, 724. https://doi.org/10.1038/s41597-023-02549-6

When using information on the Whittaker biome types, also cite:

Ricklefs, R. E., Relyea, R. (2018). Ecology: The Economy of Nature. United States: W.H. Freeman.

Whittaker, R. H. (1970). Communities and ecosystems.

Valentin Ștefan, & Sam Levin. (2018). plotbiomes: R package for plotting Whittaker biomes with ggplot2 (v1.0.0). Zenodo. https://doi.org/10.5281/zenodo.7145245

When using information on the steady state water table depth, also cite:

Verkaik, J., Sutanudjaja, E. H., Oude Essink, G. H., Lin, H. X., & Bierkens, M. F. (2024). GLOBGM v1. 0: a parallel implementation of a 30 arcsec PCR-GLOBWB-MODFLOW global-scale groundwater model. Geoscientific Model Development, 17(1), 275-300. https://gmd.copernicus.org/articles/17/275/2024/

The development of ClimateForecasts and its partial integration in version 2024.03 of the GlobalUsefulNativeTrees database was supported by the Darwin Initiative to project DAREX001 of Developing a Global Biodiversity Standard certification for tree-planting and restoration, by Norway’s International Climate and Forest Initiative through the Royal Norwegian Embassy in Ethiopia to the Provision of Adequate Tree Seed Portfolio project in Ethiopia, by the Green Climate Fund through the IUCN-led Transforming the Eastern Province of Rwanda through Adaptation project and through the Readiness proposal on Climate Appropriate Portfolios of Tree Diversity for Burkina Faso, by the Bezos Earth Fund to the Quality Tree Seed for Africa in Kenya and Rwanda project and by the German International Climate Initiative (IKI) to the regional tree seed programme on The Right Tree for the Right Place for the Right Purpose in Africa.
Z
Indoor Positioning Simulation For Examination And Correction Of Occupancy...
data.niaid.nih.gov
zenodo.org
Updated May 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2022). Indoor Positioning Simulation For Examination And Correction Of Occupancy Limits In Architectural Design [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6591448
Explore at:
Dataset updated
May 30, 2022
Dataset authored and provided by
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset that contains the images of scenarios used for the analysis and the analysis itself.
Biomass encounter rates limit the size scaling of feeding interactions:...
zenodo.org
zip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Barrios-O'Neill; Daniel Barrios-O'Neill (2020). Biomass encounter rates limit the size scaling of feeding interactions: trial data [Dataset]. http://doi.org/10.5281/zenodo.3357928
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3357928
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Barrios-O'Neill; Daniel Barrios-O'Neill
Description
Experimental trial data for the publication: Biomass encounter rates limit the size scaling of feeding interactions.
Two 100 ns NVT molecular dynamics simulations of dsDNA and dsRNA "GGGG"...
zenodo.org
bin, sh, zip
Updated Apr 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elise Duboué-Dijon; Elise Duboué-Dijon; Elisa Frezza; Elisa Frezza (2024). Two 100 ns NVT molecular dynamics simulations of dsDNA and dsRNA "GGGG" 18-mers (GCGGGGGGGGGGGGGGGC) [Dataset]. http://doi.org/10.5281/zenodo.10647894
Explore at:
sh, bin, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10647894
Dataset updated
Apr 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Elise Duboué-Dijon; Elise Duboué-Dijon; Elisa Frezza; Elisa Frezza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supporting information for "Molecular origin of distinct hydration dynamics in double helical DNA and RNA sequences" by E. Frezza, D. Laage and E. Duboué-Dijon, J. Phys. Chem. Lett. 2024, 15, 4351–4358
Two 100 ns-long NVT molecular dynamics simulation: one of dsDNA "GGGG" 18-mer (GCGGGGGGGGGGGGGGGC) and one of the analogous dsRNA. The nucleic acid is explicitly solvated in water and neutralized with 0.15M KCl. Simulations were performed using the Gromacs 5 software. DNA is described with the Amber 99SB-ILDN force field with the BSC0 modifications, RNA is described with the Amber 99SB-ILDN force field with the BSC0 and χOL3 modifications, the SPC/E force field is used for water, and the Joung Cheatham paraeters for ions. The shared coordinates are saved every 500fs, twice less frequently than the original trajectories used for the publication, to reduce the size of the shared dataset below the allowed size limit.
Replication data for Carleton et al. (Quarterly Journal of Economics, 2022),...
zenodo.org
data.niaid.nih.gov
zip
Updated Dec 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tamma Carleton; Tamma Carleton; Amir Jina; Amir Jina; Michael Delgado; Michael Delgado; Michael Greenstone; Michael Greenstone; Trevor Houser; Solomon Hsiang; Solomon Hsiang; Andrew Hultgren; Andrew Hultgren; Robert E. Kopp; Robert E. Kopp; Kelly E. McCusker; Kelly E. McCusker; Ishan Nath; Ishan Nath; James Rising; James Rising; Ashwin Rode; Ashwin Rode; Hee Kwon Seo; Arvid Viaene; Jiacan Yuan; Jiacan Yuan; Alice Tianbo Zhang; Trevor Houser; Hee Kwon Seo; Arvid Viaene; Alice Tianbo Zhang (2023). Replication data for Carleton et al. (Quarterly Journal of Economics, 2022), "Valuing the mortality consequences of climate change accounting for adaptation costs and benefits" [Dataset]. http://doi.org/10.5281/zenodo.6416119
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6416119
Dataset updated
Dec 26, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tamma Carleton; Tamma Carleton; Amir Jina; Amir Jina; Michael Delgado; Michael Delgado; Michael Greenstone; Michael Greenstone; Trevor Houser; Solomon Hsiang; Solomon Hsiang; Andrew Hultgren; Andrew Hultgren; Robert E. Kopp; Robert E. Kopp; Kelly E. McCusker; Kelly E. McCusker; Ishan Nath; Ishan Nath; James Rising; James Rising; Ashwin Rode; Ashwin Rode; Hee Kwon Seo; Arvid Viaene; Jiacan Yuan; Jiacan Yuan; Alice Tianbo Zhang; Trevor Houser; Hee Kwon Seo; Arvid Viaene; Alice Tianbo Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains replication data for Carleton et al. (Quarterly Journal of Economics, 2022), "Valuing the mortality consequences of climate change accounting for adaptation costs and benefits". All non-confidential data inputs are included, as well as intermediate data outputs, final data outputs, and final tables and figures for all main text and supplementary tables and figures. Some input data are confidential (e.g., mortality records in some countries); therefore, intermediate regression results files are included in the upload to ensure all later stages of the analysis are fully replicable. The full data output files resulting from Monte Carlo simulations of future climate change impacts on mortality far exceed Zenodo file size limits; therefore, key aggregates of the raw output files are included here, which allow for replication of all tables and figures in the paper.

data.zip contains raw, intermediate, and final datasets

outputs.zip contains output tables and figures

All replication code for the paper is available on a public Github repository, accessible here.

The manuscript and supplementary information are available at the QJE, here.
A Pelagic Size Structure database (PSSdb) to support biogeochemical...
zenodo.org
pdf, zip
Updated Jul 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mathilde Dugenne; Mathilde Dugenne; Marco Corrales-Ugalde; Marco Corrales-Ugalde; Todd O'Brien; Todd O'Brien; Fabien Lombard; Fabien Lombard; Jean-Olivier Irisson; Jean-Olivier Irisson; Lars Stemmann; Lars Stemmann; Charles Stock; Charles Stock; Rainer Kiko; Rainer Kiko; Jessica Y. Luo; Jessica Y. Luo (2024). A Pelagic Size Structure database (PSSdb) to support biogeochemical modeling: update to first release [Dataset]. http://doi.org/10.5281/zenodo.10150020
Explore at:
pdf, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10150020
Dataset updated
Jul 10, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mathilde Dugenne; Mathilde Dugenne; Marco Corrales-Ugalde; Marco Corrales-Ugalde; Todd O'Brien; Todd O'Brien; Fabien Lombard; Fabien Lombard; Jean-Olivier Irisson; Jean-Olivier Irisson; Lars Stemmann; Lars Stemmann; Charles Stock; Charles Stock; Rainer Kiko; Rainer Kiko; Jessica Y. Luo; Jessica Y. Luo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is an update to the first release of the Pelagic Size Structure database (PSSdb, https://pssdb.net) scientific project, investigating the global particle size distributions measured from multiple pelagicǂ imaging systems. These devices include the Imaging Flow Cytobot (Olson and Sosik 2007), benchtop scanners like the ZooScan (Gorsky et al. 2010), and the Underwater Vision Profiler (Picheral et al. 2010). The data sources originate from Ecotaxa (https://ecotaxa.obs-vlfr.fr/), Ecopart (https://ecopart.obs-vlfr.fr/), and Imaging FlowCytobot dashboards (https://ifcb.caloos.org/dashboard and https://ifcb-data.whoi.edu/dashboard). Links to the PSSdb code and documentation are available on the PSSdb webpage (https://pssdb.net).

This updated version includes the following changes:

Duplicate data entries and NaN values have been removed.

Data products now include Normalized Biomass Size Spectra (NBSS), and Particle Size Distribution (PSD), two widely used methods to represent plankton and particles size distribution in marine ecology and biogeochemistry.

Linear regressions are now performed with log10 transformations of the normalized biovolume/abundance and the size classes.

Inclusion of UVP6 and other benchtop plankton Scanner datasets from net tows, which expand the temporal and spatial coverage of the data products.

Unbiased portion of the size spectra is selected by a new thresholding method that accounts for both uncertainties on particle sizes, limited by the camera resolution, and particle count, so that only size classes with less than 20% uncertainty are retained, in addition to gaps in the size spectra.

This PSSdb dataset is composed of two products, specific to each imaging device:

Product 1a includes the size distribution , computed from normalized biovolume, for NBSS, and normalized abundance, for PSD, of plankton and particles within a set of pre-defined size classes (expressed in both biovolume and equivalent circular diameter), averaged by year and month, and in 1-degree longitude/latitude grid cells.

Product 1b includes the results of NBSS and PSD regression fit parameters, slopes, intercept, and coefficient of determination (R2), averaged by year and month, and in 1-degree longitude/latitude grid cells. The regression parameters are defined using ordinary least squares linear regressions applied to a log10 transformed normalized biovolume/normalized abundance and biovolume/ diameter size class values.

Size spectra parameters were averaged over a maximum of 16 spatial and temporal subsets (0.5°x0.5°x1 week) to avoid over-representation of repeated sampling events (e.g., time-series datasets) within a grid cell. Linear regressions were performed on the linear portion of the log10-transformed NBSS and PSD estimates, between the size classes with a size measurement or particle count uncertainty greater than 20% (Schartau et al. 2010) , and where the maximum NB/PSD is observed and the largest size class before three empty consecutive size classes.

For additional information, please see the PDF documentation available below ...
Data for the legacy BioLiP1 database, part 1.
zenodo.org
bin
Updated Oct 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chengxin Zhang; Chengxin Zhang (2023). Data for the legacy BioLiP1 database, part 1. [Dataset]. http://doi.org/10.5281/zenodo.8407896
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8407896
Dataset updated
Oct 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Chengxin Zhang; Chengxin Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the web service data for version 1 of the BioLiP database, last update in April 01, 2022. Due to file size limit of zenodo, the full data is split into two parts (https://doi.org/10.5281/zenodo.8407896 and https://zenodo.org/record/8407920). To concatenate and decompress the files after downloading the two split parts:
$ cat BioLiP1aa BioLiP1ab > BioLiP.tar.bz2
$ tar -xvf BioLiP.tar.bz2

Facebook

Twitter

Click to copy link

Link copied

Cite

Xiaohan Zeng; Xiaohan Zeng; Alec E Davis; Alec E Davis; Jack Donoghue; Jack Donoghue (2021). AZtec projects reach the data size limit [Dataset]. http://doi.org/10.5281/zenodo.5660090

AZtec projects reach the data size limit

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5660090

Dataset updated

Nov 19, 2021

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Xiaohan Zeng; Xiaohan Zeng; Alec E Davis; Alec E Davis; Jack Donoghue; Jack Donoghue

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Ten Ti-6Al-4V samples were mounted on a multi-sample stage for EBSD on a Thermo Fisher Apreo SEM equipped with an Oxford Instruments' Symmetry 2 detector at the University of Manchester.

In project multi-sample_1, AZtec reported a saving error when scanning the fifth sample and stopped with 5646 frames saved (.oip~4GB). It is able to montage and export the maps, but any edit on the .oip file cannot be saved.

In project multi-sample_2, we restarted the scan on the rest of the samples and completed with 5601 frames. The .oip is 3.97GB, which almost reaches the size limit. No error was reported during the scanning, and the .oip file is still editable.

Clear search

Close search

Google apps

Main menu

AZtec projects reach the data size limit

China's first sub-meter building footprints derived by deep learning (part 2...

Synthetic dataset used in "The maximum weighted submatrix coverage problem:...

MELA Dataset: A Benchmark for Mediastinal Lesion Analysis (Training Set Part...

Reproduction package for the paper: "Detection of ultra-fast radio bursts...

Data from: Predicting parameters for the Quantum Approximate...

Two-bubble simulation and gravitational wave spectrum codes and data

Long-term Continuous SIF-informed Photosynthesis Proxy reconstructed with...

RibFrac Dataset: A Benchmark for Rib Fracture Detection, Segmentation and...

TEI survey 2021

Data and scripts (1) for Storkey et al, "Resolution dependence of...

Global Building Dataset (labels of all, images of Africa, Australia, and...

Download

Speech recognition alignments for Finnish parliament data

ClimateForecasts: Globally Observed Environmental Data for 15,504 Weather...

Indoor Positioning Simulation For Examination And Correction Of Occupancy...

Biomass encounter rates limit the size scaling of feeding interactions:...

Two 100 ns NVT molecular dynamics simulations of dsDNA and dsRNA "GGGG"...

Replication data for Carleton et al. (Quarterly Journal of Economics, 2022),...

A Pelagic Size Structure database (PSSdb) to support biogeochemical...

Data for the legacy BioLiP1 database, part 1.

AZtec projects reach the data size limit