100+ datasets found
  1. dataset

    • zenodo.org
    Updated Feb 17, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhou Ang; Zhou Ang (2023). dataset [Dataset]. http://doi.org/10.5281/zenodo.7397415
    Explore at:
    Dataset updated
    Feb 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zhou Ang; Zhou Ang
    Description

    Dataset for paper

  2. Digital repository capabilities and characteristics mapping spreadsheet

    • zenodo.org
    bin
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philipp Conzett; Philipp Conzett; Severine Duvaud; Severine Duvaud; Thomas Jouneau; Thomas Jouneau; Joel Kallio; Joel Kallio; Terje Klemetsen; Terje Klemetsen; Jonas Recker; Jonas Recker; Pavel Straňák; Pavel Straňák; Sanni Tujunen; Sanni Tujunen (2025). Digital repository capabilities and characteristics mapping spreadsheet [Dataset]. http://doi.org/10.5281/zenodo.17471457
    Explore at:
    binAvailable download formats
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Philipp Conzett; Philipp Conzett; Severine Duvaud; Severine Duvaud; Thomas Jouneau; Thomas Jouneau; Joel Kallio; Joel Kallio; Terje Klemetsen; Terje Klemetsen; Jonas Recker; Jonas Recker; Pavel Straňák; Pavel Straňák; Sanni Tujunen; Sanni Tujunen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Description

    This spreadsheet presents the structured mapping of repository capabilities and characteristics conducted in Task 5.2 of the FIDELIS project. It includes metadata and annotations for over 80 resources—such as standards, best practices, and landscape analyses — aligned with the 30 Activities and Functions defined in the FIDELIS Transparent Trustworthy Repository Attributes Matrix (TTRAM). The mapping covers both domain-agnostic and domain-specific resources across five scientific communities and serves as a foundational dataset for the FIDELIS landscape analysis.

  3. Zenodo Code Images

    • kaggle.com
    zip
    Updated Jun 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Research Computing Center (2018). Zenodo Code Images [Dataset]. https://www.kaggle.com/datasets/stanfordcompute/code-images
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jun 18, 2018
    Dataset authored and provided by
    Stanford Research Computing Center
    Description

    Code Images

    DOI

    Context

    This is a subset of the Zenodo-ML Dinosaur Dataset [Github] that has been converted to small png files and organized in folders by the language so you can jump right in to using machine learning methods that assume image input.

    Content

    Included are .tar.gz files, each named based on a file extension, and when extracted, will produce a folder of the same name.

     tree -L 1
    .
    ├── c
    ├── cc
    ├── cpp
    ├── cs
    ├── css
    ├── csv
    ├── cxx
    ├── data
    ├── f90
    ├── go
    ├── html
    ├── java
    ├── js
    ├── json
    ├── m
    ├── map
    ├── md
    ├── txt
    └── xml
    

    And we can peep inside a (somewhat smaller) of the set to see that the subfolders are zenodo identifiers. A zenodo identifier corresponds to a single Github repository, so it means that the png files produced are chunks of code of the extension type from a particular repository.

    $ tree map -L 1
    map
    ├── 1001104
    ├── 1001659
    ├── 1001793
    ├── 1008839
    ├── 1009700
    ├── 1033697
    ├── 1034342
    ...
    ├── 836482
    ├── 838329
    ├── 838961
    ├── 840877
    ├── 840881
    ├── 844050
    ├── 845960
    ├── 848163
    ├── 888395
    ├── 891478
    └── 893858
    
    154 directories, 0 files
    

    Within each folder (zenodo id) the files are prefixed by the zenodo id, followed by the index into the original image set array that is provided with the full dinosaur dataset archive.

    $ tree m/891531/ -L 1
    m/891531/
    ├── 891531_0.png
    ├── 891531_10.png
    ├── 891531_11.png
    ├── 891531_12.png
    ├── 891531_13.png
    ├── 891531_14.png
    ├── 891531_15.png
    ├── 891531_16.png
    ├── 891531_17.png
    ├── 891531_18.png
    ├── 891531_19.png
    ├── 891531_1.png
    ├── 891531_20.png
    ├── 891531_21.png
    ├── 891531_22.png
    ├── 891531_23.png
    ├── 891531_24.png
    ├── 891531_25.png
    ├── 891531_26.png
    ├── 891531_27.png
    ├── 891531_28.png
    ├── 891531_29.png
    ├── 891531_2.png
    ├── 891531_30.png
    ├── 891531_3.png
    ├── 891531_4.png
    ├── 891531_5.png
    ├── 891531_6.png
    ├── 891531_7.png
    ├── 891531_8.png
    └── 891531_9.png
    
    0 directories, 31 files
    

    So what's the difference?

    The difference is that these files are organized by extension type, and provided as actual png images. The original data is provided as numpy data frames, and is organized by zenodo ID. Both are useful for different things - this particular version is cool because we can actually see what a code image looks like.

    How many images total?

    We can count the number of total images:

    find "." -type f -name *.png | wc -l
    3,026,993
    

    Dataset Curation

    The script to create the dataset is provided here. Essentially, we start with the top extensions as identified by this work (excluding actual images files) and then write each 80x80 image to an actual png image, organizing by extension then zenodo id (as shown above).

    Saving the Image

    I tested a few methods to write the single channel 80x80 data frames as png images, and wound up liking cv2's imwrite function because it would save and then load the exact same content.

    import cv2
    cv2.imwrite(image_path, image)
    

    Loading the Image

    Given the above, it's pretty easy to load an image! Here is an example using scipy, and then for newer Python (if you get a deprecation message) using imageio.

    image_path = '/tmp/data1/data/csv/1009185/1009185_0.png'
    from imageio import imread
    
    image = imread(image_path)
    array([[116, 105, 109, ..., 32, 32, 32],
        [ 48, 44, 48, ..., 32, 32, 32],
        [ 48, 46, 49, ..., 32, 32, 32],
        ..., 
        [ 32, 32, 32, ..., 32, 32, 32],
        [ 32, 32, 32, ..., 32, 32, 32],
        [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8)
    
    
    image.shape
    (80,80)
    
    
    # Deprecated
    from scipy import misc
    misc.imread(image_path)
    
    Image([[116, 105, 109, ..., 32, 32, 32],
        [ 48, 44, 48, ..., 32, 32, 32],
        [ 48, 46, 49, ..., 32, 32, 32],
        ..., 
        [ 32, 32, 32, ..., 32, 32, 32],
        [ 32, 32, 32, ..., 32, 32, 32],
        [ 32, 32, 32, ..., 32, 32, 32]], dtype=uint8)
    

    Remember that the values in the data are characters that have been converted to ordinal. Can you guess what 32 is?

    ord(' ')
    32
    
    # And thus if you wanted to convert it back...
    chr(32)
    

    So how t...

  4. Most downloaded Zenodo datasets

    • kaggle.com
    zip
    Updated Feb 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Gorgolewski (2020). Most downloaded Zenodo datasets [Dataset]. https://www.kaggle.com/chrisfilo/most-downloaded-zenodo-datasets
    Explore at:
    zip(22524 bytes)Available download formats
    Dataset updated
    Feb 6, 2020
    Authors
    Chris Gorgolewski
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Zenodo.org is a popular data repository hosted by CERN. There are tens of thousands of datasets in the repository, but not all of them are used to the same extent.

    Content

    This dataset includes names and links to the top 500 most downloaded datasets on Zenodo.

    Inspiration

    This dataset can be used to find datasets deposited on zenodo that would benefit from additional exposure to the DS/ML community by uploading them to Kaggle.

  5. Data from the International Open Data Repository Survey

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated May 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Markus von der Heyde; Markus von der Heyde (2022). Data from the International Open Data Repository Survey [Dataset]. http://doi.org/10.5281/zenodo.2643493
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 25, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Markus von der Heyde; Markus von der Heyde
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file collection is part of the ORD Landscape and Cost Analysis Project (DOI: 10.5281/zenodo.2643460), a study jointly commissioned by the SNSF and swissuniversities in 2018.

    Please cite this data collection as:
    von der Heyde, M. (2019). Data from the International Open Data Repository Survey. Retrieved from https://doi.org/10.5281/zenodo.2643493

    Further information is given in the corresponding data paper:
    von der Heyde, M. (2019). International Open Data Repository Survey: Description of collection, collected data, and analysis methods [Data paper]. Retrieved from https://doi.org/10.5281/zenodo.2643450

    Contact

    Swiss National Science Foundation (SNSF)

    Open Research Data Group

    E-mail: ord@snf.ch

    swissuniversities

    Program "Scientific Information"

    Gabi Schneider

    E-Mail: isci@swissuniversities.ch

  6. Z

    Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

    • data.niaid.nih.gov
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hsu, Jonathan; Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
    Explore at:
    Dataset updated
    Nov 20, 2023
    Authors
    Hsu, Jonathan; Stoop, Allart
    Description

    Table of Contents

    Main Description File Descriptions Linked Files Installation and Instructions

    1. Main Description

    This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

    Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

    File Descriptions

    The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

    Linked Files

    This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

    Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

    Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

    Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

    Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

    Installation and Instructions

    The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

    Ensure you have R version 4.1.2 or higher for compatibility.

    Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

    1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
    2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
    3. Set your working directory to where the following files are located:

    marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

    You can use the following code to set the working directory in R:

    setwd(directory)

    1. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
    2. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
    3. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
    4. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
  7. s

    Precision Liming Soil Datasets (LimeSoDa) Zenodo Repository

    • repository.soilwise-he.eu
    Updated Sep 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Precision Liming Soil Datasets (LimeSoDa) Zenodo Repository [Dataset]. http://doi.org/10.5281/zenodo.14936177
    Explore at:
    Dataset updated
    Sep 8, 2025
    Description

    Overview Precision Liming Soil Datasets (LimeSoDa) is a collection of 31 datasets from a field- and farm-scale soil mapping context. These datasets are 'ready-to-use' for modeling purposes, as they include target soil properties and features in a tidy tabular format. Three target soil properties are present in every dataset: (1) soil organic matter (SOM) or soil organic carbon (SOC), (2) pH, and (3) clay content, while the features for modeling are dataset-specific. The primary goal of LimeSoDa is to enable more reliable benchmarking of machine learning methods in digital soil mapping and pedometrics. All the associated materials and data from LimeSoDa can be downloaded in this data repository. However, for a more in-depth analysis, we refer to the published paper 'LimeSoDa: A Dataset Collection for Benchmarking of Machine Learning Regressors in Digital Soil Mapping' by Schmidinger et al. (2025). You may also use our R and Python package likewise called LimeSoDa. Citation Upon usage of datasets from LimeSoDa, please cite our associated paper: Schmidinger, J., Vogel, S., Barkov, V., Pham, A.-D., Gebbers, R., Tavakoli, H., Correa, J., Tavares, T.R., Filippi, P., Jones, E. J., Lukas, V., Boenecke, E., Ruehlmann, J., Schroeter, I., Kramer, E., Paetzold, S., Kodaira, M., Wadoux, A.M.J.-C., Bragazza, L., Metzger, K., Huang, J., Valente, D.S.M., Safanelli, J.L., Bottega, E.L., Dalmolin, R.S.D., Farkas, C., Steiger, A., Horst, T. Z., Ramirez-Lopez, L., Scholten, T., Stumpf, F., Rosso, P., Costa, M.M., Zandonadi, R.S., Wetterlind, J. & Atzmueller, M. (2025). LimeSoDa: A Dataset Collection for Benchmarking of Machine Learning Regressors in Digital Soil Mapping.

  8. Z

    MDverse datasets

    • nde-dev.biothings.io
    Updated Apr 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Poulain, Pierre (2023). MDverse datasets [Dataset]. https://nde-dev.biothings.io/resources?id=zenodo_7856523
    Explore at:
    Dataset updated
    Apr 23, 2023
    Dataset provided by
    Poulain, Pierre
    Tiemann, Johanna K. S.
    Chavent, Mathieu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Files and datasets in Parquet format related to molecular dynamics and retrieved from the Zenodo, Figshare and OSF data repositories. The file 'data_model_parquet.md' is a codebook that contains data models for the Parquet files.

  9. delete

    • zenodo.org
    Updated Apr 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hannes Rathmann; Hannes Rathmann (2020). delete [Dataset]. http://doi.org/10.5281/zenodo.3568104
    Explore at:
    Dataset updated
    Apr 7, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hannes Rathmann; Hannes Rathmann
    Description

    delete (repository outdated)

    new repository: https://zenodo.org/record/3713179#.XoxaDWDgq71

  10. Ab-initio data repository for physics-informed data-driven model

    • zenodo.org
    bin, csv, png
    Updated Dec 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhanshu Kuthe; Sudhanshu Kuthe (2024). Ab-initio data repository for physics-informed data-driven model [Dataset]. http://doi.org/10.5281/zenodo.13236874
    Explore at:
    csv, bin, pngAvailable download formats
    Dataset updated
    Dec 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sudhanshu Kuthe; Sudhanshu Kuthe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Aug 6, 2024
    Description

    Ab-initio Data Repository for Physics-Informed Data-Driven Model

    This repository saved the precise Density Functional Theory (DFT) calculations and Vienna Ab initio Simulation Package (VASP) codes to provide a comprehensive dataset for physics-informed models. It specifically considers the steelmaking process by focusing on different types of non-metallic inclusions (NMIs) within the steel melt.

    Data Sets Included:

    1. Hamaker Constants values
    2. Coagulation coefficients and Entrapment Probabilities
    3. Kinematic Viscosity of SAE 1055 steel
    4. Dynamic Viscosity of SAE 1055 steel
    5. Dielectric function of oxides
    6. Lattice parameters for molten and bulk Iron
    7. Surface Tension of molten Fe.

    Purpose and Application: This repository is designed to support advanced physics-informed modeling approaches, such as those using machine learning algorithms to predict clogging and inclusion behaviors in steelmaking processes.

    The datasets includes following types of NMIs with detailed characteristics in the size range of 1-10 µm:

    • Al2O3
    • CaO
    • MgO
    • MgAl2O4
    • CaS
    • ZrO2
  11. b

    Zenodo

    • bioregistry.io
    Updated Jan 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Zenodo [Dataset]. http://identifiers.org/re3data:r3d100010468
    Explore at:
    Dataset updated
    Jan 9, 2023
    Description

    Zenodo is an open repository that allows researchers to deposit research papers, data sets, research software, reports, and any other research related digital artefacts.

  12. h

    nmiracle-datasets

    • huggingface.co
    Updated Dec 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federico Ottomano (2025). nmiracle-datasets [Dataset]. https://huggingface.co/datasets/fedeotto/nmiracle-datasets
    Explore at:
    Dataset updated
    Dec 25, 2025
    Authors
    Federico Ottomano
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Datasets utilized to train NMIRacle

    This dataset repository contains derived data used for the development and evaluation of the NMIRacle framework. The data is not original. It is constructed from the following publicly available Zenodo datasets:

    Multimodal spectroscopic dataset (License: CDLA–Sharing 1.0)https://zenodo.org/records/14770232

    NMR2Struct training data (License: CC-BY-4.0)https://zenodo.org/records/13892026

    Please refer to the original Zenodo repositories for the… See the full description on the dataset page: https://huggingface.co/datasets/fedeotto/nmiracle-datasets.

  13. Repository for Aerial Imagery-Derived Dataset of Manufactured Housing...

    • zenodo.org
    zip
    Updated Aug 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Armin Yeganeh; Maria Marshall; Noah Durst; Armin Yeganeh; Maria Marshall; Noah Durst (2025). Repository for Aerial Imagery-Derived Dataset of Manufactured Housing Communities in the North Central United States [Dataset]. http://doi.org/10.5281/zenodo.16459113
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 11, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Armin Yeganeh; Maria Marshall; Noah Durst; Armin Yeganeh; Maria Marshall; Noah Durst
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This repository contains the dataset referenced in the Scientific Data journal article titled "Aerial Imagery-Derived Dataset of Manufactured Housing Communities in the North Central United States" by Armin Yeganeh, Maria Marshall, and Noah Durst. The associated code scripts are available at https://github.com/arminyeganeh/mhc

  14. Long-Term Wi-Fi fingerprinting dataset and supporting material

    • zenodo.org
    zip
    Updated Apr 11, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Germán Martín Mendoza-Silva; Germán Martín Mendoza-Silva; Philipp Richter; Philipp Richter; Joaquín Torres-Sospedra; Joaquín Torres-Sospedra; Elena Simona Lohan; Elena Simona Lohan; Joaquín Huerta; Joaquín Huerta (2020). Long-Term Wi-Fi fingerprinting dataset and supporting material [Dataset]. http://doi.org/10.5281/zenodo.1066041
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 11, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Germán Martín Mendoza-Silva; Germán Martín Mendoza-Silva; Philipp Richter; Philipp Richter; Joaquín Torres-Sospedra; Joaquín Torres-Sospedra; Elena Simona Lohan; Elena Simona Lohan; Joaquín Huerta; Joaquín Huerta
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    WiFi measurements database for UJI's library and supporting material.

    The measurements were collected by one person using one Android smartphone during 15 months at two floor of the library building from Universitat Jaume I, in Spain. It contains 63,504 WiFi fingerprints, which are organized into datasets. Each dataset is the result of a collection campaign.

    The supporting material includes Matlab® scripts to load and filter the desired data, and provides examples on possible studies that the database may enable. The supporting material also includes the bookshelve local coordinates.

    Citation request:

    G.M. Mendoza-Silva, P. Richter, J. Torres-Sospedra, E.S. Lohan, J. Huerta, "Long-Term
    Wi-Fi fingerprinting dataset and supporting material", Zenodo repository, DOI 10.5281/zenodo.1066041.

  15. The North Pacific Eukaryotic Gene Catalog: clustered nucleotide...

    • zenodo.org
    application/gzip
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ryan Groussman; Ryan Groussman; Sacha Coesel; Sacha Coesel; E. Virginia Armbrust; E. Virginia Armbrust (2025). The North Pacific Eukaryotic Gene Catalog: clustered nucleotide metatranscripts and read counts [Dataset]. http://doi.org/10.5281/zenodo.13826820
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ryan Groussman; Ryan Groussman; Sacha Coesel; Sacha Coesel; E. Virginia Armbrust; E. Virginia Armbrust
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data continues with the development of the NPEGC Trinity de novo metatranscriptome assemblies from the protein data repository of The North Pacific Eukaryotic Gene Catalog. The nucleotide sequences corresponding to the NPEGC cluster representatives are collected together in these repository files:

    NPac.G1PA.bf100.id99.nt.fasta.gz
    NPac.G2PA.bf100.id99.nt.fasta.gz
    NPac.G3PA.bf100.id99.nt.fasta.gz
    NPac.G3PA_diel.bf100.id99.nt.fasta.gz
    NPac.D1PA.bf100.id99.nt.fasta.gz

    A full description of this data is published in Scientific Data, available here: The North Pacific Eukaryotic Gene Catalog of metatranscriptome assemblies and annotations. Please cite this publication if your research uses this data:

    Groussman, R. D., Coesel, S. N., Durham, B. P., Schatz, M. J., & Armbrust, E. V. (2024). The North Pacific Eukaryotic Gene Catalog of metatranscriptome assemblies and annotations. Scientific Data, 11(1), 1161.

    These nucleotide sequences have been sourced from the Zenodo repository for raw assemblies: The North Pacific Eukaryotic Gene Catalog: Raw assemblies from Gradients 1, 2 and 3

    Key processing steps are sampled below with links to the detailed code on the main github code repository: https://github.com/armbrustlab/NPac_euk_gene_catalog


    Code used to build the kallisto indices and map the short reads against indices with kallisto are online in the code repository here: NPEGC.nt_kallisto_counts.sh

    There are two main steps:
    1. Generate the kallisto index on the sets of clustered nucleotide metatranscripts
    2. Map the short reads from environmental samples back to the assembly index

    As generated above, kallisto generates separate results files for each of the sample files. Even after compression, the total size of the tarballed kallisto output results directories are prohibitively large (>50GB). We use the code in this template R script to join together the 'est_count' estimated count values for the tens of millions of protein sequences in each project metatranscriptome, along with length.

    The code in this template script was used for each project: aggregate_kallisto_counts.R
    The output count files for each project are Gzip-compressed and uploaded to the NPEGC nucleotide data repository here:

    G1PA.raw.est_counts.csv.gz
    G2PA.raw.est_counts.csv.gz
    G3PA.raw.est_counts.csv.gz
    G3PA_diel.raw.est_counts.csv.gz
    D1PA.raw.est_counts.csv.gz

  16. Sets of mutually similar public GitHub repositories (October 2016)

    • zenodo.org
    json
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Markovtsev Vadim; Markovtsev Vadim (2020). Sets of mutually similar public GitHub repositories (October 2016) [Dataset]. http://doi.org/10.5281/zenodo.285377
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Markovtsev Vadim; Markovtsev Vadim
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The format is JSON, the list of lists. Each list is the group of very similar repositories (Weighted Jaccard Similarity threshold 0.8~0.9).

  17. WorldCereal open global harmonized reference data repository (CC-BY-SA...

    • zenodo.org
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hendrik Boogaard; Hendrik Boogaard; Arun Pratihast; Juan Carlos Laso Bayas; Santosh Karanam; Steffen Fritz; Kristof Van Tricht; Jeroen Degerickx; Sven Gilliams; Arun Pratihast; Juan Carlos Laso Bayas; Santosh Karanam; Steffen Fritz; Kristof Van Tricht; Jeroen Degerickx; Sven Gilliams (2024). WorldCereal open global harmonized reference data repository (CC-BY-SA licensed data sets) [Dataset]. http://doi.org/10.5281/zenodo.7609546
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Hendrik Boogaard; Hendrik Boogaard; Arun Pratihast; Juan Carlos Laso Bayas; Santosh Karanam; Steffen Fritz; Kristof Van Tricht; Jeroen Degerickx; Sven Gilliams; Arun Pratihast; Juan Carlos Laso Bayas; Santosh Karanam; Steffen Fritz; Kristof Van Tricht; Jeroen Degerickx; Sven Gilliams
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Within the ESA funded WorldCereal project we have built an open harmonized reference data repository at global extent for model training or product validation in support of land cover and crop type mapping. Data from 2017 onwards were collected from many different sources and then harmonized, annotated and evaluated. These steps are explained in the harmonization protocol (10.5281/zenodo.7584463). This protocol also clarifies the naming convention of the shape files and the WorldCereal attributes (LC, CT, IRR, valtime and sampleID) that were added to the original data sets.

    This publication includes those harmonized data sets of which the original data set was published under the CC-BY-SA license or a license similar to CC-BY-SA. See document "_In-situ-data-World-Cereal - license - CC-BY-SA.pdf" for an overview of the original data sets.

  18. MESSAGEix-GLOBIOM 1.1 R11 no-policy baseline

    • zenodo.org
    • data-staging.niaid.nih.gov
    bin
    Updated Feb 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oliver Fricko; Oliver Fricko; Stefan Frank; Stefan Frank; Matthew Gidden; Matthew Gidden; Daniel Huppmann; Daniel Huppmann; Nils A. Johnson; Paul Natsuo Kishimoto; Paul Natsuo Kishimoto; Peter Kolp; Peter Kolp; Francesco Lovat; Francesco Lovat; David L. McCollum; David L. McCollum; Jihoon Min; Jihoon Min; Shilpa Rao; Shilpa Rao; Keywan Riahi; Keywan Riahi; Holger Rogner; Holger Rogner; Bas van Ruijven; Bas van Ruijven; Adriano Vinca; Adriano Vinca; Behnam Zakeri; Behnam Zakeri; Andrey Lessa Derci Augustynczik; Andre Deppermann; Andre Deppermann; Tatiana Ermolieva; Mykola Gusti; Mykola Gusti; Pekka Lauri; Pekka Lauri; Chris Heyes; Chris Heyes; Wolfgang Schoepp; Wolfgang Schoepp; Zbigniew Klimont; Zbigniew Klimont; Petr Havlik; Petr Havlik; Volker Krey; Volker Krey; Nils A. Johnson; Andrey Lessa Derci Augustynczik; Tatiana Ermolieva (2024). MESSAGEix-GLOBIOM 1.1 R11 no-policy baseline [Dataset]. http://doi.org/10.5281/zenodo.10514052
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Oliver Fricko; Oliver Fricko; Stefan Frank; Stefan Frank; Matthew Gidden; Matthew Gidden; Daniel Huppmann; Daniel Huppmann; Nils A. Johnson; Paul Natsuo Kishimoto; Paul Natsuo Kishimoto; Peter Kolp; Peter Kolp; Francesco Lovat; Francesco Lovat; David L. McCollum; David L. McCollum; Jihoon Min; Jihoon Min; Shilpa Rao; Shilpa Rao; Keywan Riahi; Keywan Riahi; Holger Rogner; Holger Rogner; Bas van Ruijven; Bas van Ruijven; Adriano Vinca; Adriano Vinca; Behnam Zakeri; Behnam Zakeri; Andrey Lessa Derci Augustynczik; Andre Deppermann; Andre Deppermann; Tatiana Ermolieva; Mykola Gusti; Mykola Gusti; Pekka Lauri; Pekka Lauri; Chris Heyes; Chris Heyes; Wolfgang Schoepp; Wolfgang Schoepp; Zbigniew Klimont; Zbigniew Klimont; Petr Havlik; Petr Havlik; Volker Krey; Volker Krey; Nils A. Johnson; Andrey Lessa Derci Augustynczik; Tatiana Ermolieva
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains the parameterization of a no-policy baseline scenario of the global 11-regional MESSAGEix-GLOBIOM integrated assessment model. Regions, time periods, commodities, technologies and relations included in this model are described in a separate repository. The dataset relies on the MESSAGEix modeling framework (Huppmann et al. 2019) and can be imported into MESSAGEix via the read_excel() functionality, for which a tutorial is available, or via snapshot.load() as described here. After the import the scenario can be solved and modified to create new scenarios. Note that the published scenario as included in the ENGAGE global scenarios dataset has been run with a release candidate of version 3.4.0 of MESSAGEix.

  19. Coda and data for "A natural disaster exacerbates and redistributes disease...

    • zenodo.org
    zip
    Updated Sep 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Camille Testard; Camille Testard; Gregory Albery; Gregory Albery (2024). Coda and data for "A natural disaster exacerbates and redistributes disease risk across free-ranging macaques by altering social structure" [Dataset]. http://doi.org/10.5281/zenodo.13856677
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Camille Testard; Camille Testard; Gregory Albery; Gregory Albery
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This Zenodo repository contains the data and code for the article entitled "A natural disaster exacerbates and redistributes disease risk across free-ranging macaques by altering social structure".

  20. GitHub developer behavior and repository evolution dataset

    • zenodo.org
    application/gzip
    Updated Feb 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ShengyuZhao; TianyiZhou; ShengyuZhao; TianyiZhou (2020). GitHub developer behavior and repository evolution dataset [Dataset]. http://doi.org/10.5281/zenodo.3648084
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 7, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    ShengyuZhao; TianyiZhou; ShengyuZhao; TianyiZhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this work, based on GitHub Archive project and repository mining tools, we process all available data into concise and structured format to generate GitHub developer behavior and repository evolution dataset. With the self-configurable interactive analysis tool provided by us, it will give us a macroscopic view of open source ecosystem evolution.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Zhou Ang; Zhou Ang (2023). dataset [Dataset]. http://doi.org/10.5281/zenodo.7397415
Organization logo

dataset

Explore at:
Dataset updated
Feb 17, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Zhou Ang; Zhou Ang
Description

Dataset for paper

Search
Clear search
Close search
Google apps
Main menu