100+ datasets found

Training images
redivis.com
Updated Aug 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Redivis Demo Organization (2022). Training images [Dataset]. https://redivis.com/datasets/yz1s-d09009dbb
Explore at:
Dataset updated
Aug 17, 2022
Dataset provided by
Redivis Inc.
Authors
Redivis Demo Organization
Time period covered
Aug 8, 2022
Description
This is an auto-generated index table corresponding to a folder of files in this dataset with the same name. This table can be used to extract a subset of files based on their metadata, which can then be used for further analysis. You can view the contents of specific files by navigating to the "cells" tab and clicking on an individual file_kd.
w
Synthetic Data for an Imaginary Country, Sample, 2023 - World
microdata.worldbank.org
Updated Jul 7, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Development Data Group, Data Analytics Unit (2023). Synthetic Data for an Imaginary Country, Sample, 2023 - World [Dataset]. https://microdata.worldbank.org/index.php/catalog/5906
Explore at:
Dataset updated
Jul 7, 2023
Dataset authored and provided by
Development Data Group, Data Analytics Unit
Time period covered
2023
Area covered
World, World
Description
Abstract

The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.

The full-population dataset (with about 10 million individuals) is also distributed as open data.

Geographic coverage

The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.

Analysis unit

Household, Individual

Universe

The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.

Kind of data

ssd

Sampling procedure

The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.

Mode of data collection

other

Research instrument

The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.

Cleaning operations

The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.

Response rate

This is a synthetic dataset; the "response rate" is 100%.
d
Example data file for TRUEMET Version 2.2
catalog.data.gov
datasets.ai
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Fish and Wildlife Service (2025). Example data file for TRUEMET Version 2.2 [Dataset]. https://catalog.data.gov/dataset/example-data-file-for-truemet-version-2-2
Explore at:
Dataset updated
Feb 21, 2025
Dataset provided by
U.S. Fish and Wildlife Service
Description
This file is an example data set from the Central Valley of California from a drought study corresponding to “recent non-drought conditions” (Scenario 1 in Petrie et al., in review). In 2014, following an 8-year period with 7 below-normal to critically-dry water years, the bioenergetic model TRUEMET was used to assess the impacts of drought on wintering waterfowl habitat and bioenergetics in the Central Valley of California. The goal of the study was to assess whether available foraging habitats could provide enough food to support waterfowl populations (ducks and geese) under a variety of climate and population level scenarios. This information could then be used by managers to adapt their waterfowl habitat management plans to drought conditions. The study area spanned the Central Valley and included the Sacramento Valley in the north, the San Joaquin Valley in the south, and Suisun Marsh and Sacramento-San Joaquin River Delta (Delta) east of San Francisco Bay. The data set consists of two foraging guilds (ducks and geese/swans) and five forage types: harvested corn, rice (flooded), rice (unflooded), wetland invertebrates and wetland moist soil seeds. For more background on the data set, see Petrie et al. in review.
F
OER sample data-set
data.uni-hannover.de
csv
Updated Jan 20, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
L3S (2022). OER sample data-set [Dataset]. https://data.uni-hannover.de/dataset/oer-sample-data-set
Explore at:
csv(6260265)Available download formats
Dataset updated
Jan 20, 2022
Dataset authored and provided by
L3S
License
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Description
This data-set includes information about a sample of 8,887 of Open Educational Resources (OERs) from SkillsCommons website. It contains title, description, URL, type, availability date, issued date, subjects, and the availability of following metadata: level, time_required to finish, and accessibility.

This data-set has been used to build a metadata scoring and quality prediction model for OERs.
d
Data Cleaning Sample
dataone.org
borealisdata.ca
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luo, Rong (2023). Data Cleaning Sample [Dataset]. http://doi.org/10.5683/SP3/ZCN177
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/ZCN177
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Luo, Rong
Description
Sample data for exercises in Further Adventures in Data Cleaning.
d
Data Management Plan Examples Database
search.dataone.org
borealisdata.ca
Updated Sep 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Evering, Danica; Acharya, Shrey; Pratt, Isaac; Behal, Sarthak (2024). Data Management Plan Examples Database [Dataset]. http://doi.org/10.5683/SP3/SDITUG
Explore at:
Unique identifier
https://doi.org/10.5683/SP3/SDITUG
Dataset updated
Sep 4, 2024
Dataset provided by
Borealis
Authors
Evering, Danica; Acharya, Shrey; Pratt, Isaac; Behal, Sarthak
Time period covered
Jan 1, 2011 - Jan 1, 2023
Description
This dataset is comprised of a collection of example DMPs from a wide array of fields; obtained from a number of different sources outlined below. Data included/extracted from the examples include the discipline and field of study, author, institutional affiliation and funding information, location, date created, title, research and data-type, description of project, link to the DMP, and where possible external links to related publications or grant pages. This CSV document serves as the content for a McMaster Data Management Plan (DMP) Database as part of the Research Data Management (RDM) Services website, located at https://u.mcmaster.ca/dmps. Other universities and organizations are encouraged to link to the DMP Database or use this dataset as the content for their own DMP Database. This dataset will be updated regularly to include new additions and will be versioned as such. We are gathering submissions at https://u.mcmaster.ca/submit-a-dmp to continue to expand the collection.
h
Data from: example-dataset
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shalini Sundaram, example-dataset [Dataset]. https://huggingface.co/datasets/CoffeeDoodle/example-dataset
Explore at:
Authors
Shalini Sundaram
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset Card for example-dataset

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/CoffeeDoodle/example-dataset/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/CoffeeDoodle/example-dataset.
f
Orange dataset table
figshare.com
xlsx
Updated Mar 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rui Simões (2022). Orange dataset table [Dataset]. http://doi.org/10.6084/m9.figshare.19146410.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19146410.v1
Dataset updated
Mar 4, 2022
Dataset provided by
figshare
Authors
Rui Simões
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The complete dataset used in the analysis comprises 36 samples, each described by 11 numeric features and 1 target. The attributes considered were caspase 3/7 activity, Mitotracker red CMXRos area and intensity (3 h and 24 h incubations with both compounds), Mitosox oxidation (3 h incubation with the referred compounds) and oxidation rate, DCFDA fluorescence (3 h and 24 h incubations with either compound) and oxidation rate, and DQ BSA hydrolysis. The target of each instance corresponds to one of the 9 possible classes (4 samples per class): Control, 6.25, 12.5, 25 and 50 µM for 6-OHDA and 0.03, 0.06, 0.125 and 0.25 µM for rotenone. The dataset is balanced, it does not contain any missing values and data was standardized across features. The small number of samples prevented a full and strong statistical analysis of the results. Nevertheless, it allowed the identification of relevant hidden patterns and trends.

Exploratory data analysis, information gain, hierarchical clustering, and supervised predictive modeling were performed using Orange Data Mining version 3.25.1 [41]. Hierarchical clustering was performed using the Euclidean distance metric and weighted linkage. Cluster maps were plotted to relate the features with higher mutual information (in rows) with instances (in columns), with the color of each cell representing the normalized level of a particular feature in a specific instance. The information is grouped both in rows and in columns by a two-way hierarchical clustering method using the Euclidean distances and average linkage. Stratified cross-validation was used to train the supervised decision tree. A set of preliminary empirical experiments were performed to choose the best parameters for each algorithm, and we verified that, within moderate variations, there were no significant changes in the outcome. The following settings were adopted for the decision tree algorithm: minimum number of samples in leaves: 2; minimum number of samples required to split an internal node: 5; stop splitting when majority reaches: 95%; criterion: gain ratio. The performance of the supervised model was assessed using accuracy, precision, recall, F-measure and area under the ROC curve (AUC) metrics.
Z
Integrated DInSAR + GNSS example data sets
data.niaid.nih.gov
Updated Oct 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corsa, Brianna (2024). Integrated DInSAR + GNSS example data sets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_13999128
Explore at:
Dataset updated
Oct 27, 2024
Dataset authored and provided by
Corsa, Brianna
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data repository contains sample datasets of raw DInSAR time series (NSBAS_PARAMS.h5), raw, interpolated GNSS time series maps (GPS_East/North/Up.h5) , errors associated with the GNSS data (GPS_East/North/Up_sigma.h5), and integrated DInSAR + GNSS time series (fused.h5). Details about the data can be read about in the following publication: [Corsa, B. "Integration of DInSAR Time Series and GNSS data for Continuous Volcanic Deformation Monitoring and Eruption Early Warning Applications" Remote Sens. 2022, 14(3), 784; https://doi.org/10.3390/rs14030784]. The raw DInSAR time series spans 245 dates between 2015-11-11 to 2021-04-13 over the Big Island of Hawaii. The current raw GPS data and fused time series used 22 data points between those same dates.
h
cot-example-dataset
huggingface.co
Updated Nov 24, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2024). cot-example-dataset [Dataset]. https://huggingface.co/datasets/dvilasuero/cot-example-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 24, 2024
Authors
Daniel Vila
Description
Dataset Card for cot-example-dataset

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/dvilasuero/cot-example-dataset/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/dvilasuero/cot-example-dataset.
An example data set to demonstrate the usage of M.o.R., a shiny app for...
datasets.ai
data.nist.gov
+3more
21, 47
Updated Sep 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). An example data set to demonstrate the usage of M.o.R., a shiny app for model-based metrology. [Dataset]. https://datasets.ai/datasets/an-example-data-set-to-demonstrate-the-usage-of-m-o-r-a-shiny-app-for-model-based-metrolog-98b63
Explore at:
47, 21Available download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This data set consists of several files that were created to accompany M.o.R., a shiny app created by the Surface & Nanostructure Metrology Group in the Engineering Physics Division of the Physical Measurement Laboratory (PML) at the National Institute of Standards and Technology. It was created to simplify model-based metrology. A detailed explanation of the proper usage can be found in the M.o.R. documentation.
h
openPMD Example Data Sets from PIConGPU 0.2.0
rodare.hzdr.de
gz, zip
Updated Sep 19, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hübl, Axel (2018). openPMD Example Data Sets from PIConGPU 0.2.0 [Dataset]. http://doi.org/10.14278/rodare.57
Explore at:
zip, gzAvailable download formats
Unique identifier
https://doi.org/10.14278/rodare.57
Dataset updated
Sep 19, 2018
Dataset provided by
Helmholtz-Zentrum Dresden-Rossendorf and TU Dresden
Authors
Hübl, Axel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Quite outdated data but used in openPMD-api unit tests.

HDF5 data contains particle patches, ADIOS1 data does not. Uploading it here for reference, as a download point and for test reproducibility.
C-COMPASS Example Data
zenodo.org
bin, tsv, txt
Updated Mar 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Thomas Haas; Daniel Thomas Haas (2025). C-COMPASS Example Data [Dataset]. http://doi.org/10.5281/zenodo.13901167
Explore at:
txt, bin, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13901167
Dataset updated
Mar 21, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Thomas Haas; Daniel Thomas Haas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example Proteomics Dataset for C-COMPASS

This dataset is an example of proteomics data generated from gradient centrifugation. It is provided exclusively to the reviewers of the manuscript titled "C-COMPASS: A User-Friendly Neural Network Tool for Profiling Cell Compartments at Protein and Lipid Levels."

Until the manuscript is accepted, this dataset remains unpublished and is intended solely for testing the functionality of C-COMPASS. Any further use or distribution of the data is strictly prohibited.

C-COMPASS_session.npy: Ready-to-use session that can be opened in C-COMPASS. All processing steps were already performed hers.

C-COMPASS_ProteomeInput.tsv: Complete dataset including Fractionation and TotalProteome data. This can be used to reproduce the analysis. The containing data is equal to the dataset that was used for the research article.

C-COMPASS_MarkerList.txt: This list contains a compatible marker list that can be used to analyze the data from C-COMPASS_ProteomeInput.tsv. It is the same marker list that was used for the research article.
f
Data from: Evaluating Supplemental Samples in Longitudinal Research:...
tandf.figshare.com
txt
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura K. Taylor; Xin Tong; Scott E. Maxwell (2024). Evaluating Supplemental Samples in Longitudinal Research: Replacement and Refreshment Approaches [Dataset]. http://doi.org/10.6084/m9.figshare.12162072.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12162072.v1
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Laura K. Taylor; Xin Tong; Scott E. Maxwell
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Despite the wide application of longitudinal studies, they are often plagued by missing data and attrition. The majority of methodological approaches focus on participant retention or modern missing data analysis procedures. This paper, however, takes a new approach by examining how researchers may supplement the sample with additional participants. First, refreshment samples use the same selection criteria as the initial study. Second, replacement samples identify auxiliary variables that may help explain patterns of missingness and select new participants based on those characteristics. A simulation study compares these two strategies for a linear growth model with five measurement occasions. Overall, the results suggest that refreshment samples lead to less relative bias, greater relative efficiency, and more acceptable coverage rates than replacement samples or not supplementing the missing participants in any way. Refreshment samples also have high statistical power. The comparative strengths of the refreshment approach are further illustrated through a real data example. These findings have implications for assessing change over time when researching at-risk samples with high levels of permanent attrition.
h
example-preference-dataset
huggingface.co
Updated Jul 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
aspirina765 (2024). example-preference-dataset [Dataset]. https://huggingface.co/datasets/aspirina765/example-preference-dataset
Explore at:
Dataset updated
Jul 30, 2024
Authors
aspirina765
Description
Dataset Card for example-preference-dataset

This dataset has been created with distilabel.

Dataset Summary

This dataset contains a pipeline.yaml which can be used to reproduce the pipeline that generated it in distilabel using the distilabel CLI: distilabel pipeline run --config "https://huggingface.co/datasets/aspirina765/example-preference-dataset/raw/main/pipeline.yaml"

or explore the configuration: distilabel pipeline info --config… See the full description on the dataset page: https://huggingface.co/datasets/aspirina765/example-preference-dataset.
my-pkg-2
figshare.com
Updated Jul 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Tentner (2023). my-pkg-2 [Dataset]. http://doi.org/10.6084/m9.figshare.23647053.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.23647053.v1
Dataset updated
Jul 7, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Andrea Tentner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
example data package (local version)
f
Data from: Sample metadata
fairdomhub.org
xlsx
Updated Jul 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Thomas Harvey (2021). Sample metadata [Dataset]. https://fairdomhub.org/data_files/1440
Explore at:
xlsx(43.9 KB)Available download formats
Dataset updated
Jul 1, 2021
Authors
Thomas Harvey
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Information on samples submitted for RNAseq

Rows are individual samples

Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length
C-COMPASS Example Data
zenodo.org
bin, txt
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Thomas Haas; Daniel Thomas Haas (2025). C-COMPASS Example Data [Dataset]. http://doi.org/10.5281/zenodo.15223914
Explore at:
txt, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15223914
Dataset updated
Apr 15, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Daniel Thomas Haas; Daniel Thomas Haas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example Proteomics Dataset for C-COMPASS

This dataset is a simulated example of proteomics dataset containing 8 compartments across 16 fractions over 3 replicates for 2 conditions, including relocalizations. The data is simulated to be highly comparable to a real spatial proteomics dataset based on gradient fractionation, and contains controlled random variations and noises that mimic real experimental variation.

The purpose of this dataset is, to test or validate C-COMPASS, or to have a prepared session that helps handling the GUI of C-COMPASS.

sim_08_16_FractionationData.txt:
Simulated fractionation dataset for the above mentioned dataset. It can be imported to C-COMPASS for spatial proteomics analysis.

sim_08_16_MarkerList.txt:
Simulated marker list, fitting to the above mentioned fractionation dataset. It can be imported to C-COMPASS to assign markers for the 8 simulated compartments, and is necessary for spatial predictions.

sim_08_16_TotalProteomeData.txt:
Simulated total proteome dataset, fitting to the above mentioned fractionation dataset. This can be imported to C-COMPASS to calculate class-centric changes across conditions.

sim_08_16_CCMPSsession_start.npy:
Prepared C-COMPASS session that can be loaded. This session provides a starting point for a C-COMPASS analysis of the simulated data where the gradient samples are pre-defined, but all the following steps are not yet done.

sim_08_16_CCMPSsession_processed.npy:
Prepared C-COMPASS session that can be loaded. This session provides a fully analyzed state of the simulated data. Results can be exported and plots can be generated. Each processing step can be re-done via the corresponding 'reset' button.
D
Replication Data for: BIGPROD Data Sample
dataverse.nl
bin
Updated Nov 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sajad Ashouri; Arash Hajikhani; Arho Suominen; Angela Jäger; Torben Schubert; Lukas Pukelis; Scott Cunningham; Cees Van Beers; Serdar Türkeli; Sajad Ashouri; Arash Hajikhani; Arho Suominen; Angela Jäger; Torben Schubert; Lukas Pukelis; Scott Cunningham; Cees Van Beers; Serdar Türkeli (2021). Replication Data for: BIGPROD Data Sample [Dataset]. http://doi.org/10.34894/CI5XRR
Explore at:
bin(106338841)Available download formats
Unique identifier
https://doi.org/10.34894/CI5XRR
Dataset updated
Nov 19, 2021
Dataset provided by
DataverseNL
Authors
Sajad Ashouri; Arash Hajikhani; Arho Suominen; Angela Jäger; Torben Schubert; Lukas Pukelis; Scott Cunningham; Cees Van Beers; Serdar Türkeli; Sajad Ashouri; Arash Hajikhani; Arho Suominen; Angela Jäger; Torben Schubert; Lukas Pukelis; Scott Cunningham; Cees Van Beers; Serdar Türkeli
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This data sample (in support the article "Indicators on firm level innovation activities from web scraped data" https://ssrn.com/abstract=3938767) contains data on companies' innovative behavior measured at the firm-level based on web scraped firm-level data derived from medium-high and high-technology companies in the European Union and the United Kingdom. The data are retrieved from individual company websites and contains in total data on 96,921 companies. The data provide information on various aspects of innovation, most significantly the research and development orientation of the company at the company and product level, the company’s collaborative activities, company’s products, and use of standards. In addition to the web scraped data, the dataset aggregates a variety firm-level indicators including patenting activities. In total, the dataset includes 28 variables with unique identifiers which enables connecting to other databases such as financial data.
Example data for SOPiNet
zenodo.org
data.niaid.nih.gov
csv
Updated Apr 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xing Yan; Xing Yan (2023). Example data for SOPiNet [Dataset]. http://doi.org/10.5281/zenodo.7815394
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7815394
Dataset updated
Apr 11, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Xing Yan; Xing Yan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the example data for use the SOPiNet at https://github.com/RegiusQuant/ESIDLM.

The details of SOPiNet can be foud at https://doi.org/10.1016/j.envpol.2023.121509

Facebook

Twitter

Click to copy link

Link copied

Cite

Redivis Demo Organization (2022). Training images [Dataset]. https://redivis.com/datasets/yz1s-d09009dbb

Training images

Explore at:

Dataset updated

Aug 17, 2022

Dataset provided by

Redivis Inc.

Authors

Redivis Demo Organization

Time period covered

Aug 8, 2022

Description

This is an auto-generated index table corresponding to a folder of files in this dataset with the same name. This table can be used to extract a subset of files based on their metadata, which can then be used for further analysis. You can view the contents of specific files by navigating to the "cells" tab and clicking on an individual file_kd.

Clear search

Close search

Google apps

Main menu

Training images

Synthetic Data for an Imaginary Country, Sample, 2023 - World

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Example data file for TRUEMET Version 2.2

OER sample data-set

Data Cleaning Sample

Data Management Plan Examples Database

Data from: example-dataset

Orange dataset table

Integrated DInSAR + GNSS example data sets

cot-example-dataset

An example data set to demonstrate the usage of M.o.R., a shiny app for...

openPMD Example Data Sets from PIConGPU 0.2.0

C-COMPASS Example Data

Data from: Evaluating Supplemental Samples in Longitudinal Research:...

example-preference-dataset

my-pkg-2

Data from: Sample metadata

C-COMPASS Example Data

Replication Data for: BIGPROD Data Sample

Example data for SOPiNet

Training images