10 datasets found
  1. Z

    A stakeholder-centered determination of High-Value Data sets: the use-case...

    • data-staging.niaid.nih.gov
    Updated Oct 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anastasija Nikiforova (2021). A stakeholder-centered determination of High-Value Data sets: the use-case of Latvia [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5142816
    Explore at:
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    University of Latvia
    Authors
    Anastasija Nikiforova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Latvia
    Description

    The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.

    The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)

    Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}

    Format of the file .xls, .csv (for the first spreadsheet only), .odt

    Licenses or restrictions CC-BY

  2. Z

    Synthesized anthropometric data for the German working-age population

    • data.niaid.nih.gov
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ackermann, Alexander; Bonin, Dominik; Jaitner, Thomas; Peters, Markus; Radke, Dörte; Wischniewski, Sascha (2023). Synthesized anthropometric data for the German working-age population [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8042776
    Explore at:
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    Federal Institute for Occupational Safety and Health (BAuA)
    Institute for Community Medicine - SHIP/KEF, University Medicine Greifswald
    Institute for Sport and Sport Science, TU Dortmund University
    Authors
    Ackermann, Alexander; Bonin, Dominik; Jaitner, Thomas; Peters, Markus; Radke, Dörte; Wischniewski, Sascha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The anthropometric datasets presented here are virtual datasets. The unweighted virtual dataset was generated using a synthesis and subsequent validation algorithm (Ackermann et al., 2023). The underlying original dataset used in the algorithm was collected within a regional epidemiological public health study in northeastern Germany (SHIP, see Völzke et al., 2022). Important details regarding the collection of the anthropometric dataset within SHIP (e.g. sampling strategy, measurement methodology & quality assurance process) are discussed extensively in the study by Bonin et al. (2022). To approximate nationally representative values for the German working-age population, the virtual dataset was weighted with reference data from the first survey wave of the Study on health of adults in Germany (DEGS1, see Scheidt-Nave et al., 2012). Two different algorithms were used for the weighting procedure: (1) iterative proportional fitting (IPF), which is described in more detail in the publication by Bonin et al. (2022), and (2) a nearest neighbor approach (1NN), which is presented in the study by Kumar and Parkinson (2018). Weighting coefficients were calculated for both algorithms and it is left to the practitioner which coefficients are used in practice. Therefore, the weighted virtual dataset has two additional columns containing the calculated weighting coefficients with IPF ("WeightCoef_IPF") or 1NN ("WeightCoef_1NN"). Unfortunately, due to the sparse data basis at the distribution edges of SHIP compared to DEGS1, values underneath the 5th and above the 95th percentile should be considered with caution. In addition, the following characteristics describe the weighted and unweighted virtual datasets: According to ISO 15535, values for "BMI" are in [kg/m2], values for "Body mass" are in [kg], and values for all other measures are in [mm]. Anthropometric measures correspond to measures defined in ISO 7250-1. Offset values were calculated for seven anthropometric measures because there were systematic differences in the measurement methodology between SHIP and ISO 7250-1 regarding the definition of two bony landmarks: the acromion and the olecranon. Since these seven measures rely on one of these bony landmarks, and it was not possible to modify the SHIP methodology regarding landmark definitions, offsets had to be calculated to obtain ISO-compliant values. In the presented datasets, two columns exist for these seven measures. One column contains the measured values with the landmarking definitions from SHIP, and the other column (marked with the suffix "_offs") contains the calculated ISO-compliant values (for more information concerning the offset values see Bonin et al., 2022). The sample size is N = 5000 for the male and female subsets. The original SHIP dataset has a sample size of N = 1152 (women) and N = 1161 (men). Due to this discrepancy between the original SHIP dataset and the virtual datasets, users may get a false sense of comfort when using the virtual data, which should be mentioned at this point. In order to get the best possible representation of the original dataset, a virtual sample size of N = 5000 is advantageous and has been confirmed in pre-tests with varying sample sizes, but it must be kept in mind that the statistical properties of the virtual data are based on an original dataset with a much smaller sample size.

  3. Cellular Network Analysis Dataset

    • kaggle.com
    zip
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suraj (2023). Cellular Network Analysis Dataset [Dataset]. https://www.kaggle.com/datasets/suraj520/cellular-network-analysis-dataset/code
    Explore at:
    zip(1306071 bytes)Available download formats
    Dataset updated
    Jun 16, 2023
    Authors
    Suraj
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset, provides realistic signal metrics for 3G, 4G, 5G, and LTE network analysis using DragonOS, Spike, and SDR devices. The dataset aims to provide a representative sample of signal measurements for various network types and locations in Bihar, India. The dataset also replicates the hardware setup involving the Spike software, DragonOS running on the Valve Steam Deck gaming system, BB60C spectrum analyzer powered by an external USB3 hub connected to the Steam Deck's USB C port, srsRan running on a separate laptop for creating the base station using the bladeRFxA9 device.

    Features: The dataset includes the following features:

    1. Timestamp: The timestamps represent the time at which the signal metrics were recorded, with a 10-minute interval between each timestamp.

    2. Latitude and Longitude: The latitude and longitude coordinates indicate the location of the measurement in Bihar. The dataset covers 20 specified localities in Bihar, including Kankarbagh, Rajendra Nagar, Boring Road, Ashok Rajpath, Danapur, Anandpuri, Bailey Road, Gardanibagh, Patliputra Colony, Phulwari Sharif, Exhibition Road, Pataliputra, Fraser Road, Kidwaipuri, Gandhi Maidan, S.K. Puri, Anisabad, Boring Canal Road, Bankipore, and Kumhrar.

    3. Signal Strength (dBm): The signal strength represents the received signal power in decibels (dBm) for different network types (3G, 4G, 5G, and LTE).

    4. Signal Quality (%): The signal quality represents the percentage of signal strength relative to the maximum possible signal strength. It is calculated based on the signal strength values and is applicable for 3G, 4G, 5G, and LTE networks. Unfortunately, Signal Quality percentage yielded some error so it's 0.0 in all.

    5. Data Throughput (Mbps): The data throughput represents the network's capacity to transmit data, measured in megabits per second (Mbps). Different network types have varying data throughput values.

    6. Latency (ms): Latency refers to the time delay between the transmission and reception of data packets, measured in milliseconds (ms). Different network types have different latency values, generated using a random uniform distribution within appropriate ranges.

    7. Network Type: The network type indicates the technology used for data transmission, such as 3G, 4G, 5G, or LTE.

    8. BB60C Measurement (dBm): The BB60C measurement represents the signal strength measured using the BB60C spectrum analyzer device. The values are generated based on the signal strength values with added random uniform noise specific to 4G, 5G, and LTE networks.

    9. srsRAN Measurement (dBm): The srsRAN measurement represents the signal strength measured using the srsRAN software-defined radio device.

    10. BladeRFxA9 Measurement (dBm): The BladeRFxA9 measurement represents the signal strength measured using the BladeRFxA9 software-defined radio device.

    The dataset is generated with a total of 1926 time periods and covers 20 localities in Bihar. It can be used for various purposes, including network optimization, coverage analysis, and performance evaluation.

    Hardware Setup: The dataset replicates the hardware setup using the following components:

    • Valve Steam Deck gaming system running DragonOS Focal
    • BB60C spectrum analyzer powered by an external USB3 hub
    • srsRan software-defined radio (SDR) device
    • BladeRFxA9 software-defined radio (SDR) device

    The BB60C spectrum analyzer is connected to the Steam Deck's USB C port via an external USB3 hub. The srsRan and BladeRFxA9 devices are connected to a separate laptop, which is running the srsenb software to create the base station.

    Additionally, the Spike LTE Analysis tools are utilized to decode the LTE information in real-time. The dataset demonstrates how the Spike software, DragonOS, and SDR devices can be integrated to perform LTE analysis, and the results can be combined with a working GPS for mapping purposes within the Spike software.

    Atlast, We'd like to extend credits to our volunteers in these localities who helped in logging the data after replicating the setup.

    Let us know what you build out of this dataset. It's a subset of data that's being analysed for bio weapon usage in the Bihar area which's controlled via wireless signals to report to international delegates for expedited action against these.

  4. COVID-19 High Frequency Phone Survey of Households 2020, Round 2 - Viet Nam

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Oct 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    World Bank (2023). COVID-19 High Frequency Phone Survey of Households 2020, Round 2 - Viet Nam [Dataset]. https://microdata.worldbank.org/index.php/catalog/4061
    Explore at:
    Dataset updated
    Oct 26, 2023
    Dataset provided by
    World Bank Grouphttp://www.worldbank.org/
    Authors
    World Bank
    Time period covered
    2020
    Area covered
    Vietnam
    Description

    Geographic coverage

    National, regional

    Analysis unit

    Households

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The 2020 Vietnam COVID-19 High Frequency Phone Survey of Households (VHFPS) uses a nationally representative household survey from 2018 as the sampling frame. The 2018 baseline survey includes 46,980 households from 3132 communes (about 25% of total communes in Vietnam). In each commune, one EA is randomly selected and then 15 households are randomly selected in each EA for interview. We use the large module of to select the households for official interview of the VHFPS survey and the small module households as reserve for replacement. After data processing, the final sample size for Round 2 is 3,935 households.

    Mode of data collection

    Computer Assisted Telephone Interview [cati]

    Research instrument

    The questionnaire for Round 2 consisted of the following sections

    Section 2. Behavior Section 3. Health Section 5. Employment (main respondent) Section 6. Coping Section 7. Safety Nets Section 8. FIES

    Cleaning operations

    Data cleaning began during the data collection process. Inputs for the cleaning process include available interviewers’ note following each question item, interviewers’ note at the end of the tablet form as well as supervisors’ note during monitoring. The data cleaning process was conducted in following steps: • Append households interviewed in ethnic minority languages with the main dataset interviewed in Vietnamese. • Remove unnecessary variables which were automatically calculated by SurveyCTO • Remove household duplicates in the dataset where the same form is submitted more than once. • Remove observations of households which were not supposed to be interviewed following the identified replacement procedure. • Format variables as their object type (string, integer, decimal, etc.) • Read through interviewers’ note and make adjustment accordingly. During interviews, whenever interviewers find it difficult to choose a correct code, they are recommended to choose the most appropriate one and write down respondents’ answer in detail so that the survey management team will justify and make a decision which code is best suitable for such answer. • Correct data based on supervisors’ note where enumerators entered wrong code. • Recode answer option “Other, please specify”. This option is usually followed by a blank line allowing enumerators to type or write texts to specify the answer. The data cleaning team checked thoroughly this type of answers to decide whether each answer needed recoding into one of the available categories or just keep the answer originally recorded. In some cases, that answer could be assigned a completely new code if it appeared many times in the survey dataset.
    • Examine data accuracy of outlier values, defined as values that lie outside both 5th and 95th percentiles, by listening to interview recordings. • Final check on matching main dataset with different sections, where information is asked on individual level, are kept in separate data files and in long form. • Label variables using the full question text. • Label variable values where necessary.

  5. f

    Average AMI scores (higher is better) of 100 independent runs of various...

    • figshare.com
    xls
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Devavrat Vivek Dabke; Olga Dorabiala (2024). Average AMI scores (higher is better) of 100 independent runs of various community detection methods over a range of synthetic datasets. [Dataset]. http://doi.org/10.1371/journal.pcsy.0000023.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    PLOS Complex Systems
    Authors
    Devavrat Vivek Dabke; Olga Dorabiala
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    STGkM is our method, CC uses dynamic connected components, k-medoids compresses a dynamic graph into a single static one and uses k-medoids, and DCDID is a heuristic method [4]. The best performance is bolded.

  6. d

    Data from: Geochemical Database for Iron Oxide-Copper-Cobalt-Gold-Rare Earth...

    • catalog.data.gov
    • data.usgs.gov
    • +3more
    Updated Nov 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Geochemical Database for Iron Oxide-Copper-Cobalt-Gold-Rare Earth Element Deposits of Southeast Missouri [Dataset]. https://catalog.data.gov/dataset/geochemical-database-for-iron-oxide-copper-cobalt-gold-rare-earth-element-deposits-of-sout
    Explore at:
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    The Geochemical Database for Iron Oxide-Copper-Cobalt-Gold-Rare Earth Element Deposits of Southeast Missouri (IOCG-REE_GX) contains new geochemical data compilations for samples from IOCG-REE type deposits in which each rock sample has one "best value" determination for each analyzed species, greatly improving speed and efficiency of use. IOCG-REE_GX was created and designed to compile whole-rock and trace element data from southeast Missouri in order to facilitate petrologic studies, mineral resource assessments, and the definition and statistics of geochemical baseline values. This relational database serves as a data archive in support of present and future geologic and geochemical studies of IOCG-REE type deposits, and contains data tables in two different formats describing historical and new quantitative and qualitative geochemical analyses. The analytical results were determined by 24 laboratory analytical methods on 457 rock samples collected during two phases from outcrop and drill core sites from throughout the entire St. Francois Mountains terrane made by the U.S. Geological Survey (USGS). In the first phase, the USGS collected and analyzed 315 samples from 1989 to 1995. During the second phase from 2013-2015, 119 samples were collected and analyzed, and 23 samples from the first phase were reanalyzed using analytical methods of higher precision. The data presents the most precise analytical approach to report the best value for each element. In order to facilitate examination of the geochemistry of the broad range of samples reported (i.e., regional samples, ore zone, or ore deposit alteration-related), a short sample description is given and each sample is coded (Jcode) according to the type of rock suites defined by Kisvarsanyi (1981), whether the sample was collected as a non-ore deposit-related representative of a given rock suite or as a deposit-related sample (Kcode, and; if the sample was related to a specific ore deposit the zone within the given deposit was included (Lcode. These coded data provide a robust tool for evaluating the regional geologic setting of the host terrane as well as assessing the character of hydrothermal alteration related to many of the contained mineral deposits. Data from the first phase are currently maintained in the USGS National Geochemical Database (NGDB), and data from the second phase will soon be added. The data of the IOCG-REE_GX were checked for accuracy regarding sample location, sample media type, and analytical methods used. Reference: >Kisvarsanyi, E.B., 1981, Geology of the Precambrian St. Francois terrane, southeastern Missouri: Missouri Department of Natural Resources Report of Investigations 64, 58 p.

  7. Means, standard deviations, and longitudinal correlations of hovs in the...

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Nov 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oscar Smallenbroek; Adrian Stanciu; Regina Arant; Klaus Boehnke (2023). Means, standard deviations, and longitudinal correlations of hovs in the LuNT data. [Dataset]. http://doi.org/10.1371/journal.pone.0289487.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Nov 30, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Oscar Smallenbroek; Adrian Stanciu; Regina Arant; Klaus Boehnke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Means, standard deviations, and longitudinal correlations of hovs in the LuNT data.

  8. H

    2000-2015: Monthly Means of Daily Means Wind Speed TIFFs for the Lake...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Oct 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2016). 2000-2015: Monthly Means of Daily Means Wind Speed TIFFs for the Lake Victoria Region [Dataset]. http://doi.org/10.7910/DVN/FQBKX5
    Explore at:
    Dataset updated
    Oct 11, 2016
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    2000-2015: Monthly Means of Daily Means Wind Speed TIFFs for the Lake Victoria Region Reference System: SR-ORG:14 Developed for: NSF Award Number: 1518532 CNH-L: The Potential for Aquaculture in Lake Victoria and Implications for Wild Fisheries and Fish Commodity Markets File Naming Convention: All TIFFs are named under the following naming convention: YYYY_MMWS, where YYYY refers to year of the data, MM refers to the month, and WS refers to Wind Speed. This wind speed data set was developed for the purposes of National Science Foundation (NSF) Coupled Natural and Human systems research. Data Origin: Original dataset was downloaded from: http://apps.ecmwf.int/datasets/data/interim-full-moda/levtype=sfc/ Original data downloaded was ERA Interim, Monthly Means of Daily Means, and was developed by the European Centre for Medium-Range Weather Forecasts (ECMWF). Data was downloaded as single component 10 meter Wind Speed data packaged as NetCDF. Data values are measured in m/s-1. Data values are monthly means of daily means. Data is representative of surface level collection. Origin data was derived and developed from reanalysis. ECMWF, the data provider and developer, defines their methods of reanalysis as follows: “Reanalysis (as well as analysis) is a process by which model information and observations of many different sorts are combined in an optimal way to produce a consistent, global best estimate of the various atmospheric, wave and oceanographic parameters.” http://www.ecmwf.int/en/how-are-data-obtained-do-they-come-observations-or-have-they-been-derived-numerical-models Extent downloaded: (degrees) N: 2 W: 27 S: -5 E: 36 Resolution downloaded: (degrees) 0.125 x 0.125 (approx. 14km at the equator). Data Development/Processing: These data were downloaded individually as monthly 10 meter Wind Speed in NetCDF format then converted to TIFF using the following GDAL script. for %A in ("C:\temp*.nc") do gdal_translate -of GTiff -ot FLOAT32 -a_srs "+init=epsg:4326" -unscale -co "COMPRESS=PACKBITS" "%A" "%A.tif Converted TIFF data was validated against the parent NetCDF file for correct cell size and pixel value. To use this data, cite: http://onlinelibrary.wiley.com/doi/10.1002/qj.828/abstract

  9. Cronbach’s alpha values, means and standard deviations1.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marrie H. J. Bekker; Marcel A. L. M. van Assen (2023). Cronbach’s alpha values, means and standard deviations1. [Dataset]. http://doi.org/10.1371/journal.pone.0181626.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Marrie H. J. Bekker; Marcel A. L. M. van Assen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cronbach’s alpha values, means and standard deviations1.

  10. Z

    PixBox Sentinel-2 pixel collection for CMIX

    • data.niaid.nih.gov
    Updated Dec 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paperin, Michael; Wevers, Jan; Stelzer, Kerstin; Brockmann, Carsten (2021). PixBox Sentinel-2 pixel collection for CMIX [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5036990
    Explore at:
    Dataset updated
    Dec 20, 2021
    Dataset provided by
    Brockmann Consult GmbH
    Authors
    Paperin, Michael; Wevers, Jan; Stelzer, Kerstin; Brockmann, Carsten
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The PixBox-S2-CMIX dataset was used as a validation reference within the first Cloud Masking Inter-comparison eXercise (CMIX) conducted within the Committee Earth Observation Satellites (CEOS) Working Group on Calibration & Validation (WGCV) in 2019. The PixBox-S2-CMIX pixel collection was existing prior to CMIX and conducted already in 2018.

    The overarching idea of PixBox is a quantitative assessment of the quality of a pixel classification which is the result of an automated algorithm/procedure. Pixel classification is defined as assigning a certain number of attributes to an image pixel, such as cloud, clear sky, water, land, inland water, flooded, snow etc. Such pixel classification attributes are typically used to further guide higher level processing.

    The PixBox dataset production: trained experienced expert(s) manually classify pixels of an image sensor into a pre-defined detailed set of classes. These are typically different cloud transparencies, cloud shadow, condition of underlying surface (“semi-transparent clouds over snow”, “clouds over bright scattering water”). An average collected dataset includes several 10-thousands of pixels because it has to be representative for all classes, and for various observation and environmental conditions, such as climate zones, sun illumination etc. Quality control of the collected pixels is important in order to detect misclassifications and systematic errors. An auto-associative neural network is trained for this purpose.

    The PixBox-S2-CMIX dataset is a pixel collection containing 17,351 pixels manually collected from 29 Sentinel-2 A & B Level 1C products. The dataset is spatially, temporally, and thematically well distributed.

    PixBox-S2-CMIX dataset

    The PixBox-S2-CMIX dataset consists of two two main ZIP files, one holding the pixel collection and description, and another one with all used Sentinel-2 L1C data. The dataset is structured as follows:

    PixBox-S2-CMIX.zip

    The collected features (CSV file).

    A description to all categories and classes, incl. linkage to the used Sentinel-2 L1C products.

    Sentinel-2_L1C.zip

    29 zipped Sentinel-2 Level L1C products [1], used to produce the dataset.

    Files

    pixbox_sentinel2_cmix_20180425.csv - This file contains all collected pixel information in CSV format. All collected classes are stored as integer values. A description of the categories and definition of the integers to class names is given in the additional description file.

    pixbox_sentinel2_cmix_20180425_description.txt - This file gives a clear description of the categories and classes. It can be used to convert the class ID numbers, stored in the CSV, to class strings. Additionally, it links the satellite product ID, given in the CSV, to the Sentinel-2 L1C product names.

    29 Sentinel-2 L1C products in ZIP format.

    References

    [1] Copernicus Sentinel data 2017/2018

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anastasija Nikiforova (2021). A stakeholder-centered determination of High-Value Data sets: the use-case of Latvia [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5142816

A stakeholder-centered determination of High-Value Data sets: the use-case of Latvia

Explore at:
Dataset updated
Oct 27, 2021
Dataset provided by
University of Latvia
Authors
Anastasija Nikiforova
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Latvia
Description

The data in this dataset were collected in the result of the survey of Latvian society (2021) aimed at identifying high-value data set for Latvia, i.e. data sets that, in the view of Latvian society, could create the value for the Latvian economy and society. The survey is created for both individuals and businesses. It being made public both to act as supplementary data for "Towards enrichment of the open government data: a stakeholder-centered determination of High-Value Data sets for Latvia" paper (author: Anastasija Nikiforova, University of Latvia) and in order for other researchers to use these data in their own work.

The survey was distributed among Latvian citizens and organisations. The structure of the survey is available in the supplementary file available (see Survey_HighValueDataSets.odt)

Description of the data in this data set: structure of the survey and pre-defined answers (if any) 1. Have you ever used open (government) data? - {(1) yes, once; (2) yes, there has been a little experience; (3) yes, continuously, (4) no, it wasn’t needed for me; (5) no, have tried but has failed} 2. How would you assess the value of open govenment data that are currently available for your personal use or your business? - 5-point Likert scale, where 1 – any to 5 – very high 3. If you ever used the open (government) data, what was the purpose of using them? - {(1) Have not had to use; (2) to identify the situation for an object or ab event (e.g. Covid-19 current state); (3) data-driven decision-making; (4) for the enrichment of my data, i.e. by supplementing them; (5) for better understanding of decisions of the government; (6) awareness of governments’ actions (increasing transparency); (7) forecasting (e.g. trendings etc.); (8) for developing data-driven solutions that use only the open data; (9) for developing data-driven solutions, using open data as a supplement to existing data; (10) for training and education purposes; (11) for entertainment; (12) other (open-ended question) 4. What category(ies) of “high value datasets” is, in you opinion, able to create added value for society or the economy? {(1)Geospatial data; (2) Earth observation and environment; (3) Meteorological; (4) Statistics; (5) Companies and company ownership; (6) Mobility} 5. To what extent do you think the current data catalogue of Latvia’s Open data portal corresponds to the needs of data users/ consumers? - 10-point Likert scale, where 1 – no data are useful, but 10 – fully correspond, i.e. all potentially valuable datasets are available 6. Which of the current data categories in Latvia’s open data portals, in you opinion, most corresponds to the “high value dataset”? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 7. Which of them form your TOP-3? - {(1)Foreign affairs; (2) business econonmy; (3) energy; (4) citizens and society; (5) education and sport; (6) culture; (7) regions and municipalities; (8) justice, internal affairs and security; (9) transports; (10) public administration; (11) health; (12) environment; (13) agriculture, food and forestry; (14) science and technologies} 8. How would you assess the value of the following data categories? 8.1. sensor data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.2. real-time data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 8.3. geospatial data - 5-point Likert scale, where 1 – not needed to 5 – highly valuable 9. What would be these datasets? I.e. what (sub)topic could these data be associated with? - open-ended question 10. Which of the data sets currently available could be valauble and useful for society and businesses? - open-ended question 11. Which of the data sets currently NOT available in Latvia’s open data portal could, in your opinion, be valauble and useful for society and businesses? - open-ended question 12. How did you define them? - {(1)Subjective opinion; (2) experience with data; (3) filtering out the most popular datasets, i.e. basing the on public opinion; (4) other (open-ended question)} 13. How high could be the value of these data sets value for you or your business? - 5-point Likert scale, where 1 – not valuable, 5 – highly valuable 14. Do you represent any company/ organization (are you working anywhere)? (if “yes”, please, fill out the survey twice, i.e. as an individual user AND a company representative) - {yes; no; I am an individual data user; other (open-ended)} 15. What industry/ sector does your company/ organization belong to? (if you do not work at the moment, please, choose the last option) - {Information and communication services; Financial and ansurance activities; Accommodation and catering services; Education; Real estate operations; Wholesale and retail trade; repair of motor vehicles and motorcycles; transport and storage; construction; water supply; waste water; waste management and recovery; electricity, gas supple, heating and air conditioning; manufacturing industry; mining and quarrying; agriculture, forestry and fisheries professional, scientific and technical services; operation of administrative and service services; public administration and defence; compulsory social insurance; health and social care; art, entertainment and recreation; activities of households as employers;; CSO/NGO; Iam not a representative of any company 16. To which category does your company/ organization belong to in terms of its size? - {small; medium; large; self-employeed; I am not a representative of any company} 17. What is the age group that you belong to? (if you are an individual user, not a company representative) - {11..15, 16..20, 21..25, 26..30, 31..35, 36..40, 41..45, 46+, “do not want to reveal”} 18. Please, indicate your education or a scientific degree that corresponds most to you? (if you are an individual user, not a company representative) - {master degree; bachelor’s degree; Dr. and/ or PhD; student (bachelor level); student (master level); doctoral candidate; pupil; do not want to reveal these data}

Format of the file .xls, .csv (for the first spreadsheet only), .odt

Licenses or restrictions CC-BY

Search
Clear search
Close search
Google apps
Main menu