100+ datasets found
  1. metadata

    • catalog.data.gov
    • datasets.ai
    Updated Nov 12, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.

  2. f

    Data from: Sample metadata

    • fairdomhub.org
    xlsx
    Updated Jul 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas Harvey (2021). Sample metadata [Dataset]. https://fairdomhub.org/data_files/1440
    Explore at:
    xlsx(43.9 KB)Available download formats
    Dataset updated
    Jul 1, 2021
    Authors
    Thomas Harvey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Information on samples submitted for RNAseq

    Rows are individual samples

    Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length

  3. Enterprise Metadata Repository (EMR)

    • catalog.data.gov
    • data.wu.ac.at
    Updated May 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Security Administration (2025). Enterprise Metadata Repository (EMR) [Dataset]. https://catalog.data.gov/dataset/enterprise-metadata-repository-emr
    Explore at:
    Dataset updated
    May 22, 2025
    Dataset provided by
    Social Security Administrationhttp://www.ssa.gov/
    Description

    Stores physical and logical information about relational databases and record structures to assist in data identification and management.

  4. Common Metadata Elements for Cataloging Biomedical Datasets

    • figshare.com
    xlsx
    Updated Jan 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Read (2016). Common Metadata Elements for Cataloging Biomedical Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.1496573.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 20, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Kevin Read
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset outlines a proposed set of core, minimal metadata elements that can be used to describe biomedical datasets, such as those resulting from research funded by the National Institutes of Health. It can inform efforts to better catalog or index such data to improve discoverability. The proposed metadata elements are based on an analysis of the metadata schemas used in a set of NIH-supported data sharing repositories. Common elements from these data repositories were identified, mapped to existing data-specific metadata standards from to existing multidisciplinary data repositories, DataCite and Dryad, and compared with metadata used in MEDLINE records to establish a sustainable and integrated metadata schema. From the mappings, we developed a preliminary set of minimal metadata elements that can be used to describe NIH-funded datasets. Please see the readme file for more details about the individual sheets within the spreadsheet.

  5. d

    Data from: A metadata approach to evaluate the state of ocean knowledge:...

    • datadryad.org
    • data.niaid.nih.gov
    • +2more
    zip
    Updated Jun 14, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juliano Palacios-Abrantes; Andrés M. Cisneros-Montemayor; Miguel A. Cisneros-Mata; Laura Rodríguez; Francisco Arreguín-Sánchez; Veronica Aguilar; Santiago Domínguez-Sánchez; Stuart Fulton; Raquel López-Sagástegui; Hector Reyes-Bonilla; Rocio Rivera-Campos; Silvia Salas; Nuno Simoes; William W. L. Cheung (2019). A metadata approach to evaluate the state of ocean knowledge: strengths, limitations, and application to Mexico [Dataset]. http://doi.org/10.5061/dryad.pt80482
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 14, 2019
    Dataset provided by
    Dryad
    Authors
    Juliano Palacios-Abrantes; Andrés M. Cisneros-Montemayor; Miguel A. Cisneros-Mata; Laura Rodríguez; Francisco Arreguín-Sánchez; Veronica Aguilar; Santiago Domínguez-Sánchez; Stuart Fulton; Raquel López-Sagástegui; Hector Reyes-Bonilla; Rocio Rivera-Campos; Silvia Salas; Nuno Simoes; William W. L. Cheung
    Time period covered
    2019
    Area covered
    Gulf of Mexico (Gulf of America), Mexico, Baja California Peninsula, Yucatan Peninsula, Gulf of California, Eastern Tropical Pacific
    Description

    Mexico Marine Research Metadata DatabaseThis project compiled metadata on available datasets produced by marine research in Mexico. The data is categorized by region, theme, species (when applicable), and research fields. This dataset corresponds to the associated peer-reviewed paper, the living database can be accessed at http://infoceanos.conabio.gob.mx.Mexico Metadata Database DataDryad.csv

  6. M

    Metadata Management Tools Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Metadata Management Tools Report [Dataset]. https://www.marketresearchforecast.com/reports/metadata-management-tools-46465
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    Mar 21, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Metadata Management Tools market is experiencing robust growth, driven by the increasing volume and complexity of data across various industries. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 12% from 2025 to 2033, reaching approximately $40 billion by 2033. This expansion is fueled by several key factors. Firstly, the rising adoption of cloud-based solutions provides scalability and cost-effectiveness, attracting businesses of all sizes. Secondly, the stringent regulatory compliance needs across sectors like BFSI and healthcare necessitate robust metadata management for data governance and security. Furthermore, the growing demand for data-driven decision-making and advanced analytics increases the reliance on accurate and readily accessible metadata. Key trends include the integration of AI and machine learning for automated metadata discovery and classification, and the increasing demand for solutions offering enhanced data lineage capabilities. While the market faces restraints like the complexity of implementation and the need for skilled professionals, the overall positive market outlook is supported by continuous innovation and increasing enterprise awareness of the value proposition of effective metadata management. The market is segmented by deployment (cloud-based and on-premise) and application (BFSI, retail, medical, media, and others). Major players such as Oracle, SAP, IBM, and Informatica dominate the market, while several emerging players are also vying for market share through innovative solutions. The North American region currently holds the largest market share, followed by Europe and Asia Pacific. The competitive landscape is marked by both established players and innovative startups. Established players leverage their existing customer base and extensive product portfolios, while emerging companies often focus on niche solutions and advanced technologies. The market is witnessing increased mergers and acquisitions, strategic partnerships, and product advancements, indicative of a dynamic and competitive landscape. Future growth hinges on the ability of vendors to adapt to the evolving technological landscape, meet the growing need for data security and compliance, and provide user-friendly, scalable, and cost-effective solutions. The focus on data quality, interoperability, and governance will continue to shape the development and adoption of metadata management tools across industries. Geographical expansion, especially into developing economies, presents a significant opportunity for market growth.

  7. h

    github-meta-data

    • huggingface.co
    Updated May 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zamal_ (2025). github-meta-data [Dataset]. https://huggingface.co/datasets/zamal/github-meta-data
    Explore at:
    Dataset updated
    May 31, 2025
    Authors
    zamal_
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    GitHub Meta Data

    This dataset contains GitHub repository descriptions paired with their tags.

    input: a natural language query or description of a GitHub project
    target: comma-separated tags describing it

    Used for training a T5 model for GitHub-style tag generation.

  8. g

    gms-index-mediator: a R-tree-based in-memory index for fast spatio-temporal...

    • dataservices.gfz-potsdam.de
    Updated 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Eggert; Mike Sips; Doris Dransch; Mike Sips; Doris Dransch (2018). gms-index-mediator: a R-tree-based in-memory index for fast spatio-temporal queries for the GeoMultiSens platform [Dataset]. http://doi.org/10.5880/gfz.1.5.2018.004
    Explore at:
    Dataset updated
    2018
    Dataset provided by
    GFZ Data Services
    datacite
    Authors
    Daniel Eggert; Mike Sips; Doris Dransch; Mike Sips; Doris Dransch
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    Gms-index-mediator is a standalone index for spatio-temporal data acting as a mediator between an application and a database. Even modern databases need several minutes to execute a spatio-temporal query to huge tables containing several million entries. Our index-mediator speeds the execution of such queries up by several magnitues, resulting in response times around 100ms. This version is tailored towards the GeoMultiSens database, but can be adapted to work with custom table layouts with reasonable effort.

  9. Database for climate time series homogenization with metadata

    • zenodo.org
    zip
    Updated Aug 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Domonkos; Peter Domonkos (2022). Database for climate time series homogenization with metadata [Dataset]. http://doi.org/10.5281/zenodo.6990845
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 15, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Domonkos; Peter Domonkos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Usefulness of metadata in the automatic version of ACMANTv5 was tested.
    A benchmark database has been developed, which consists of 41 datasets
    of 20,500 networks of 170,000 synthetic monthly temperature time series
    and the relating metadata dates. The research was supported by the
    Catalan Meteorological Service. The research results will be published
    in the open access MDPI journal Atmosphere.

    See more in the "Readme.txt" file of the dataset.

  10. M

    Metadata Management Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jan 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Metadata Management Software Report [Dataset]. https://www.datainsightsmarket.com/reports/metadata-management-software-1974588
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Jan 18, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Market Size and Growth: The global Metadata Management Software market was valued at USD XXX million in 2025 and is projected to grow at a CAGR of XX% from 2025 to 2033, reaching USD XXX million by the end of the forecast period. The increasing demand for efficient and accurate data management, coupled with the growing adoption of cloud-based solutions, are key drivers of this growth. The market is segmented by application (data governance, data integration, data quality, data security, and others) and type (structured, unstructured, and semi-structured). North America and Europe are currently the dominant regional markets, while Asia Pacific is expected to witness significant growth in the coming years. Key Trends and Challenges: One of the major trends in the Metadata Management Software market is the rise of artificial intelligence (AI) and machine learning (ML). AI-powered tools can automate metadata extraction, classification, and analysis tasks, reducing manual effort and improving accuracy. Another trend is the adoption of semantic technologies, which allow organizations to create more meaningful connections between different types of data. However, challenges such as data privacy and security concerns, as well as the lack of skilled professionals, could hinder market growth.

  11. f

    All Metadata - Training Data.xlsx

    • fairdomhub.org
    xlsx
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charles Demurjian (2025). All Metadata - Training Data.xlsx [Dataset]. https://fairdomhub.org/data_files/7664
    Explore at:
    xlsx(66.7 KB)Available download formats
    Dataset updated
    Feb 5, 2025
    Authors
    Charles Demurjian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description not specified.........................

  12. b

    GOLD metadata

    • bioregistry.io
    Updated Apr 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). GOLD metadata [Dataset]. https://bioregistry.io/gold.meta
    Explore at:
    Dataset updated
    Apr 27, 2021
    Description
    • DEPRECATION NOTE - Please, keep in mind that this namespace has been superseeded by ‘gold’ prefix at https://registry.identifiers.org/registry/gold, and this namespace is kept here for support to already existing citations, new ones would need to use the pointed ‘gold’ namespace.

    The GOLD (Genomes OnLine Database)is a resource for centralized monitoring of genome and metagenome projects worldwide. It stores information on complete and ongoing projects, along with their associated metadata. This collection references metadata associated with samples.

  13. d

    Data from: An open source framework for metadata exploration and discovery...

    • search.dataone.org
    • arcticdata.io
    • +1more
    Updated Jul 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Mattmann (2020). An open source framework for metadata exploration and discovery of Polar Data [Dataset]. http://doi.org/10.18739/A2R49G96H
    Explore at:
    Dataset updated
    Jul 17, 2020
    Dataset provided by
    Arctic Data Center
    Authors
    Christian Mattmann
    Time period covered
    Jan 1, 2015 - Jan 1, 2016
    Area covered
    Earth
    Description

    This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and metadata classification of over 1200 data formats. The PI will expand Tika to handle polar data and scientific data formats, making Polar data more easily available, searchable, and retrievable by all major content management systems. The proposed activity will lay the framework for a thorough automatically generated inventory of polar metadata and data. Expanding Tika to handle polar data will also naturally invite the technology/open source community to deal with polar use cases, helping to increase understanding of the arctic. The resultant software produced through effort will be disseminated to the software and polar communities through the Apache Software Foundation. A computer science graduate student and postdoc will be exposed to Cryosphere and Arctic data, helping to train the next generation of cross disciplinary data scientists in the domain. The PI's Search Engines (20-40 students annual enrollment) and Software Architecture (30-50 students annual enrollment) graduate courses at USC will benefit from the Arctic cyberinfrastructure use cases disseminated through course projects and lecture material. The PI will also work collaboratively with NSF-funded projects dealing with projects focusing on the archiving, discovery and access of polar data, such as ACADIS and the Antarctic Master Directory.

  14. Active Marine Station Metadata

    • catalog.data.gov
    • datadiscoverystudio.org
    • +2more
    Updated Sep 7, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOC/NOAA/NESDIS/NCEI > National Centers for Environmental Information, NESDIS, NOAA, U.S. Department of Commerce (Point of Contact) (2012). Active Marine Station Metadata [Dataset]. https://catalog.data.gov/ro/dataset/active-marine-station-metadata
    Explore at:
    Dataset updated
    Sep 7, 2012
    Dataset provided by
    United States Department of Commercehttp://www.commerce.gov/
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    National Environmental Satellite, Data, and Information Service
    Description

    The Active Marine Station Metadata is a daily metadata report for active marine bouy and C-MAN (Coastal Marine Automated Network) platforms from the National Data Buoy Center (NDBC). Metadata includes the station id, latitude/longitude (resolution to thousandths of a degree), the station name, the station owner, the program the station is associated with (e.g., TAO, NDBC, tsunami, NOS, etc.), station type (e.g., buoy, fixed, oil rig, etc.), notification if the station observes meteorology, currents, and water quality (signified by 'y' for yes and 'n' for no). If there is a 'y' associated with one of these tags, then the station has reported data in that category within the last 8 hours (or 24 hours for DART stations--Deep-Ocean Assessment Reporting of Tsunamis). If there is an 'n', data has not been received within those times. Stations are removed from the list when they are dismantled. The metadata information is written to a daily XML-formatted file.

  15. a

    Metadata Management Tool (MMT) records archive

    • ontario-geohub-1-3-lio.hub.arcgis.com
    Updated Mar 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Land Information Ontario (2022). Metadata Management Tool (MMT) records archive [Dataset]. https://ontario-geohub-1-3-lio.hub.arcgis.com/items/f350b1fca5e841f28d42e59bce0d9294
    Explore at:
    Dataset updated
    Mar 28, 2022
    Dataset authored and provided by
    Land Information Ontario
    License

    https://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario

    Area covered
    Description

    This table represents metadata records which formerly existed on LIO’s Metadata Management Tool. Records representing data licensed for use under the Open Government Licence - Ontario have migrated to the Ontario GeoHub.

    The remaining records could not migrate for one of the following reasons:

    The data is not spatial. The metadata record is incomplete. The metadata contact information is invalid. The metadata references data that has not been made available to LIO. LIO cannot confirm that the data has been reviewed to be released under the Open Government Licence - Ontario.

    Contact LIO Support at geospatial@ontario.ca for more information or to get an extract of original metadata files.

    Status

    Obsolete: data is no longer relevant

    Maintenance and Update Frequency

    Not planned: there are no plans to update the data

    Contact

    Land Information Ontario (LIO) Support, geospatial@ontario.ca

  16. H

    Data for: Identifying Metadata Quality Issues Across Cultures

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julie Shi; Mike Nason; Marco Tullney; Juan Pablo Alperin (2023). Data for: Identifying Metadata Quality Issues Across Cultures [Dataset]. http://doi.org/10.7910/DVN/GZI7IA
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Julie Shi; Mike Nason; Marco Tullney; Juan Pablo Alperin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This sample was drawn from the Crossref API on March 8, 2022. The sample was constructed purposefully on the hypothesis that records with at least one known issue would be more likely to yield issues related to cultural meanings and identity. Records known or suspected to have at least one quality issue were selected by the authors and Crossref staff. The Crossref API was then used to randomly select additional records from the same prefix. Records in the sample represent 51 DOI prefixes that were chosen without regard for the manuscript management or publishing platform used, as well as 17 prefixes for journals known to use the Open Journal Systems manuscript management and publishing platform. OJS was specifically identified due to the authors' familiarity with the platform, its international and multilingual reach, and previous work on its metadata quality.

  17. U

    Priority Toxic Contaminant Metadata Inventory and Associated Total...

    • data.usgs.gov
    • catalog.data.gov
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Banks; Trevor Needham; Caitlyn Dugan; Ellie Foss; Emily Majcher, Priority Toxic Contaminant Metadata Inventory and Associated Total Polychlorinated Biphenyls Concentration Data [Dataset]. http://doi.org/10.5066/P9R78SQ6
    Explore at:
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Authors
    Brian Banks; Trevor Needham; Caitlyn Dugan; Ellie Foss; Emily Majcher
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Time period covered
    Jan 1, 1972 - Dec 23, 2018
    Description

    In June 2019, the U.S. Geological Survey Maryland-Delaware-District of Columbia Water Science Center (MD-DE-DC WSC) team began to collect and inventory available information on toxic contaminants within the Chesapeake Bay Watershed. State agencies were contacted to determine available data. Also, the National Water Information System (NWIS) and National Water Quality Database (NWQD) were queried to gather relevant data for the compilation. The resulting tables contain records for available sites where specific analyte groups, Hg (mercury), PCB (polychlorinated biphenyls), or pesticides, have been collected with appropriate supplemental metadata including media, method, time frame, and frequency of collection. Sample results span 1972-2019. Files included in the data release: Basic_Table.csv Detailed_Table.csv NWIS_PCodes.csv State_Result_Totals.csv NWIS_Result_Totals.csv

  18. d

    Metadata Records for Gap Analysis Program National Data Resources

    • datadiscoverystudio.org
    Updated Jan 1, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). Metadata Records for Gap Analysis Program National Data Resources [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/75207a1c98994c848862cbd6b9c7b79b/html
    Explore at:
    Dataset updated
    Jan 1, 2012
    Area covered
    Description

    Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information

  19. Dataset relating a study on Geospatial Open Data usage and metadata quality

    • zenodo.org
    csv
    Updated Jun 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alfonso Quarati; Alfonso Quarati (2023). Dataset relating a study on Geospatial Open Data usage and metadata quality [Dataset]. http://doi.org/10.5281/zenodo.4584542
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 19, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alfonso Quarati; Alfonso Quarati
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Open Government Data portals (OGD) thanks to the presence of thousands of geo-referenced datasets, containing spatial information, are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. Besides, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.

    The dataset consists of six zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 160,000 geospatial datasets belonging to the three national and three international portals considered in the study, i.e. US (catalog.data.gov), Colombia (datos.gov.co), Ireland (data.gov.ie), HDX (data.humdata.org), EUODP (data.europa.eu), and NASA (data.nasa.gov).

    Data collection occurred in the period: 2019-12-19 -- 2019-12-23.

    The header for each CSV file is:

    [ ,portalid,id,downloaddate,metadata,overallq,qvalues,assessdate,dviews,downloads,engine,admindomain]

    where for each row (a portal's dataset) the following fields are defined as follows:

    • portalid: portal identifier
    • id: dataset identifier
    • downloaddate: date of data collection
    • overallq: overall quality values computed by applying the methodology presented in [1]
    • qvalues: json object containing the quality values computed for the 17 metrics presented in [1]
    • assessdate: date of quality assessment
    • dviews: number of total views for the dataset
    • downloads: number of total downloads for the dataset (made available only by the Colombia, HDX, and NASA portals)
    • engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata)
    • admindomain: 1 (national), 3 (international)
    • metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema

    [1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909

  20. d

    Data from: Advanced technologies and data management practices in...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Apr 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca R. Hernandez; Matthew S. Mayernik; Michelle L. Murphy-Mariscal; Michael F. Allen (2025). Advanced technologies and data management practices in environmental science: lessons from academia [Dataset]. http://doi.org/10.5061/dryad.cv86385c
    Explore at:
    Dataset updated
    Apr 13, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Rebecca R. Hernandez; Matthew S. Mayernik; Michelle L. Murphy-Mariscal; Michael F. Allen
    Time period covered
    Jan 1, 2012
    Description

    Environmental scientists stand uniquely poised to capitalize on recent advancements in technology, computation, and data management, however, it is unknown the degree to which this is occurring. We analyzed survey responses of 445 graduate students in California to evaluate understanding and use of such advances in the environmental sciences. Of students who had completed their degree, 64.3% had completed the data life cycle, 30.5% had archived research data so that it is available online, and 61.4% had no plans to create metadata for research data sets. Roughly one-third of students used an environmental sensor and collaborated with someone outside their expertise. Results varied by students’ research status and by university type. Doing excellent science in this data-intensive age may necessitate greater emphasis by university programs on data management best practices borrowed from information technology, and skills supplemented by unique training opportunities, courses, counsel fro...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500
Organization logo

metadata

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.

Search
Clear search
Close search
Google apps
Main menu