100+ datasets found

metadata
catalog.data.gov
datasets.ai
Updated Nov 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.
M
Metadata Management Tools Report
marketresearchforecast.com
doc, pdf, ppt
Updated Mar 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Research Forecast (2025). Metadata Management Tools Report [Dataset]. https://www.marketresearchforecast.com/reports/metadata-management-tools-46465
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
Mar 21, 2025
Dataset authored and provided by
Market Research Forecast
License
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Metadata Management Tools market is experiencing robust growth, driven by the increasing volume and complexity of data across various industries. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 12% from 2025 to 2033, reaching approximately $40 billion by 2033. This expansion is fueled by several key factors. Firstly, the rising adoption of cloud-based solutions provides scalability and cost-effectiveness, attracting businesses of all sizes. Secondly, the stringent regulatory compliance needs across sectors like BFSI and healthcare necessitate robust metadata management for data governance and security. Furthermore, the growing demand for data-driven decision-making and advanced analytics increases the reliance on accurate and readily accessible metadata. Key trends include the integration of AI and machine learning for automated metadata discovery and classification, and the increasing demand for solutions offering enhanced data lineage capabilities. While the market faces restraints like the complexity of implementation and the need for skilled professionals, the overall positive market outlook is supported by continuous innovation and increasing enterprise awareness of the value proposition of effective metadata management. The market is segmented by deployment (cloud-based and on-premise) and application (BFSI, retail, medical, media, and others). Major players such as Oracle, SAP, IBM, and Informatica dominate the market, while several emerging players are also vying for market share through innovative solutions. The North American region currently holds the largest market share, followed by Europe and Asia Pacific. The competitive landscape is marked by both established players and innovative startups. Established players leverage their existing customer base and extensive product portfolios, while emerging companies often focus on niche solutions and advanced technologies. The market is witnessing increased mergers and acquisitions, strategic partnerships, and product advancements, indicative of a dynamic and competitive landscape. Future growth hinges on the ability of vendors to adapt to the evolving technological landscape, meet the growing need for data security and compliance, and provide user-friendly, scalable, and cost-effective solutions. The focus on data quality, interoperability, and governance will continue to shape the development and adoption of metadata management tools across industries. Geographical expansion, especially into developing economies, presents a significant opportunity for market growth.
Common Metadata Elements for Cataloging Biomedical Datasets
figshare.com
xlsx
Updated Jan 20, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kevin Read (2016). Common Metadata Elements for Cataloging Biomedical Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.1496573.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1496573.v1
Dataset updated
Jan 20, 2016
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Kevin Read
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset outlines a proposed set of core, minimal metadata elements that can be used to describe biomedical datasets, such as those resulting from research funded by the National Institutes of Health. It can inform efforts to better catalog or index such data to improve discoverability. The proposed metadata elements are based on an analysis of the metadata schemas used in a set of NIH-supported data sharing repositories. Common elements from these data repositories were identified, mapped to existing data-specific metadata standards from to existing multidisciplinary data repositories, DataCite and Dryad, and compared with metadata used in MEDLINE records to establish a sustainable and integrated metadata schema. From the mappings, we developed a preliminary set of minimal metadata elements that can be used to describe NIH-funded datasets. Please see the readme file for more details about the individual sheets within the spreadsheet.
Crunchyroll Meta-Data
kaggle.com
Updated Aug 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BIT_Guber (2023). Crunchyroll Meta-Data [Dataset]. https://www.kaggle.com/datasets/bitguber/crunchyroll-meta-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 15, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
BIT_Guber
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is just prepared data from crunchyroll web scraped data using code line here I extracted meta-data from crunchyroll websites.

Before please visit Crunchyroll

Dataset contains 7 files

popular.csv

Each row represented a series in popular page. note: some information not updated ( I guess Crunchyroll not update is Popular table in Database )

series.csv

It's also have similar feature as popular.csv but updated data points.

seasons.csv

Each row represented a season from it's corresponding series.

episodes.csv

Information about individual episodes from it's corresponding series.

series_music.csv

Some series have featured music collection.

audio.json

Mapping full representation of audio version of episode dubbed.

categories.json

Mapping each categories of series ,it defined by crunchyroll.
h
github-meta-data
huggingface.co
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
zamal_ (2025). github-meta-data [Dataset]. https://huggingface.co/datasets/zamal/github-meta-data
Explore at:
Dataset updated
May 31, 2025
Authors
zamal_
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
GitHub Meta Data

This dataset contains GitHub repository descriptions paired with their tags.

input: a natural language query or description of a GitHub project
target: comma-separated tags describing it

Used for training a T5 model for GitHub-style tag generation.
d
Data from: An open source framework for metadata exploration and discovery...
search.dataone.org
arcticdata.io
+1more
Updated Jul 17, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Mattmann (2020). An open source framework for metadata exploration and discovery of Polar Data [Dataset]. http://doi.org/10.18739/A2R49G96H
Explore at:
Unique identifier
https://doi.org/10.18739/A2R49G96H
Dataset updated
Jul 17, 2020
Dataset provided by
Arctic Data Center
Authors
Christian Mattmann
Time period covered
Jan 1, 2015 - Jan 1, 2016
Area covered
Earth
Description
This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and metadata classification of over 1200 data formats. The PI will expand Tika to handle polar data and scientific data formats, making Polar data more easily available, searchable, and retrievable by all major content management systems. The proposed activity will lay the framework for a thorough automatically generated inventory of polar metadata and data. Expanding Tika to handle polar data will also naturally invite the technology/open source community to deal with polar use cases, helping to increase understanding of the arctic. The resultant software produced through effort will be disseminated to the software and polar communities through the Apache Software Foundation. A computer science graduate student and postdoc will be exposed to Cryosphere and Arctic data, helping to train the next generation of cross disciplinary data scientists in the domain. The PI's Search Engines (20-40 students annual enrollment) and Software Architecture (30-50 students annual enrollment) graduate courses at USC will benefit from the Arctic cyberinfrastructure use cases disseminated through course projects and lecture material. The PI will also work collaboratively with NSF-funded projects dealing with projects focusing on the archiving, discovery and access of polar data, such as ACADIS and the Antarctic Master Directory.
H
Data for: Identifying Metadata Quality Issues Across Cultures
dataverse.harvard.edu
search.dataone.org
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie Shi; Mike Nason; Marco Tullney; Juan Pablo Alperin (2023). Data for: Identifying Metadata Quality Issues Across Cultures [Dataset]. http://doi.org/10.7910/DVN/GZI7IA
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/GZI7IA
Dataset updated
Jul 26, 2023
Dataset provided by
Harvard Dataverse
Authors
Julie Shi; Mike Nason; Marco Tullney; Juan Pablo Alperin
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This sample was drawn from the Crossref API on March 8, 2022. The sample was constructed purposefully on the hypothesis that records with at least one known issue would be more likely to yield issues related to cultural meanings and identity. Records known or suspected to have at least one quality issue were selected by the authors and Crossref staff. The Crossref API was then used to randomly select additional records from the same prefix. Records in the sample represent 51 DOI prefixes that were chosen without regard for the manuscript management or publishing platform used, as well as 17 prefixes for journals known to use the Open Journal Systems manuscript management and publishing platform. OJS was specifically identified due to the authors' familiarity with the platform, its international and multilingual reach, and previous work on its metadata quality.
ERB case studies meta data
catalog.data.gov
Updated Feb 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2025). ERB case studies meta data [Dataset]. https://catalog.data.gov/dataset/erb-case-studies-meta-data
Explore at:
Dataset updated
Feb 8, 2025
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
The data are qualitative data consisting of notes recorded during meetings, workshops, and other interactions with case study participants. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The data cannot be accessed by anyone outside of the research team because of the potential to identify human participants. Format: The data are qualitative data contained in Microsoft Word documents. This dataset is associated with the following publication: Eisenhauer, E., K. Maxwell, B. Kiessling, S. Henson, M. Matsler, R. Nee, M. Shacklette, M. Fry, and S. Julius. Inclusive engagement for equitable resilience: community case study insights. Environmental Research Communications. IOP Publishing, BRISTOL, UK, 6: 125012, (2024).
d
Priority Toxic Contaminant Metadata Inventory and Associated Total...
catalog.data.gov
data.usgs.gov
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Priority Toxic Contaminant Metadata Inventory and Associated Total Polychlorinated Biphenyls Concentration Data [Dataset]. https://catalog.data.gov/dataset/priority-toxic-contaminant-metadata-inventory-and-associated-total-polychlorinated-bipheny
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
In June 2019, the U.S. Geological Survey Maryland-Delaware-District of Columbia Water Science Center (MD-DE-DC WSC) team began to collect and inventory available information on toxic contaminants within the Chesapeake Bay Watershed. State agencies were contacted to determine available data. Also, the National Water Information System (NWIS) and National Water Quality Database (NWQD) were queried to gather relevant data for the compilation. The resulting tables contain records for available sites where specific analyte groups, Hg (mercury), PCB (polychlorinated biphenyls), or pesticides, have been collected with appropriate supplemental metadata including media, method, time frame, and frequency of collection. Sample results span 1972-2019. Files included in the data release: Basic_Table.csv Detailed_Table.csv NWIS_PCodes.csv State_Result_Totals.csv NWIS_Result_Totals.csv
Data warehouse and metadata holdings relevant to Australias North West Shelf...
devweb.dga.links.com.au
researchdata.edu.au
html
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CSIRO Oceans & Atmosphere (2025). Data warehouse and metadata holdings relevant to Australias North West Shelf [Dataset]. https://devweb.dga.links.com.au/data/dataset/data-warehouse-and-metadata-holdings-relevant-to-australias-north-west-shelf
Explore at:
htmlAvailable download formats
Dataset updated
Mar 13, 2025
Dataset provided by
CSIROhttp://www.csiro.au/
Authors
CSIRO Oceans & Atmosphere
Area covered
Australia
Description
From the earliest stages of planning the North West Shelf Joint Environmental Management Study it was evident that good management of the scientific data to be used in the research would be important for the success of the Study. A comprehensive review of data sets and other information relevant to the marine ecosystems, the geology, infrastructure and industries of the North West Shelf area had been completed (Heyward et al. 2006). The Data Management Project was established to source and prepare existing data sets for use, requiring the development and use of a range of tools: metadata systems, data visualisation and data delivery applications. These were made available to collaborators to allow easy access to data obtained and generated by the Study. The CMAR MarLIN metadata system was used to document the 285 data sets, those which were identified as potentially useful for the Study and the software and information products generated by and for the Study. This report represents a hard copy atlas of all NWSJEMS data products and the existing data sets identified for potential use as inputs to the Study. It comprises summary metadata elements describing the data sets, their custodianship and how the data sets might be obtained. The identifiers of each data set can be used to refer to the full metadata records in the on-line MarLIN system.
Spatial enablement for data and metadata User Guides and Best Practices
devweb.dga.links.com.au
docx
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geoscience Australia (2025). Spatial enablement for data and metadata User Guides and Best Practices [Dataset]. https://devweb.dga.links.com.au/data/dataset/spatial-enablement-for-data-and-metadata-user-guides-and-best-practices
Explore at:
docxAvailable download formats
Dataset updated
Jan 20, 2025
Dataset authored and provided by
Geoscience Australiahttp://ga.gov.au/
Description
The pace, with which government agencies, researchers, industry, and the public need to react to national and international challenges of economic, environmental, and social natures, is constantly changing and rapidly increasing. Responses to the global COVID-19 pandemic event, the 2020 Australian bushfire and 2021 flood crisis situations are recent examples of these requirements. Decisions are no longer made on information or data coming from a single source or discipline or a solitary aspect of life: the issues of today are too complex. Solving complex issues requires seamless integration of data across multiple domains and understanding and consideration of potential impacts on businesses, the economy, and the environment. Modern technologies, easy access to information on the web, abundance of openly available data shifts is not enough to overcome previous limitations of dealing with data and information. Data and software have to be Findable, Accessible, Interoperable and Reusable (FAIR), processes have to be transparent, verifiable and trusted. The approaches toward data integration, analysis, evaluation, and access require rethinking to: - Support building flexible re-usable and re-purposeful data and information solutions serving multiple domains and communities. - Enable timely and effective delivery of complex solutions to enable effective decision and policy making. The unifying factor for these events is location: everything is happening somewhere at some time. Inconsistent representation of location (e.g. coordinates, statistical aggregations, and descriptions) and the use of multiple techniques to represent the same data creates difficulty in spatially integrating multiple data streams often from independent sources and providers. To use location for integration, location information needs to be embedded within the datasets and metadata, describing those datasets, so those datasets and metadata would become ‘spatially enabled’.
c
The global enterprise metadata management market size is USD 7.85 billion in...
cognitivemarketresearch.com
pdf,excel,csv,ppt
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cognitive Market Research (2025). The global enterprise metadata management market size is USD 7.85 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 24.1% from 2024 to 2031. [Dataset]. https://www.cognitivemarketresearch.com/enterprise-metadata-management-market-report
Explore at:
pdf,excel,csv,pptAvailable download formats
Dataset updated
Apr 15, 2025
Dataset authored and provided by
Cognitive Market Research
License
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
Time period covered
2021 - 2033
Area covered
Global
Description
According to Cognitive Market Research, the global enterprise metadata management market size is USD 7.85 billion in 2024 and will expand at a compound annual growth rate (CAGR) of 24.1% from 2024 to 2031. Market Dynamics of Enterprise Metadata Management Market

Key Drivers for Enterprise Metadata Management Market

Rapidly expanding data sets- The market growth is fueled by enterprise metadata management. Enterprises need to manage and understand their massive and varied datasets as the amount of data generated by these entities continues to grow at an exponential rate. The management of structured and unstructured data is becoming more complicated as organizations gather massive volumes of data from many sources. Enterprise metadata management is crucial for comprehending data context, linkages, and usage; enterprise metadata management offers a framework for organizing, characterizing, and controlling data using metadata. Moreover, improved data quality, easier data integration, and system-wide consistency are all results of well-managed metadata. Better decision-making and operational efficiency can be achieved when firms use enterprise metadata management because it increases data discoverability, streamlines data processes, and supports advanced analytics. The demand for enterprise metadata management is being driven by these markets becoming more popular because of the growth of big data and advanced analytics tools.

Key Restraints for Enterprise Metadata Management Market

The enterprise metadata management industry is restricted due to a high implementation cost. The implementation and maintenance of enterprise metadata management solutions can be impeded by a lack of trained specialists in this industry.

za Introduction of the Enterprise Metadata Management Market

Enterprise metadata management is the process of managing all of an organization’s information. Metadata is information about other data that gives it organization, meaning, and context. Better management of data, following the rules, and better decisions are all made easier by enterprise metadata management, which makes sure that data is correctly defined and easy to find. The necessity for improved data governance and strict adherence to regulations is mostly driving the global enterprise metadata management market. The demand for enterprise metadata management is also being propelled by the increasingly digital landscape and the widespread use of advanced analytics. In addition, because it aids in managing and securing the metadata created and stored, blockchain technology is gaining traction across many industries, opening up enormous possibilities for enterprise metadata management. As a result, there will likely be a meteoric rise in the business metadata management industry. Issues with data consistency across numerous channels provide a challenge for both business users and IT departments in the enterprise metadata management market.
Dataset relating a study on Geospatial Open Data usage and metadata quality
zenodo.org
csv
Updated Jun 19, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alfonso Quarati; Alfonso Quarati (2023). Dataset relating a study on Geospatial Open Data usage and metadata quality [Dataset]. http://doi.org/10.5281/zenodo.4584542
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4584542
Dataset updated
Jun 19, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alfonso Quarati; Alfonso Quarati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Open Government Data portals (OGD) thanks to the presence of thousands of geo-referenced datasets, containing spatial information, are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. Besides, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.

The dataset consists of six zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 160,000 geospatial datasets belonging to the three national and three international portals considered in the study, i.e. US (catalog.data.gov), Colombia (datos.gov.co), Ireland (data.gov.ie), HDX (data.humdata.org), EUODP (data.europa.eu), and NASA (data.nasa.gov).

Data collection occurred in the period: 2019-12-19 -- 2019-12-23.

The header for each CSV file is:

[ ,portalid,id,downloaddate,metadata,overallq,qvalues,assessdate,dviews,downloads,engine,admindomain]

where for each row (a portal's dataset) the following fields are defined as follows:

portalid: portal identifier

id: dataset identifier

downloaddate: date of data collection

overallq: overall quality values computed by applying the methodology presented in [1]

qvalues: json object containing the quality values computed for the 17 metrics presented in [1]

assessdate: date of quality assessment

dviews: number of total views for the dataset

downloads: number of total downloads for the dataset (made available only by the Colombia, HDX, and NASA portals)

engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata)

admindomain: 1 (national), 3 (international)

metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema

[1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
H
Data from: A general purpose tool-set for representing data relationships:...
dataverse.harvard.edu
Updated May 4, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joshua Stillerman, Thomas Fredian, Martin Greenwald, John Wright (2018). A general purpose tool-set for representing data relationships: Converting data into knowledge [Dataset]. http://doi.org/10.7910/DVN/SHYWLB
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SHYWLB
Dataset updated
May 4, 2018
Dataset provided by
Harvard Dataverse
Authors
Joshua Stillerman, Thomas Fredian, Martin Greenwald, John Wright
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SHYWLBhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/SHYWLB
Description
Rich metadata is required to find and understand the recorded measurements from modern experiments with their immense and complex data stores. Systems to store and manage these metadata have improved over time, but in most cases are ad-hoc collections of data relationships, often represented in domain or site specific application code. We are developing a general set of tools to store, manage, and retrieve datarelationship metadata. These tools will be agnostic to the underlying data storage mechanisms, and to the data stored in them, making the system applicable across a wide range of science domains. Data management tools typically represent at least one relationship paradigm through implicit or explicit metadata. The addition of these metadata allows the data to be searched and understood by larger groups of users over longer periods of time. Using these systems, researchers are less dependent on one on one communication with the scientists involved in running the experiments, nor to rely on their ability to remember the details of their data. In the magnetic fusion research community, the MDSplus system is widely used to record raw and processed data from experiments. Users create a hierarchical relationship tree for each instance of their experiment, allowing them to record the meanings of what is recorded. Most users of this system, add to this a set of ad-hoc tools to help users locate specific experiment runs, which they can then access via this hierarchical organization. However, the MDSplus tree is only one possible organization of the records, and these additional applications that relate the experiment 'shots' into run days, experimental proposals, logbook entries, run summaries, analysis work flow, publications, etc. have up until now, been implemented on an experiment by experiment basis. The Metadata Provenance Ontology project, MPO, is a system built to record data provenance information about computed results. It allows users to record the inputs and outputs from each step of their computational workflows, in particular, what raw and processed data were used as inputs, what codes were run and what results were produced. The resulting collections of provenance graphs can be annotated, grouped, searched, filtered and browsed. This provides a powerful tool to record, understand, and locate computed results. However, this can be understood as one more specific data relationship, which can be construed as an instance of something more general. Building on concepts developed in these projects, we are developing a general system that could be used to represent all of these kinds of data relationships as mathematical graphs. Just as MDSplus and MPO were generalizations of data management needs for a collection of users, this new system will generalize the storage, location, and retrieval of the relationships between data. The system will store data relationships as data, not encoded in a set of application specific programs or ad hoc data structures. Stored data, would be referred to by URIs allowing the system to be agnostic to the underlying data representations. Users can then traverse these graphs. The system will allow users to construct a collection of graphs describing ANY OR ALL OF the relationships between data items, locate interesting data, see what other graphs these data are members of and navigate into and through them.
o
Making the case for FAIR Data Points
explore.openaire.eu
Updated Apr 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angus Whyte; Ryan O'Connor; Josefine Nordling (2022). Making the case for FAIR Data Points [Dataset]. http://doi.org/10.5281/zenodo.6256839
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6256839
Dataset updated
Apr 8, 2022
Authors
Angus Whyte; Ryan O'Connor; Josefine Nordling
Description
As a service manager how may I assist my organisation to make research data we hold both FAIR and “as open as possible, as closed as necessary”? The FAIR Data Point is a protocol for (meta)data provision championed by GO-FAIR as a solution to this need. In this story we describe how two organisations have applied the FAIR Data Point (FDP) to provide FAIR data or metadata in two contexts. In Leiden University Medical Centre the FDP is used to make metadata about COVID patient data as open as possible in the interest of research, while the data is necessarily closed and held in a variety of different systems. By contrast, Dutch data service provider SURF is applying the FDP to improve the FAIRness of an extensive dataset repository that is openly accessible by default. Based on interviews with the lead protagonists in both organisations' FDP implementations we compare their rationales and approaches, and how they expect this FAIR-enabling technology to benefit their user communities.
d
Quantifying FAIR: metadata improvement and guidance in the DataONE...
dataone.org
search.dataone.org
+1more
Updated Sep 14, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Jones; Peter Slaughter; Ted Habermann (2019). Quantifying FAIR: metadata improvement and guidance in the DataONE repository network [Dataset]. https://dataone.org/datasets/urn%3Auuid%3A2b7ce46c-18ed-4afd-89ad-578102214230
Explore at:
Dataset updated
Sep 14, 2019
Dataset provided by
Knowledge Network for Biocomplexity
Authors
Matthew Jones; Peter Slaughter; Ted Habermann
Time period covered
Nov 3, 2003 - May 8, 2019
Area covered
Earth
Description
DataONE has consistently focused on interoperability among data repositories to enable seamless access to well-described data on the Earth and the environment. Our existing services promote data discovery and access through harmonization of the diverse metadata specifications used across communities, and through our integrated data search portal and services. In terms of the FAIR principles, we have done a good job at Findable and Accessible, while as a community we have placed less emphasis on Interoperable and Reusable. We present new DataONE services for quantitatively providing guidance on metadata completeness and effectiveness relative to the FAIR principles. The services produce guidance for FAIRness at both the level of an individual data set and trends through time for repository, user, and funder data collections. These analytical results regarding conformance to FAIR principles are preliminary and based on proposed quantitative assessment metrics for FAIR which will be changed with input from the community. The current statistics are based on version 0.2.0 of the DataONE FAIR suite. Thus, these results should not be viewed as conclusive about the data sets presented, but rather illustrate the types of quantitative comparisons that will be able to be made when the FAIR metrics at DataONE have been finalized.
BLM OR Cadastral PLSS Metadata Glance Polygon Hub
catalog.data.gov
gimi9.com
+1more
Updated Dec 12, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Land Management (2024). BLM OR Cadastral PLSS Metadata Glance Polygon Hub [Dataset]. https://catalog.data.gov/dataset/blm-or-cadastral-plss-metadata-glance-polygon-hub-7b89b
Explore at:
Dataset updated
Dec 12, 2024
Dataset provided by
Bureau of Land Managementhttp://www.blm.gov/
Description
MetadataGlance: MetadataGlance provides PLSS data steward content for individual PLSS units.This dataset represents the GIS Version of the Public Land Survey System including both rectangular and non-rectangular surveys. The primary source for the data is cadastral survey records housed by the BLM supplemented with local records and geographic control coordinates from states, counties as well as other federal agencies such as the USGS and USFS. The data has been converted from source documents to digital form and transferred into a GIS format that is compliant with FGDC Cadastral Data Content Standards and Guidelines for publication. This data is optimized for data publication and sharing rather than for specific "production" or operation and maintenance. This data set includes the following: PLSS Fully Intersected (all of the PLSS feature at the atomic or smallest polygon level), PLSS Townships, First Divisions and Second Divisions (the hierarchical break down of the PLSS Rectangular surveys) PLSS Special surveys (non rectangular components of the PLSS) Meandered Water, Corners and Conflicted Areas (known areas of gaps or overlaps between Townships or state boundaries). The Entity-Attribute section of this metadata describes these components in greater detail.
d
Overview Metadata for the Data used in te Conceptual and Numerical Model of...
catalog.data.gov
gimi9.com
+1more
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2024). Overview Metadata for the Data used in te Conceptual and Numerical Model of the Colorado River (1990-2016) [Dataset]. https://catalog.data.gov/dataset/overview-metadata-for-the-data-used-in-te-conceptual-and-numerical-model-of-the-color-1990
Explore at:
Dataset updated
Jul 6, 2024
Dataset provided by
U.S. Geological Survey
Area covered
Colorado River
Description
This data release contains six different datasets that were used in the report SIR 2018-5108. These datasets contain discharge data, discrete dissolved-solids data, quality-control discrete dissolved data, and computed mean dissolved solids data that were collected at various locations between the Hoover Dam and the Imperial Dam. Study Sites: Site 1: Colorado River below Hoover Dam Site 2: Bill Williams River near Parker Site 3: Colorado River below Parker Dam Site 4: CRIR Main Canal Site 5: Palo Verde Canal Site 6: Colorado River at Palo Verde Dam Site 7: CRIR Lower Main Drain Site 8: CRIR Upper Levee Drain Site 9: PVID Outfall Drain Site 10: Colorado River above Imperial Dam Discrete Dissolved-solids Dataset and Replicate Samples for Discrete Dissolved-solids Dataset: The Bureau of Reclamation collected discrete water-quality samples for the parameter of dissolved-solids (sum of constituents). Dissolved-solids, measured in milligrams per liter, are the sum of the following constituents: bicarbonate, calcium, carbonate, chloride, fluoride, magnesium, nitrate, potassium, silicon dioxide, sodium, and sulfate. These samples were collected on a monthly to bimonthly basis at various time periods between 1990 and 2016 at Sites 1-5 and Sites 7-10. No data were collected for Site 6: Colorado River at Palo Verde Dam. The Bureau of Reclamation and the USGS collected discrete quality-control replicate samples for the parameter of dissolved-solids, sum of constituents measured in milligrams per liter. The USGS collected discrete quality-control replicate samples in 2002 and 2003 and the Bureau of Reclamation collected discrete quality-control replicate samples in 2016 and 2017. Listed below are the sites where these samples were collected at and which agency collected the samples. Site 3: Colorado River below Parker Dam: USGS and Reclamation Site 4: CRIR Main Canal: Reclamation Site 5: Palo Verde Canal: Reclamation Site 7: CRIR Lower Main Drain: Reclamation Site 8: CRIR Upper Levee Drain: Reclamation Site 9: PVID Outfall Drain: Reclamation Site 10: Colorado River above Imperial Dam: USGS and Reclamation Monthly Mean Datasets and Mean Monthly Datasets: Monthly mean discharge data (cfs), flow weighted monthly mean dissolved-solids concentrations (mg/L) data and monthly mean dissolved-solids load data from 1990 to 2016 were computed using raw data from the USGS and the Bureau of Reclamation. This data were computed for all 10 sites. Flow weighted monthly mean dissolved-solids concentration and monthly mean dissolved-solids load were not computed for Site 2: Bill Williams River near Parker. The monthly mean datasets that were calculated for each month for the period between 1990 and 2016 were used to compute the mean monthly discharge and the mean monthly dissolved-solids load for each of the 12 months within a year. Each monthly mean was weighted by how many days were in the month and then averaged for each of the twelve months. This was computed for all 10 sites except mean monthly dissolved-solids load were not computed at Site 2: Bill Williams River near Parker. Site 8a: Colorado River between Parker and Palo Verde Valleys was computed by summing the data from sites 6, 7 and 8. Bill Williams Daily Mean Discharge, Instantaneous Dissolved-solids Concentration, and Daily Means Dissolved-solids Load Dataset: Daily mean discharge (cfs), instantaneous solids concentration (mg/L), and daily mean dissolved solids load were calculated using raw data collected by the USGS and the Bureau of Reclamation. This data were calculated for Site 2: Bill Williams River near Parker for the period of January 1990 to February 2016. Palo Verde Irrigation District Outfall Drain Mean Daily Discharge Dataset: The Bureau of Reclamation collected mean daily discharge data for the period of 01/01/2005 to 09/30/2016 at the Palo Verde Irrigation District (PVID) outfall drain using a stage-discharge relationship.
i
Monitor and extract metadata from FPLC-generated data
hub.ibisba.eu
ibisbahub.eu
zip
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mauro Di Fenza (2025). Monitor and extract metadata from FPLC-generated data [Dataset]. https://hub.ibisba.eu/data_files/125
Explore at:
zip(17.8 KB)Available download formats
Dataset updated
Feb 25, 2025
Authors
Mauro Di Fenza
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The monitor_and_extract_metadata.py script is designed to monitor a specified parent folder for new subfolders containing a Result.xml file. It extracts selected metadata from the Result.xml file and saves this metadata in both JSON and XML formats within the same subfolder.
v
Data from: Ethical Data Management
data.virginiabeach.gov
data.virginia.gov
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Virginia Beach - Online Mapping (2022). Ethical Data Management [Dataset]. https://data.virginiabeach.gov/documents/2949ba73014d49fba67bb7717280a8aa
Explore at:
Dataset updated
Nov 23, 2022
Dataset authored and provided by
City of Virginia Beach - Online Mapping
Description
Ethical Data ManagementExecutive SummaryIn the age of data and information, it is imperative that the City of Virginia Beach strategically utilize its data assets. Through expanding data access, improving quality, maintaining pace with advanced technologies, and strengthening capabilities, IT will ensure that the city remains at the forefront of digital transformation and innovation. The Data and Information Management team works under the purpose:“To promote a data-driven culture at all levels of the decision making process by supporting and enabling business capabilities with relevant and accurate information that can be accessed securely anytime, anywhere, and from any platform.”To fulfill this mission, IT will implement and utilize new and advanced technologies, enhanced data management and infrastructure, and will expand internal capabilities and regional collaboration.Introduction and JustificationThe Information technology (IT) department’s resources are integral features of the social, political and economic welfare of the City of Virginia Beach residents. In regard to local administration, the IT department makes it possible for the Data and Information Management Team to provide the general public with high-quality services, generate and disseminate knowledge, and facilitate growth through improved productivity.For the Data and Information Management Team, it is important to maximize the quality and security of the City’s data; to develop and apply the coherent management of information resources and management policies that aim to keep the general public constantly informed, protect their rights as subjects, improve the productivity, efficiency, effectiveness and public return of its projects and to promote responsible innovation. Furthermore, as technology evolves, it is important for public institutions to manage their information systems in such a way as to identify and minimize the security and privacy risks associated with the new capacities of those systems.The responsible and ethical use of data strategy is part of the City’s Master Technology Plan 2.0 (MTP), which establishes the roadmap designed by improve data and information accessibility, quality, and capabilities throughout the entire City. The strategy is being put into practice in the shape of a plan that involves various programs. Although these programs was specifically conceived as a conceptual framework for achieving a cultural change in terms of the public perception of data, it basically covers all the aspects of the MTP that concern data, and in particular the open-data and data-commons strategies, data-driven projects, with the aim of providing better urban services and interoperability based on metadata schemes and open-data formats, permanent access and data use and reuse, with the minimum possible legal, economic and technological barriers within current legislation.Fundamental valuesThe City of Virginia Beach’s data is a strategic asset and a valuable resource that enables our local government carry out its mission and its programs effectively. Appropriate access to municipal data significantly improves the value of the information and the return on the investment involved in generating it. In accordance with the Master Technology Plan 2.0 and its emphasis on public innovation, the digital economy and empowering city residents, this data-management strategy is based on the following considerations.Within this context, this new management and use of data has to respect and comply with the essential values applicable to data. For the Data and Information Team, these values are:Shared municipal knowledge. Municipal data, in its broadest sense, has a significant social dimension and provides the general public with past, present and future knowledge concerning the government, the city, society, the economy and the environment.The strategic value of data. The team must manage data as a strategic value, with an innovative vision, in order to turn it into an intellectual asset for the organization.Geared towards results. Municipal data is also a means of ensuring the administration’s accountability and transparency, for managing services and investments and for maintaining and improving the performance of the economy, wealth and the general public’s well-being.Data as a common asset. City residents and the common good have to be the central focus of the City of Virginia Beach’s plans and technological platforms. Data is a source of wealth that empowers people who have access to it. Making it possible for city residents to control the data, minimizing the digital gap and preventing discriminatory or unethical practices is the essence of municipal technological sovereignty.Transparency and interoperability. Public institutions must be open, transparent and responsible towards the general public. Promoting openness and interoperability, subject to technical and legal requirements, increases the efficiency of operations, reduces costs, improves services, supports needs and increases public access to valuable municipal information. In this way, it also promotes public participation in government.Reuse and open-source licenses. Making municipal information accessible, usable by everyone by default, without having to ask for prior permission, and analyzable by anyone who wishes to do so can foster entrepreneurship, social and digital innovation, jobs and excellence in scientific research, as well as improving the lives of Virginia Beach residents and making a significant contribution to the city’s stability and prosperity.Quality and security. The city government must take firm steps to ensure and maximize the quality, objectivity, usefulness, integrity and security of municipal information before disclosing it, and maintain processes to effectuate requests for amendments to the publicly-available information.Responsible organization. Adding value to the data and turning it into an asset, with the aim of promoting accountability and citizens’ rights, requires new actions, new integrated procedures, so that the new platforms can grow in an organic, transparent and cross-departmental way. A comprehensive governance strategy makes it possible to promote this revision and avoid redundancies, increased costs, inefficiency and bad practices.Care throughout the data’s life cycle. Paying attention to the management of municipal registers, from when they are created to when they are destroyed or preserved, is an essential part of data management and of promoting public responsibility. Being careful with the data throughout its life cycle combined with activities that ensure continued access to digital materials for as long as necessary, help with the analytic exploitation of the data, but also with the responsible protection of historic municipal government registers and safeguarding the economic and legal rights of the municipal government and the city’s residents.Privacy “by design”. Protecting privacy is of maximum importance. The Data and Information Management Team has to consider and protect individual and collective privacy during the data life cycle, systematically and verifiably, as specified in the general regulation for data protection.Security. Municipal information is a strategic asset subject to risks, and it has to be managed in such a way as to minimize those risks. This includes privacy, data protection, algorithmic discrimination and cybersecurity risks that must be specifically established, promoting ethical and responsible data architecture, techniques for improving privacy and evaluating the social effects. Although security and privacy are two separate, independent fields, they are closely related, and it is essential for the units to take a coordinated approach in order to identify and manage cybersecurity and risks to privacy with applicable requirements and standards.Open Source. It is obligatory for the Data and Information Management Team to maintain its Open Data- Open Source platform. The platform allows citizens to access open data from multiple cities in a central location, regional universities and colleges to foster continuous education, and aids in the development of data analytics skills for citizens. Continuing to uphold the Open Source platform with allow the City to continually offer citizens the ability to provide valuable input on the structure and availability of its data. Strategic areasIn order to deploy the strategy for the responsible and ethical use of data, the following areas of action have been established, which we will detail below, together with the actions and emblematic projects associated with them.In general, the strategy pivots on the following general principals, which form the basis for the strategic areas described in this section.Data sovereigntyOpen data and transparencyThe exchange and reuse of dataPolitical decision-making informed by dataThe life cycle of data and continual or permanent accessData GovernanceData quality and accessibility are crucial for meaningful data analysis, and must be ensured through the implementation of data governance. IT will establish a Data Governance Board, a collaborative organizational capability made up of the city’s data and analytics champions, who will work together to develop policies and practices to treat and use data as a strategic asset.Data governance is the overall management of the availability, usability, integrity and security of data used in the city. Increased data quality will positively impact overall trust in data, resulting in increased use and adoption. The ownership, accessibility, security, and quality, of the data is defined and maintained by the Data Governance Board.To improve operational efficiency, an enterprise-wide data catalog will be created to inventory data and track metadata from various data sources to allow for rapid data asset discovery. Through the data catalog, the city will

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) (2020). metadata [Dataset]. https://catalog.data.gov/dataset/metadata-f2500

metadata

Explore at:

Dataset updated

Nov 12, 2020

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.

Clear search

Close search

Google apps

Main menu

metadata

Metadata Management Tools Report

Common Metadata Elements for Cataloging Biomedical Datasets

Crunchyroll Meta-Data

Before please visit Crunchyroll

Dataset contains 7 files

popular.csv

series.csv

seasons.csv

episodes.csv

series_music.csv

audio.json

categories.json

github-meta-data

Data from: An open source framework for metadata exploration and discovery...

Data for: Identifying Metadata Quality Issues Across Cultures

ERB case studies meta data

Priority Toxic Contaminant Metadata Inventory and Associated Total...

Data warehouse and metadata holdings relevant to Australias North West Shelf...

Spatial enablement for data and metadata User Guides and Best Practices

The global enterprise metadata management market size is USD 7.85 billion in...

Dataset relating a study on Geospatial Open Data usage and metadata quality

Data from: A general purpose tool-set for representing data relationships:...

Making the case for FAIR Data Points

Quantifying FAIR: metadata improvement and guidance in the DataONE...

BLM OR Cadastral PLSS Metadata Glance Polygon Hub

Overview Metadata for the Data used in te Conceptual and Numerical Model of...

Monitor and extract metadata from FPLC-generated data

Data from: Ethical Data Management

metadataSee More Versions

metadata