The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information on samples submitted for RNAseq
Rows are individual samples
Columns are: ID Sample Name Date sampled Species Sex Tissue Geographic location Date extracted Extracted by Nanodrop Conc. (ng/µl) 260/280 260/230 RIN Plate ID Position Index name Index Seq Qubit BR kit Conc. (ng/ul) BioAnalyzer Conc. (ng/ul) BioAnalyzer bp (region 200-1200) Submission reference Date submitted Conc. (nM) Volume provided PE/SE Number of reads Read length
Stores physical and logical information about relational databases and record structures to assist in data identification and management.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset outlines a proposed set of core, minimal metadata elements that can be used to describe biomedical datasets, such as those resulting from research funded by the National Institutes of Health. It can inform efforts to better catalog or index such data to improve discoverability. The proposed metadata elements are based on an analysis of the metadata schemas used in a set of NIH-supported data sharing repositories. Common elements from these data repositories were identified, mapped to existing data-specific metadata standards from to existing multidisciplinary data repositories, DataCite and Dryad, and compared with metadata used in MEDLINE records to establish a sustainable and integrated metadata schema. From the mappings, we developed a preliminary set of minimal metadata elements that can be used to describe NIH-funded datasets. Please see the readme file for more details about the individual sheets within the spreadsheet.
Mexico Marine Research Metadata DatabaseThis project compiled metadata on available datasets produced by marine research in Mexico. The data is categorized by region, theme, species (when applicable), and research fields. This dataset corresponds to the associated peer-reviewed paper, the living database can be accessed at http://infoceanos.conabio.gob.mx.Mexico Metadata Database DataDryad.csv
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The Metadata Management Tools market is experiencing robust growth, driven by the increasing volume and complexity of data across various industries. The market, estimated at $15 billion in 2025, is projected to achieve a Compound Annual Growth Rate (CAGR) of 12% from 2025 to 2033, reaching approximately $40 billion by 2033. This expansion is fueled by several key factors. Firstly, the rising adoption of cloud-based solutions provides scalability and cost-effectiveness, attracting businesses of all sizes. Secondly, the stringent regulatory compliance needs across sectors like BFSI and healthcare necessitate robust metadata management for data governance and security. Furthermore, the growing demand for data-driven decision-making and advanced analytics increases the reliance on accurate and readily accessible metadata. Key trends include the integration of AI and machine learning for automated metadata discovery and classification, and the increasing demand for solutions offering enhanced data lineage capabilities. While the market faces restraints like the complexity of implementation and the need for skilled professionals, the overall positive market outlook is supported by continuous innovation and increasing enterprise awareness of the value proposition of effective metadata management. The market is segmented by deployment (cloud-based and on-premise) and application (BFSI, retail, medical, media, and others). Major players such as Oracle, SAP, IBM, and Informatica dominate the market, while several emerging players are also vying for market share through innovative solutions. The North American region currently holds the largest market share, followed by Europe and Asia Pacific. The competitive landscape is marked by both established players and innovative startups. Established players leverage their existing customer base and extensive product portfolios, while emerging companies often focus on niche solutions and advanced technologies. The market is witnessing increased mergers and acquisitions, strategic partnerships, and product advancements, indicative of a dynamic and competitive landscape. Future growth hinges on the ability of vendors to adapt to the evolving technological landscape, meet the growing need for data security and compliance, and provide user-friendly, scalable, and cost-effective solutions. The focus on data quality, interoperability, and governance will continue to shape the development and adoption of metadata management tools across industries. Geographical expansion, especially into developing economies, presents a significant opportunity for market growth.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
GitHub Meta Data
This dataset contains GitHub repository descriptions paired with their tags.
input: a natural language query or description of a GitHub project
target: comma-separated tags describing it
Used for training a T5 model for GitHub-style tag generation.
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Gms-index-mediator is a standalone index for spatio-temporal data acting as a mediator between an application and a database. Even modern databases need several minutes to execute a spatio-temporal query to huge tables containing several million entries. Our index-mediator speeds the execution of such queries up by several magnitues, resulting in response times around 100ms. This version is tailored towards the GeoMultiSens database, but can be adapted to work with custom table layouts with reasonable effort.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Usefulness of metadata in the automatic version of ACMANTv5 was tested.
A benchmark database has been developed, which consists of 41 datasets
of 20,500 networks of 170,000 synthetic monthly temperature time series
and the relating metadata dates. The research was supported by the
Catalan Meteorological Service. The research results will be published
in the open access MDPI journal Atmosphere.
See more in the "Readme.txt" file of the dataset.
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Market Size and Growth: The global Metadata Management Software market was valued at USD XXX million in 2025 and is projected to grow at a CAGR of XX% from 2025 to 2033, reaching USD XXX million by the end of the forecast period. The increasing demand for efficient and accurate data management, coupled with the growing adoption of cloud-based solutions, are key drivers of this growth. The market is segmented by application (data governance, data integration, data quality, data security, and others) and type (structured, unstructured, and semi-structured). North America and Europe are currently the dominant regional markets, while Asia Pacific is expected to witness significant growth in the coming years. Key Trends and Challenges: One of the major trends in the Metadata Management Software market is the rise of artificial intelligence (AI) and machine learning (ML). AI-powered tools can automate metadata extraction, classification, and analysis tasks, reducing manual effort and improving accuracy. Another trend is the adoption of semantic technologies, which allow organizations to create more meaningful connections between different types of data. However, challenges such as data privacy and security concerns, as well as the lack of skilled professionals, could hinder market growth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description not specified.........................
The GOLD (Genomes OnLine Database)is a resource for centralized monitoring of genome and metagenome projects worldwide. It stores information on complete and ongoing projects, along with their associated metadata. This collection references metadata associated with samples.
This project will deliver an open source framework for metadata exploration, automatic text mining and information retrieval of polar data that uses the Apache Tika technology. Apache Tika is currently the de facto "babel fish", aiding in the automatic MIME detection, text extraction, and metadata classification of over 1200 data formats. The PI will expand Tika to handle polar data and scientific data formats, making Polar data more easily available, searchable, and retrievable by all major content management systems. The proposed activity will lay the framework for a thorough automatically generated inventory of polar metadata and data. Expanding Tika to handle polar data will also naturally invite the technology/open source community to deal with polar use cases, helping to increase understanding of the arctic. The resultant software produced through effort will be disseminated to the software and polar communities through the Apache Software Foundation. A computer science graduate student and postdoc will be exposed to Cryosphere and Arctic data, helping to train the next generation of cross disciplinary data scientists in the domain. The PI's Search Engines (20-40 students annual enrollment) and Software Architecture (30-50 students annual enrollment) graduate courses at USC will benefit from the Arctic cyberinfrastructure use cases disseminated through course projects and lecture material. The PI will also work collaboratively with NSF-funded projects dealing with projects focusing on the archiving, discovery and access of polar data, such as ACADIS and the Antarctic Master Directory.
The Active Marine Station Metadata is a daily metadata report for active marine bouy and C-MAN (Coastal Marine Automated Network) platforms from the National Data Buoy Center (NDBC). Metadata includes the station id, latitude/longitude (resolution to thousandths of a degree), the station name, the station owner, the program the station is associated with (e.g., TAO, NDBC, tsunami, NOS, etc.), station type (e.g., buoy, fixed, oil rig, etc.), notification if the station observes meteorology, currents, and water quality (signified by 'y' for yes and 'n' for no). If there is a 'y' associated with one of these tags, then the station has reported data in that category within the last 8 hours (or 24 hours for DART stations--Deep-Ocean Assessment Reporting of Tsunamis). If there is an 'n', data has not been received within those times. Stations are removed from the list when they are dismantled. The metadata information is written to a daily XML-formatted file.
https://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
This table represents metadata records which formerly existed on LIO’s Metadata Management Tool. Records representing data licensed for use under the Open Government Licence - Ontario have migrated to the Ontario GeoHub.
The remaining records could not migrate for one of the following reasons:
The data is not spatial. The metadata record is incomplete. The metadata contact information is invalid. The metadata references data that has not been made available to LIO. LIO cannot confirm that the data has been reviewed to be released under the Open Government Licence - Ontario.
Contact LIO Support at geospatial@ontario.ca for more information or to get an extract of original metadata files.
Status
Obsolete: data is no longer relevant
Maintenance and Update Frequency
Not planned: there are no plans to update the data
Contact
Land Information Ontario (LIO) Support, geospatial@ontario.ca
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This sample was drawn from the Crossref API on March 8, 2022. The sample was constructed purposefully on the hypothesis that records with at least one known issue would be more likely to yield issues related to cultural meanings and identity. Records known or suspected to have at least one quality issue were selected by the authors and Crossref staff. The Crossref API was then used to randomly select additional records from the same prefix. Records in the sample represent 51 DOI prefixes that were chosen without regard for the manuscript management or publishing platform used, as well as 17 prefixes for journals known to use the Open Journal Systems manuscript management and publishing platform. OJS was specifically identified due to the authors' familiarity with the platform, its international and multilingual reach, and previous work on its metadata quality.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
In June 2019, the U.S. Geological Survey Maryland-Delaware-District of Columbia Water Science Center (MD-DE-DC WSC) team began to collect and inventory available information on toxic contaminants within the Chesapeake Bay Watershed. State agencies were contacted to determine available data. Also, the National Water Information System (NWIS) and National Water Quality Database (NWQD) were queried to gather relevant data for the compilation. The resulting tables contain records for available sites where specific analyte groups, Hg (mercury), PCB (polychlorinated biphenyls), or pesticides, have been collected with appropriate supplemental metadata including media, method, time frame, and frequency of collection. Sample results span 1972-2019. Files included in the data release: Basic_Table.csv Detailed_Table.csv NWIS_PCodes.csv State_Result_Totals.csv NWIS_Result_Totals.csv
Link to the ScienceBase Item Summary page for the item described by this metadata record. Service Protocol: Link to the ScienceBase Item Summary page for the item described by this metadata record. Application Profile: Web Browser. Link Function: information
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Open Government Data portals (OGD) thanks to the presence of thousands of geo-referenced datasets, containing spatial information, are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. Besides, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.
The dataset consists of six zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 160,000 geospatial datasets belonging to the three national and three international portals considered in the study, i.e. US (catalog.data.gov), Colombia (datos.gov.co), Ireland (data.gov.ie), HDX (data.humdata.org), EUODP (data.europa.eu), and NASA (data.nasa.gov).
Data collection occurred in the period: 2019-12-19 -- 2019-12-23.
The header for each CSV file is:
[ ,portalid,id,downloaddate,metadata,overallq,qvalues,assessdate,dviews,downloads,engine,admindomain]
where for each row (a portal's dataset) the following fields are defined as follows:
[1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
Environmental scientists stand uniquely poised to capitalize on recent advancements in technology, computation, and data management, however, it is unknown the degree to which this is occurring. We analyzed survey responses of 445 graduate students in California to evaluate understanding and use of such advances in the environmental sciences. Of students who had completed their degree, 64.3% had completed the data life cycle, 30.5% had archived research data so that it is available online, and 61.4% had no plans to create metadata for research data sets. Roughly one-third of students used an environmental sensor and collaborated with someone outside their expertise. Results varied by students’ research status and by university type. Doing excellent science in this data-intensive age may necessitate greater emphasis by university programs on data management best practices borrowed from information technology, and skills supplemented by unique training opportunities, courses, counsel fro...
The dataset consists of public domain acute and chronic toxicity and chemistry data for algal species. Data are accessible at: https://envirotoxdatabase.org/ Data include algal species, chemical identification, and the concentrations that do and do not affect algal growth.