CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the metadata of the datasets published in 101 Dataverse installations, information about the metadata blocks of 106 installations, and the lists of pre-defined licenses or dataset terms that depositors can apply to datasets in the 88 installations that were running versions of the Dataverse software that include the "multiple-license" feature. The data is useful for improving understandings about how certain Dataverse features and metadata fields are used and for learning about the quality of dataset and file-level metadata within and across Dataverse installations. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 25 and August 30, 2024 using a "get_dataverse_installations_metadata" function in a collection of Python functions at https://github.com/jggautier/dataverse-scripts/blob/main/dataverse_repository_curation_assistant/dataverse_repository_curation_assistant_functions.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL for which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens in order to use certain API endpoints. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author_2024.08.25-2024.08.30.csv │ ├── contributor_2024.08.25-2024.08.30.csv │ ├── data_source_2024.08.25-2024.08.30.csv │ ├── ... │ └── topic_classification_2024.08.25-2024.08.30.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2024.08.26_15.52.42.zip │ ├── dataset_pids_Abacus_2024.08.26_15.52.42.csv │ ├── Dataverse_JSON_metadata_2024.08.26_15.52.42 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.9 │ ├── astrophysics_v5.9.json │ ├── biomedical_v5.9.json │ ├── citation_v5.9.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2024.08.26_00.02.51.zip │ ├── ... │ └── Yale_Dataverse_2024.08.25_03.52.57.zip └── dataverse_installations_summary_2024.08.30.csv └── dataset_pids_from_most_known_dataverse_installations_2024.08.csv └── license_options_for_each_dataverse_installation_2024.08.28_14.42.54.csv └── metadatablocks_from_most_known_dataverse_installations_2024.08.30.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the "Citation" metadata block and "Geospatial" metadata block of datasets in the 101 Dataverse installations. For example, author_2024.08.25-2024.08.30.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in 101 installations, with a column for each of the four child fields: author name, affiliation, identifier type, and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 106 zip files, one zip file for each of the 106 Dataverse installations whose sites were functioning when I attempted to collect their metadata. Each zip file contains a directory with JSON files that have information about the installation's metadata fields, such as the field names and how they're organized. For installations that had published datasets, and I was able to use Dataverse APIs to download the dataset metadata, the zip file also contains: A CSV file listing information about the datasets published in the installation, including a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. A directory of JSON files that contain the metadata of the installation's published, non-deaccessioned dataset versions in the Dataverse JSON metadata schema. The dataverse_installations_summary_2024.08.30.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata included and not included in this dataset. The dataset_pids_from_most_known_dataverse_installations_2024.08.csv file contains the dataset PIDs of published datasets in 101 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all "dataset_pids_....csv" files in each of the 101 zip files in the dataverse_json_metadata_from_each_known_dataverse_installation directory. The license_options_for_each_dataverse_installation_2024.08.28_14.42.54.csv file contains information about the licenses and...
We compiled macroinvertebrate assemblage data collected from 1995 to 2014 from the St. Louis River Area of Concern (AOC) of western Lake Superior. Our objective was to define depth-adjusted cutoff values for benthos condition classes (poor, fair, reference) to provide tool useful for assessing progress toward achieving removal targets for the degraded benthos beneficial use impairment in the AOC. The relationship between depth and benthos metrics was wedge-shaped. We therefore used quantile regression to model the limiting effect of depth on selected benthos metrics, including taxa richness, percent non-oligochaete individuals, combined percent Ephemeroptera, Trichoptera, and Odonata individuals, and density of ephemerid mayfly nymphs (Hexagenia). We created a scaled trimetric index from the first three metrics. Metric values at or above the 90th percentile quantile regression model prediction were defined as reference condition for that depth. We set the cutoff between poor and fair condition as the 50th percentile model prediction. We examined sampler type, exposure, geographic zone of the AOC, and substrate type for confounding effects. Based on these analyses we combined data across sampler type and exposure classes and created separate models for each geographic zone. We used the resulting condition class cutoff values to assess the relative benthic condition for three habitat restoration project areas. The depth-limited pattern of ephemerid abundance we observed in the St. Louis River AOC also occurred elsewhere in the Great Lakes. We provide tabulated model predictions for application of our depth-adjusted condition class cutoff values to new sample data. This dataset is associated with the following publication: Angradi, T., W. Bartsch, A. Trebitz, V. Brady, and J. Launspach. A depth-adjusted ambient distribution approach for setting numeric removal targets for a Great Lakes Area of Concern beneficial use impairment: Degraded benthos. JOURNAL OF GREAT LAKES RESEARCH. International Association for Great Lakes Research, Ann Arbor, MI, USA, 43(1): 108-120, (2017).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset was extracted from a set of metadata files harvested from the DataCite metadata store (https://search.datacite.org/ui) during December 2015. Metadata records for items with a resourceType of dataset were collected. 1,647,949 total records were collected. This dataset contains three files: 1) readme.txt: A readme file. 2) version-results.csv: A CSV file containing three columns: DOI, DOI prefix, and version text contents 3) version-counts.csv: A CSV file containing counts for unique version text content values.
https://www.futuremarketinsights.com/privacy-policyhttps://www.futuremarketinsights.com/privacy-policy
The global enterprise metadata management market is expected to grow at a 14.8% CAGR during the forecast period. In 2023, the market is currently valued at US$ 2,626.9 million. The enterprise metadata management market is expected to reach US$ 10,474.3 million by 2033. Future Market Insights specialists have observed a historical CAGR of 12.7% from 2018 to 2022.
Data Points | Key Statistics |
---|---|
Expected Market Value (2023) | US$ 2,626.9 million |
Anticipated Forecast Value (2033) | US$ 10,474.3 million |
Projected Growth Rate (2023 to 2033) | 14.8% CAGR |
Report Scope
Report Attribute | Details |
---|---|
Market Value in 2023 | US$ 2,626.9 million |
Market Value in 2033 | US$ 10,474.3 million |
Growth Rate | CAGR of 14.8% from 2023 to 2033 |
Base Year for Estimation | 2023 |
Historical Data | 2018 to 2022 |
Forecast Period | 2023 to 2033 |
Quantitative Units | Revenue in US$ million and CAGR from 2023 to 2033 |
Report Coverage | Revenue Forecast, Volume Forecast, Company Ranking, Competitive Landscape, Growth Factors, Trends and Pricing Analysis |
Segments Covered |
|
Regions Covered |
|
Key Countries Profiled |
|
Key Companies Profiled |
|
Customization | Available Upon Request |
The Open Government Data portals (OGD) thanks to the presence of thousands of geo-referenced datasets, containing spatial information, are of extreme interest for any analysis or process relating to the territory. For this to happen, users must be enabled to access these datasets and reuse them. An element often considered hindering the full dissemination of OGD data is the quality of their metadata. Starting from an experimental investigation conducted on over 160,000 geospatial datasets belonging to six national and international OGD portals, this work has as its first objective to provide an overview of the usage of these portals measured in terms of datasets views and downloads. Furthermore, to assess the possible influence of the quality of the metadata on the use of geospatial datasets, an assessment of the metadata for each dataset was carried out, and the correlation between these two variables was measured. The results obtained showed a significant underutilization of geospatial datasets and a generally poor quality of their metadata. Besides, a weak correlation was found between the use and quality of the metadata, not such as to assert with certainty that the latter is a determining factor of the former.
The dataset consists of six zipped CSV files, containing the collected datasets' usage data, full metadata, and computed quality values, for about 160,000 geospatial datasets belonging to the three national and three international portals considered in the study, i.e. US (catalog.data.gov), Colombia (datos.gov.co), Ireland (data.gov.ie), HDX (data.humdata.org), EUODP (data.europa.eu), and NASA (data.nasa.gov).
Data collection occurred in the period: 2019-12-19 -- 2019-12-23.
The header for each CSV file is:
[ ,portalid,id,downloaddate,metadata,overallq,qvalues,assessdate,dviews,downloads,engine,admindomain]
where for each row (a portal's dataset) the following fields are defined as follows:
portalid: portal identifier
id: dataset identifier
downloaddate: date of data collection
metadata: the overall dataset's metadata downloaded via API from the portal according to the supporting platform schema
overallq: overall quality values computed by applying the methodology presented in [1]
qvalues: json object containing the quality values computed for the 17 metrics presented in [1]
assessdate: date of quality assessment
dviews: number of total views for the dataset
downloads: number of total downloads for the dataset (made available only by the Colombia, HDX, and NASA portals)
engine: identifier of the supporting portal platform: 1(CKAN), 2 (Socrata)
admindomain: 1 (national), 2 (international)
[1] Neumaier, S.; Umbrich, J.; Polleres, A. Automated Quality Assessment of Metadata Across Open Data Portals.J. Data and Information Quality2016,8, 2:1–2:29. doi:10.1145/2964909
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains Zenodo's published open access records' metadata, including also records that have been marked by the Zenodo staff as spam and deleted.
The dataset is a gzipped compressed JSON-lines file, where each line is a JSON object representation of a Zenodo record.
Each object contains the terms:
part_of, thesis, description, doi, meeting, imprint, references, recid, alternate_identifiers, resource_type, journal, related_identifiers, title, subjects, notes, creators, communities, access_right, keywords, contributors, publication_date
which are corresponding to the fields with the same name available in Zenodo's record JSON Schema at https://zenodo.org/schemas/records/record-v1.0.0.json.
In addition, some terms have been altered:
The term files contains a list of dictionaries containing filetype, size, and filename only.
The term license contains a short Zenodo ID of the license (e.g "cc-by").
The term spam contains a boolean value, determining whether a given record was marked as a spam record by Zenodo staff.
Some values for the top-level terms, which were missing in the metadata may contain a null value.
A smaller uncompressed random sample of 200 JSON lines is also included to allow for testing and getting familiar with the format without having to download the entire dataset.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata subject fields.
https://www.verifiedmarketresearch.com/privacy-policy/https://www.verifiedmarketresearch.com/privacy-policy/
Metadata Management Tools Market size was valued at USD 8.09 Billion in 2024 and is projected to reach USD 25.07 Billion by 2031, growing at a CAGR of 20.7% from 2024 to 2031.
Global Metadata Management Tools Market Drivers
The requirements for data governance and compliance: Organizations use metadata management technologies to guarantee compliance, data quality, and data lineage due to growing legal requirements and the need for strong data governance.
The swift expansion of big data and analytics: Large-scale data generated by enterprises requires efficient metadata management in order to be understood, tracked, and used. This is due to the growth of big data and analytics programs.
Initiatives for Digital Transformation: Digitally transforming organizations understand the value of metadata in managing heterogeneous data sources, promoting interoperability, and guaranteeing data integration between systems.
The intricacy of data ecosystems: Organizations’ data ecosystems becoming more complex as they deal with a wider range of data sources, types, and architectures. Tools for metadata management aid in sifting through and understanding this complexity.
Cloud Usage: Metadata management technologies are becoming more and more necessary as cloud environments and hybrid or multi-cloud architectures are used to guarantee data visibility, control, and governance across various platforms.
A greater emphasis on master data management and data quality: The need for metadata management tools to preserve and improve the integrity of organizational data is being driven by the increased understanding of the significance of master data management (MDM) and data quality.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata subject fields.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata location fields.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata facet fields.
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
The INSPIRE metadata code list register contains the code lists and their values, as defined in the INSPIRE implementing rules on metadata (Commission Regulation (EC) No 1205/2008).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Watershed metadata was collected for 14 watersheds from studies where channel length survey data was presented. For variables not found in the publications associated with the channel length surveys, additional sources are referenced. These sources are included in the notes column. Variables without sources were calculated, as described in the Additional Metadata section below. Examples of calculated values include, q_avg_mm_per_day, beta, and l_avg_km.
For Python packages, modules, and functions used to find calculated values, please see the associated GitHub repository: https://zenodo.org/record/4057320
Comma separated values (csv) file that are the findings of the Southeast region. The files list the site identification number, the p-value, percent change, water year, median before the change point, median after the change point, primary attribution, secondary attribution, level of evidence, and attribution notes and citations.
https://www.mordorintelligence.com/privacy-policyhttps://www.mordorintelligence.com/privacy-policy
The report covers Enterprise Metadata Management Companies and it is segmented by deployment (On-Cloud, On-Premise), end-user industry (BFSI, Healthcare, Medica and Entertainment, IT and Telecom, Retail, Government, other end-user industries), and geography (North America (United States, Canada), Europe (Germany, United Kingdom, France, Rest of Europe), Asia Pacific (China, Japan, South Korea, Rest of Asia Pacific), Latin America, Middle East & Africa). The market sizes and forecasts are provided in terms of value (USD billion) for all the above segments.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata facet fields.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Unique values and counts of metadata subject fields.
This video demonstrates to viewers the importance and value on fit for purpose metadata, metadata standards, and metadata profiles.
a small mock Big Five Inventory dataset
This table contains variable names, labels, and number of missing values. See the complete codebook for more.
name | label | n_missing |
---|---|---|
session | NA | 0 |
created | user first opened survey | 0 |
modified | user last edited survey | 0 |
ended | user finished survey | 0 |
expired | NA | 28 |
BFIK_agree_4R | Ich kann mich schroff und abweisend anderen gegenüber verhalten. | 0 |
BFIK_agree_1R | Ich neige dazu, andere zu kritisieren. | 0 |
BFIK_neuro_2R | Ich bin entspannt, lasse mich durch Stress nicht aus der Ruhe bringen. | 0 |
BFIK_agree_3R | Ich kann mich kalt und distanziert verhalten. | 0 |
BFIK_neuro_3 | Ich mache mir viele Sorgen. | 0 |
BFIK_neuro_4 | Ich werde leicht nervös und unsicher. | 0 |
BFIK_agree_2 | Ich schenke anderen leicht Vertrauen, glaube an das Gute im Menschen. | 0 |
BFIK_agree | 4 BFIK_agree items aggregated by aggregation_function | 0 |
BFIK_neuro | 3 BFIK_neuro items aggregated by aggregation_function | 0 |
age | Alter | 0 |
This dataset was automatically described using the codebook R package (version 0.9.6).
Daily snow depth values from the UW Snoqualmie Pass site. A timelapse camera and 3 snow depth poles were deployed at the forest plot during water year 2015. Manual snow stake observations were taken in the open plot. This comparison of snow depth between the open and forest uses the daily snow depth data observed with the snow stake, rounded to 5cm, compared to the average of all visible pole values in the forest (read by eye from photos), also rounded to 5 cm. These data have been processed, aggregated and rounded. Raw photographs of the forest poles are also available. UW_Snoqualmie_snow_camera Attributes: Site - Snoqualmie, Cover - Forest or open, WY - water year 2015, Date - yyyy-mm-dd, Method - snow depth pole (with time lapse camera) or manual snow stake observation, Rounding - to nearest 5 cm, variable - snow depth, in cm, value - aggregated and rounded values.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the metadata of the datasets published in 101 Dataverse installations, information about the metadata blocks of 106 installations, and the lists of pre-defined licenses or dataset terms that depositors can apply to datasets in the 88 installations that were running versions of the Dataverse software that include the "multiple-license" feature. The data is useful for improving understandings about how certain Dataverse features and metadata fields are used and for learning about the quality of dataset and file-level metadata within and across Dataverse installations. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 25 and August 30, 2024 using a "get_dataverse_installations_metadata" function in a collection of Python functions at https://github.com/jggautier/dataverse-scripts/blob/main/dataverse_repository_curation_assistant/dataverse_repository_curation_assistant_functions.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL for which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens in order to use certain API endpoints. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author_2024.08.25-2024.08.30.csv │ ├── contributor_2024.08.25-2024.08.30.csv │ ├── data_source_2024.08.25-2024.08.30.csv │ ├── ... │ └── topic_classification_2024.08.25-2024.08.30.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2024.08.26_15.52.42.zip │ ├── dataset_pids_Abacus_2024.08.26_15.52.42.csv │ ├── Dataverse_JSON_metadata_2024.08.26_15.52.42 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.9 │ ├── astrophysics_v5.9.json │ ├── biomedical_v5.9.json │ ├── citation_v5.9.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2024.08.26_00.02.51.zip │ ├── ... │ └── Yale_Dataverse_2024.08.25_03.52.57.zip └── dataverse_installations_summary_2024.08.30.csv └── dataset_pids_from_most_known_dataverse_installations_2024.08.csv └── license_options_for_each_dataverse_installation_2024.08.28_14.42.54.csv └── metadatablocks_from_most_known_dataverse_installations_2024.08.30.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the "Citation" metadata block and "Geospatial" metadata block of datasets in the 101 Dataverse installations. For example, author_2024.08.25-2024.08.30.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in 101 installations, with a column for each of the four child fields: author name, affiliation, identifier type, and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 106 zip files, one zip file for each of the 106 Dataverse installations whose sites were functioning when I attempted to collect their metadata. Each zip file contains a directory with JSON files that have information about the installation's metadata fields, such as the field names and how they're organized. For installations that had published datasets, and I was able to use Dataverse APIs to download the dataset metadata, the zip file also contains: A CSV file listing information about the datasets published in the installation, including a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. A directory of JSON files that contain the metadata of the installation's published, non-deaccessioned dataset versions in the Dataverse JSON metadata schema. The dataverse_installations_summary_2024.08.30.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata included and not included in this dataset. The dataset_pids_from_most_known_dataverse_installations_2024.08.csv file contains the dataset PIDs of published datasets in 101 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all "dataset_pids_....csv" files in each of the 101 zip files in the dataverse_json_metadata_from_each_known_dataverse_installation directory. The license_options_for_each_dataverse_installation_2024.08.28_14.42.54.csv file contains information about the licenses and...