100+ datasets found
  1. H

    Dataset metadata of known Dataverse installations, August 2023

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Aug 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julian Gautier (2024). Dataset metadata of known Dataverse installations, August 2023 [Dataset]. http://doi.org/10.7910/DVN/8FEGUV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 30, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Julian Gautier
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains the metadata of the datasets published in 85 Dataverse installations and information about each installation's metadata blocks. It also includes the lists of pre-defined licenses or terms of use that dataset depositors can apply to the datasets they publish in the 58 installations that were running versions of the Dataverse software that include that feature. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations and improving understandings about how certain Dataverse features and metadata fields are used. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 22 and August 28, 2023 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation)_2023.08.22-2023.08.28.csv │ ├── contributor(citation)_2023.08.22-2023.08.28.csv │ ├── data_source(citation)_2023.08.22-2023.08.28.csv │ ├── ... │ └── topic_classification(citation)_2023.08.22-2023.08.28.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2023.08.27_12.59.59.zip │ ├── dataset_pids_Abacus_2023.08.27_12.59.59.csv │ ├── Dataverse_JSON_metadata_2023.08.27_12.59.59 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2023.08.26_22.14.04.zip │ ├── ADA_Dataverse_2023.08.27_13.16.20.zip │ ├── Arca_Dados_2023.08.27_13.34.09.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2023.08.27_19.24.15.zip └── dataverse_installations_summary_2023.08.28.csv └── dataset_pids_from_most_known_dataverse_installations_2023.08.csv └── license_options_for_each_dataverse_installation_2023.09.05.csv └── metadatablocks_from_most_known_dataverse_installations_2023.09.05.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the citation metadata block and geospatial metadata block of datasets in the 85 Dataverse installations. For example, author(citation)_2023.08.22-2023.08.28.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in the 85 installations, where there's a row for author names, affiliations, identifier types and identifiers. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 85 zipped files, one for each of the 85 Dataverse installations whose dataset metadata I was able to download. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. It also includes the alias/identifier and category of the Dataverse collection that the dataset is in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The Dataverse JSON export of the latest version of each dataset includes "(latest_version)" in the file name. This should help those who are interested in the metadata of only the latest version of each dataset. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I included them so that they can be used when extracting metadata from the dataset's Dataverse JSON exports. The dataverse_installations_summary_2023.08.28.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata...

  2. d

    Data from: Data Catalog Project - A Browsable, Searchable, Metadata System

    • search.dataone.org
    • dataverse.harvard.edu
    • +1more
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua Stillerman; Thomas Fredian; Martin Greenwald; Gabriele Manduchi (2023). Data Catalog Project - A Browsable, Searchable, Metadata System [Dataset]. http://doi.org/10.7910/DVN/5EZSZC
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Joshua Stillerman; Thomas Fredian; Martin Greenwald; Gabriele Manduchi
    Description

    Modern experiments are typically conducted by large, extended, where researchers rely on other team members to produce much of the data they use. The experiments record very large numbers of measurements which can be difficult for users to find, access and understand. We are developing a system for users to annotate their data products with structured metadata, providing data consumers with a discoverable, browsable data index. Machine understandable metadata captures the underlying semantics of the recorded data, which can then be consumed by both programs, and interactively by users. Collaborators can use these metadata to select and understand recorded measurements.

  3. S

    SUPER DADA for Dataverse 4+ [Version 3.0]

    • sodha.be
    sh, text/markdown
    Updated Apr 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Sciences and Digital Humanities Archive – SODHA (2022). SUPER DADA for Dataverse 4+ [Version 3.0] [Dataset]. http://doi.org/10.34934/DVN/PSFVVF
    Explore at:
    sh(5376), text/markdown(7474)Available download formats
    Dataset updated
    Apr 25, 2022
    Dataset provided by
    Social Sciences and Digital Humanities Archive – SODHA
    Dataset funded by
    Belgian Science Policy Office (BELSPO)
    Description

    SUPER DADA is a bash script that adapts XML-DDI metadata files produced by Dataverse in order to make them compliant with the technical requirements of the CESSDA Data Catalogue (CDC). This version of the script is geared towards versions 4+ of Dataverse. In its current state, SUPER DADA modifies XML-DDI files produced by a version 4+ Dataverse installation so that the files become fully compliant with the 'BASIC' level of validation (or 'validation gate') of the CESSDA Metadata Validator against the CESSDA Data Catalogue (CDC) DDI 2.5 Profile 1.0.4. This new version of the script now also ensures that country names in the appropriate metadata field are detected by the CESSDA Data Catalogue, adding them to the "Country" search facet. See the README file for technical details and specifications.

  4. d

    Crossref Metadata - 1900 to 2017

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moqri, Mahdi (2023). Crossref Metadata - 1900 to 2017 [Dataset]. http://doi.org/10.7910/DVN/7JIWXI
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Moqri, Mahdi
    Description

    This data contains the following fields: doi+'\t'+year+'\t'+citation count+'\t'+issnp+'\t'+issne+'\t'+journal+'\t'+pub+'\t'+lic+' ' let me know if you need other fields (moqri@ufl.edu)

  5. B

    Dataverse Study for Metadata Testing

    • borealisdata.ca
    • search.dataone.org
    Updated Nov 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Author Name; John Huck (2023). Dataverse Study for Metadata Testing [Dataset]. http://doi.org/10.7939/DVN/10977
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    Borealis
    Authors
    Author Name; John Huck
    License

    https://borealisdata.ca/api/datasets/:persistentId/versions/2.3/customlicense?persistentId=doi:10.7939/DVN/10977https://borealisdata.ca/api/datasets/:persistentId/versions/2.3/customlicense?persistentId=doi:10.7939/DVN/10977

    Time period covered
    Dec 31, 1999 - Jan 1, 2000
    Area covered
    Canada, Other, Alberta, Edmonton, Camrose, Canada, Alberta
    Dataset funded by
    Grant Number Agency
    Description

    This study was created as a tool for library staff to check metadata through Dataverse system upgrades and migrations. Record saturation is achieved by supplying a value for every metadata element, enabling comparisons to be made between test and production instances. Second Description field

  6. S

    SUPER DADA for Dataverse 5+ [Version 2.0]

    • sodha.be
    sh, text/markdown
    Updated Apr 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Social Sciences and Digital Humanities Archive – SODHA (2022). SUPER DADA for Dataverse 5+ [Version 2.0] [Dataset]. http://doi.org/10.34934/DVN/AOQRSJ
    Explore at:
    sh(9352), text/markdown(6386)Available download formats
    Dataset updated
    Apr 25, 2022
    Dataset provided by
    Social Sciences and Digital Humanities Archive – SODHA
    Dataset funded by
    Belgian Science Policy Office (BELSPO)
    Description

    SUPER DADA is a bash script that adapts XML-DDI metadata files produced by Dataverse in order to make them compliant with the technical requirements of the CESSDA Data Catalogue (CDC). This version of the script is geared towards versions 5+ of Dataverse. In its current state, SUPER DADA modifies XML-DDI files produced by a version 5+ Dataverse installation so that the files become fully compliant with the 'BASIC' level of validation (or 'validation gate') of the CESSDA Metadata Validator against the CESSDA Data Catalogue (CDC) DDI 2.5 Profile 1.0.4. This new version of the script now also ensures that country names in the appropriate metadata field are detected by the CESSDA Data Catalogue, adding them to the "Country" search facet. See the README file for technical details and specifications.

  7. o

    Metadata Mapping Dataverse 5.4 - CESSDA Metadata Model 2.0

    • explore.openaire.eu
    Updated Jun 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Huis in 't Veld; Ricarda Braukmann; Marion Wittenberg (2021). Metadata Mapping Dataverse 5.4 - CESSDA Metadata Model 2.0 [Dataset]. http://doi.org/10.5281/zenodo.5011416
    Explore at:
    Dataset updated
    Jun 22, 2021
    Authors
    Laura Huis in 't Veld; Ricarda Braukmann; Marion Wittenberg
    Description

    Excel and .csv version of a mapping of metadata fields from Dataverse version 5.4 to the CESSDA Metadata Model (CMM) 2.0. This mapping was created by DANS in the context of the ODISSEI Project (NWO grant number 184.035.014).

  8. c

    ckanext-dataverse

    • catalog.civicdataecosystem.org
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). ckanext-dataverse [Dataset]. https://catalog.civicdataecosystem.org/dataset/ckanext-dataverse
    Explore at:
    Dataset updated
    Jun 4, 2025
    Description

    The Dataverse extension for CKAN facilitates integration and interaction with Dataverse installations. This likely empowers users to connect their CKAN instance with Dataverse repositories, potentially allowing for the discovery, harvesting, and management of datasets residing in Dataverse. Given the limited information, the exact features and capabilities will need to be derived from the source code. Key Features (Assumed Based on Extension Name): Dataverse Integration: Likely provides functionality to connect and interact with remote Dataverse instances including potentially retrieving metadata about published datasets. Dataset Discovery: May include tools to search and discover datasets within connected Dataverse repositories directly from the CKAN interface. Data Harvesting (Potential): Could offer data harvesting capabilities, making it possible to import datasets from Dataverse into CKAN for centralized management. Technical Integration (Limited Information): Due to the limited information, exact integration methods are unclear. However, it likely utilizes CKAN's plugin system and API to add new functionalities for managing Dataverse interactions. It may involve configuration settings to specify Dataverse endpoints and credentials. Given that it is a GeoSolutions extension there may be related GeoServer functionalities if CKAN and Dataverse can be integrated or configured to share common workflows. Benefits & Impact (Inferred): Connecting CKAN with Dataverse could promote data accessibility and interoperability between platforms. It allows users to take advantage of both systems' capabilities, by potentially enabling the seamless transfer of datasets and catalog information and enabling broader collaboration with a wide variety of potential systems.

  9. H

    Customer Experience Management & CRM - Crossref Bibliographic Metadata

    • dataverse.harvard.edu
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diomar Anez; Dimar Anez (2025). Customer Experience Management & CRM - Crossref Bibliographic Metadata [Dataset]. http://doi.org/10.7910/DVN/EEJST3
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Diomar Anez; Dimar Anez
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset provides detailed bibliographic metadata records for scholarly publications related to 'Customer Experience Management' (CEM) and 'Customer Relationship Management' (CRM), as retrieved from Crossref.org. This metadata corpus facilitates in-depth exploration of the academic discourse surrounding these interconnected domains. Contextual Overview of Customer Experience Management & CRM: 1. Definition and Context: Customer Relationship Management (CRM) focuses on practices, strategies, and technologies that companies use to manage and analyze customer interactions and data throughout the customer lifecycle, aiming to improve business relationships, assist in customer retention, and drive sales growth. Customer Experience Management (CEM) is a broader strategy that focuses on the perceptions and feelings a customer has as a result of all their interactions with a company over time. Both gained significant traction from the late 1990s, fueled by technology and a shift towards customer-centricity. 2. Strengths and Weaknesses: Strengths include enhanced customer loyalty, increased sales, improved customer service, and better customer insights. Weaknesses often stem from costly and complex system implementations, challenges in achieving a truly unified customer view, potential for data privacy concerns, and organizational resistance to customer-centric cultural shifts. Success requires strategic alignment, robust data governance, and consistent execution across all touchpoints, not just technological deployment. 3. Relevance and Research Potential: CEM and CRM are critical in today's competitive, digitally-driven markets where customer expectations are high. They are central to marketing, sales, and service management. Research avenues include the impact of AI and big data on CRM/CEM effectiveness, measuring and managing omnichannel customer journeys, the role of employee experience in delivering customer experience, personalization at scale, and the ethical implications of customer data utilization, particularly within emerging digital ecosystems. Dataset Structure and Content: The dataset consists of one or more archives. Each archive contains a series of approximately 850 monthly folders (e.g., spanning from January 1950 to January 2025), reflecting a granular month-by-month process of metadata retrieval and curation for CEM/CRM. Within each monthly folder, users will find several JSON files documenting the search and filtering process for that specific month: term_results/: A subfolder containing JSON files for results of initial broad keyword searches related to CEM/CRM. merged_results.json: Aggregated results from these individual term searches before advanced filtering. filtered_results.json: Results after applying a more specific, complex Boolean query (e.g., ("customer experience management" OR CRM ...) AND ("strategy" OR ...)) and exact phrase matching to refine relevance. The exact query used is detailed within this file. final_results.json: This is the primary file of interest for most users. It contains the curated, deduplicated (by DOI) list of unique publication metadata records deemed most relevant to 'Customer Experience Management & CRM' for that specific month. Includes fields like Title, Authors, DOI, Publication Date, Source Title, Abstract (if available from Crossref). statistics_results.json: Summary statistics of the search and filtering process for the month. This granular monthly structure allows researchers to trace the evolution of academic discourse on CEM/CRM and identify relevant publications with high temporal precision. For an overview of the general retrieval methodology, refer to the parent Dataverse description (Management Tool Bibliographic Metadata (Crossref)). Users interested in aggregated publication counts or trend analysis for CEM/CRM should consult the corresponding datasets in the Raw Extracts Dataverse and the Comparative Indices Dataverse.

  10. d

    lina dataverse

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hu, lina (2023). lina dataverse [Dataset]. http://doi.org/10.7910/DVN/ID4DGD
    Explore at:
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    hu, lina
    Description

    experimental data. Visit https://dataone.org/datasets/sha256%3A18735774f162e6915a7d05c2276ae4ddf535e237e1559bebab64d219355e9ca8 for complete metadata about this dataset.

  11. H

    The Politics of Metadata. Sami Traces. SNHB Kulturmiljöbild

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jan 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vendela Grundell Gachoud (2022). The Politics of Metadata. Sami Traces. SNHB Kulturmiljöbild [Dataset]. http://doi.org/10.7910/DVN/JWLHMI
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 28, 2022
    Dataset provided by
    Harvard Dataverse
    Authors
    Vendela Grundell Gachoud
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Search results for images and image metadata pertaining to the keywords 'same' and 'samisk' in the collections of the Swedish National Heritage Board in the in-house image database Kulturmiljöbild. The results are part of datasets produced by Vendela Grundell Gachoud comprising images, image metadata and interview notes pertaining to collections of the Swedish National Heritage Board presented in the in-house image database Kulturmiljöbild and on the social media site Flickr Commons. The research for this dataset was conducted within the project The Politics of Metadata at the Department of Culture and Aesthetics at Stockholm University, funded by the Swedish Research Council (grant no. 2018-01068). The project leader is Anna Näslund Dahlgren. Results of this research are presented in Digital Approaches to Inclusion and Participation in Cultural Heritage (eds. Giglitto et al, Routledge 2022).

  12. d

    Data for: Identifying Metadata Quality Issues Across Cultures

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shi, Julie; Nason, Mike; Tullney, Marco; Alperin, Juan Pablo (2023). Data for: Identifying Metadata Quality Issues Across Cultures [Dataset]. http://doi.org/10.7910/DVN/GZI7IA
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Shi, Julie; Nason, Mike; Tullney, Marco; Alperin, Juan Pablo
    Description

    This sample was drawn from the Crossref API on March 8, 2022. The sample was constructed purposefully on the hypothesis that records with at least one known issue would be more likely to yield issues related to cultural meanings and identity. Records known or suspected to have at least one quality issue were selected by the authors and Crossref staff. The Crossref API was then used to randomly select additional records from the same prefix. Records in the sample represent 51 DOI prefixes that were chosen without regard for the manuscript management or publishing platform used, as well as 17 prefixes for journals known to use the Open Journal Systems manuscript management and publishing platform. OJS was specifically identified due to the authors' familiarity with the platform, its international and multilingual reach, and previous work on its metadata quality.

  13. H

    Data from: Call metadata from: Data collection smart and simple: Evaluation...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jan 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Eitzinger (2024). Call metadata from: Data collection smart and simple: Evaluation and metanalysis of call data from studies applying the 5Q approach [Dataset]. http://doi.org/10.7910/DVN/CMIVQK
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 24, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Anton Eitzinger
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/CMIVQKhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7910/DVN/CMIVQK

    Time period covered
    Mar 1, 2015 - May 30, 2021
    Area covered
    United Republic of, Tanzania, Uganda, Colombia, Ghana, Rwanda
    Dataset funded by
    Bill & Melinda Gates Foundation
    Bill & Melinda Gates Foundation - OPP1107891 Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) - 81206685 OPEC Fund for International Development (OFID)
    Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ)
    Description

    The original research publication that uses the dataset evaluated the application of the 5Q approach (5Q) combined with interactive voice response (IVR) call campaigns for agile data collection. The dataset includes 37’503 call metadata from 102 IVR call campaigns and among five countries. The dataset provides insights to call status, average call duration, reached IVR blocks, and differences in response rate between different call types and survey topics.

  14. D

    Correspondence Metadata from the Digital Scholarly Edition of Edvard Munch's...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    txt, xml
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Annika Rockenberger; Annika Rockenberger; Loke Sjølie; Hilde Bøe; Hilde Bøe; Loke Sjølie (2023). Correspondence Metadata from the Digital Scholarly Edition of Edvard Munch's Writings [Dataset]. http://doi.org/10.18710/TAFUSV
    Explore at:
    txt(15499), xml(4112303)Available download formats
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    DataverseNO
    Authors
    Annika Rockenberger; Annika Rockenberger; Loke Sjølie; Hilde Bøe; Hilde Bøe; Loke Sjølie
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    1874 - 1944
    Area covered
    https://www.geonames.org/2953418/, Bad Kösen, Germany, https://www.geonames.org/2879139/, Saxony, Leipzig, Germany, Thuringia, https://www.geonames.org/2822542/, Italy, https://www.geonames.org/3169070/, Rome, Prague, Czech Republic, https://www.geonames.org/3067696/, Norway, Norway, Norway, Norway, Norway
    Dataset funded by
    Teksthub, University of Oslo
    Description

    The eMunch dataset contains correspondence metadata of 8.527 letters to and from the Norwegian painter Edvard Munch (1863-1944). The dataset is derived from the digital scholarly edition of Edvard Munch's Writings, eMunch.no, edited by Hilde Bøe, The Munch Museum, Oslo. The eMunch dataset is part of the NorKorr - Norwegian Correspondences project that aims to collect metadata from all correspondences in collections of Norwegian academic and cultural heritage institutions, project website on GitHub. A Python script was developed to parse the XML files on eMunch.no and supplementary data files (Excel spreadsheet with updated dates, CSV file with GeoNames IDs for places) and extract the following metadata: sender's name, receiver's name, place name, date, and letter ID in the scholarly edition. These metadata were then converted into the Correspondence Metadata Interchange Format (CMIF). The entire dataset has been integrated into the international CorrespSearch search service for scholarly editions of letters hosted by the Berlin-Brandenburg Academy of Sciences—link to the CorrespSearch website.

  15. d

    Data from: Daily data

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhang, Lynda (2023). Daily data [Dataset]. http://doi.org/10.7910/DVN/K8HGXS
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Zhang, Lynda
    Description
  16. d

    Ethical Considerations of Including Personal Demographic Information in Open...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suiter, Greta; Nerissa Lindsey; Kurt Hanselman (2023). Ethical Considerations of Including Personal Demographic Information in Open Knowledge Platforms survey data [Dataset]. http://doi.org/10.7910/DVN/UTPPN9
    Explore at:
    Dataset updated
    Nov 14, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Suiter, Greta; Nerissa Lindsey; Kurt Hanselman
    Description

    This is data collected from a survey that was for members of GLAM institutions that were contributing to open knowledge projects (Wikidata, Wikipedia, SNAC, etc.). The purpose of the survey was to learn about policies and practices, or lack thereof, GLAM staff are following around contributing demographic information for living people (e.g., Sex or Gender, Ethnic Group, Race, Sexual Orientation, etc.) to open knowledge projects. Information collected from this survey will inform an ethical investigation into issues surrounding these practices.

  17. d

    Supporting Data for: Globally Accessible Distributed Data Sharing (GADDS): a...

    • search.dataone.org
    • dataverse.no
    • +1more
    Updated Jan 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vazquez, Pavel; Rayner, Simon (2024). Supporting Data for: Globally Accessible Distributed Data Sharing (GADDS): a decentralized FAIR platform to facilitate data sharing in the life sciences [Dataset]. http://doi.org/10.18710/7SNGIX
    Explore at:
    Dataset updated
    Jan 5, 2024
    Dataset provided by
    DataverseNO
    Authors
    Vazquez, Pavel; Rayner, Simon
    Description

    Experimental data and metadata used to demostrate the use of the Globally Accessible Distributed Data Sharing (GADDS) platform. The experiemental data is used only to ilustrate the use the GADDS platform in a life science context. The Globally Accessible Distributed Data Sharing (GADDS) platform to facilitate FAIR-like data-sharing in cross-disciplinary research collaborations.

  18. H

    Benchmarking - Crossref Bibliographic Metadata

    • dataverse.harvard.edu
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diomar Anez; Dimar Anez (2025). Benchmarking - Crossref Bibliographic Metadata [Dataset]. http://doi.org/10.7910/DVN/MMAVWO
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Diomar Anez; Dimar Anez
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset provides detailed bibliographic metadata records for scholarly publications related to 'Benchmarking', as retrieved from Crossref.org. This metadata corpus facilitates in-depth exploration of the academic discourse surrounding Benchmarking. Contextual Overview of Benchmarking: 1. Definition and Context: Benchmarking is a systematic process of measuring an organization's products, services, and processes against those of organizations recognized as leaders in their field ("best in class") to identify areas for improvement. Its core purpose is to learn from others and implement best practices to enhance performance. While informal comparisons have always existed, formal benchmarking gained widespread adoption in the 1980s, particularly influenced by companies like Xerox, aiming to achieve competitive superiority. 2. Strengths and Weaknesses: Strengths include providing objective goals, fostering innovation by learning from external successes, improving processes, and enhancing competitiveness. It can motivate change by highlighting performance gaps. Weaknesses may involve difficulties in finding comparable organizations or data, the risk of merely copying without understanding context, potential legal/ethical issues in data sharing, and the resources required for a thorough study. It may also lead to incremental rather than breakthrough improvements if not applied creatively. 3. Relevance and Research Potential: Benchmarking remains a relevant tool for continuous improvement and strategic positioning across industries. It is integral to quality management, operational excellence, and competitive strategy. Research opportunities include the evolution of benchmarking in the digital age (e.g., digital benchmarking, AI-driven comparisons), its application to intangible assets and complex services, ethical considerations in competitive benchmarking, and its integration with other improvement methodologies like Lean and Six Sigma for synergistic effects. Dataset Structure and Content: The dataset consists of one or more archives. Each archive contains a series of approximately 850 monthly folders (e.g., spanning from January 1950 to January 2025), reflecting a granular month-by-month process of metadata retrieval and curation for Benchmarking. Within each monthly folder, users will find several JSON files documenting the search and filtering process for that specific month: term_results/: A subfolder containing JSON files for results of initial broad keyword searches related to Benchmarking. merged_results.json: Aggregated results from these individual term searches before advanced filtering. filtered_results.json: Results after applying a more specific, complex Boolean query (e.g., "benchmarking" AND ("performance" OR "best practices" ...)) and exact phrase matching to refine relevance. The exact query used is detailed within this file. final_results.json: This is the primary file of interest for most users. It contains the curated, deduplicated (by DOI) list of unique publication metadata records deemed most relevant to 'Benchmarking' for that specific month. Includes fields like Title, Authors, DOI, Publication Date, Source Title, Abstract (if available from Crossref). statistics_results.json: Summary statistics of the search and filtering process for the month. This granular monthly structure allows researchers to trace the evolution of academic discourse on Benchmarking and identify relevant publications with high temporal precision. For an overview of the general retrieval methodology, refer to the parent Dataverse description (Management Tool Bibliographic Metadata (Crossref)). Users interested in aggregated publication counts or trend analysis for Benchmarking should consult the corresponding datasets in the Raw Extracts Dataverse and the Comparative Indices Dataverse.

  19. d

    Harvard Library Bibliographic Metadata: Detailed Content Inventory

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eslao, Christine (2023). Harvard Library Bibliographic Metadata: Detailed Content Inventory [Dataset]. http://doi.org/10.7910/DVN/Y5WUTU
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Eslao, Christine
    Description

    Detailed content inventory for Harvard Library Bibliographic Metadata.

  20. H

    Strategic Planning - Crossref Bibliographic Metadata

    • dataverse.harvard.edu
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diomar Anez; Dimar Anez (2025). Strategic Planning - Crossref Bibliographic Metadata [Dataset]. http://doi.org/10.7910/DVN/4ETI8W
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 7, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Diomar Anez; Dimar Anez
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset provides detailed bibliographic metadata records for scholarly publications related to 'Strategic Planning', as retrieved from Crossref.org. This metadata corpus facilitates in-depth exploration of the academic discourse surrounding Strategic Planning. Contextual Overview of Strategic Planning: 1. Definition and Context: Strategic Planning is a fundamental organizational management activity used to set priorities, focus energy and resources, strengthen operations, ensure that employees and other stakeholders are working toward common goals, and establish agreement around intended outcomes/results. It is a disciplined effort that produces fundamental decisions and actions shaping an organization's future. While a long-standing practice, its methodologies have evolved, particularly with shifts in competitive landscapes and analytical capabilities across all organizational types. 2. Strengths and Weaknesses: The core strength of Strategic Planning lies in providing direction, alignment, and a proactive approach to shaping an organization's future. It facilitates resource allocation and performance measurement. However, weaknesses can include becoming overly bureaucratic, static in dynamic environments ("paralysis by analysis"), or poorly implemented, leading to plans that are not executed. Its effectiveness often depends on stakeholder involvement, realistic assessments, and an adaptive rather than rigid approach. 3. Relevance and Research Potential: Strategic Planning remains critically relevant as organizations navigate complexity, technological change, and market disruptions. It is a cornerstone of strategic management theory and practice. Research opportunities include examining the efficacy of different planning frameworks (e.g., SWOT, PESTEL, VRIO), the role of agility and dynamic capabilities in strategic planning, its integration with execution, and its impact on organizational performance and resilience across varying contexts and industries, particularly in the digital age. Dataset Structure and Content: The dataset consists of one or more archives. Each archive contains a series of approximately 850 monthly folders (e.g., spanning from January 1950 to January 2025), reflecting a granular month-by-month process of metadata retrieval and curation for Strategic Planning. Within each monthly folder, users will find several JSON files documenting the search and filtering process for that specific month: term_results/: A subfolder containing JSON files for results of initial broad keyword searches related to Strategic Planning. merged_results.json: Aggregated results from these individual term searches before advanced filtering. filtered_results.json: Results after applying a more specific, complex Boolean query (e.g., ("strategic planning" OR "strategic management" ...) AND ("process" OR ...)) and exact phrase matching to refine relevance. The exact query used is detailed within this file. final_results.json: This is the primary file of interest for most users. It contains the curated, deduplicated (by DOI) list of unique publication metadata records deemed most relevant to 'Strategic Planning' for that specific month. Includes fields like Title, Authors, DOI, Publication Date, Source Title, Abstract (if available from Crossref). statistics_results.json: Summary statistics of the search and filtering process for the month. This granular monthly structure allows researchers to trace the evolution of academic discourse on Strategic Planning and identify relevant publications with high temporal precision. For an overview of the general retrieval methodology, refer to the parent Dataverse description (Management Tool Bibliographic Metadata (Crossref)). Users interested in aggregated publication counts or trend analysis for Strategic Planning should consult the corresponding datasets in the Raw Extracts Dataverse and the Comparative Indices Dataverse.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Julian Gautier (2024). Dataset metadata of known Dataverse installations, August 2023 [Dataset]. http://doi.org/10.7910/DVN/8FEGUV

Dataset metadata of known Dataverse installations, August 2023

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 30, 2024
Dataset provided by
Harvard Dataverse
Authors
Julian Gautier
License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

This dataset contains the metadata of the datasets published in 85 Dataverse installations and information about each installation's metadata blocks. It also includes the lists of pre-defined licenses or terms of use that dataset depositors can apply to the datasets they publish in the 58 installations that were running versions of the Dataverse software that include that feature. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations and improving understandings about how certain Dataverse features and metadata fields are used. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation between August 22 and August 28, 2023 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another column named "apikey" listing my accounts' API tokens. The Python script expects the CSV file and the listed API tokens to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation)_2023.08.22-2023.08.28.csv │ ├── contributor(citation)_2023.08.22-2023.08.28.csv │ ├── data_source(citation)_2023.08.22-2023.08.28.csv │ ├── ... │ └── topic_classification(citation)_2023.08.22-2023.08.28.csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2023.08.27_12.59.59.zip │ ├── dataset_pids_Abacus_2023.08.27_12.59.59.csv │ ├── Dataverse_JSON_metadata_2023.08.27_12.59.59 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0(latest_version).json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2023.08.26_22.14.04.zip │ ├── ADA_Dataverse_2023.08.27_13.16.20.zip │ ├── Arca_Dados_2023.08.27_13.34.09.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2023.08.27_19.24.15.zip └── dataverse_installations_summary_2023.08.28.csv └── dataset_pids_from_most_known_dataverse_installations_2023.08.csv └── license_options_for_each_dataverse_installation_2023.09.05.csv └── metadatablocks_from_most_known_dataverse_installations_2023.09.05.csv This dataset contains two directories and four CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 20 CSV files that list the values of many of the metadata fields in the citation metadata block and geospatial metadata block of datasets in the 85 Dataverse installations. For example, author(citation)_2023.08.22-2023.08.28.csv contains the "Author" metadata for the latest versions of all published, non-deaccessioned datasets in the 85 installations, where there's a row for author names, affiliations, identifier types and identifiers. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 85 zipped files, one for each of the 85 Dataverse installations whose dataset metadata I was able to download. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate if the Python script was able to download the Dataverse JSON metadata for each dataset. It also includes the alias/identifier and category of the Dataverse collection that the dataset is in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The Dataverse JSON export of the latest version of each dataset includes "(latest_version)" in the file name. This should help those who are interested in the metadata of only the latest version of each dataset. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I included them so that they can be used when extracting metadata from the dataset's Dataverse JSON exports. The dataverse_installations_summary_2023.08.28.csv file contains information about each installation, including its name, URL, Dataverse software version, and counts of dataset metadata...

Search
Clear search
Close search
Google apps
Main menu