This dataset contains the metadata of the datasets published in 77 Dataverse installations, information about each installation's metadata blocks, and the list of standard licenses that dataset depositors can apply to the datasets they publish in the 36 installations running more recent versions of the Dataverse software. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation on October 2 and October 3, 2022 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another named "apikey" listing my accounts' API tokens. The Python script expects and uses the API tokens in this CSV file to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation).csv │ ├── basic.csv │ ├── contributor(citation).csv │ ├── ... │ └── topic_classification(citation).csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2022.10.02_17.11.19.zip │ ├── dataset_pids_Abacus_2022.10.02_17.11.19.csv │ ├── Dataverse_JSON_metadata_2022.10.02_17.11.19 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0.json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2022.10.02_17.26.19.zip │ ├── ADA_Dataverse_2022.10.02_17.26.57.zip │ ├── Arca_Dados_2022.10.02_17.44.35.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2022.10.02_22.59.36.zip └── dataset_pids_from_most_known_dataverse_installations.csv └── licenses_used_by_dataverse_installations.csv └── metadatablocks_from_most_known_dataverse_installations.csv This dataset contains two directories and three CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 18 CSV files that contain the values from common metadata fields of all 77 Dataverse installations. For example, author(citation)_2022.10.02-2022.10.03.csv contains the "Author" metadata for all published, non-deaccessioned, versions of all datasets in the 77 installations, where there's a row for each author name, affiliation, identifier type and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 77 zipped files, one for each of the 77 Dataverse installations whose dataset metadata I was able to download using Dataverse APIs. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate whether or not the Python script was able to download the Dataverse JSON metadata for each dataset. For Dataverse installations using Dataverse software versions whose Search APIs include each dataset's owning Dataverse collection name and alias, the CSV files also include which Dataverse collection (within the installation) that dataset was published in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I saved them so that they can be used when extracting metadata from the Dataverse JSON files. The dataset_pids_from_most_known_dataverse_installations.csv file contains the dataset PIDs of all published datasets in the 77 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all of the "dataset_pids_..." files in each of the 77 zip files. The licenses_used_by_dataverse_installations.csv file contains information about the licenses that a number of the installations let depositors choose when creating datasets. When I collected ... Visit https://dataone.org/datasets/sha256%3Ad27d528dae8cf01e3ea915f450426c38fd6320e8c11d3e901c43580f997a3146 for complete metadata about this dataset.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains raw data and processed data from the Dataverse Community Survey 2022. The main goal of the survey was to help the Global Dataverse Community Consortium (GDCC; https://dataversecommunity.global/) and the Dataverse Project (https://dataverse.org/) decide on what actions to take to improve the Dataverse software and the larger ecosystem of integrated tools and services as well as better support community members. The results from the survey may also be of interest to other communities working on software and services for managing research data. The survey was designed to map out the current status as well as the roadmaps and priorities of Dataverse installations around the world. The main target group for participating in the survey were the people/teams responsible for operating Dataverse installations around the world. A secondary target group were people/teams at organizations that are planning to deploy or considering deploying a Dataverse installation. There were 34 existing and planned Dataverse installations participating in the survey.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Comparative review of open access data repositories collected to inform product development for the Dataverse Project at the Harvard Institute for Quantitative Social Science More information about the scope, purpose and development of this review is at https://dataverse.org/blog/comparative-review-various-data-repositories.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A Dataset of small image files, screenshots of SPA code, tools and development environment, to show files table
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
test dataset
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Exercising fields used by schema.org
exporter.
Project portal for publishing, citing, sharing and discovering research data. Software, protocols, and community connections for creating research data repositories that automate professional archival practices, guarantee long term preservation, and enable researchers to share, retain control of, and receive web visibility and formal academic citations for their data contributions. Researchers, data authors, publishers, data distributors, and affiliated institutions all receive appropriate credit. Hosts multiple dataverses. Each dataverse contains studies or collections of studies, and each study contains cataloging information that describes the data plus the actual data files and complementary files. Data related to social sciences, health, medicine, humanities or other sciences with an emphasis in human behavior are uploaded to the IQSS Dataverse Network (Harvard). You can create your own dataverse for free and start adding studies for your data files and complementary material (documents, software, etc). You may install your own Dataverse Network for your University or organization.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
TXST Dataverse Quick Start Guide-Example RDM Dataverse
experimental data. Visit https://dataone.org/datasets/sha256%3A18735774f162e6915a7d05c2276ae4ddf535e237e1559bebab64d219355e9ca8 for complete metadata about this dataset.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This four datasets are used in conjunction with the senior seminar course, “Storytelling with Data: The Beyonce Edition.” This compilation of information related to Beyonce’s live performances related to the singer’s music videos, live performance, award nominations and wins, and chart performances of her songs.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repo contains data produced from the manuscript entitled: "Discovering non-additive heritability using additive GWAS summary statistics". Here, we provide the additive and cis-interaction LD scores used for the real data analyses of 25 well-studied quantitative phenotypes from 349,468 individuals of self-identified European ancestry in the UK Biobank and up to 159,095 individuals in BioBank Japan. Note that for the UK Biobank analysis, LD scores were computed using a reference panel of 489 individuals from the European superpopulation (EUR) of the 1000 Genomes Project. For the analysis of BioBank Japan, In order to analyze data from BioBank Japan, we downloaded publicly available GWAS summary statistics for the 25 traits from http://jenger.riken.jp/en/result. Summary statistics used age, sex, and the first ten principal components as confounders in the initial GWAS study. We then used individuals from the East Asian (EAS) superpopulation from the 1000 Genomes Project Phase 3 to calculate paired LD scores from a reference panel.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Total File Search Selection
Piper, Andrew, 2016, "Fictionality", doi:10.7910/DVN/5WKTZV, Harvard Dataverse, V1. Contains LIWC feature tables for all ~27,000 documents used in this study, R and Python code used to generate statistical results, and all supporting tables. Original CA article published at http://culturalanalytics.org/2016/12/fictionality/ See also Community Norms [http://best-practices.dataverse.org/harvard-policies/community-norms.html] as well as good scientific practices expect that proper credit is given via citation.
This dataset contains data, documentation, and code files associated with studies performed on snapshots of the contents of Harvard Dataverse taken on 28 and 29 October 2019.
Harvard Dataverse => Digital Library - Projects & Theses - Prof. Dr. Scholz ----- Introduction and background information to "Digital Library - Projects & Theses - Prof. Dr. Scholz". The URL of the dataverse: http://dataverse.harvard.edu/dataverse/LibraryProfScholz The URL of this (introduction) dataset: http://doi.org/10.7910/DVN/R33RS9 YOU MAY HAVE BEEN DIRECTED HERE, BECAUSE THE CALLING PAGE HAS NO OTHER ENTRY POINT (with DOI) INTO THIS DATAVERSE. Click on the title of this page to reach the start page of the dataverse! Introduction to the Data in this Dataverse This dataverse is about: Aircraft Design Flight Mechanics Aircraft Systems This dataverse contains research data and software produced by students for their projects and theses on above topics. Get linked to all other resources from their reports using the URN from the German National Library (DNB) as given in each dataset under "Metadata": https://nbn-resolving.org/html/urn:nbn:de:gbv:18302-aeroJJJJ-MM-DD.01x Alternative sites that store the data given in this dataverse are: http://library.ProfScholz.de and https://archive.org/details/@profscholz Open an "item". Under "DOWNLOAD OPTIONS" select the file (as far as available) called "ZIP" to download DataXxxx.zip. Alternatively, go to "SHOW ALL"; In the new window select next to DataXxxx.zip click "View Contents" or select URL next to "Data-list". Download single file from DataXxxx.zip. Data Publishing Data publishing means publishing of research data for (re)use by others. It consists of preparing single files or a dataset containing several files for access in the WWW. This practice is part of the open science movement. There is consensus about the benefits resulting from Open Data - especially in connection with Open Access publishing. It is important to link the publication (e.g. thesis) with the underlying data and vice versa. General (not disciplinary) and free data repositories are: Harvard Dataverse (this one!) figshare (emphasis: multi media) Zenodo (emphasis: results from EU research, mainly text) Mendeley Data (emphasis: data associated with journal articles) To find data repositories use http://re3data.org Read more on https://en.wikipedia.org/wiki/Data_publishing
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
The Sicpa_OpenData libraries allow to facilitate the publication of data to the INRAE dataverse in a transparent way 1/ by simplifying the creation of the metadata document from the data already present in the information systems, 2/ by simplifying the use of dataverse.org APIs. Available as a DLL, the SicpaOpenData for .Net library can be used from all developments using the Microsoft .NET platform
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Cora data contains bibliographic records of machine learning papers that have been manually clustered into groups that refer to the same publication. Originally, Cora was prepared by Andrew McCallum, and his versions of this data set are available on his Data web page. The data is also hosted here. Note that various versions of the Cora data set have been used by many publications in record linkage and entity resolution over the years.
Harvard College course enrollment statistics for the most recent semester including course, department, class number, and number of students (categorized by affiliation.)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Documentation file for do-files and datasets corresponding to paper titled: “Public Health Policy at Scale: Impact of a Government-sponsored Information Campaign on Infant Mortality in Denmark” Onur Altindag, Jane Greve, and Erdal Tekin This document describes the datasets, STATA and R programs that replicate the results for the paper “Public Health Policy at Scale: Impact of Government-sponsored Information Campaign on Infant Mortality in Denmark” by Onur Altindag, Jane Greve, and Erdal Tekin, Review of Economics and Statistics, the version that is accepted on February 2021.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Replication Data for: "Social Ties in Academia: A Friend is a Treasure"
This dataset contains the metadata of the datasets published in 77 Dataverse installations, information about each installation's metadata blocks, and the list of standard licenses that dataset depositors can apply to the datasets they publish in the 36 installations running more recent versions of the Dataverse software. The data is useful for reporting on the quality of dataset and file-level metadata within and across Dataverse installations. Curators and other researchers can use this dataset to explore how well Dataverse software and the repositories using the software help depositors describe data. How the metadata was downloaded The dataset metadata and metadata block JSON files were downloaded from each installation on October 2 and October 3, 2022 using a Python script kept in a GitHub repo at https://github.com/jggautier/dataverse-scripts/blob/main/other_scripts/get_dataset_metadata_of_all_installations.py. In order to get the metadata from installations that require an installation account API token to use certain Dataverse software APIs, I created a CSV file with two columns: one column named "hostname" listing each installation URL in which I was able to create an account and another named "apikey" listing my accounts' API tokens. The Python script expects and uses the API tokens in this CSV file to get metadata and other information from installations that require API tokens. How the files are organized ├── csv_files_with_metadata_from_most_known_dataverse_installations │ ├── author(citation).csv │ ├── basic.csv │ ├── contributor(citation).csv │ ├── ... │ └── topic_classification(citation).csv ├── dataverse_json_metadata_from_each_known_dataverse_installation │ ├── Abacus_2022.10.02_17.11.19.zip │ ├── dataset_pids_Abacus_2022.10.02_17.11.19.csv │ ├── Dataverse_JSON_metadata_2022.10.02_17.11.19 │ ├── hdl_11272.1_AB2_0AQZNT_v1.0.json │ ├── ... │ ├── metadatablocks_v5.6 │ ├── astrophysics_v5.6.json │ ├── biomedical_v5.6.json │ ├── citation_v5.6.json │ ├── ... │ ├── socialscience_v5.6.json │ ├── ACSS_Dataverse_2022.10.02_17.26.19.zip │ ├── ADA_Dataverse_2022.10.02_17.26.57.zip │ ├── Arca_Dados_2022.10.02_17.44.35.zip │ ├── ... │ └── World_Agroforestry_-_Research_Data_Repository_2022.10.02_22.59.36.zip └── dataset_pids_from_most_known_dataverse_installations.csv └── licenses_used_by_dataverse_installations.csv └── metadatablocks_from_most_known_dataverse_installations.csv This dataset contains two directories and three CSV files not in a directory. One directory, "csv_files_with_metadata_from_most_known_dataverse_installations", contains 18 CSV files that contain the values from common metadata fields of all 77 Dataverse installations. For example, author(citation)_2022.10.02-2022.10.03.csv contains the "Author" metadata for all published, non-deaccessioned, versions of all datasets in the 77 installations, where there's a row for each author name, affiliation, identifier type and identifier. The other directory, "dataverse_json_metadata_from_each_known_dataverse_installation", contains 77 zipped files, one for each of the 77 Dataverse installations whose dataset metadata I was able to download using Dataverse APIs. Each zip file contains a CSV file and two sub-directories: The CSV file contains the persistent IDs and URLs of each published dataset in the Dataverse installation as well as a column to indicate whether or not the Python script was able to download the Dataverse JSON metadata for each dataset. For Dataverse installations using Dataverse software versions whose Search APIs include each dataset's owning Dataverse collection name and alias, the CSV files also include which Dataverse collection (within the installation) that dataset was published in. One sub-directory contains a JSON file for each of the installation's published, non-deaccessioned dataset versions. The JSON files contain the metadata in the "Dataverse JSON" metadata schema. The other sub-directory contains information about the metadata models (the "metadata blocks" in JSON files) that the installation was using when the dataset metadata was downloaded. I saved them so that they can be used when extracting metadata from the Dataverse JSON files. The dataset_pids_from_most_known_dataverse_installations.csv file contains the dataset PIDs of all published datasets in the 77 Dataverse installations, with a column to indicate if the Python script was able to download the dataset's metadata. It's a union of all of the "dataset_pids_..." files in each of the 77 zip files. The licenses_used_by_dataverse_installations.csv file contains information about the licenses that a number of the installations let depositors choose when creating datasets. When I collected ... Visit https://dataone.org/datasets/sha256%3Ad27d528dae8cf01e3ea915f450426c38fd6320e8c11d3e901c43580f997a3146 for complete metadata about this dataset.