Data archive to assist in the sharing of research grade information pertaining to the social and economic sciences. The majority of digital content currently consists of social science research data from experiments, program files with the code for analyzing the data, requisite documentation to use and understand the data, and associated files. Access to the ISPS Data Archive is provided at no cost and is granted for scholarship and research purposes only. When possible, Data is linked to Projects and Publications, via the ISPS KnowledgeBase. ISPS operates in accordance with the prevailing standards and practices of the digital preservation community including the Open Archival Information System (OAIS) Reference Model (ISO 14721:2003) and the Data Documentation Initiative (DDI) standard. Accordingly, ISPS supports digital life-cycle management, interoperability, and preferred methods of preservation. The ISPS Data Archive is intended for use by social science researchers, policy-makers, and practitioners who are conducting or analyzing field (and other) experiments in various social science disciplines. Currently, Replication Files originate with ISPS-affiliated scholars.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This data release presents the Yale stocks and flows database (YSTAFDB). Its data describe the use of 102 materials from the early 1800s to circa 2013 through anthropogenic cycles, their recycling and criticality properties, and on spatial scales ranging from suburbs to global. This data collection was previously scattered across multiple non-uniformly formatted files such as journal papers, reports, and unpublished spreadsheets. These data have been synthesized into YSTAFDB, which is presented as individual comma-separated text files and also in MySQL and PostgreSQL database formats. Consolidation of these data into a single database can increase their accessibility and reusability, which is relevant to diverse stakeholders ranging from researchers in sustainability science to government employees involved in national emergency planning.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This Zenodo upload accompanies a Data Descriptor in Scientific Data, titled "YSTAFDB, a unified database of material stocks and flows for sustainability science". We refer to this database as the Yale stocks and flows database (YSTAFDB).
Here, we provide core and supplementary ('hierarchy') tables for YSTAFDB in comma separated value file format. We additionally provide an example of a template that we used to manually prepare data for entry into YSTAFDB.
A complete description of YSTAFDB can be found in the Data Descriptor mentioned above. Key properties of YSTAFDB include:
100,000+ material cycles, criticality, and recycling data records
quantitative stocks and flows data for 62 elemental cycles and various engineering material cycles
data records describing stocks and flows at spatial scales from cities to global, and from the early 1800s to ca. 2013
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
We present a dataset of 11,892 longitudinal brain MRI studies from 1,430 patients with clinically confirmed brain metastases. T1-weighted pre-contrast, T1-weighted post-contrast, T2-weighted, and fluid-attenuated inversion recovery MRI sequence images are provided in NIfTI format. Additionally, an Excel spreadsheet with patient demographic information, scanner details, and image acquisition parameters are provided. This dataset will facilitate the development of AI models to assist in the long-term management of patients with brain metastasis.
Brain metastases are associated with significant morbidity, (Achrol 2019, Sacks 2020, Lamba 2021) necessitating frequent radiologic assessment in collaboration with neuro-oncologists to evaluate treatment response and disease progression. Magnetic resonance imaging (MRI) is a cornerstone in the management of central nervous system metastases, (Brenner 2022, Lin 2015, Vogelbaum 2022, Le Rhun 2021, Patil 2017, Kraft 2019, Aldawsari 2023) providing critical insights over time. (Kang 2009, Friedman 2001, Lunsford 1998)
Artificial intelligence (AI) emerged as a valuable tool for prognosis and treatment planning in neuro-oncology. (Cassinelli Petersen 2022, Aneja 2019, Aboian 2022, Aboian 2022, Rudie 2021, Xue 2020) However, the creation of widely applicable clinical models is constrained by the scarcity of large-scale, heterogeneous datasets. (Aneja 2019, Rudie 2019) This underscores the need for a longitudinal imaging dataset that captures a diverse range of imaging patterns, scanner technologies, and acquisition techniques. In response to this gap, we introduce a dataset spanning nearly 20 years, including pre- and post-treatment imaging across four essential MRI sequences.
To our knowledge, this is the largest publicly available MRI dataset of patients with brain metastases. By providing open access to this resource, we hope to enable diverse research applications, from conventional radiologic investigations to state-of-the-art machine learning approaches, ultimately contributing to better patient outcomes and a more comprehensive understanding of brain metastases. The inclusion of both imaging and clinical data makes this dataset a valuable asset for researchers in oncology, neuroradiology, and data science.
The following subsections provide information about how the data were selected, acquired and prepared for publication, approximate date range of imaging studies.
The electronic medical record (EMR) system at Yale New Haven Hospital was searched for MRI scans performed between 2004 and 2023 that evaluated brain metastases. This automated query initially retrieved 46,364 MRI studies from 7,111 patients with potential intracranial metastatic disease. A subsequent manual review of the electronic health record (EHR) excluded cases lacking radiologic or pathologic confirmation of brain metastases.
To ensure consistency, only MRI exams containing axial T1-weighted (T1W), contrast-enhanced T1-weighted (T1CE), T2-weighted (T2), or fluid-attenuated inversion recovery (FLAIR) sequences were selected. For patients who underwent treatments targeting brain metastases—such as stereotactic radiosurgery, whole-brain radiotherapy, or surgical resection—pre-treatment scans taken within 30 days before treatment initiation were retained, along with all follow-up imaging to enable longitudinal analysis of disease progression and treatment effects. After these refinements, the final dataset comprised 11,892 MRI studies from 1,430 patients with confirmed brain metastases.
This retrospective study was approved by the Institutional Review Board of Yale University on 10/01/2020, protocol 2000029055.
Radiology: Most of the MRI scans were obtained using 1.5T or 3T scanners manufactured by Siemens Healthineers or General Electric Healthcare. Image data and associated metadata were extracted through the application programming interface of Visage (Visage 7, Visage Imaging, Inc., San Diego, CA). DICOM metadata enabled the retrieval of key imaging parameters, including study location, scanner manufacturer, scanner model, magnetic field strength, acquisition type (2D vs. 3D), sequence designation, slice thickness, slice spacing, repetition time, echo time, and inversion time. A comprehensive breakdown of these acquisition parameters for each scan is available in the accompanying Excel file.
Clinical: Patient baseline information was extracted from the EMR for each study time point. The recorded data include the patient's age at the time of imaging, sex, study date. All data were retrieved as of December 2023.
MRI Sequence Selection and Standardization: The MRI sequences T1W, T1CE, T2, and FLAIR were chosen for inclusion due to their essential role in evaluating brain metastases, as they provide complementary imaging characteristics critical for diagnosis and longitudinal assessment. To ensure consistency across the dataset, MRI sequence names were standardized to address variations in DICOM metadata arising from differences in scanners, radiology technicians, imaging sites, and longitudinal studies.
A manual review of studies guided the development of a rules-based image classifier and validation process. Images were filtered based on factors such as orientation, acquisition technique, contrast enhancement, and spin echo variations to retain only relevant sequences. Additionally, redundant sequence identifiers were removed to streamline naming conventions. This structured approach ensured precise inclusion and uniform labeling of MRI sequences, enhancing the dataset’s reliability for longitudinal analysis.
The selected studies were subsequently exported as NIfTI files to a secure external drive using the Visage application programming interface. Following sequence selection and standardization, HD-BET was applied to extract brain parenchyma from each image, ensuring the removal of identifiable facial features.
MRI scans are accompanied by an Excel file containing separate sheets for clinical data and radiologic image acquisition parameters. Each brain metastasis study includes up to four files corresponding to T1W, T1CE, T2W, and/or FLAIR sequences. All imaging data were exported from the Visage AI Accelerator in NIfTI format and processed for brain extraction.
File names follow a standardized format, incorporating an anonymous patient identifier, anonymized study date-time, and sequence type, structured as caseID_date-time_sequence.nii.gz to ensure clarity and consistency across the dataset.
Timeseries data from 'Yale Boathouse, Thames River, CT' (noaa_nos_co_ops_8461467) cdm_data_type=TimeSeries cdm_timeseries_variables=station,longitude,latitude contributor_email=feedback@axiomdatascience.com contributor_name=Axiom Data Science contributor_role=processor contributor_role_vocabulary=NERC contributor_url=https://www.axiomdatascience.com Conventions=IOOS-1.2, CF-1.6, ACDD-1.3, NCCSV-1.2 defaultDataQuery=sea_surface_height_amplitude_due_to_geocentric_ocean_tide_above_mllw_qc_agg,z,sea_surface_height_amplitude_due_to_geocentric_ocean_tide_above_mllw,time&time>=max(time)-3days Easternmost_Easting=-72.0933 featureType=TimeSeries geospatial_lat_max=41.43 geospatial_lat_min=41.43 geospatial_lat_units=degrees_north geospatial_lon_max=-72.0933 geospatial_lon_min=-72.0933 geospatial_lon_units=degrees_east geospatial_vertical_max=0.0 geospatial_vertical_min=0.0 geospatial_vertical_positive=up geospatial_vertical_units=m history=Downloaded from NOAA Center for Operational Oceanographic Products and Services (CO-OPS) at https://tidesandcurrents.noaa.gov/api/ id=116052 infoUrl=https://sensors.ioos.us/#metadata/116052/station institution=NOAA Center for Operational Oceanographic Products and Services (CO-OPS) naming_authority=com.axiomdatascience Northernmost_Northing=41.43 platform=fixed platform_name=Yale Boathouse, Thames River, CT platform_vocabulary=http://mmisw.org/ont/ioos/platform processing_level=Level 2 references=https://tidesandcurrents.noaa.gov/stationhome.html?id=8461467,https://tidesandcurrents.noaa.gov/api/, sourceUrl=https://tidesandcurrents.noaa.gov/api/ Southernmost_Northing=41.43 standard_name_vocabulary=CF Standard Name Table v72 station_id=116052 time_coverage_end=2025-07-20T10:01:00Z time_coverage_start=2022-06-03T04:53:00Z Westernmost_Easting=-72.0933
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de517945https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de517945
Abstract (en): Inclusion/exclusion: (1) Three groups of scholars were surveyd about their experiences attempting to replicate statistical studies: students from the author's PhD methods class, students from Gary King’s PhD methods class, and subscribers to the Political Methodology listserve, (2) Data was collected on the availability of replication files for recent publications in the two top political science journals, the American Political Science Review (APSR) since 2010 and the American Journal of Political Science (AJPS) since 2009. Source of metadata: The Institution for Social and Policy Studies (2014). Science Deserves Better: The Imperative to Share Complete Replication Files. Retrieved August 8 th, 2016 from http://isps.yale.edu/research/data/d108 Observation
The 2022 Environmental Performance Index (EPI) ranks 180 countries on 40 performance indicators in the following 11 issue categories: air quality, sanitation and drinking water, heavy metals, waste management, biodiversity and habitat, ecosystem services, fisheries, acid rain, agriculture, water resources, and climate change mitigation. These categories track performance and progress on three broad policy objectives, environmental health, ecosystem vitality, and climate change. The EPI's proximity-to-target methodology facilitates cross-country comparisons among economic and regional peer groups. The data set includes the 2022 EPI, component scores, and time-series source data. It is the result of a collaboration of the Yale Center for Environmental Law and Policy (YCELP), Yale University, and the Columbia University Center for International Earth Science Information Network (CIESIN).
The Yale Center for Earth Observation (YCEO) Surface Urban Heat Islands, Version 4, 2003-2018 includes annual, summertime, and wintertime Surface Urban Heat Island (SUHI) intensities for daytime and nighttime for over 10,000 global urban extents. This global SUHI data set was created using the Simplified Urban-Extent (SUE) algorithm and is available at the pixel and urban cluster-levels (i.e. at the level of larger urban agglomerations). Monthly composites are also available as urban cluster means. A summary of older versions, including changes from the data set created and analyzed in the originally published manuscript (Chakraborty and Lee, 2019) can be found on the YCEO Global Surface UHI Explorer website (https://yceo.yale.edu/research/global-surface-uhi-explorer).
LISTOS_Ground_YaleCoastal_Data is the Long Island Sound Tropospheric Ozone Study (LISTOS) ground site data collected at the Yale Coastal ground site during the LISTOS field campaign. This product is a result of a joint effort across multiple agencies, including NASA, NOAA, the EPA Northeast States for Coordinated Air Use Management (NESCAUM), Maine Department of Environmental Protection, New Jersey Department of Environmental Protection, New York State Department of Environmental Conservation and several research groups at universities. Data collection is complete.
The New York City (NYC) metropolitan area (comprised of portions of New Jersey, New York, and Connecticut in and around NYC) is home to over 20 million people, but also millions of people living downwind in neighboring states. This area continues to persistently have challenges meeting past and recently revised federal health-based air quality standards for ground-level ozone, which impacts the health and well-being of residents living in the area. A unique feature of this chronic ozone problem is the pollution transported in a northeast direction out of NYC over Long Island Sound. The relatively cool waters of Long Island Sound confine the pollutants in a shallow and stable marine boundary layer. Afternoon heating over coastal land creates a sea breeze that carries the air pollution inland from the confined marine layer, resulting in high ozone concentrations in Connecticut and, at times, farther east into Rhode Island and Massachusetts. To investigate the evolving nature of ozone formation and transport in the NYC region and downwind, Northeast States for Coordinated Air Use Management (NESCAUM) launched the Long Island Sound Tropospheric Ozone Study (LISTOS). LISTOS was a multi-agency collaborative study focusing on Long Island Sound and the surrounding coastlines that continually suffer from poor air quality exacerbated by land/water circulation. The primary measurement observations took place between June-September 2018 and include in-situ and remote sensing instrumentation that were integrated aboard three aircraft, a network of ground sites, mobile vehicles, boat measurements, and ozonesondes. The goal of LISTOS was to improve the understanding of ozone chemistry and sea breeze transported pollution over Long Island Sound and its coastlines. LISTOS also provided NASA the opportunity to test air quality remote sensing retrievals with the use of its airborne simulators (GEOstationary Coastal and Air Pollution Events (GEO-CAPE) Airborne Simulator (GCAS), and Geostationary Trace gas and Aerosol Sensory Optimization (GeoTASO)) for the preparation of the Tropospheric Emissions; Monitoring of Pollution (TEMPO) observations for monitoring air quality from space. LISTOS also helped collaborators in the validation of Tropospheric Monitoring Instrument (TROPOMI) science products, with use of airborne- and ground-based measurements of ozone, NO2, and HCHO.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Units are randomly assigned, with equal probability, to receive a positive or negative fact on their Wikipedia page. In some studies, units were also randomly assigned, with equal probability, to have the fact cited or not. Treatment was the addition of either a true positive or true negative fact to the Wikipedia page of US Senators. Outcome measure was the length of time fact lasted on Wikipedia page. Data also archived at http://isps.yale.edu/research/data/d132.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This page contains the data necessary to replicate the results of "The Whistleblower Industrial Complex" by Alexander I. Platt, published in the Yale Journal on Regulation (2023).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data was collected with support from J-PAL's Cash Transfers for Child Health (CaTCH) initiative with the aim to understand if mobile phones can improve women's awareness and take-up of maternity benefits. The data collected also is part of a larger study focused on understanding constraints to women's mobile phone use and how to close India’s digital gender gap. Under the CaTCH research, women were called and provided information about how to access public maternal health-focused conditional cash transfers (CCTs); phone and in-person surveys were used to understand knowledge changes. This dataset includes three waves of phone survey and a final follow-up survey conducted in-person. ***** Note for users ***** 1. There are a total of 13 files that are relevant to this dataset. Please download the full packet (data and documentation) for the best user experience. [Access Dataset > Original Format ZIP]. 2. Please read the "0. User guide.pdf" document first. It contains important information about the other files in the ZIP file. 3. Please also note that this dataset can be used to conduct descriptive analysis but does not contain treatment indicators.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
An exit survey of Tunisian voters conducted in five governorates on the day of the 2014 parliamentary elections. Collected by Chantal Berman (Princeton University) and Elizabeth Nugent (Yale University). Replication data for the article "Regionalism in New Democracies: The Authoritarian Origins of Voter-Party Linkages," forthcoming at Political Research Quarterly.
The 2018 Environmental Performance Index (EPI) ranks 180 countries on 24 performance indicators in the following 10 issue categories: air quality, water and sanitation, heavy metals, biodiversity and habitat, forests, fisheries, climate and energy, air pollution, water resources, and agriculture. These categories track performance and progress on two broad policy objectives, environmental health and ecosystem vitality. The EPI's proximity-to-target methodology facilitates cross-country comparisons among economic and regional peer groups. The data set includes the 2018 EPI, component scores, and time-series source data. The 2018 EPI was formally released in Davos, Switzerland, at the annual meeting of the World Economic Forum in January 2018. It is the result of collaboration of the Yale Center for Environmental Law and Policy (YCELP), Yale University, Columbia University Center for International Earth Science Information Network (CIESIN), and the World Economic Forum (WEF). The Interactive Website for the 2018 EPI is at https://epi.envirocenter.yale.edu/.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The replication archive contains R and Stata scripts as well as datasets to reproduce simulation and applied results in Aronow, Samii, and Assenova, “Cluster Robust Variance Estimation for Dyadic Data.” R scripts 1-3 reproduce the simulation results. Stata script 4 reproduces Russett and Oneal’s original estimates and also creates a dataset for our reanalysis. R script 5 reproduces our reanalysis of Russett and Oneal. R script 6 reproduces our reanalysis of Fisman et al. TRIANGLE.dta are the data for the Russett and Oneal application (obtained from Russett’s homepage: http://pantheon.yale.edu/~brusset/PeaceStata.zip). Speed Dating Data.csv are the data for the Fisman et al. application (obtained from the website for Gelman and Hill 2007: http://www.stat.columbia.edu/~gelman/arm/).
The 2020 Environmental Performance Index (EPI) ranks 180 countries on 32 performance indicators in the following 11 issue categories: air quality, sanitation and drinking water, heavy metals, waste management, biodiversity and habitat, ecosystem services, fisheries, climate change, pollution emissions, agriculture, and water resources. These categories track performance and progress on two broad policy objectives, environmental health and ecosystem vitality. The EPI's proximity-to-target methodology facilitates cross-country comparisons among economic and regional peer groups. The data set includes the 2020 EPI, component scores, and time-series source data. It is the result of a collaboration of the Yale Center for Environmental Law and Policy (YCELP), Yale University, and the Columbia University Center for International Earth Science Information Network (CIESIN). The Interactive Website for the 2020 EPI is at https://epi.yale.edu/.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This page contains the code and instructions necessary to replicate the results of "Restoring Indian Reservation Status: An Empirical Analysis" by Michael K. Velchik & Jeffery Y. Zhang, published in the Yale Journal on Regulation (2023).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data extraction categories.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Data archive to assist in the sharing of research grade information pertaining to the social and economic sciences. The majority of digital content currently consists of social science research data from experiments, program files with the code for analyzing the data, requisite documentation to use and understand the data, and associated files. Access to the ISPS Data Archive is provided at no cost and is granted for scholarship and research purposes only. When possible, Data is linked to Projects and Publications, via the ISPS KnowledgeBase. ISPS operates in accordance with the prevailing standards and practices of the digital preservation community including the Open Archival Information System (OAIS) Reference Model (ISO 14721:2003) and the Data Documentation Initiative (DDI) standard. Accordingly, ISPS supports digital life-cycle management, interoperability, and preferred methods of preservation. The ISPS Data Archive is intended for use by social science researchers, policy-makers, and practitioners who are conducting or analyzing field (and other) experiments in various social science disciplines. Currently, Replication Files originate with ISPS-affiliated scholars.