100+ datasets found

Dataset for Exploring case-control samples with non-targeted analysis
catalog.data.gov
datasets.ai
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Dataset for Exploring case-control samples with non-targeted analysis [Dataset]. https://catalog.data.gov/dataset/dataset-for-exploring-case-control-samples-with-non-targeted-analysis
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
These data contain the results of GC-MS, LC-MS and immunochemistry analyses of mask sample extracts. The data include tentatively identified compounds through library searches and compound abundance. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The data can not be accessed. Format: The dataset contains the identification of compounds found in the mask samples as well as the abundance of those compounds for individuals who participated in the trial. This dataset is associated with the following publication: Pleil, J., M. Wallace, J. McCord, M. Madden, J. Sobus, and G. Ferguson. How do cancer-sniffing dogs sort biological samples? Exploring case-control samples with non-targeted LC-Orbitrap, GC-MS, and immunochemistry methods. Journal of Breath Research. Institute of Physics Publishing, Bristol, UK, 14(1): 016006, (2019).
CASE_Dataset-full
springernature.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claudio Castellini; Friedhelm Schwenker; Karan Sharma; Egon L. van den Broek; Alin Albu-Schaeffer (2023). CASE_Dataset-full [Dataset]. http://doi.org/10.6084/m9.figshare.8869157.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.8869157.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Claudio Castellini; Friedhelm Schwenker; Karan Sharma; Egon L. van den Broek; Alin Albu-Schaeffer
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This file contains the full (i.e., data from all 30 participants) CASE dataset.
w
Dataset of books called Clinical cases in critical care
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Clinical cases in critical care [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Clinical+cases+in+critical+care
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Clinical cases in critical care. It features 7 columns including author, publication date, language, and book publisher.
i
COVID-19 Case Demographics by County and District - Dataset - The Indiana...
hub.mph.in.gov
Updated Jul 27, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). COVID-19 Case Demographics by County and District - Dataset - The Indiana Data Hub [Dataset]. https://hub.mph.in.gov/dataset/covid-19-case-demographics-by-county
Explore at:
Dataset updated
Jul 27, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Archived as of 11/15/2023: With the end of the federal emergency and reporting requirements continuing to evolve, the Indiana Department of Health will no longer publish and refresh the COVID-19 datasets after November 15, 2023 - one final dataset publication will continue to be available as an archival copy. By county breakdown of demographics (age group, gender, race, ethnicity) All data displayed is preliminary and subject to change as more information is reported to ISDH. Expect historical data to change as data is reported to ISDH.
Case Mix Index
data.chhs.ca.gov
data.ca.gov
+3more
docx, pdf, xlsx, zip
Updated Nov 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Health Care Access and Information (2024). Case Mix Index [Dataset]. https://data.chhs.ca.gov/dataset/case-mix-index
Explore at:
xlsx(185114), zip, pdf, docxAvailable download formats
Dataset updated
Nov 13, 2024
Dataset authored and provided by
Department of Health Care Access and Information
Description
The Case Mix Index (CMI) is the average relative DRG weight of a hospital’s inpatient discharges, calculated by summing the Medicare Severity-Diagnosis Related Group (MS-DRG) weight for each discharge and dividing the total by the number of discharges. The CMI reflects the diversity, clinical complexity, and resource needs of all the patients in the hospital. A higher CMI indicates a more complex and resource-intensive case load. Although the MS-DRG weights, provided by the Centers for Medicare & Medicaid Services (CMS), were designed for the Medicare population, they are applied here to all discharges regardless of payer. Note: It is not meaningful to add the CMI values together.
Earnings Case Management System
catalog.data.gov
s.cnmilf.com
+1more
Updated Sep 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2025). Earnings Case Management System [Dataset]. https://catalog.data.gov/dataset/earnings-case-management-system
Explore at:
Dataset updated
Sep 19, 2025
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
This database supports Earnings Corrections.
Federal Court Cases: Integrated Data Base, 2014
catalog.data.gov
icpsr.umich.edu
Updated Mar 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Justice Statistics (2025). Federal Court Cases: Integrated Data Base, 2014 [Dataset]. https://catalog.data.gov/dataset/federal-court-cases-integrated-data-base-2014-37ca8
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
Bureau of Justice Statisticshttp://bjs.ojp.gov/
Description
The purpose of this data collection is to provide an official public record of the business of the federal courts. The data originate from district and appellate court offices throughout the United States. Information was obtained at two points in the life of appellate and civil cases: filing and termination. The termination data (archived in this data collection) contain information on both filing and terminations, while the pending data (archived as a separate data collection) contain only filing information on the most recent pending cases. The unit of analysis for the appellate and civil terminations data is the case. This collection also contains data on criminal cases in federal courts. However, the unit of analysis for the criminal data is the defendant, and a defendant can be included in several cases.
h
Caselaw_Access_Project_FAISS_index
huggingface.co
Updated Mar 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FreeLaw (2024). Caselaw_Access_Project_FAISS_index [Dataset]. https://huggingface.co/datasets/free-law/Caselaw_Access_Project_FAISS_index
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 2, 2024
Dataset authored and provided by
FreeLaw
License
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
Description
The Caselaw Access Project

In collaboration with Ravel Law, Harvard Law Library digitized over 40 million U.S. court decisions consisting of 6.7 million cases from the last 360 years into a dataset that is widely accessible to use. Access a bulk download of the data through the Caselaw Access Project API (CAPAPI): https://case.law/caselaw/ Find more information about accessing state and federal written court decisions of common law through the bulk data service documentation here:… See the full description on the dataset page: https://huggingface.co/datasets/free-law/Caselaw_Access_Project_FAISS_index.
h
MultiCaRe_Dataset
huggingface.co
Updated Apr 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mauro Nievas Offidani (2025). MultiCaRe_Dataset [Dataset]. https://huggingface.co/datasets/mauro-nievoff/MultiCaRe_Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 4, 2025
Authors
Mauro Nievas Offidani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains multi-modal data from over 75,000 open access and de-identified case reports, including metadata, clinical cases, image captions and more than 130,000 images. Images and clinical cases belong to different medical specialties, such as oncology, cardiology, surgery and pathology. The structure of the dataset allows to easily map images with their corresponding article metadata, clinical case, captions and image labels. Details of the data structure can be found in the file… See the full description on the dataset page: https://huggingface.co/datasets/mauro-nievoff/MultiCaRe_Dataset.
Data from: COVID-19 Case Surveillance Public Use Data with Geography
data.cdc.gov
odgavaprod.ogopendata.com
+4more
csv, xlsx, xml
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDC Data, Analytics and Visualization Task Force (2024). COVID-19 Case Surveillance Public Use Data with Geography [Dataset]. https://data.cdc.gov/w/n8mc-b4w4/tdwk-ruhb?cur=pc3QpBSU3ov&from=D_Lyelr9Yy9
Explore at:
xlsx, xml, csvAvailable download formats
Dataset updated
Jul 9, 2024
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Authors
CDC Data, Analytics and Visualization Task Force
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 19 elements for all COVID-19 cases shared with CDC and includes demographics, geography (county and state of residence), any exposure history, disease severity indicators and outcomes, and presence of any underlying medical conditions and risk behaviors.

Currently, CDC provides the public with three versions of COVID-19 case surveillance line-listed data: this 19 data element dataset with geography, a 12 data element public use dataset, and a 33 data element restricted access dataset.

The following apply to the public use datasets and the restricted access dataset:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data are suppressed to protect individual privacy.
Datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensure that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and autonomous reporting entities, including New York City and the District of Columbia (D.C.), as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification (Interim-20-ID-02). The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC. COVID-19 case surveillance data are collected by jurisdictions and reported voluntarily to CDC.

For more information: NNDSS Supports the COVID-19 Response | CDC.

COVID-19 Case Reports COVID-19 case reports are routinely submitted to CDC by public health jurisdictions using nationally standardized case reporting forms. On April 5, 2020, CSTE released an Interim Position Statement with national surveillance case definitions for COVID-19. Current versions of these case definitions are available at: https://ndc.services.cdc.gov/case-definitions/coronavirus-disease-2019-2021/. All cases reported on or after were requested to be shared by public health departments to CDC using the standardized case definitions for lab-confirmed or probable cases. On May 5, 2020, the standardized case reporting form was revised. States and territories continue to use this form.

Data are Considered Provisional

The COVID-19 case surveillance data are dynamic; case reports can be modified at any time by the jurisdictions sharing COVID-19 data with CDC. CDC may update prior cases shared with CDC based on any updated information from jurisdictions. For instance, as new information is gathered about previously reported cases, health departments provide updated data to CDC. As more information and data become available, analyses might find changes in surveillance data and trends during a previously reported time window. Data may also be shared late with CDC due to the volume of COVID-19 cases.
Annual finalized data: To create the final NNDSS data used in the annual tables, CDC works carefully with the reporting jurisdictions to reconcile the data received during the year until each state or territorial epidemiologist confirms that the data from their area are correct.

Access Addressing Gaps in Public Health Reporting of Race and Ethnicity for COVID-19, a report from the Council of State and Territorial Epidemiologists, to better understand the challenges in completing race and ethnicity data for COVID-19 and recommendations for improvement.

Data Limitations

To learn more about the limitations in using case surveillance data, visit FAQ: COVID-19 Data and Surveillance.

Data Quality Assurance Procedures

CDC’s Case Surveillance Section routinely performs data quality assurance procedures (i.e., ongoing corrections and logic checks to address data errors). To date, the following data cleaning steps have been implemented:
Questions that have been left unanswered (blank) on the case report form are reclassified to a Missing value, if applicable to the question. For example, in the question "Was the individual hospitalized?" where the possible answer choices include "Yes," "No," or "Unknown," the blank value is recoded to "Missing" because the case report form did not include a response to the question.
Logic checks are performed for date data. If an illogical date has been provided, CDC reviews the data with the reporting jurisdiction. For example, if a symptom onset date in the future is reported to CDC, this value is set to null until the reporting jurisdiction updates the date appropriately.
Additional data quality processing to recode free text data is ongoing. Data on symptoms, race, ethnicity, and healthcare worker status have been prioritized.

Data Suppression

To prevent release of data that could be used to identify people, data cells are suppressed for low frequency (<11 COVID-19 case records with a given values). Suppression includes low frequency combinations of case month, geographic characteristics (county and state of residence), and demographic characteristics (sex, age group, race, and ethnicity). Suppressed values are re-coded to the NA answer option; records with data suppression are never removed.

Additional COVID-19 Data

COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths by state and by county. These and other COVID-19 data are available from multiple public locations: COVID Data Tracker; United States COVID-19 Cases and Deaths by State; COVID-19 Vaccination Reporting Data Systems; and COVID-19 Death Data and Resources.

Notes:

March 1, 2022: The "COVID-19 Case Surveillance Public Use Data with Geography" will be updated on a monthly basis.

April 7, 2022: An adjustment was made to CDC’s cleaning algorithm for COVID-19 line level case notification data. An assumption in CDC's algorithm led to misclassifying deaths that were not COVID-19 related. The algorithm has since been revised, and this dataset update reflects corrected individual level information about death status for all cases collected to date.

June 25, 2024: An adjustment
m
Appeal Cases heard at the Supreme Court of Nigeria Dataset
data.mendeley.com
Updated Jun 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremiah Balogun (2023). Appeal Cases heard at the Supreme Court of Nigeria Dataset [Dataset]. http://doi.org/10.17632/ky6zfyf669.1
Explore at:
Unique identifier
https://doi.org/10.17632/ky6zfyf669.1
Dataset updated
Jun 9, 2023
Authors
Jeremiah Balogun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Nigeria
Description
The dataset contains information about appeal cases heard at the Supreme Court of Nigeria (SCN) between the years 1962 to 2022. The dataset was extracted from case files that were provided by The Prison Law Pavillion; a data archiving firm in Nigeria. The dataset originally consisted of documentation of the various appeal cases alongside the outcome of the judgment of the SCN. Feature extraction techniques were used to generate a structured dataset containing information about a number of annotated features. Some of the features were stored as string values while some of the features were stored as numeric values. The dataset consists of information about 14 features including the outcome of the judgment. 13 features are the input variables among which 4 are stored as strings while the remaining 9 were stored as numeric values. Missing values among the numeric values were represented using the value -1. Unsupervised and Supervised machine learning algorithms can be applied to the dataset for the purpose of extracting important information required for gaining a better understanding of the relationship that exists among the features and with respect to predicting the target class which is the outcome of the SCN judgment.
J
E-Courts dataset (2010-2018)
justicehub.in
pdf
Updated Mar 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). E-Courts dataset (2010-2018) [Dataset]. https://justicehub.in/dataset/e-courts-dataset-2010-2018
Explore at:
pdf(4596833)Available download formats
Dataset updated
Mar 1, 2021
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We obtained case records from the Indian e-Courts platform - a public system put in place by the Indian government in 2013. The publicly available information includes the filing, registration, hearing, and decision dates for each case, as well as petitioner and respondent names, the position of the presiding judge, the acts and sections under which the case was filed, and the final decision or disposition.

The database covers India's lower judiciary -- all courts including and under the jurisdiction of District and Sessions courts. We also obtained data on judges pertaining to all courts in the Indian lower judiciary from the e-Courts platform. The data for each judge includes the judge's name, their position or designation, and the start and end date of the judge's appointment to each court. We joined the case-level data with the judge-level data based on the judge's designation and the initial case filing date. However, this case-judge matching process was only conducted for criminal cases and has a match rate between 50% and 75% depending on the jurisdiction.
h
casehold
huggingface.co
Updated Jul 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CaseHOLD (2023). casehold [Dataset]. https://huggingface.co/datasets/casehold/casehold
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 2, 2023
Dataset authored and provided by
CaseHOLD
Description
CaseHOLD (Case Holdings On Legal Decisions) is a law dataset comprised of over 53,000+ multiple choice questions to identify the relevant holding of a cited case.
w
Dataset of books by Alison A. Case
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books by Alison A. Case [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=author&fop0=%3D&fval0=Alison+A.+Case
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 6 rows and is filtered where the author is Alison A. Case. It features 7 columns including author, publication date, language, and book publisher.
w
Dataset of books called Multiple case study analysis
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Multiple case study analysis [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Multiple+case+study+analysis
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Multiple case study analysis. It features 7 columns including author, publication date, language, and book publisher.
United States Supreme Court Judicial Database Terms Series
catalog.data.gov
datasets.ai
+1more
Updated Mar 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Justice Statistics (2025). United States Supreme Court Judicial Database Terms Series [Dataset]. https://catalog.data.gov/dataset/united-states-supreme-court-judicial-database-terms-series-aea2b
Explore at:
Dataset updated
Mar 12, 2025
Dataset provided by
Bureau of Justice Statisticshttp://bjs.ojp.gov/
Area covered
United States
Description
Investigator(s): Harold J. Spaeth, James L. Gibson, Michigan State University This data collection encompasses all aspects of United States Supreme Court decision-making from the beginning of the Warren Court in 1953 up to the completion of the 1995 term of the Rehnquist Court on July 1, 1996, including any decisions made afterward but before the start of the 1996 term on October 7, 1996. In this collection, distinct aspects of the court's decisions are covered by six types of variables: (1) identification variables including case citation, docket number, unit of analysis, and number of records per unit of analysis, (2) background variables offering information on origin of case, source of case, reason for granting cert, parties to the case, direction of the lower court's decision, and manner in which the Court takes jurisdiction, (3) chronological variables covering date of term of court, chief justice, and natural court, (4) substantive variables including multiple legal provisions, authority for decision, issue, issue areas, and direction of decision, (5) outcome variables supplying information on form of decision, disposition of case, winning party, declaration of unconstitutionality, and multiple memorandum decisions, and (6) voting and opinion variables pertaining to the vote in the case and to the direction of the individual justices' votes.Years Produced: Annually
w
Dataset of books called Case counterfeit
workwithdata.com
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of books called Case counterfeit [Dataset]. https://www.workwithdata.com/datasets/books?f=1&fcol0=book&fop0=%3D&fval0=Case+counterfeit
Explore at:
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about books. It has 1 row and is filtered where the book is Case counterfeit. It features 7 columns including author, publication date, language, and book publisher.
h
Data from: RadCases
huggingface.co
Updated Sep 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Yao (2024). RadCases [Dataset]. http://doi.org/10.57967/hf/6013
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/6013
Dataset updated
Sep 28, 2024
Authors
Michael Yao
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
RadCases Dataset

This HuggingFace (HF) dataset contains the raw case labels for input patient "one-liner" case summaries according to the ACR Appropriateness Criteria. Because many of the sources of data used to construct the RadCases dataset require credentialed access, we cannot publicly release the input patient case summaries. Instead, the "cases" included in this publicly available dataset are the cryptographically secure SHA-512 hashes of the original, "human-readable" cases.… See the full description on the dataset page: https://huggingface.co/datasets/michaelsyao/RadCases.
I
Cline Center Coup d’État Project Dataset
databank.illinois.edu
Updated May 11, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Buddy Peyton; Joseph Bajjalieh; Dan Shalmon; Michael Martin; Emilio Soto (2025). Cline Center Coup d’État Project Dataset [Dataset]. http://doi.org/10.13012/B2IDB-9651987_V7
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-9651987_V7
Dataset updated
May 11, 2025
Authors
Buddy Peyton; Joseph Bajjalieh; Dan Shalmon; Michael Martin; Emilio Soto
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Coups d'Ètat are important events in the life of a country. They constitute an important subset of irregular transfers of political power that can have significant and enduring consequences for national well-being. There are only a limited number of datasets available to study these events (Powell and Thyne 2011, Marshall and Marshall 2019). Seeking to facilitate research on post-WWII coups by compiling a more comprehensive list and categorization of these events, the Cline Center for Advanced Social Research (previously the Cline Center for Democracy) initiated the Coup d’État Project as part of its Societal Infrastructures and Development (SID) project. More specifically, this dataset identifies the outcomes of coup events (i.e., realized, unrealized, or conspiracy) the type of actor(s) who initiated the coup (i.e., military, rebels, etc.), as well as the fate of the deposed leader. Version 2.1.3 adds 19 additional coup events to the data set, corrects the date of a coup in Tunisia, and reclassifies an attempted coup in Brazil in December 2022 to a conspiracy. Version 2.1.2 added 6 additional coup events that occurred in 2022 and updated the coding of an attempted coup event in Kazakhstan in January 2022. Version 2.1.1 corrected a mistake in version 2.1.0, where the designation of “dissident coup” had been dropped in error for coup_id: 00201062021. Version 2.1.1 fixed this omission by marking the case as both a dissident coup and an auto-coup. Version 2.1.0 added 36 cases to the data set and removed two cases from the v2.0.0 data. This update also added actor coding for 46 coup events and added executive outcomes to 18 events from version 2.0.0. A few other changes were made to correct inconsistencies in the coup ID variable and the date of the event. Version 2.0.0 improved several aspects of the previous version (v1.0.0) and incorporated additional source material to include: • Reconciling missing event data • Removing events with irreconcilable event dates • Removing events with insufficient sourcing (each event needs at least two sources) • Removing events that were inaccurately coded as coup events • Removing variables that fell below the threshold of inter-coder reliability required by the project • Removing the spreadsheet ‘CoupInventory.xls’ because of inadequate attribution and citations in the event summaries • Extending the period covered from 1945-2005 to 1945-2019 • Adding events from Powell and Thyne’s Coup Data (Powell and Thyne, 2011)
Items in this Dataset 1. Cline Center Coup d'État Codebook v.2.1.3 Codebook.pdf - This 15-page document describes the Cline Center Coup d’État Project dataset. The first section of this codebook provides a summary of the different versions of the data. The second section provides a succinct definition of a coup d’état used by the Coup d'État Project and an overview of the categories used to differentiate the wide array of events that meet the project's definition. It also defines coup outcomes. The third section describes the methodology used to produce the data. Revised February 2024 2. Coup Data v2.1.3.csv - This CSV (Comma Separated Values) file contains all of the coup event data from the Cline Center Coup d’État Project. It contains 29 variables and 1000 observations. Revised February 2024 3. Source Document v2.1.3.pdf - This 325-page document provides the sources used for each of the coup events identified in this dataset. Please use the value in the coup_id variable to identify the sources used to identify that particular event. Revised February 2024 4. README.md - This file contains useful information for the user about the dataset. It is a text file written in markdown language. Revised February 2024
Citation Guidelines 1. To cite the codebook (or any other documentation associated with the Cline Center Coup d’État Project Dataset) please use the following citation: Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Scott Althaus. 2024. “Cline Center Coup d’État Project Dataset Codebook”. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7 2. To cite data from the Cline Center Coup d’État Project Dataset please use the following citation (filling in the correct date of access): Peyton, Buddy, Joseph Bajjalieh, Dan Shalmon, Michael Martin, Jonathan Bonaguro, and Emilio Soto. 2024. Cline Center Coup d’État Project Dataset. Cline Center for Advanced Social Research. V.2.1.3. February 27. University of Illinois Urbana-Champaign. doi: 10.13012/B2IDB-9651987_V7
z
Requirements data sets (user stories)
zenodo.org
data.mendeley.com
txt
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabiano Dalpiaz; Fabiano Dalpiaz (2025). Requirements data sets (user stories) [Dataset]. http://doi.org/10.17632/7zbk8zsd8y.1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.17632/7zbk8zsd8y.1
Dataset updated
Jan 13, 2025
Dataset provided by
Mendeley Data
Authors
Fabiano Dalpiaz; Fabiano Dalpiaz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A collection of 22 data set of 50+ requirements each, expressed as user stories.

The dataset has been created by gathering data from web sources and we are not aware of license agreements or intellectual property rights on the requirements / user stories. The curator took utmost diligence in minimizing the risks of copyright infringement by using non-recent data that is less likely to be critical, by sampling a subset of the original requirements collection, and by qualitatively analyzing the requirements. In case of copyright infringement, please contact the dataset curator (Fabiano Dalpiaz, f.dalpiaz@uu.nl) to discuss the possibility of removal of that dataset [see Zenodo's policies]

The data sets have been originally used to conduct experiments about ambiguity detection with the REVV-Light tool: https://github.com/RELabUU/revv-light

This collection has been originally published in Mendeley data: https://data.mendeley.com/datasets/7zbk8zsd8y/1

Overview of the datasets [data and links added in December 2024]

The following text provides a description of the datasets, including links to the systems and websites, when available. The datasets are organized by macro-category and then by identifier.

Public administration and transparency

g02-federalspending.txt (2018) originates from early data in the Federal Spending Transparency project, which pertain to the website that is used to share publicly the spending data for the U.S. government. The website was created because of the Digital Accountability and Transparency Act of 2014 (DATA Act). The specific dataset pertains a system called DAIMS or Data Broker, which stands for DATA Act Information Model Schema. The sample that was gathered refers to a sub-project related to allowing the government to act as a data broker, thereby providing data to third parties. The data for the Data Broker project is currently not available online, although the backend seems to be hosted in GitHub under a CC0 1.0 Universal license. Current and recent snapshots of federal spending related websites, including many more projects than the one described in the shared collection, can be found here.

g03-loudoun.txt (2018) is a set of extracted requirements from a document, by the Loudoun County Virginia, that describes the to-be user stories and use cases about a system for land management readiness assessment called Loudoun County LandMARC. The source document can be found here and it is part of the Electronic Land Management System and EPlan Review Project - RFP RFQ issued in March 2018. More information about the overall LandMARC system and services can be found here.

g04-recycling.txt(2017) concerns a web application where recycling and waste disposal facilities can be searched and located. The application operates through the visualization of a map that the user can interact with. The dataset has obtained from a GitHub website and it is at the basis of a students' project on web site design; the code is available (no license).

g05-openspending.txt (2018) is about the OpenSpending project (www), a project of the Open Knowledge foundation which aims at transparency about how local governments spend money. At the time of the collection, the data was retrieved from a Trello board that is currently unavailable. The sample focuses on publishing, importing and editing datasets, and how the data should be presented. Currently, OpenSpending is managed via a GitHub repository which contains multiple sub-projects with unknown license.

g11-nsf.txt (2018) refers to a collection of user stories referring to the NSF Site Redesign & Content Discovery project, which originates from a publicly accessible GitHub repository (GPL 2.0 license). In particular, the user stories refer to an early version of the NSF's website. The user stories can be found as closed Issues.

(Research) data and meta-data management

g08-frictionless.txt (2016) regards the Frictionless Data project, which offers an open source dataset for building data infrastructures, to be used by researchers, data scientists, and data engineers. Links to the many projects within the Frictionless Data project are on GitHub (with a mix of Unlicense and MIT license) and web. The specific set of user stories has been collected in 2016 by GitHub user @danfowler and are stored in a Trello board.

g14-datahub.txt (2013) concerns the open source project DataHub, which is currently developed via a GitHub repository (the code has Apache License 2.0). DataHub is a data discovery platform which has been developed over multiple years. The specific data set is an initial set of user stories, which we can date back to 2013 thanks to a comment therein.

g16-mis.txt (2015) is a collection of user stories that pertains a repository for researchers and archivists. The source of the dataset is a public Trello repository. Although the user stories do not have explicit links to projects, it can be inferred that the stories originate from some project related to the library of Duke University.

g17-cask.txt (2016) refers to the Cask Data Application Platform (CDAP). CDAP is an open source application platform (GitHub, under Apache License 2.0) that can be used to develop applications within the Apache Hadoop ecosystem, an open-source framework which can be used for distributed processing of large datasets. The user stories are extracted from a document that includes requirements regarding dataset management for Cask 4.0, which includes the scenarios, user stories and a design for the implementation of these user stories. The raw data is available in the following environment.

g18-neurohub.txt (2012) is concerned with the NeuroHub platform, a neuroscience data management, analysis and collaboration platform for researchers in neuroscience to collect, store, and share data with colleagues or with the research community. The user stories were collected at a time NeuroHub was still a research project sponsored by the UK Joint Information Systems Committee (JISC). For information about the research project from which the requirements were collected, see the following record.

g22-rdadmp.txt (2018) is a collection of user stories from the Research Data Alliance's working group on DMP Common Standards. Their GitHub repository contains a collection of user stories that were created by asking the community to suggest functionality that should part of a website that manages data management plans. Each user story is stored as an issue on the GitHub's page.

g23-archivesspace.txt (2012-2013) refers to ArchivesSpace: an open source, web application for managing archives information. The application is designed to support core functions in archives administration such as accessioning; description and arrangement of processed materials including analog, hybrid, and
born digital content; management of authorities and rights; and reference service. The application supports collection management through collection management records, tracking of events, and a growing number of administrative reports. ArchivesSpace is open source and its

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. EPA Office of Research and Development (ORD) (2020). Dataset for Exploring case-control samples with non-targeted analysis [Dataset]. https://catalog.data.gov/dataset/dataset-for-exploring-case-control-samples-with-non-targeted-analysis

Dataset for Exploring case-control samples with non-targeted analysis

Explore at:

Dataset updated

Nov 12, 2020

Dataset provided by

United States Environmental Protection Agencyhttp://www.epa.gov/

Description

These data contain the results of GC-MS, LC-MS and immunochemistry analyses of mask sample extracts. The data include tentatively identified compounds through library searches and compound abundance. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The data can not be accessed. Format: The dataset contains the identification of compounds found in the mask samples as well as the abundance of those compounds for individuals who participated in the trial. This dataset is associated with the following publication: Pleil, J., M. Wallace, J. McCord, M. Madden, J. Sobus, and G. Ferguson. How do cancer-sniffing dogs sort biological samples? Exploring case-control samples with non-targeted LC-Orbitrap, GC-MS, and immunochemistry methods. Journal of Breath Research. Institute of Physics Publishing, Bristol, UK, 14(1): 016006, (2019).

Clear search

Close search

Google apps

Main menu

Dataset for Exploring case-control samples with non-targeted analysis

CASE_Dataset-full

Dataset of books called Clinical cases in critical care

COVID-19 Case Demographics by County and District - Dataset - The Indiana...

Case Mix Index

Earnings Case Management System

Federal Court Cases: Integrated Data Base, 2014

Caselaw_Access_Project_FAISS_index

MultiCaRe_Dataset

Data from: COVID-19 Case Surveillance Public Use Data with Geography

Data are Considered Provisional

Data Limitations

Data Quality Assurance Procedures

Data Suppression

Additional COVID-19 Data

Appeal Cases heard at the Supreme Court of Nigeria Dataset

E-Courts dataset (2010-2018)

casehold

Dataset of books by Alison A. Case

Dataset of books called Multiple case study analysis

United States Supreme Court Judicial Database Terms Series

Dataset of books called Case counterfeit

Data from: RadCases

Cline Center Coup d’État Project Dataset

Requirements data sets (user stories)

Overview of the datasets [data and links added in December 2024]

Public administration and transparency

(Research) data and meta-data management

Dataset for Exploring case-control samples with non-targeted analysis