100+ datasets found

d
Department of Licensing Data Sharing Contract Audits History
catalog.data.gov
data.wa.gov
+1more
Updated Jan 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.wa.gov (2025). Department of Licensing Data Sharing Contract Audits History [Dataset]. https://catalog.data.gov/dataset/department-of-licensing-data-sharing-contract-audits-history
Explore at:
Dataset updated
Jan 24, 2025
Dataset provided by
data.wa.gov
Description
The Department of Licensing (DOL) shares data under the strict terms of a data sharing agreement. People and organizations agree to undergo regular data security and permissible use audits. This dataset is a record of the audits that DOL conducts each year.
COVID-19 Case Surveillance Public Use Data
healthdata.gov
opendatalab.com
+5more
application/rdfxml +5
Updated Feb 25, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cdc.gov (2021). COVID-19 Case Surveillance Public Use Data [Dataset]. https://healthdata.gov/w/knt4-7efa/default?cur=xbTVFQpGL_I
Explore at:
csv, json, application/rssxml, tsv, application/rdfxml, xmlAvailable download formats
Dataset updated
Feb 25, 2021
Dataset provided by
data.cdc.gov
Description
Note: Reporting of new COVID-19 Case Surveillance data will be discontinued July 1, 2024, to align with the process of removing SARS-CoV-2 infections (COVID-19 cases) from the list of nationally notifiable diseases. Although these data will continue to be publicly available, the dataset will no longer be updated.

Authorizations to collect certain public health data expired at the end of the U.S. public health emergency declaration on May 11, 2023. The following jurisdictions discontinued COVID-19 case notifications to CDC: Iowa (11/8/21), Kansas (5/12/23), Kentucky (1/1/24), Louisiana (10/31/23), New Hampshire (5/23/23), and Oklahoma (5/2/23). Please note that these jurisdictions will not routinely send new case data after the dates indicated. As of 7/13/23, case notifications from Oregon will only include pediatric cases resulting in death.

This case surveillance public use dataset has 12 elements for all COVID-19 cases shared with CDC and includes demographics, any exposure history, disease severity indicators and outcomes, presence of any underlying medical conditions and risk behaviors, and no geographic data.

CDC has three COVID-19 case surveillance datasets:
COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)
COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)
COVID-19 Case Surveillance Restricted Access Detailed Data: Restricted access, patient-level dataset with clinical and symptom data, demographics, and state and county of residence. Access requires a registration process and a data use agreement. (33 data elements)
The following apply to all three datasets:
Data elements can be found on the COVID-19 case report form located at www.cdc.gov/coronavirus/2019-ncov/downloads/pui-form.pdf.
Data are considered provisional by CDC and are subject to change until the data are reconciled and verified with the state and territorial data providers.
Some data cells are suppressed to protect individual privacy.
The datasets will include all cases with the earliest date available in each record (date received by CDC or date related to illness/specimen collection) at least 14 days prior to the creation of the current datasets. This 14-day lag allows case reporting to be stabilized and ensures that time-dependent outcome data are accurately captured.
Datasets are updated monthly.
Datasets are created using CDC’s Policy on Public Health Research and Nonresearch Data Management and Access and include protections designed to protect individual privacy.
For more information about data collection and reporting, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/about-us-cases-deaths.html.
For more information about the COVID-19 case surveillance data, please see https://www.cdc.gov/coronavirus/2019-ncov/covid-data/faq-surveillance.html

Overview

The COVID-19 case surveillance database includes individual-level data reported to U.S. states and aut
m
Annotated Terms of Service of 100 Online Platforms
data.mendeley.com
Updated Dec 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Przemyslaw Palka (2023). Annotated Terms of Service of 100 Online Platforms [Dataset]. http://doi.org/10.17632/dtbj87j937.3
Explore at:
Unique identifier
https://doi.org/10.17632/dtbj87j937.3
Dataset updated
Dec 12, 2023
Authors
Przemyslaw Palka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains information about the contents of 100 Terms of Service (ToS) of online platforms. The documents were analyzed and evaluated from the point of view of the European Union consumer law. The main results have been presented in the table titled "Terms of Service Analysis and Evaluation_RESULTS." This table is accompanied by the instruction followed by the annotators, titled "Variables Definitions," allowing for the interpretation of the assigned values. In addition, we provide the raw data (analyzed ToS, in the folder "Clear ToS") and the annotated documents (in the folder "Annotated ToS," further subdivided).

SAMPLE: The sample contains 100 contracts of digital platforms operating in sixteen market sectors: Cloud storage, Communication, Dating, Finance, Food, Gaming, Health, Music, Shopping, Social, Sports, Transportation, Travel, Video, Work, and Various. The selected companies' main headquarters span four legal surroundings: the US, the EU, Poland specifically, and Other jurisdictions. The chosen platforms are both privately held and publicly listed and offer both fee-based and free services. Although the sample cannot be treated as representative of all online platforms, it nevertheless accounts for the most popular consumer services in the analyzed sectors and contains a diverse and heterogeneous set.

CONTENT: Each ToS has been assigned the following information: 1. Metadata: 1.1. the name of the service; 1.2. the URL; 1.3. the effective date; 1.4. the language of ToS; 1.5. the sector; 1.6. the number of words in ToS; 1.7–1.8. the jurisdiction of the main headquarters; 1.9. if the company is public or private; 1.10. if the service is paid or free. 2. Evaluative Variables: remedy clauses (2.1– 2.5); dispute resolution clauses (2.6–2.10); unilateral alteration clauses (2.11–2.15); rights to police the behavior of users (2.16–2.17); regulatory requirements (2.18–2.20); and various (2.21–2.25). 3. Count Variables: the number of clauses seen as unclear (3.1) and the number of other documents referred to by the ToS (3.2). 4. Pull-out Text Variables: rights and obligations of the parties (4.1) and descriptions of the service (4.2)

ACKNOWLEDGEMENT: The research leading to these results has received funding from the Norwegian Financial Mechanism 2014-2021, project no. 2020/37/K/HS5/02769, titled “Private Law of Data: Concepts, Practices, Principles & Politics.”
d
OCP Procurement Agreements
data.detroitmi.gov
detroitdata.org
+2more
Updated Dec 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Detroit (2019). OCP Procurement Agreements [Dataset]. https://data.detroitmi.gov/datasets/ocp-procurement-agreements/explore
Explore at:
Dataset updated
Dec 12, 2019
Dataset authored and provided by
City of Detroit
Description
The Procurement Agreements dataset provides details about contract agreements between the City of Detroit and suppliers who provide materials, equipment and services to the City. Initial and amended contracts and purchase orders associated with the contracts are included in the dataset, In some cases, purchase orders are generated to pay suppliers for work completed under a contract. If available, a link to the contract agreement document in PDF format is provided in the 'Contract Link' field of each record (row) in the dataset. This dataset is updated weekly with data from the Office of Contracting and Procurement (OCP).
Procurement Contracts - Datasets - Lincolnshire Open Data
lincolnshire.ckan.io
Updated Aug 16, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
lincolnshire.ckan.io (2017). Procurement Contracts - Datasets - Lincolnshire Open Data [Dataset]. https://lincolnshire.ckan.io/dataset/contracts
Explore at:
Dataset updated
Aug 16, 2017
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Area covered
Lincolnshire
Description
Existing Contracts Register for awarded contracts over £5,000. An extract of all published contract awards starting from 01 April 2017. The data names the buyer and the awarded suppliers, plus information on the value and duration of the contract itself. This is a work in progress - by improving data quality and maintaining compliance with procurement regulations, the Council is working towards a complete dataset. Other details about Contracts shown in this dataset may be available on the ProContract website (source link shown below). That website may for example, also provide information on suppliers for Contracts that have more than one supplier. The data is updated quarterly. Data source: Procurement Lincolnshire, Lincolnshire County Council. For any enquiries about this publication contact procontract.support@lincolnshire.gov.uk
National Inpatient Sample (NIS) - Restricted Access Files
catalog.data.gov
healthdata.gov
+2more
Updated Feb 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). National Inpatient Sample (NIS) - Restricted Access Files [Dataset]. https://catalog.data.gov/dataset/hcup-national-nationwide-inpatient-sample-nis-restricted-access-file
Explore at:
Dataset updated
Feb 22, 2025
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
The Healthcare Cost and Utilization Project (HCUP) National Inpatient Sample (NIS) is the largest publicly available all-payer inpatient care database in the United States. The NIS is designed to produce U.S. regional and national estimates of inpatient utilization, access, cost, quality, and outcomes. Unweighted, it contains data from more than 7 million hospital stays each year. Weighted, it estimates more than 35 million hospitalizations nationally. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels. Starting with the 2012 data year, the NIS is a sample of discharges from all hospitals participating in HCUP, covering more than 97 percent of the U.S. population. For prior years, the NIS was a sample of hospitals. The NIS allows for weighted national estimates to identify, track, and analyze national trends in health care utilization, access, charges, quality, and outcomes. The NIS's large sample size enables analyses of rare conditions, such as congenital anomalies; uncommon treatments, such as organ transplantation; and special patient populations, such as the uninsured. NIS data are available since 1988, allowing analysis of trends over time. The NIS inpatient data include clinical and resource use information typically available from discharge abstracts with safeguards to protect the privacy of individual patients, physicians, and hospitals (as required by data sources). Data elements include but are not limited to: diagnoses, procedures, discharge status, patient demographics (e.g., sex, age), total charges, length of stay, and expected payment source, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. The NIS excludes data elements that could directly or indirectly identify individuals. Restricted access data files are available with a data use agreement and brief online security training.
d
Contracts
catalog.data.gov
data.austintexas.gov
+1more
Updated Jul 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). Contracts [Dataset]. https://catalog.data.gov/dataset/contracts-d35f9
Explore at:
Dataset updated
Jul 25, 2025
Dataset provided by
data.austintexas.gov
Description
Information about City's authorized spending limit, contract lifetime (called inception-to-date) ordering and spending. Contracts are visible only while active. For the purposes of this data set, a contract is a long-term (multi-year) contract for goods and services, and contracts for construction activity. Within the City, these are referred to as Master Agreements and Purchase Contracts.
Contract Understanding Atticus Dataset (CUAD)
kaggle.com
Updated Mar 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/2015428
Dataset updated
Mar 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Atticus Project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Please download the full version of the dataset from Zenodo, here.

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

Code for replicating the results, together with the model trained on CUAD, is published on Github here.
t
Procurement Contracts
data.tempe.gov
data-academy.tempe.gov
+8more
Updated Jun 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Tempe (2020). Procurement Contracts [Dataset]. https://data.tempe.gov/documents/bb33874274f44b6384598b633b017a4e
Explore at:
Dataset updated
Jun 12, 2020
Dataset authored and provided by
City of Tempe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The purpose of this site is to allow public access to the City's procurement contracts. These contracts represent a diverse list of resources required by Tempe to support the community's needs including equipment, vehicles, products, materials and services. Click on Open or double arrow to enter full screen mode.In line with the City's strong commitment to transparency and Smart City initiatives, we are pleased to make this information available in this accessible format.A user guide may be viewed by clicking here.Additional InformationSource: The original data source originates from the City of Tempe's Purchasing Contracts document storage and the purchasing contract financials application.Contact (author): Contact E-Mail (author): Contact (maintainer): Michael Greene, Procurement AdministrationContact E-Mail (maintainer): michael_greene@tempe.govData Source Type: Data Source Types are: Sql Server, Oracle and pdf file storage.Preparation Method: Data is extracted using a programmatic automation process that pulls data from the defined sources at regular time intervals.Publish Frequency: Once per week.Publish Method: Automatic via developed automation process.Data Dictionary
Archived, Pilot of the Open Contracting Data Standard (250 contract records)...
open.canada.ca
html, json, pdf
Updated Jul 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Public Services and Procurement Canada (2025). Archived, Pilot of the Open Contracting Data Standard (250 contract records) [Dataset]. https://open.canada.ca/data/en/dataset/60f22648-c173-446f-aa8a-4929d75d63e3
Explore at:
html, json, pdfAvailable download formats
Dataset updated
Jul 30, 2025
Dataset provided by
Public Services and Procurement Canadahttp://www.pwgsc.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Time period covered
Jan 1, 2014 - Dec 30, 2018
Description
This dataset includes the results of the pilot activity that Public Services and Procurement Canada undertook as part of Canada’s 2018-2020 National Action Plan on Open Government. The purpose is to demonstrate the usage and implementation of the Open Contracting Data Standard (OCDS). OCDS is an international data standard that is used to standardize how contracting data and documents can be published in an accessible, structured, and repeatable way. OCDS uses a standard language for contracting data that can be understood by all users. ###What procurement data is included in the OCDS Pilot? Procurement data included as part of this pilot is a cross-section of at least 250 contract records for a variety of contracts, including major projects. ###Methodology and lessons learned The Lessons Learned Report documents the methodology used and the lessons learned during the process of compiling the pilot data.
MAUD v1
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Atticus Project; The Atticus Project (2024). MAUD v1 [Dataset]. http://doi.org/10.5281/zenodo.7500064
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7500064
Dataset updated
Jul 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
The Atticus Project; The Atticus Project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Merger Agreement Understanding Dataset (MAUD) v1 is a corpus of 47,000+ labels in 152 merger agreements that have been manually labeled under the supervision of experienced lawyers to identify 92 questions in each agreement used by the 2021 American Bar Association (ABA) Public Target Deal Points Study.

MAUD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

ReadMe and Datasheet are published here. Code for replicating the results, together with the model trained on CUAD, is published on Github here.
Z
A set of generated Instagram Data Download Packages (DDPs) to investigate...
data.niaid.nih.gov
zenodo.org
Updated Jan 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Boeschoten (2021). A set of generated Instagram Data Download Packages (DDPs) to investigate their structure and content [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4472605
Explore at:
Dataset updated
Jan 28, 2021
Dataset provided by
Laura Boeschoten
Daniel Oberski
Ruben van den Goorbergh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Instagram data-download example dataset

In this repository you can find a data-set consisting of 11 personal Instagram archives, or Data-Download Packages (DDPs).

How the data was generated

These Instagram accounts were all new and generated by a group of researchers who were interested to figure out in detail the structure and variety in structure of these Instagram DDPs. The participants user the Instagram account extensively for approximately a week. The participants also intensively communicated with each other so that the data can be used as an example of a network.

The data was primarily generated to evaluate the performance of de-identification software. Therefore, the text in the DDPs particularly contain many randomly chosen (Dutch) first names, phone numbers, e-mail addresses and URLS. In addition, the images in the DDPs contain many faces and text as well. The DDPs contain faces and text (usernames) of third parties. However, only content of so-called `professional accounts' are shared, such as accounts of famous individuals or institutions who self-consciously and actively seek publicity, and these sources are easily publicly available. Furthermore, the DDPs do not contain sensitive personal data of these individuals.

Obtaining your Instagram DDP

After using the Instagram accounts intensively for approximately a week, the participants requested their personal Instagram DDPs by using the following steps. You can follow these steps yourself if you are interested in your personal Instagram DDP.

Go to www.instagram.com and log in

Click on your profile picture, go to Settings and Privacy and Security

Scroll to Data download and click Request download

Enter your email adress and click Next

Enter your password and click Request download

Instagram then delivered the data in a compressed zip folder with the format username_YYYYMMDD.zip (i.e., Instagram handle and date of download) to the participant, and the participants shared these DDPs with us.

Data cleaning

To comply with the Instagram user agreement, participants shared their full name, phone number and e-mail address. In addition, Instagram logged the i.p. addresses the participant used during their active period on Instagram. After colleting the DDPs, we manually replaced such information with random replacements such that the DDps shared here do not contain any personal data of the participants.

How this data-set can be used

This data-set was generated with the intention to evaluate the performance of the de-identification software. We invite other researchers to use this data-set for example to investigate what type of data can be found in Instagram DDPs or to investigate the structure of Instagram DDPs. The packages can also be used for example data-analyses, although no substantive research questions can be answered using this data as the data does not reflect how research subjects behave `in the wild'.

Authors

The data collection is executed by Laura Boeschoten, Ruben van den Goorbergh and Daniel Oberski of Utrecht University. For questions, please contact l.boeschoten@uu.nl.

Acknowledgments

The researchers would like to thank everyone who participated in this data-generation project.
Meta data and supporting documentation
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Meta data and supporting documentation [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/meta-data-and-supporting-documentation
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
We include a description of the data sets in the meta-data as well as sample code and results from a simulated data set. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: The R code is available on line here: https://res1githubd-o-tcom.vcapture.xyz/warrenjl/SpGPCW. Format: Abstract The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publicly available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. File format: R workspace file. Metadata (including data dictionary) • y: Vector of binary responses (1: preterm birth, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate). This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
Contract Administration Management System (CAMS)
catalog.data.gov
data.va.gov
+2more
Updated Aug 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2025). Contract Administration Management System (CAMS) [Dataset]. https://catalog.data.gov/dataset/contract-administration-management-system-cams
Explore at:
Dataset updated
Aug 2, 2025
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description
The Contract Administration and Management System (CAMS) is a data management system designed specifically for the Veterans Health Administration Office of Facilities Management (FM) for the management of contract and funding data. It provides a means of sorting and tracking data related to major Architect-Engineer and construction contracts such as contract type, project locations, project status, and contract funding.
E
Atticus Open Contract Dataset (AOK) (beta)
live.european-language-grid.eu
data.niaid.nih.gov
+1more
csv
Updated Jun 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Atticus Open Contract Dataset (AOK) (beta) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7648
Explore at:
csvAvailable download formats
Dataset updated
Jun 22, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Atticus Open Contract Dataset (AOK)(beta) is a corpus of 5,000+ labels in 200 commercial legal contracts that have been manually labeled by legal experts to identify 40 types of clauses that are important during contract review in connection with corporate transactions, such as mergers and acquisitions, IPO, and corporate financing.AOK Dataset is curated and maintained by The Atticus Project, Inc., a non-profit organization, to support NLP research and development in legal contract review. If you download this dataset, we'd love to know more about you and your project! Please fill out this short form: https://forms.gle/h47GUENTTbBqH39m7
Check out our website at atticusprojectai.org.
Update: The expanded 1.0 version of the dataset is available here https://zenodo.org/record/4595826
HCUP State Emergency Department Databases (SEDD) - Restricted Access File
catalog.data.gov
healthdata.gov
+4more
Updated Jul 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). HCUP State Emergency Department Databases (SEDD) - Restricted Access File [Dataset]. https://catalog.data.gov/dataset/hcup-state-emergency-department-databases-sedd-restricted-access-file
Explore at:
Dataset updated
Jul 29, 2025
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
The Healthcare Cost and Utilization Project (HCUP) State Emergency Department Databases (SEDD) contain the universe of emergency department visits in participating States. The data are translated into a uniform format to facilitate multi-State comparisons and analyses. The SEDD consist of data from hospital-based emergency department visits that do not result in an admission. The SEDD include all patients, regardless of the expected payer including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ), HCUP data inform decision making at the national, State, and community levels. The SEDD contain clinical and resource use information included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and facilities (as required by data sources). Data elements include but are not limited to: diagnoses, procedures, admission and discharge status, patient demographics (e.g., sex, age, race), total charges, length of stay, and expected payment source, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. In addition to the core set of uniform data elements common to all SEDD, some include State-specific data elements. The SEDD exclude data elements that could directly or indirectly identify individuals. For some States, hospital and county identifiers are included that permit linkage to the American Hospital Association Annual Survey File and the Bureau of Health Professions' Area Resource File except in States that do not allow the release of hospital identifiers. Restricted access data files are available with a data use agreement and brief online security training.
Simulation Data Set
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).
HCUP Nationwide Emergency Department Database (NEDS) Restricted Access File
catalog.data.gov
data.virginia.gov
+3more
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agency for Healthcare Research and Quality, Department of Health & Human Services (2025). HCUP Nationwide Emergency Department Database (NEDS) Restricted Access File [Dataset]. https://catalog.data.gov/dataset/hcup-nationwide-emergency-department-database-neds-restricted-access-file
Explore at:
Dataset updated
Jul 29, 2025
Dataset provided by
Agency for Healthcare Research and Qualityhttp://www.ahrq.gov/
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Description
The Healthcare Cost and Utilization Project (HCUP) Nationwide Emergency Department Sample (NEDS) is the largest all-payer emergency department (ED) database in the United States. yielding national estimates of hospital-owned ED visits. Unweighted, it contains data from over 30 million ED visits each year. Weighted, it estimates roughly 145 million ED visits nationally. Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. Sampled from the HCUP State Inpatient Databases (SID) and State Emergency Department Databases (SEDD), the HCUP NEDS can be used to create national and regional estimates of ED care. The SID contain information on patients initially seen in the ED and subsequently admitted to the same hospital. The SEDD capture information on ED visits that do not result in an admission (i.e., treat-and-release visits and transfers to another hospital). Developed through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality, HCUP data inform decision making at the national, State, and community levels. The NEDS contain information about geographic characteristics, hospital characteristics, patient characteristics, and the nature of visits (e.g., common reasons for ED visits, including injuries). The NEDS contains clinical and resource use information included in a typical discharge abstract, with safeguards to protect the privacy of individual patients, physicians, and hospitals (as required by data sources). It includes ED charge information for over 85% of patients, regardless of expected payer, including but not limited to Medicare, Medicaid, private insurance, self-pay, or those billed as ‘no charge’. The NEDS excludes data elements that could directly or indirectly identify individuals, hospitals, or states.Restricted access data files are available with a data use agreement and brief online security training.
Data from: Weather conditions and Legionellosis: A nationwide case-crossover...
catalog.data.gov
Updated Mar 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2025). Weather conditions and Legionellosis: A nationwide case-crossover study among Medicare recipients [Dataset]. https://catalog.data.gov/dataset/weather-conditions-and-legionellosis-a-nationwide-case-crossover-study-among-medicare-reci
Explore at:
Dataset updated
Mar 29, 2025
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
Data consist of CMS Medicare data files which are restricted access and cannot be released publicly. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. EPA cannot release CBI, or data protected by copyright, patent, or otherwise subject to trade secret restrictions. Request for access to CBI data may be directed to the dataset owner by an authorized person by contacting the party listed. It can be accessed through the following means: CMS Medicare data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. . Weather data are available at https://prism.oregonstate.edu/. Format: The data that support the findings of this study are available from the Centers for Medicare and Medicaid Services (CMS). Restrictions apply to the availability of these data, which were provided under a Data Use Agreement specific to this study. Data are available from: https://www.cms.gov/data-research/files-for-order/data-disclosures-and-data-use-agreements-duas/limited-data-set-lds with the requirement of a signed Data Use Agreement. Data do not contain personally identifiable information but contain are classified as Limited Data Set files and their distribution require an agreement and between CMS and the requester and approval by CMS. Weather data are available at https://prism.oregonstate.edu/. Because the data do not contain identifiable private information and were not obtained through interaction or intervention with individuals, the Institutional Review Board for the University of North Carolina and the US Environmental Protection Agency Human Research Protocol Officer determined that use of this data does not constitute human subjects research. This dataset is associated with the following publication: Wade, T., and C. Herbert. Weather conditions and legionellosis: a nationwide case-crossover study among Medicare recipients. EPIDEMIOLOGY AND INFECTION. Cambridge University Press, Cambridge, UK, 152: E125, (2024).
A
COVID-19 Case Surveillance Restricted Access Detailed Data
data.amerigeoss.org
data.virginia.gov
+3more
Updated May 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States (2021). COVID-19 Case Surveillance Restricted Access Detailed Data [Dataset]. https://data.amerigeoss.org/dataset/covid-19-case-surveillance-restricted-access-detailed-data
Explore at:
Dataset updated
May 10, 2021
Dataset provided by
United States
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
This case surveillance publicly available dataset has 32 elements for all COVID-19 cases shared with CDC and includes demographics, geography (county and state of residence), any exposure history, disease severity indicators and outcomes, and presence of any underlying medical conditions and risk behaviors. This dataset requires a registration process and a data use agreement.

CDC has three COVID-19 case surveillance datasets:

COVID-19 Case Surveillance Public Use Data with Geography: Public use, patient-level dataset with clinical data (including symptoms), demographics, and county and state of residence. (19 data elements)

COVID-19 Case Surveillance Public Use Data: Public use, patient-level dataset with clinical and symptom data and demographics, with no geographic data. (12 data elements)

COVID-19 Case Surveillance Restricted Access Data: Restricted access, patient-level dataset with clinical (including symptoms), demographics, and county and state of residence. Access requires a registration process and a data use agreement. (32 data elements)

Requesting Access to the COVID-19 Case Surveillance Restricted Access Detailed Data

Please review the following documents to determine your interest in accessing the COVID-19 Case Surveillance Restricted Access Detailed Data file:

1) CDC COVID-19 Case Surveillance Restricted Access Detailed Data: Summary, Guidance, Limitations Information, and Restricted Access Data Use Agreement Information

2) Data Dictionary for the COVID-19 Case Surveillance Restricted Access Detailed Data

The next step is to complete the Registration Information and Data Use Restrictions Agreement (RIDURA). Once complete, CDC will review your agreement. After access is granted, Ask SRRG (eocevent394@cdc.gov) will email you information about how to access the data through GitHub. If you have questions about obtaining access, email eocevent394@cdc.gov.

Overview

The COVID-19 case surveillance database includes patient-level data reported by U.S. states and autonomous reporting entities, including New York City, the District of Columbia, as well as U.S. territories and affiliates. On April 5, 2020, COVID-19 was added to the Nationally Notifiable Condition List and classified as “immediately notifiable, urgent (within 24 hours)” by a Council of State and Territorial Epidemiologists (CSTE) Interim Position Statement (Interim-20-ID-01). CSTE updated the position statement on August 5, 2020, to clarify the interpretation of antigen detection tests and serologic test results within the case classification. The statement also recommended that all states and territories enact laws to make COVID-19 reportable in their jurisdiction, and that jurisdictions conducting surveillance should submit case notifications to CDC.

COVID-19 case surveillance data are collected by jurisdictions and are shared voluntarily with CDC. For more information, visit: <a href="https://wwwn.cdc.gov/nndss/conditions/coronavirus-disease-2019-c

Facebook

Twitter

Click to copy link

Link copied

Cite

data.wa.gov (2025). Department of Licensing Data Sharing Contract Audits History [Dataset]. https://catalog.data.gov/dataset/department-of-licensing-data-sharing-contract-audits-history

Department of Licensing Data Sharing Contract Audits History

Explore at:

Dataset updated

Jan 24, 2025

Dataset provided by

data.wa.gov

Description

The Department of Licensing (DOL) shares data under the strict terms of a data sharing agreement. People and organizations agree to undergo regular data security and permissible use audits. This dataset is a record of the audits that DOL conducts each year.

Clear search

Close search

Google apps

Main menu

Department of Licensing Data Sharing Contract Audits History

COVID-19 Case Surveillance Public Use Data

CDC has three COVID-19 case surveillance datasets:

Overview

Annotated Terms of Service of 100 Online Platforms

OCP Procurement Agreements

Procurement Contracts - Datasets - Lincolnshire Open Data

National Inpatient Sample (NIS) - Restricted Access Files

Contracts

Contract Understanding Atticus Dataset (CUAD)

Procurement Contracts

Archived, Pilot of the Open Contracting Data Standard (250 contract records)...

MAUD v1

A set of generated Instagram Data Download Packages (DDPs) to investigate...

Meta data and supporting documentation

Contract Administration Management System (CAMS)

Atticus Open Contract Dataset (AOK) (beta)

HCUP State Emergency Department Databases (SEDD) - Restricted Access File

Simulation Data Set

HCUP Nationwide Emergency Department Database (NEDS) Restricted Access File

Data from: Weather conditions and Legionellosis: A nationwide case-crossover...

COVID-19 Case Surveillance Restricted Access Detailed Data

CDC has three COVID-19 case surveillance datasets:

Overview

Department of Licensing Data Sharing Contract Audits History