100+ datasets found

Contract Understanding Atticus Dataset (CUAD)
kaggle.com
Updated Mar 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/2015428
Dataset updated
Mar 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Atticus Project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Please download the full version of the dataset from Zenodo, here.

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

Code for replicating the results, together with the model trained on CUAD, is published on Github here.
w
WITS Global Preferential Trade Agreement Database - Dataset - waterdata
wbwaterdata.org
Updated Mar 16, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). WITS Global Preferential Trade Agreement Database - Dataset - waterdata [Dataset]. https://wbwaterdata.org/dataset/wits-global-preferential-trade-agreement-database
Explore at:
Dataset updated
Mar 16, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The Global Preferential Trade Agreements Database (GPTAD) provides information on preferential trade agreements (PTAs) around the world, including agreements that have not been notified to the World Trade Organization (WTO). It is designed to help trade policy makers, scholars, and business operators better understand and navigate the world of PTAs. The GPTAD contains the original text of PTAs that have been notified to the WTO as well as agreements that have not yet been notified. The database is updated on a regular basis and currently comprises more than 330 PTAs. Agreements in the database have been indexed using a classification consistent with the WTO criteria. The GPTAD is a unique online tool that allows users to search PTAs around the world by provisions or keywords and to compare provisions across multiple agreements.
P
Terms of Service Dataset
paperswithcode.com
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Lippi; Przemyslaw Palka; Giuseppe Contissa; Francesca Lagioia; Hans-Wolfgang Micklitz; Giovanni Sartor; Paolo Torroni (2024). Terms of Service Dataset [Dataset]. https://paperswithcode.com/dataset/terms-of-service
Explore at:
Dataset updated
Feb 21, 2024
Authors
Marco Lippi; Przemyslaw Palka; Giuseppe Contissa; Francesca Lagioia; Hans-Wolfgang Micklitz; Giovanni Sartor; Paolo Torroni
Description
The Terms of Service dataset is a law dataset corresponding to the task of identifying whether contractual terms are potentially unfair. This is a binary classification task, where positive examples are potentially unfair contractual terms (clauses) from the terms of service in consumer contracts. Article 3 of the Directive 93/13 on Unfair Terms in Consumer Contracts defines an unfair contractual term as follows. A contractual term is unfair if: (1) it has not been individually negotiated; and (2) contrary to the requirement of good faith, it causes a significant imbalance in the parties rights and obligations, to the detriment of the consumer. The Terms of Service dataset consists of 9,414 examples.
d
Prepaid Product Agreements Database
datasets.ai
catalog.data.gov
+1more
0, 8
Updated Aug 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Consumer Financial Protection Bureau (2024). Prepaid Product Agreements Database [Dataset]. https://datasets.ai/datasets/prepaid-product-agreements-database
Explore at:
8, 0Available download formats
Dataset updated
Aug 26, 2024
Dataset authored and provided by
Consumer Financial Protection Bureau
Description
Prepaid account agreement data, which contain general terms and conditions, pricing, and fee information, that issuers submit to the Bureau under the terms of the Prepaid Rule. Data is refreshed nightly.
VHA Data Sharing Agreement Repository
catalog.data.gov
data.va.gov
+4more
Updated May 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2021). VHA Data Sharing Agreement Repository [Dataset]. https://catalog.data.gov/dataset/vha-data-sharing-agreement-repository
Explore at:
Dataset updated
May 1, 2021
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description
The VHA Data Sharing Agreement Repository serves as a centralized location to collect and report on agreements that share VHA data with entities outside of VA. It provides senior management an overall view of existing data sharing agreements; fosters productive sharing of health information with VHA's external partners; and streamlines data acquisition to improve data management responsibilities overall. Agreements that VHA has established with entities within the VA are not candidates for this Repository.
P
Contract Discovery Dataset
paperswithcode.com
opendatalab.com
Updated Oct 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Łukasz Borchmann; Dawid Wiśniewski; Andrzej Gretkowski; Izabela Kosmala; Dawid Jurkiewicz; Łukasz Szałkiewicz; Gabriela Pałka; Karol Kaczmarek; Agnieszka Kaliska; Filip Graliński (2022). Contract Discovery Dataset [Dataset]. https://paperswithcode.com/dataset/contract-discovery
Explore at:
Dataset updated
Oct 16, 2022
Authors
Łukasz Borchmann; Dawid Wiśniewski; Andrzej Gretkowski; Izabela Kosmala; Dawid Jurkiewicz; Łukasz Szałkiewicz; Gabriela Pałka; Karol Kaczmarek; Agnieszka Kaliska; Filip Graliński
Description
A new shared task of semantic retrieval from legal texts, in which a so-called contract discovery is to be performed, where legal clauses are extracted from documents, given a few examples of similar clauses from other legal acts.
E
Atticus Open Contract Dataset (AOK) (beta)
live.european-language-grid.eu
explore.openaire.eu
+2more
csv
Updated Jun 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Atticus Open Contract Dataset (AOK) (beta) [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7648
Explore at:
csvAvailable download formats
Dataset updated
Jun 22, 2023
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Atticus Open Contract Dataset (AOK)(beta) is a corpus of 5,000+ labels in 200 commercial legal contracts that have been manually labeled by legal experts to identify 40 types of clauses that are important during contract review in connection with corporate transactions, such as mergers and acquisitions, IPO, and corporate financing.AOK Dataset is curated and maintained by The Atticus Project, Inc., a non-profit organization, to support NLP research and development in legal contract review. If you download this dataset, we'd love to know more about you and your project! Please fill out this short form: https://forms.gle/h47GUENTTbBqH39m7
Check out our website at atticusprojectai.org.
Update: The expanded 1.0 version of the dataset is available here https://zenodo.org/record/4595826
h
kl3m-data-edgar-agreements
huggingface.co
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ALEA Institute (2025). kl3m-data-edgar-agreements [Dataset]. https://huggingface.co/datasets/alea-institute/kl3m-data-edgar-agreements
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 10, 2025
Authors
ALEA Institute
Description
KL3M Data Project

Note: This page provides general information about the KL3M Data Project. Additional details specific to this dataset will be added in future updates. For complete information, please visit the GitHub repository or refer to the KL3M Data Project paper.

Description

This dataset is part of the ALEA Institute's KL3M Data Project, which provides copyright-clean training resources for large language models.

Dataset Details

Format: Parquet… See the full description on the dataset page: https://huggingface.co/datasets/alea-institute/kl3m-data-edgar-agreements.
D
OCP Procurement Agreements
detroitdata.org
data.ferndalemi.gov
+2more
Updated Jan 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Detroit (2025). OCP Procurement Agreements [Dataset]. https://detroitdata.org/dataset/ocp-procurement-agreements
Explore at:
arcgis geoservices rest api, geojson, kml, zip, html, csvAvailable download formats
Dataset updated
Jan 24, 2025
Dataset provided by
City of Detroit
Description
The Procurement Agreements dataset provides details about contract agreements between the City of Detroit and suppliers who provide materials, equipment and services to the City. Initial and amended contracts and purchase orders associated with the contracts are included in the dataset, In some cases, purchase orders are generated to pay suppliers for work completed under a contract. If available, a link to the contract agreement document in PDF format is provided in the 'Contract Link' field of each record (row) in the dataset.

This dataset is updated weekly with data from the Office of Contracting and Procurement (OCP).
m
Annotated Terms of Service of 100 Online Platforms
data.mendeley.com
Updated Dec 12, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Przemyslaw Palka (2023). Annotated Terms of Service of 100 Online Platforms [Dataset]. http://doi.org/10.17632/dtbj87j937.3
Explore at:
Unique identifier
https://doi.org/10.17632/dtbj87j937.3
Dataset updated
Dec 12, 2023
Authors
Przemyslaw Palka
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset contains information about the contents of 100 Terms of Service (ToS) of online platforms. The documents were analyzed and evaluated from the point of view of the European Union consumer law. The main results have been presented in the table titled "Terms of Service Analysis and Evaluation_RESULTS." This table is accompanied by the instruction followed by the annotators, titled "Variables Definitions," allowing for the interpretation of the assigned values. In addition, we provide the raw data (analyzed ToS, in the folder "Clear ToS") and the annotated documents (in the folder "Annotated ToS," further subdivided).

SAMPLE: The sample contains 100 contracts of digital platforms operating in sixteen market sectors: Cloud storage, Communication, Dating, Finance, Food, Gaming, Health, Music, Shopping, Social, Sports, Transportation, Travel, Video, Work, and Various. The selected companies' main headquarters span four legal surroundings: the US, the EU, Poland specifically, and Other jurisdictions. The chosen platforms are both privately held and publicly listed and offer both fee-based and free services. Although the sample cannot be treated as representative of all online platforms, it nevertheless accounts for the most popular consumer services in the analyzed sectors and contains a diverse and heterogeneous set.

CONTENT: Each ToS has been assigned the following information: 1. Metadata: 1.1. the name of the service; 1.2. the URL; 1.3. the effective date; 1.4. the language of ToS; 1.5. the sector; 1.6. the number of words in ToS; 1.7–1.8. the jurisdiction of the main headquarters; 1.9. if the company is public or private; 1.10. if the service is paid or free. 2. Evaluative Variables: remedy clauses (2.1– 2.5); dispute resolution clauses (2.6–2.10); unilateral alteration clauses (2.11–2.15); rights to police the behavior of users (2.16–2.17); regulatory requirements (2.18–2.20); and various (2.21–2.25). 3. Count Variables: the number of clauses seen as unclear (3.1) and the number of other documents referred to by the ToS (3.2). 4. Pull-out Text Variables: rights and obligations of the parties (4.1) and descriptions of the service (4.2)

ACKNOWLEDGEMENT: The research leading to these results has received funding from the Norwegian Financial Mechanism 2014-2021, project no. 2020/37/K/HS5/02769, titled “Private Law of Data: Concepts, Practices, Principles & Politics.”
d
Public Contracts
catalog.data.gov
data.bloomington.in.gov
+1more
Updated Jul 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.bloomington.in.gov (2025). Public Contracts [Dataset]. https://catalog.data.gov/dataset/public-contracts
Explore at:
Dataset updated
Jul 5, 2025
Dataset provided by
data.bloomington.in.gov
Description
Public contracts with the City of Bloomington since 2018.
d
Bilateral Labor Agreements Dataset - Version 2 (2022)
dataone.org
Updated Nov 8, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chilton, Adam (2023). Bilateral Labor Agreements Dataset - Version 2 (2022) [Dataset]. http://doi.org/10.7910/DVN/QWSLYI
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/QWSLYI
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Chilton, Adam
Description
This dataset provides three resources on Bilateral Labor Agreements signed between 1945 and 2020. More information is available on this project at: https://www.law.uchicago.edu/bilateral-labor-agreements-dataset. .
d
Investor Relations - Credit Agreements
catalog.data.gov
datacatalog.cookcountyil.gov
+3more
Updated Nov 29, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
datacatalog.cookcountyil.gov (2021). Investor Relations - Credit Agreements [Dataset]. https://catalog.data.gov/dataset/investor-relations-credit-agreements
Explore at:
Dataset updated
Nov 29, 2021
Dataset provided by
datacatalog.cookcountyil.gov
Description
The County is a party to various credit agreements, including short term notes, Direct Pay variable rate agreements , Direct Placement variable rate agreements, and an operating Line of Credit. Current credit agreements that the county is a party to are made available below.
c
Non-trade issues in preferential trade agreements dataset (issues and scope)...
datacatalogue.cessda.eu
data.aussda.at
Updated Sep 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lechner, Lisa (2024). Non-trade issues in preferential trade agreements dataset (issues and scope) between 1945 and 2020 (OA edition) [Dataset]. http://doi.org/10.11587/Z4BPCP
Explore at:
Unique identifier
https://doi.org/10.11587/Z4BPCP
Dataset updated
Sep 14, 2024
Dataset provided by
University of Innsbruck
Authors
Lechner, Lisa
Time period covered
Oct 30, 2013 - Dec 31, 2021
Area covered
Austria
Variables measured
Text unit
Measurement technique
Content coding
Description
Full edition for public use. This dataset contains information on which non-trade issues occur in preferential trade agreements signed between 1945 and 2020. These range from various environmental protection issues, such as clean air or biodiversity, over basic human rights, such as the right to vote, to economic and social rights, such as the prohibition of forced labor.
a
Albemarle Open Space Use Agreement
data-old-uvalibrary.opendata.arcgis.com
data-uvalibrary.opendata.arcgis.com
Updated Aug 9, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Virginia (2019). Albemarle Open Space Use Agreement [Dataset]. https://data-old-uvalibrary.opendata.arcgis.com/datasets/albemarle-open-space-use-agreement
Explore at:
Dataset updated
Aug 9, 2019
Dataset authored and provided by
University of Virginia
Area covered

Description
This dataset contains all of the current parcels that are currently under an Open Space Use Agreement between the owners of the parcel and the County of Albemarle. These agreements limit construction and development activity on the property owner's land, and lasts from 4 to 10 years. For more information on any particular agreement, contact the Real Estate division of the County of Albemarle's Finance Department.
H
Bilateral Labor Agreements Dataset and Additional Replication Data for:...
dataverse.harvard.edu
Updated May 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MARGARET PETERS (2019). Bilateral Labor Agreements Dataset and Additional Replication Data for: Immigration and International Law [Dataset]. http://doi.org/10.7910/DVN/9ADZUF
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/9ADZUF
Dataset updated
May 13, 2019
Dataset provided by
Harvard Dataverse
Authors
MARGARET PETERS
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Bilateral Labor Agreements Dataset and additional replication files for "Immigration and International Law"
o
understanding agreements - Dataset - Open Government Data
opendata.gov.jo
Updated Apr 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). understanding agreements - Dataset - Open Government Data [Dataset]. https://opendata.gov.jo/dataset/understanding-agreements-1875-2023
Explore at:
Dataset updated
Apr 6, 2023
Description
understanding agreements
w
Dataset of news about General Agreement on Tariffs and Trade...
workwithdata.com
Updated May 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2025). Dataset of news about General Agreement on Tariffs and Trade (Organization)-History [Dataset]. https://www.workwithdata.com/datasets/news?f=1&fcol0=page_name&fop0=%3D&fval0=General+Agreement+on+Tariffs+and+Trade+%28Organization%29-History
Explore at:
Dataset updated
May 16, 2025
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about news. It has 3 rows and is filtered where the keywords includes General Agreement on Tariffs and Trade (Organization)-History. It features 10 columns including source, publication date, section, and news link.
Credit Card Agreements Database
datasets.ai
catalog.data.gov
+1more
0, 33
Updated Aug 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Consumer Financial Protection Bureau (2024). Credit Card Agreements Database [Dataset]. https://datasets.ai/datasets/credit-card-agreements-database
Explore at:
33, 0Available download formats
Dataset updated
Aug 7, 2024
Dataset authored and provided by
Consumer Financial Protection Bureauhttp://www.consumerfinance.gov/
Description
The Credit Card Agreements (CCA) database includes credit card agreements from more than 600 card issuers. These agreements include general terms and conditions, pricing, and fee information and are collected quarterly pursuant to requirements in the CARD Act.
MAUD v1
zenodo.org
data.niaid.nih.gov
zip
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Atticus Project; The Atticus Project (2024). MAUD v1 [Dataset]. http://doi.org/10.5281/zenodo.7500064
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7500064
Dataset updated
Jul 15, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
The Atticus Project; The Atticus Project
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Merger Agreement Understanding Dataset (MAUD) v1 is a corpus of 47,000+ labels in 152 merger agreements that have been manually labeled under the supervision of experienced lawyers to identify 92 questions in each agreement used by the 2021 American Bar Association (ABA) Public Target Deal Points Study.

MAUD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

ReadMe and Datasheet are published here. Code for replicating the results, together with the model trained on CUAD, is published on Github here.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428

Contract Understanding Atticus Dataset (CUAD)

A dataset of legal contracts with rich expert annotations.

Explore at:

44 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/2015428

Dataset updated

Mar 12, 2021

Dataset provided by

Kagglehttp://kaggle.com/

Authors

The Atticus Project

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please download the full version of the dataset from Zenodo, here.

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

Code for replicating the results, together with the model trained on CUAD, is published on Github here.

Clear search

Close search

Google apps

Main menu

Contract Understanding Atticus Dataset (CUAD)

WITS Global Preferential Trade Agreement Database - Dataset - waterdata

Terms of Service Dataset

Prepaid Product Agreements Database

VHA Data Sharing Agreement Repository

Contract Discovery Dataset

Atticus Open Contract Dataset (AOK) (beta)

kl3m-data-edgar-agreements

OCP Procurement Agreements

Annotated Terms of Service of 100 Online Platforms

Public Contracts

Bilateral Labor Agreements Dataset - Version 2 (2022)

Investor Relations - Credit Agreements

Non-trade issues in preferential trade agreements dataset (issues and scope)...

Albemarle Open Space Use Agreement

Bilateral Labor Agreements Dataset and Additional Replication Data for:...

understanding agreements - Dataset - Open Government Data

Dataset of news about General Agreement on Tariffs and Trade...

Credit Card Agreements Database

MAUD v1

Contract Understanding Atticus Dataset (CUAD)

A dataset of legal contracts with rich expert annotations.