6 datasets found
  1. Contract Understanding Atticus Dataset (CUAD)

    • kaggle.com
    Updated Mar 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please download the full version of the dataset from Zenodo, here.

    Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

    We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

    Code for replicating the results, together with the model trained on CUAD, is published on Github here.

  2. o

    Atticus Open Contract Dataset (AOK) (beta)

    • explore.openaire.eu
    • live.european-language-grid.eu
    • +2more
    Updated Oct 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2020). Atticus Open Contract Dataset (AOK) (beta) [Dataset]. http://doi.org/10.5281/zenodo.4064880
    Explore at:
    Dataset updated
    Oct 5, 2020
    Authors
    The Atticus Project
    Description

    Atticus Open Contract Dataset (AOK)(beta) is a corpus of 5,000+ labels in 200 commercial legal contracts that have been manually labeled by legal experts to identify 40 types of clauses that are important during contract review in connection with corporate transactions, such as mergers and acquisitions, IPO, and corporate financing. AOK Dataset is curated and maintained by The Atticus Project, Inc., a non-profit organization, to support NLP research and development in legal contract review. If you download this dataset, we'd love to know more about you and your project! Please fill out this short form: https://forms.gle/h47GUENTTbBqH39m7. Check out our website at atticusprojectai.org. Update: The expanded 1.0 version of the dataset is available here https://zenodo.org/record/4595826

  3. h

    acord

    • huggingface.co
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2025). acord [Dataset]. https://huggingface.co/datasets/theatticusproject/acord
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2025
    Authors
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ACORD: Atticus Clause Retrieval Dataset Atticus Clause Retrieval Dataset (ACORD) is the first retrieval benchmark for contract drafting fully annotated by experts. It includes 114 queries and over 126,662 query-clause pairs, each ranked on a scale from 1 to 5 stars. The task is to find the most relevant precedent clauses to a query. ACORD focuses on complex contract clauses such as Limitation of Liability, Indemnification, Change of Control, and Most Favored Nation. FORMAT ACORD consists of… See the full description on the dataset page: https://huggingface.co/datasets/theatticusproject/acord.

  4. h

    cuad_qa

    • huggingface.co
    Updated Sep 15, 1999
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chenghao Mou (1999). cuad_qa [Dataset]. https://huggingface.co/datasets/chenghao/cuad_qa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 1999
    Authors
    Chenghao Mou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CUAD

    This is a modified version of original CUAD which trims the question to its label form.

      Dataset Summary
    

    Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled to identify 41 categories of important clauses that lawyers look for when reviewing contracts in connection with corporate transactions. CUAD is curated and maintained by The Atticus Project, Inc.… See the full description on the dataset page: https://huggingface.co/datasets/chenghao/cuad_qa.

  5. O

    CUAD (Contract Understanding Atticus Dataset)

    • opendatalab.com
    zip
    Updated Sep 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Nueva School (2022). CUAD (Contract Understanding Atticus Dataset) [Dataset]. https://opendatalab.com/OpenDataLab/CUAD
    Explore at:
    zip(18309308 bytes)Available download formats
    Dataset updated
    Sep 22, 2022
    Dataset provided by
    The Nueva School
    University of California, Berkeley
    Description

    Contract Understanding Atticus Dataset (CUAD) is a dataset for legal contract review. CUAD was created with dozens of legal experts from The Atticus Project and consists of over 13,000 annotations. The task is to highlight salient portions of a contract that are important for a human to review.

  6. Z

    MAUD v1

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2024). MAUD v1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7500063
    Explore at:
    Dataset updated
    Jul 15, 2024
    Dataset authored and provided by
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Merger Agreement Understanding Dataset (MAUD) v1 is a corpus of 47,000+ labels in 152 merger agreements that have been manually labeled under the supervision of experienced lawyers to identify 92 questions in each agreement used by the 2021 American Bar Association (ABA) Public Target Deal Points Study.

    MAUD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

    ReadMe and Datasheet are published here. Code for replicating the results, together with the model trained on CUAD, is published on Github here.

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
Organization logo

Contract Understanding Atticus Dataset (CUAD)

A dataset of legal contracts with rich expert annotations.

Explore at:
45 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Atticus Project
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please download the full version of the dataset from Zenodo, here.

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

Code for replicating the results, together with the model trained on CUAD, is published on Github here.

Search
Clear search
Close search
Google apps
Main menu