39 datasets found
  1. Contract Understanding Atticus Dataset (CUAD)

    • kaggle.com
    Updated Mar 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please download the full version of the dataset from Zenodo, here.

    Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

    We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

    Code for replicating the results, together with the model trained on CUAD, is published on Github here.

  2. O

    CUAD (Contract Understanding Atticus Dataset)

    • opendatalab.com
    zip
    Updated Sep 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Nueva School (2022). CUAD (Contract Understanding Atticus Dataset) [Dataset]. https://opendatalab.com/OpenDataLab/CUAD
    Explore at:
    zip(18309308 bytes)Available download formats
    Dataset updated
    Sep 22, 2022
    Dataset provided by
    The Nueva School
    University of California, Berkeley
    Description

    Contract Understanding Atticus Dataset (CUAD) is a dataset for legal contract review. CUAD was created with dozens of legal experts from The Atticus Project and consists of over 13,000 annotations. The task is to highlight salient portions of a contract that are important for a human to review.

  3. h

    filtered-cuad

    • huggingface.co
    Updated Aug 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alex Apostolopoulos (2011). filtered-cuad [Dataset]. https://huggingface.co/datasets/alex-apostolo/filtered-cuad
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2011
    Authors
    Alex Apostolopoulos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for filtered_cuad

      Dataset Summary
    

    Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled to identify 41 categories of important clauses that lawyers look for when reviewing contracts in connection with corporate transactions. This dataset is a filtered version of CUAD. It excludes legal contracts with an Agreement date prior to 2002 and contracts which are not… See the full description on the dataset page: https://huggingface.co/datasets/alex-apostolo/filtered-cuad.

  4. h

    cuad_qa

    • huggingface.co
    Updated Sep 15, 1999
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chenghao Mou (1999). cuad_qa [Dataset]. https://huggingface.co/datasets/chenghao/cuad_qa
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 15, 1999
    Authors
    Chenghao Mou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for CUAD

    This is a modified version of original CUAD which trims the question to its label form.

      Dataset Summary
    

    Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled to identify 41 categories of important clauses that lawyers look for when reviewing contracts in connection with corporate transactions. CUAD is curated and maintained by The Atticus Project, Inc.… See the full description on the dataset page: https://huggingface.co/datasets/chenghao/cuad_qa.

  5. h

    cuad-deepseek

    • huggingface.co
    Updated May 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ZenML (2025). cuad-deepseek [Dataset]. https://huggingface.co/datasets/zenml/cuad-deepseek
    Explore at:
    Dataset updated
    May 9, 2025
    Dataset authored and provided by
    ZenML
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUAD-DeepSeek: Enhanced Legal Contract Understanding Dataset

    CUAD-DeepSeek is an enhanced version of the Contract Understanding Atticus Dataset (CUAD), enriched with expert rationales and reasoning traces provided by the DeepSeek language model. This dataset aims to improve legal contract analysis by providing not just classifications but detailed explanations for why specific clauses belong to particular legal categories.

      Purpose and Scope
    

    Legal contract review is… See the full description on the dataset page: https://huggingface.co/datasets/zenml/cuad-deepseek.

  6. h

    CUAD_v1_Contract_Understanding_PDF

    • huggingface.co
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Voigt Godoy (2025). CUAD_v1_Contract_Understanding_PDF [Dataset]. https://huggingface.co/datasets/dvgodoy/CUAD_v1_Contract_Understanding_PDF
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 20, 2025
    Authors
    Daniel Voigt Godoy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Card for Contract Understanding Atticus Dataset (CUAD) PDF

    This dataset contains the PDFs and the full text of 509 commercial legal contracts from the original CUAD dataset. One of the original 510 contracts was removed due to being a scanned copy. The extracted text was cleaned using clean-text. The PDFs were encoded in base64 and added as the pdf_bytes_base64 feature. You can easily and quickly load it: dataset = load_dataset("dvgodoy/CUAD_v1_Contract_Understanding_PDF")… See the full description on the dataset page: https://huggingface.co/datasets/dvgodoy/CUAD_v1_Contract_Understanding_PDF.

  7. MAUD v1

    • zenodo.org
    zip
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project; The Atticus Project (2024). MAUD v1 [Dataset]. http://doi.org/10.5281/zenodo.7500064
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    The Atticus Project; The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Merger Agreement Understanding Dataset (MAUD) v1 is a corpus of 47,000+ labels in 152 merger agreements that have been manually labeled under the supervision of experienced lawyers to identify 92 questions in each agreement used by the 2021 American Bar Association (ABA) Public Target Deal Points Study.

    MAUD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

    ReadMe and Datasheet are published here. Code for replicating the results, together with the model trained on CUAD, is published on Github here.

  8. h

    CUADGoverningLawLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADGoverningLawLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADGoverningLawLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADGoverningLawLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause specifies which state/country’s law governs the contract.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADGoverningLawLegalBenchClassification.

  9. h

    CUADInsuranceLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADInsuranceLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADInsuranceLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADInsuranceLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if clause creates a requirement for insurance that must be maintained by one party for the benefit of the counterparty.

      Task category
    

    t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding model on this… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADInsuranceLegalBenchClassification.

  10. Z

    MAUD v1

    • data.niaid.nih.gov
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Atticus Project (2024). MAUD v1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7500063
    Explore at:
    Dataset updated
    Jul 15, 2024
    Dataset authored and provided by
    The Atticus Project
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Merger Agreement Understanding Dataset (MAUD) v1 is a corpus of 47,000+ labels in 152 merger agreements that have been manually labeled under the supervision of experienced lawyers to identify 92 questions in each agreement used by the 2021 American Bar Association (ABA) Public Target Deal Points Study.

    MAUD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

    ReadMe and Datasheet are published here. Code for replicating the results, together with the model trained on CUAD, is published on Github here.

  11. P

    Merger Agreement Understanding Dataset (MAUD) Dataset

    • paperswithcode.com
    Updated Jan 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Steven H. Wang; Antoine Scardigli; Leonard Tang; Wei Chen; Dimitry Levkin; Anya Chen; Spencer Ball; Thomas Woodside; Oliver Zhang; Dan Hendrycks (2023). Merger Agreement Understanding Dataset (MAUD) Dataset [Dataset]. https://paperswithcode.com/dataset/merger-agreement-understanding-dataset-maud
    Explore at:
    Dataset updated
    Jan 1, 2023
    Authors
    Steven H. Wang; Antoine Scardigli; Leonard Tang; Wei Chen; Dimitry Levkin; Anya Chen; Spencer Ball; Thomas Woodside; Oliver Zhang; Dan Hendrycks
    Description

    MAUD is an expert-annotated merger agreement reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points study, where lawyers and law students answered 92 questions about 152 merger agreements.

    With over 39,000 examples and 47,000 total annotations, it is the largest expert-annotated legal reading comprehension dataset in the English language, as well as the first expert-annotated merger agreement dataset.

  12. h

    CUADAntiAssignmentLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADAntiAssignmentLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADAntiAssignmentLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADAntiAssignmentLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause requires consent or notice of a party if the contract is assigned to a third party.

    Task categoryt2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADAntiAssignmentLegalBenchClassification.

  13. h

    CUADWarrantyDurationLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADWarrantyDurationLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADWarrantyDurationLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADWarrantyDurationLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause specifies a duration of any warranty against defects or errors in technology, products, or services provided under the contract.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADWarrantyDurationLegalBenchClassification.

  14. h

    CUADLicenseGrantLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADLicenseGrantLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADLicenseGrantLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADLicenseGrantLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause contains a license granted by one party to its counterparty.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADLicenseGrantLegalBenchClassification.

  15. h

    CUADThirdPartyBeneficiaryLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADThirdPartyBeneficiaryLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADThirdPartyBeneficiaryLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADThirdPartyBeneficiaryLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause specifies that that there a non-contracting party who is a beneficiary to some or all of the clauses in the contract and therefore can enforce its rights against a contracting party.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADThirdPartyBeneficiaryLegalBenchClassification.

  16. h

    CUADMostFavoredNationLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADMostFavoredNationLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADMostFavoredNationLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADMostFavoredNationLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if a third party gets better terms on the licensing or sale of technology/goods/services described in the contract, the buyer of such technology/goods/services under the contract shall be entitled to those better terms.

    Task category t2c

    Domains Legal, Written

    Reference… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADMostFavoredNationLegalBenchClassification.

  17. h

    CUADRevenueProfitSharingLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADRevenueProfitSharingLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADRevenueProfitSharingLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADRevenueProfitSharingLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause require a party to share revenue or profit with the counterparty for any technology, goods, or services.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADRevenueProfitSharingLegalBenchClassification.

  18. h

    CUADExpirationDateLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADExpirationDateLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADExpirationDateLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADExpirationDateLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause specifies the date upon which the initial term expires.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an embedding model on this dataset using the following code: import mteb… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADExpirationDateLegalBenchClassification.

  19. h

    CUADPriceRestrictionsLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADPriceRestrictionsLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADPriceRestrictionsLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADPriceRestrictionsLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause places a restriction on the ability of a party to raise or reduce prices of technology, goods, or services provided.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADPriceRestrictionsLegalBenchClassification.

  20. h

    CUADVolumeRestrictionLegalBenchClassification

    • huggingface.co
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massive Text Embedding Benchmark (2025). CUADVolumeRestrictionLegalBenchClassification [Dataset]. https://huggingface.co/datasets/mteb/CUADVolumeRestrictionLegalBenchClassification
    Explore at:
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Massive Text Embedding Benchmark
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CUADVolumeRestrictionLegalBenchClassification An MTEB dataset Massive Text Embedding Benchmark

    This task was constructed from the CUAD dataset. It consists of determining if the clause specifies a fee increase or consent requirement, etc. if one party's use of the product/services exceeds certain threshold.

    Task category t2c

    Domains Legal, Written

    Reference https://huggingface.co/datasets/nguha/legalbench

      How to evaluate on this task
    

    You can evaluate an… See the full description on the dataset page: https://huggingface.co/datasets/mteb/CUADVolumeRestrictionLegalBenchClassification.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Atticus Project (2021). Contract Understanding Atticus Dataset (CUAD) [Dataset]. http://doi.org/10.34740/kaggle/dsv/2015428
Organization logo

Contract Understanding Atticus Dataset (CUAD)

A dataset of legal contracts with rich expert annotations.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 12, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Atticus Project
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please download the full version of the dataset from Zenodo, here.

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of more than 13,000 labels in 510 commercial legal contracts that have been manually labeled by The Atticus Project to identify 41 categories of important clauses that lawyers look for when reviewing contracts.

We tested CUAD v1 against ten pretrained AI models and published the results on arXiv here.

Code for replicating the results, together with the model trained on CUAD, is published on Github here.

Search
Clear search
Close search
Google apps
Main menu