56 datasets found

Supreme Court Judgment Prediction
kaggle.com
zip
Updated Dec 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deep Contractor (2021). Supreme Court Judgment Prediction [Dataset]. https://www.kaggle.com/datasets/deepcontractor/supreme-court-judgment-prediction
Explore at:
zip(1397072 bytes)Available download formats
Dataset updated
Dec 16, 2021
Authors
Deep Contractor
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Artificial intelligence is being utilized in many domains as of late, and the legal system is no exception. However, as it stands now, the number of well-annotated datasets pertaining to legal documents from the Supreme Court of the United States (SCOTUS) is very limited for public use. Even though the Supreme Court rulings are public domain knowledge, trying to do meaningful work with them becomes a much greater task due to the need to manually gather and process that data from scratch each time. Hence, our goal is to create a high-quality dataset of SCOTUS court cases so that they may be readily used in natural language processing (NLP) research and other data-driven applications. Additionally, recent advances in NLP provide us with the tools to build predictive models that can be used to reveal patterns that influence court decisions. By using advanced NLP algorithms to analyze previous court cases, the trained models are able to predict and classify a court's judgment given the case's facts from the plaintiff and the defendant in textual format; in other words, the model is emulating a human jury by generating a final verdict

Content

The dataset contains 3304 cases from the Supreme Court of the United States from 1955 to 2021. Each case has the case's identifiers as well as the facts of the case and the decision outcome. Other related datasets rarely included the facts of the case which could prove to be helpful in natural language processing applications. One potential use case of this dataset is determining the outcome of a case using its facts.

Target Variable: First Party Winner, if true means that the first party won, and if false it means that the second party won. Use NLP techniques to build features out of facts column.

research team's jupyter notebook: click here

Acknowledgements

Mohammad Alali, Shaayan Syed, Mohammed Alsayed, Smit Patel, Hemanth Bodala
US Supreme Court Cases, 1946-2016
kaggle.com
zip
Updated Jan 25, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Washington University (2017). US Supreme Court Cases, 1946-2016 [Dataset]. https://www.kaggle.com/datasets/wustl/supreme-court
Explore at:
zip(670011 bytes)Available download formats
Dataset updated
Jan 25, 2017
Dataset authored and provided by
Washington University
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Area covered
United States
Description
Content

The Supreme Court database is the definitive source for researchers, students, journalists, and citizens interested in the United States Supreme Court. The database contains more than two hundred variables regarding each case decided by the Court between the 1946 and 2015 terms. Examples include the identity of the court whose decision the Supreme Court reviewed, the parties to the suit, the legal provisions considered in the case, and the votes of the Justices. The database codebook is available here.

Acknowledgements

The database was compiled by Professor Spaeth of Washington University Law and funded with a grant from the National Science Foundation.
United States Supreme Court Judicial Database Terms Series
catalog.data.gov
datasets.ai
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Justice Statistics (2025). United States Supreme Court Judicial Database Terms Series [Dataset]. https://catalog.data.gov/dataset/united-states-supreme-court-judicial-database-terms-series-aea2b
Explore at:
Dataset updated
Nov 14, 2025
Dataset provided by
Bureau of Justice Statisticshttp://bjs.ojp.gov/
Area covered
United States
Description
Investigator(s): Harold J. Spaeth, James L. Gibson, Michigan State University This data collection encompasses all aspects of United States Supreme Court decision-making from the beginning of the Warren Court in 1953 up to the completion of the 1995 term of the Rehnquist Court on July 1, 1996, including any decisions made afterward but before the start of the 1996 term on October 7, 1996. In this collection, distinct aspects of the court's decisions are covered by six types of variables: (1) identification variables including case citation, docket number, unit of analysis, and number of records per unit of analysis, (2) background variables offering information on origin of case, source of case, reason for granting cert, parties to the case, direction of the lower court's decision, and manner in which the Court takes jurisdiction, (3) chronological variables covering date of term of court, chief justice, and natural court, (4) substantive variables including multiple legal provisions, authority for decision, issue, issue areas, and direction of decision, (5) outcome variables supplying information on form of decision, disposition of case, winning party, declaration of unconstitutionality, and multiple memorandum decisions, and (6) voting and opinion variables pertaining to the vote in the case and to the direction of the individual justices' votes.Years Produced: Annually
Data from: Expanded United States Supreme Court Judicial Database, 1946-1968...
catalog.data.gov
icpsr.umich.edu
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Justice Statistics (2025). Expanded United States Supreme Court Judicial Database, 1946-1968 Terms [Dataset]. https://catalog.data.gov/dataset/expanded-united-states-supreme-court-judicial-database-1946-1968-terms-45a5d
Explore at:
Dataset updated
Nov 14, 2025
Dataset provided by
Bureau of Justice Statisticshttp://bjs.ojp.gov/
Area covered
United States
Description
This data collection is an expanded version of UNITED STATES SUPREME COURT JUDICIAL DATABASE, 1953-1996 TERMS (ICPSR 9422), encompassing all aspects of United States Supreme Court decision-making from the beginning of the Vinson Court in 1946 to the end of the Warren Court in 1968. Two major differences distinguish the expanded version of the database from the original collection: the addition of data on the decisions of the Vinson Court, and the inclusion of the conference votes of the Vinson and Warren courts. Whereas the original collection contained only the vote as reported in the UNITED STATES SUPREME COURT REPORTS, the expanded database includes all votes cast in conference. Concomitant with the expansion of the database is a shift in its basic unit of analysis. The original collection contained every case in which at least one justice wrote an opinion, and cases without opinions were excluded. This version includes every case in which the Court cast a conference vote, with and without opinions. The justices cast many more votes than they wrote opinions, and hence, the number of Warren Court records in this version increased by more than a factor of two over the original version. As in the original collection, distinct aspects of the Court's decisions are covered by six types of variables: (1) identification variables including case citation, docket number, unit of analysis, and number of records per unit of analysis, (2) background variables offering information on origin of case, source of case, reason for granting cert, parties to the case, direction of the lower court's decision, and manner in which the Court takes jurisdiction, (3) chronological variables covering date of term of court, chief justice, and natural court, (4) substantive variables including multiple legal provisions, authority for decision, issue, issue areas, and direction of decision, (5) outcome variables supplying information on form of decision, disposition of case, winning party, declaration of unconstitutionality, and multiple memorandum decisions, and (6) voting and opinion variables pertaining to the vote in the case and to the direction of the individual justices' votes.
Pending Court Cases in India
kaggle.com
zip
Updated Sep 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie Kri (2023). Pending Court Cases in India [Dataset]. https://www.kaggle.com/datasets/juliekri/pending-court-cases-in-india
Explore at:
zip(1319 bytes)Available download formats
Dataset updated
Sep 4, 2023
Authors
Julie Kri
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
India
Description
Pendency of Court Cases in India

India has the largest number of pending court cases in the world. Many judges and government officials have said that the pendency of cases is the biggest challenge before Indian judiciary. According to a 2018 NITI Aayog strategy paper, at the then-prevailing rate of disposal of cases in our courts, it would take more than 324 years to clear the backlog.

Pendency of court cases in India is the delay in the disposal of cases (lawsuits) to provide justice to the aggrieved person or organization by judicial courts at all levels. The judiciary in India works in hierarchy at three levels - supreme court, state or high courts, and district courts. The court cases is categorized into two types - civil and criminal. In 2022, the total number of pending cases of all types and at all levels rose to 50 million or 5 crores, including over 169,000 court cases pending for more than 30 years in district and high courts.

Causes of pendency

Low strength of judges and non-judicial staff

Vacant position of the judges

Inadequate funding

Lack of infrastructure

Abuse of legal procedure

State-wise statistics

Courthall shortfall is calculated as lack of courthalls as percentage of the total sanctioned strength of the judges. A negative percentage means courthalls are in excess. Case clearance rate (CCR) is cases disposed in a year as a percentage of new cases filed in the same year. CCR of less than 100 means case pendency will increase, CCR equal to 100 means case pendency will remain same, CCR of more than 100 means case pendency will decrease. NA: Not Available. (Source: India Justice Report, 2022)
d
Registration Status, Case Category, and Disposal Period wise Disposed Cases...
dataful.in
Updated May 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataful (Factly) (2025). Registration Status, Case Category, and Disposal Period wise Disposed Cases in Supreme Court by Year of Disposal [Dataset]. https://dataful.in/datasets/21379
Explore at:
csv, xlsx, application/x-parquetAvailable download formats
Dataset updated
May 19, 2025
Dataset authored and provided by
Dataful (Factly)
License
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
Area covered
All India
Variables measured
Number of disposed cases
Description
This dataset presents detailed statistics on court case disposals in India, categorized by the case type and the registration status. It includes data categorized by age of the case at the time of disposal. It captures the absolute number of cases disposed within various time brackets, ranging from within 1 year to more than 21 years.
m
Appeal Cases heard at the Supreme Court of Nigeria Dataset
data.mendeley.com
Updated Jun 9, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremiah Balogun (2023). Appeal Cases heard at the Supreme Court of Nigeria Dataset [Dataset]. http://doi.org/10.17632/ky6zfyf669.1
Explore at:
Unique identifier
https://doi.org/10.17632/ky6zfyf669.1
Dataset updated
Jun 9, 2023
Authors
Jeremiah Balogun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Nigeria
Description
The dataset contains information about appeal cases heard at the Supreme Court of Nigeria (SCN) between the years 1962 to 2022. The dataset was extracted from case files that were provided by The Prison Law Pavillion; a data archiving firm in Nigeria. The dataset originally consisted of documentation of the various appeal cases alongside the outcome of the judgment of the SCN. Feature extraction techniques were used to generate a structured dataset containing information about a number of annotated features. Some of the features were stored as string values while some of the features were stored as numeric values. The dataset consists of information about 14 features including the outcome of the judgment. 13 features are the input variables among which 4 are stored as strings while the remaining 9 were stored as numeric values. Missing values among the numeric values were represented using the value -1. Unsupervised and Supervised machine learning algorithms can be applied to the dataset for the purpose of extracting important information required for gaining a better understanding of the relationship that exists among the features and with respect to predicting the target class which is the outcome of the SCN judgment.
n
U.S. Appeals and Supreme Court dataset for: Judicial hierarchy and...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Oct 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Felix Herron; Michael Livermore; Daniel Rockmore; Keith Carlson (2023). U.S. Appeals and Supreme Court dataset for: Judicial hierarchy and discursive influence [Dataset]. http://doi.org/10.5061/dryad.w3r2280wt
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.w3r2280wt
Dataset updated
Oct 3, 2023
Dataset provided by
Sorbonne Université
University of Virginia
Dartmouth College
Authors
Felix Herron; Michael Livermore; Daniel Rockmore; Keith Carlson
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
This dataset contains written opinions from the 11 numbered Courts of Appeals and the DC Circuit Court of Appeals (not including the Federal Circuit), as well as the SCOTUS. It also contains metadata pertaining to each opinion, such as author, year, etc. It also contains the processed outputs of the rDIM model (Gerow et. al. 2018) pertaining to the experiments performed in our paper. These results contain the assigned influence and topic distribution for each case. Methods The data was curated from four main sources:

Harvard Caselaw Access Project case.law Federal Judicial Center (FJC) list of judges A list of federal appeals court cases selected for review, as well as their corresponding SCOTUS opinions from Livermore et. al. "The Supreme Court and the Judicial Genre" The Supreme Court Database (SCDB)

The opinions were cleaned using standard text cleaning techniques. The authors were deduced by performing regular expression matches between the noisy Caselaw author field and a list of judges from the FJC.
⚖️Swiss Federal Supreme Court Dataset (SCD)
kaggle.com
zip
Updated Nov 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahdavi.1202 (2023). ⚖️Swiss Federal Supreme Court Dataset (SCD) [Dataset]. https://www.kaggle.com/datasets/mahdavi1202/swiss-federal-supreme-court-dataset-scd
Explore at:
zip(16930128 bytes)Available download formats
Dataset updated
Nov 1, 2023
Authors
Mahdavi.1202
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
📌For most researchers and all applications which do not require accessing the judgment texts, we recommend using the standard .csv export.

📌Here, along with a csv file and a pdf file, it has been uploaded for guidance and more details about the dataset and columns.

Description

The Swiss Federal Supreme Court Dataset (SCD) provides a record of all 118,443 cases decided by the Swiss Federal Supreme Court between 2007 and September 2023. The SCD includes 31 variables that document basic case information, the court composition, the area of law, information about the appealed judgment, the parties, the case outcome, and about citations and publication status.

The dataset can be used as data infrastructure for both qualitative and quantitative analysis of Federal Supreme Court jurisprudence. It is generated using a fully automated pipeline and will be updated quarterly until at least 2025 to include the latest judgments and possible expansions.

Number of instances: 118443

Number of attributes: 31

A brief description of the columns:

✅docref: Reference to the document

✅url: Web address or link associated with the document

✅date: Date related to the document

✅year: Year related to the document

✅proc_type: Type of judicial process

✅merged_cases: Number of merged cases

✅division: Division or department division

✅division_type: Type of division or department division

✅n_judges: Number of judges

✅language: Document language

✅length: Length of the document

✅area_general: General topic

✅area_intermediate: Intermediate topic

✅area_detailed: Detailed topic

✅topic: Topic

✅issue: Issue

✅source_date: Source date

✅source_canton: Source canton

✅proc_duration: Duration of the judicial process

✅app_class: Applicant class

✅app_represented: Applicant represented type

✅resp_class: Respondent class

✅resp_represented: Respondent represented type

✅outcome: Outcome

✅outcome_binary: Binary outcome

✅cited_bger: Citation to the Swiss Federal Court

✅n_cited_bger: Number of citations to the Swiss Federal Court

✅cited_bge: Citation to the Swiss Civil Court

✅n_cited_bge: Number of citations to the Swiss Civil Court

✅leading_case: Precedent case

✅doi_version: Digital Object Identifier (DOI) version
Data from: United States Supreme Court Judicial Database, 1953-1997 Terms
icpsr.umich.edu
catalog.data.gov
ascii, sas, spss +1
Updated Nov 4, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Spaeth, Harold J. (2005). United States Supreme Court Judicial Database, 1953-1997 Terms [Dataset]. http://doi.org/10.3886/ICPSR09422.v9
Explore at:
ascii, stata, sas, spssAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR09422.v9
Dataset updated
Nov 4, 2005
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
Spaeth, Harold J.
License
https://www.icpsr.umich.edu/web/ICPSR/studies/9422/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/9422/terms
Time period covered
1953 - 1997
Area covered
United States
Description
This data collection encompasses all aspects of United States Supreme Court decision-making from the beginning of the Warren Court in 1953 to the completion of the most recent term of the Rehnquist Court. In this collection, distinct aspects of the Court's decisions are covered by six types of variables: (1) identification variables including citations and docket numbers, (2) background variables offering information on how the Court took jurisdiction, origin and source of case, and the reason the Court granted cert, (3) chronological variables covering date of decision, Court term, and natural court, (4) substantive variables including legal provisions, issues, and direction of decision, (5) outcome variables supplying information on disposition of case, winning party, formal alteration of precedent, and declaration of unconstitutionality, and (6) voting and opinion variables pertaining to how individual justices voted, their opinions and interagreements, and the direction of their votes.
h
Indian-Supreme-Court-Judgements-Chunked
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vihaan Nama, Indian-Supreme-Court-Judgements-Chunked [Dataset]. https://huggingface.co/datasets/vihaannnn/Indian-Supreme-Court-Judgements-Chunked
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Authors
Vihaan Nama
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
India
Description
Indian Supreme Court Judgements Chunked

Executive Summary

The dataset aims to address the chronic backlog in the Indian judiciary system, particularly in the Supreme Court, by creating a dataset optimized for legal language models (LLMs). The dataset will consist of pre-processed, chunked, and embedded textual data derived from the Supreme Court's judgment PDFs.

Problem and Importance - Motivation

Indian courts are overwhelmed with pending cases, with the… See the full description on the dataset page: https://huggingface.co/datasets/vihaannnn/Indian-Supreme-Court-Judgements-Chunked.
Legal Dataset: SC Judgments India (1950–2024)
kaggle.com
zip
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FanaticAuthorship (2025). Legal Dataset: SC Judgments India (1950–2024) [Dataset]. https://www.kaggle.com/datasets/adarshsingh0903/legal-dataset-sc-judgments-india-19502024
Explore at:
zip(6872192295 bytes)Available download formats
Dataset updated
Apr 22, 2025
Authors
FanaticAuthorship
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
India
Description
🏛️ Supreme Court Judgments (India, 1950–2024)

This dataset is a comprehensive collection of Supreme Court of India judgments from 1950 to early 2025, covering approximately 98% of the documents available on the Indian Kanoon website under the Supreme Court judgments section.

📂 Contents

26000 of PDF files of judgments spanning over 75 years.

Rich legal language suitable for Legal NLP, RAG systems, Summarization, Classification, and Legal Information Retrieval.

Each file represents an official judgment document delivered by the Supreme Court of India.

📌 Source & Coverage

Source: Scraped and compiled from Indian Kanoon.

Coverage: ~98% of available Supreme Court judgments available on Indian Kanoon website as of early 2025.

⚙️ Use Cases

Legal Language Modeling and Pretraining

Retrieval-Augmented Generation (RAG) for Law

Legal Document Summarization

Case Similarity & Legal Analytics

Timeline-based legal precedent analysis

Recommended For

Legal AI researchers

Law and public policy scholars

NLP practitioners working on domain-specific language models

Developers building legal chatbots or legal tech products
H
Replication Data for: Justice Speaks, But Who’s Listening? Mass Public...
dataverse.harvard.edu
Updated Dec 31, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Hitt; Kyle L. Saunders; Kevin M. Scott (2020). Replication Data for: Justice Speaks, But Who’s Listening? Mass Public Awareness of U.S. Supreme Court Cases [Dataset]. http://doi.org/10.7910/DVN/DYBF2Y
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/DYBF2Y
Dataset updated
Dec 31, 2020
Dataset provided by
Harvard Dataverse
Authors
Matthew Hitt; Kyle L. Saunders; Kevin M. Scott
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
United States
Description
Theories of the relationship between the Supreme Court and the public assume that the public can potentially monitor the Court's behavior. We seek to measure the impact of Court decisions on public awareness of its cases. Public awareness of cases varies according to individual differences: more educated, knowledgeable, and informationally-motivated citizens are more likely to report awareness. Further, decision announcements increase awareness more generally, especially in cases of moderate salience. The results suggest that while the public may eventually respond to the behavior of national institutions, this response is likely first filtered through an elite subset of the population.
U
Replication data for: Impartial Judges? Race, Institutional Context, and...
dataverse.unc.edu
dataverse-staging.rdmc.unc.edu
Updated Nov 19, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UNC Dataverse (2015). Replication data for: Impartial Judges? Race, Institutional Context, and U.S. State Supreme Courts [Dataset]. http://doi.org/10.15139/S3/12123
Explore at:
tsv(32053436), text/plain; charset=us-ascii(9722)Available download formats
Unique identifier
https://doi.org/10.15139/S3/12123
Dataset updated
Nov 19, 2015
Dataset provided by
UNC Dataverse
Area covered
United States
Description
We address a fundamental question in judicial politics: other things being equal, do African American judges behave differently than white judges? Many presume that white judges differ from their minority counterparts in terms of sentencing, deliberation, and propensity to overturn decisions. However, to date, there is little empirical evidence on whether there are systematic differences in behavior between these judges. Here, we utilize the newly created judge-level U.S. State Supreme Court Database to assess whether judicial decisionmaking is affected by the race of the judge. Looking at all criminal cases decided by U.S. state supreme court judges from 1995-1998, we find evidence of differences between white and non-white judges, but only in states where there is no intermediate appellate court. This suggests the effects of race on judicial decisionmaking are conditioned by the institutional structure of the court system.
i
Supreme Court of Pakistan Judgments Dataset
ieee-dataport.org
Updated Jul 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Ammar (2025). Supreme Court of Pakistan Judgments Dataset [Dataset]. https://ieee-dataport.org/documents/supreme-court-pakistan-judgments-dataset
Explore at:
Dataset updated
Jul 15, 2025
Authors
Muhammad Ammar
Area covered
Pakistan
Description
Supreme Court of Pakistan Judgments DatasetThis dataset contains almost 1200 judgments made by the Supreme Court of Pakistan up to May 2025.This dataset includes the judgments made by
Z
Usages of 'our' by U.S. Supreme Court Justices, 1791-2011
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated May 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nystrom, Eric C.; Tanenhaus, David S. (2021). Usages of 'our' by U.S. Supreme Court Justices, 1791-2011 [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4279657
Explore at:
Dataset updated
May 9, 2021
Dataset provided by
University of Nevada, Las Vegas
Arizona State University
Authors
Nystrom, Eric C.; Tanenhaus, David S.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This is data to accompany an article by Eric C. Nystrom and David S. Tanenhaus, "'Our Most Sacred Legal Commitments:' A Digital Exploration of the U.S. Supreme Court Defining Who We Are and How They Should Opine," University of Cincinnati Law Review 89, no. 4 (May 2021).

Data Creation

This data was generated using the "cap-tools" suite of programs (written by Eric C. Nystrom and available at https://github.com/ericnystrom/captools). The current data version (05202020, 07152020) included with this repository was generated by running "cap-tools" against:

Caselaw Access Project (CAP), United States jurisdiction, rev. 20200303

CAP New York jurisdiction rev. 20200302

Harold J. Spaeth, Lee Epstein, Andrew D. Martin, Jeffrey A. Segal, Theodore J. Ruger, and Sara C. Benesh. 2019 Supreme Court Database, Version 2019 Release 01. URL: http://Supremecourtdatabase.org

Harold J. Spaeth, Lee Epstein, Andrew D. Martin, Jeffrey A. Segal, Theodore J. Ruger, and Sara C. Benesh. 2019 Supreme Court Database, Version Legacy Release 05. URL: http://Supremecourtdatabase.org

Eric C. Nystrom and David S. Tanenhaus, (2020). Connecting U.S. Supreme Court Case Information and Opinion Authorship (SCDB) to Full Case Text Data (CAP), 1791-2011 (Version 1.0) [Data set]. Zenodo. http://doi.org/10.5281/zenodo.4344917

Data Files

"CURRENT-our-kwic-cap-scdb" is a TSV of keyword-in-context (KWIC) results for the term "our," with a six word window on each side of the term. Basic results were then filtered to exclude any results that were not found in the SCDB-CAP map, and the SCDB ID was added. The file contains 79693 records plus a first-line header.

Fields:

1: "cap-id" -- ID of the case in CAP system. 2: "casename" -- short form of the case name 3: "cite" -- reporter citation (typically US Reports) 4: "date" -- year the case was decided 5: "courtname" -- Court name, in CAP. This should be "U.S." for US Supreme Court, but some records were misfiled within the CAP data and have something else here. These were manually checked for actually being US Supreme Court records, however. 6: "courtslug" -- CAP "slug" representing this court. Typically "us" but there are a handful of variations. 7: "numopins" -- Number of opinions in CAP in this case, with counting beginning at 1. CAP's detection routines get this right a lot, but there are definitely exceptions where the actual opinion count in the case, as measured by a human observer, would be different. 8: "opintype" -- The type of opinion, as determined by CAP. Generally right, with some allowance for errors, as mentioned in the other fields. 9: "opinnum" -- The number of the particular opinion in this case from which this match was drawn. 1-based counting. 10: "casematch" -- The sequential number of this match for the case as a whole, numbered from 1. 11: "opinmatch" -- The sequential number of this match for this opinion only, numbered from 1. 12: "before" -- the string of words prior to the matching word; in this data, six words. (lowercase) 13: "term" -- the term itself, here, it is always "our" 14: "after" -- the string of six words following the term (lowercase) 15: "scdb-id" -- the SCDB identification number of this case, matched using the CAP-SCDB match described above.

"CURRENT-our-pos-cap-scdb" -- a TSV file very similar to the KWIC results file described above, with the same header and field structure, and the same results from a case perspective. The difference is that the text in fields 12, 13, and 14 was tagged with parts of speech (POS) using the Perl Lingua::EN::Tagger library, v0.28, by Aaron Coburn. The window was lengthened to seven words on each side of "our" and then tags were applied, but since the tagger also tags punctuation separately in many cases, sometimes more than seven term/TAG "words" exist in fields 12 and 14. A complete list of the tags supported by the tagger and their grammatical meanings can be found at: https://metacpan.org/source/ACOBURN/Lingua-EN-Tagger-0.30/README

"RESULTS-our-kwic-followers-opinauth-chief_071520.tsv" further extends the results contained in the files above, by isolating the noun phrase following "our" using the grammatical tags above. These noun phrases were individually categorized by our legal historian as constitutive of "culture" or "process" (or falling into an ambiguous category). (See Tanenhaus and Nystrom, listed above.) The data was further augmented by applying the opinion author's name and SCDB author ID number from the corrected opinion authorship information, available separately as Nystrom and Tanenhaus (cite above). The Chief Justice information was also added, from SCDB.

"our-casecount-by-year_normalized.tsv" -- a TSV file containing 4 columns and no header. Column 1 is the year, column 2 is the number of individual cases (not opinions) decided in that year that contained the word "our," column 3 is the total number of cases decided in that year, and column 4 is the percentage of column 3 represented by column 2 (i.e. percent of cases in a year containing "our"). Note that number of cases per year is determined from SCDB, so any minor actions such as denial of cert not included in SCDB would not be included here either.
Mortgages cases disposed orders made - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Jun 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2025). Mortgages cases disposed orders made - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/mortdisordlgd
Explore at:
Dataset updated
Jun 2, 2025
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The number of final orders made against mortgage cases disposed in the High Court. Datasets are produced on an annual year basis. The dataset is entered onto ICOS, the Integrated Courts Operations System. The data are then extracted and merged with the Central Postcode Directory, and aggregated information uploaded to this portal. Northern Ireland Courts and Tribunals Service collects information on writs and originating summonses issued in respect of mortgages in Chancery Division of the Northern Ireland High Court. This covers both Northern Ireland Housing Executive and private mortgages, and relates to both domestic and commercial properties. A mortgage case may involve more than one address or a land property. In such cases, the first postcode address entered onto ICOS is used. Not all writs and originating summonses lead to eviction. A plaintiff begins an action for an order for possession of property. The court, following a judicial hearing, may grant an order for possession. This entitles the plaintiff to apply for an order to have the defendant evicted. However, even where an order for eviction is issued the parties can still negotiate a compromise to prevent eviction. When a case is disposed of, it may have more than one final order made. This database contains the last final order made. A description of the orders is below: Possession: The court orders the defendant to deliver possession of the property to the plaintiff within a specified time. If the defendant fails to comply with the court order the plaintiff may proceed to apply to the Enforcement of Judgements Office to repossess the property and give possession of it to the plaintiff. Sale and Possession: If the plaintiff seeks possession of property which is subject to an ‘equitable mortgage’ (i.e. normally one created informally by the deposit of deeds rather than the execution of a mortgage deed) the court may order a sale of the property to enable enforcement of the equitable mortgage and that the defendant give up possession for that purpose. The sale price is subject to approval by the court. Suspended Possession: The court may postpone the date for delivery of possession if it is satisfied that the defendant is likely to be able, within a reasonable period, to pay any sums due under the mortgage, or to remedy any other breach of the obligations under the mortgage. A suspended possession order cannot be enforced by the plaintiff without the permission of the court, which will only be granted after a further hearing. Other: other orders include strike out, dismiss action, and other less common orders. Strike out: This occurs when the moving party does not wish to proceed any further, or when the court rules that there is no reasonable ground for bringing or defending the mortgage action. Dismiss action: The mortgage action is dismissed by the courts. Other orders: These include: (a) Declaration of possession coupled with an order for sale in lieu of partition and (b) Stay of Eviction - after a Possession Order is granted but prior to actual repossession, the Defendant may apply to Court to seek a stay of eviction which, if granted, prevents repossession for a certain defined period. Users of this data may have been able to self-identify themselves due to the low values in some cells. Primary and secondary disclosure control methods have been applied to this data, denoted by cells with missing data in the tables. Values of less than four, but not zero, were initially suppressed, but some of these values could have been calculated using some row and column totals and thus secondary suppression was applied to the next lowest value in the row and column. The data contain the number of final orders made against cases disposed by each Local Government District and have the following proportions of postcode coverage: 2012, 97.7%; 2013, 96.5%; 2014, 96.0%; 2015, 94.8%; 2016, 95.5%; 2017, 95.1%; 2018, 94.8%; 2019, 93.8%; 2020, 95.6%; 2021, 93.6%; 2022, 95.3%; 2023, 97.5%; 2024, 95.7%.
m
Data for: 3659611
data.mendeley.com
Updated Jul 23, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lorianne Updike Toler (2020). Data for: 3659611 [Dataset]. http://doi.org/10.17632/3vf7m4gtzb.1
Explore at:
Unique identifier
https://doi.org/10.17632/3vf7m4gtzb.1
Dataset updated
Jul 23, 2020
Authors
Lorianne Updike Toler
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes every reference to the 1787 United States Constitutional Convention in Supreme Court opinions through the 2019 term. Other variables such as citing justice, case name, year, and portion of the opinion quoting the Convention are included among many other variables.
H
State Supreme Court Cases
dataverse.harvard.edu
search.dataone.org
Updated Feb 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Francesca Amato (2022). State Supreme Court Cases [Dataset]. http://doi.org/10.7910/DVN/OYW9GT
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/OYW9GT
Dataset updated
Feb 7, 2022
Dataset provided by
Harvard Dataverse
Authors
Francesca Amato
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Contains the Supreme Court cases for the following states from the years 2000-2020: California, Hawaii, Idaho, Maryland, Massachusetts, Oklahoma, Utah, Vermont, West Virginia, and Wyoming. The data is contained in an RData file, all of the PDF opinions have been converted to character vectors.
D
Replication Data for: A High Court Plays the Accordion: Validating Ex Ante...
dataverse.azure.uit.no
dataverse.no
+1more
tsv, txt
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Henrik L. Bentsen; Gunnar Grendstad; William R. Shaffer; Eric N. Waltenburg; Eric N. Waltenburg; Henrik L. Bentsen; Gunnar Grendstad; William R. Shaffer (2023). Replication Data for: A High Court Plays the Accordion: Validating Ex Ante Case Complexity on Oral Arguments [Dataset]. http://doi.org/10.18710/DWIX6Y
Explore at:
tsv(235966), txt(213402), txt(6671)Available download formats
Unique identifier
https://doi.org/10.18710/DWIX6Y
Dataset updated
Sep 28, 2023
Dataset provided by
DataverseNO
Authors
Henrik L. Bentsen; Gunnar Grendstad; William R. Shaffer; Eric N. Waltenburg; Eric N. Waltenburg; Henrik L. Bentsen; Gunnar Grendstad; William R. Shaffer
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data set (saved in Stata *.dta and .txt) contains all observations (Norwegian supreme court cases 2008-2018 decided in five-justice panels) and variables (independent variables measuring complexity of cases and the dependent variable measuring time in hours scheduled for oral arguments) relevant for a complete replication of the the study. ABSTRACT OF STUDY: While high courts with fixed time for oral arguments deprive researchers of the opportunity to extract temporal variance, courts that apply the “accordion model” institutional design and adjust the time for oral arguments according to the perceived complexity of a case are a boon for research that seeks to validate case complexity well ahead of the courts’ opinion writing. We analyse an original data set of all 1,402 merits decisions of the Norwegian Supreme Court from 2008 to 2018 where the justices set time for oral arguments to accommodate the anticipated difficulty of the case. Our validation model empirically tests whether and how attributes of a case associated with ex ante complexity are linked with time allocated for oral arguments. Cases that deal with international law and civil law, have several legal players, are cross-appeals from lower courts are indicative of greater case complexity. We argue that these results speak powerfully to the use of case attributes and/or the time reserved for oral arguments as ex ante measures of case complexity. To enhance the external validity of our findings, future studies should examine whether these results are confirmed in high courts with similar institutional design for oral arguments. Subsequent analyses should also test the degree to which complex cases and/or time for oral arguments have predictive validity on more divergent opinions among the justices and on the time courts and justices need to render a final opinion.

Facebook

Twitter

Click to copy link

Link copied

Cite

Deep Contractor (2021). Supreme Court Judgment Prediction [Dataset]. https://www.kaggle.com/datasets/deepcontractor/supreme-court-judgment-prediction

Supreme Court Judgment Prediction

Predict the judgment of a court using NLP

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

zip(1397072 bytes)Available download formats

Dataset updated

Dec 16, 2021

Authors

Deep Contractor

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Artificial intelligence is being utilized in many domains as of late, and the legal system is no exception. However, as it stands now, the number of well-annotated datasets pertaining to legal documents from the Supreme Court of the United States (SCOTUS) is very limited for public use. Even though the Supreme Court rulings are public domain knowledge, trying to do meaningful work with them becomes a much greater task due to the need to manually gather and process that data from scratch each time. Hence, our goal is to create a high-quality dataset of SCOTUS court cases so that they may be readily used in natural language processing (NLP) research and other data-driven applications. Additionally, recent advances in NLP provide us with the tools to build predictive models that can be used to reveal patterns that influence court decisions. By using advanced NLP algorithms to analyze previous court cases, the trained models are able to predict and classify a court's judgment given the case's facts from the plaintiff and the defendant in textual format; in other words, the model is emulating a human jury by generating a final verdict

Content

The dataset contains 3304 cases from the Supreme Court of the United States from 1955 to 2021. Each case has the case's identifiers as well as the facts of the case and the decision outcome. Other related datasets rarely included the facts of the case which could prove to be helpful in natural language processing applications. One potential use case of this dataset is determining the outcome of a case using its facts.

Target Variable: First Party Winner, if true means that the first party won, and if false it means that the second party won. Use NLP techniques to build features out of facts column.

research team's jupyter notebook: click here

Acknowledgements

Mohammad Alali, Shaayan Syed, Mohammed Alsayed, Smit Patel, Hemanth Bodala

Clear search

Close search

Google apps

Main menu

Supreme Court Judgment Prediction

Context

Content

Acknowledgements

US Supreme Court Cases, 1946-2016

Content

Acknowledgements

United States Supreme Court Judicial Database Terms Series

Data from: Expanded United States Supreme Court Judicial Database, 1946-1968...

Pending Court Cases in India

Registration Status, Case Category, and Disposal Period wise Disposed Cases...

Appeal Cases heard at the Supreme Court of Nigeria Dataset

U.S. Appeals and Supreme Court dataset for: Judicial hierarchy and...

⚖️Swiss Federal Supreme Court Dataset (SCD)

Data from: United States Supreme Court Judicial Database, 1953-1997 Terms

Indian-Supreme-Court-Judgements-Chunked

Legal Dataset: SC Judgments India (1950–2024)

🏛️ Supreme Court Judgments (India, 1950–2024)

📂 Contents

📌 Source & Coverage

⚙️ Use Cases

Recommended For

Replication Data for: Justice Speaks, But Who’s Listening? Mass Public...

Replication data for: Impartial Judges? Race, Institutional Context, and...

Supreme Court of Pakistan Judgments Dataset

Usages of 'our' by U.S. Supreme Court Justices, 1791-2011

Mortgages cases disposed orders made - Dataset - data.gov.uk

Data for: 3659611

State Supreme Court Cases

Replication Data for: A High Court Plays the Accordion: Validating Ex Ante...

Supreme Court Judgment Prediction

Predict the judgment of a court using NLP

Context

Content

Acknowledgements