29 datasets found
  1. Connecting U.S. Supreme Court Case Information and Opinion Authorship (SCDB)...

    • zenodo.org
    • data.niaid.nih.gov
    tsv
    Updated Dec 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eric C. Nystrom; Eric C. Nystrom; David S. Tanenhaus; David S. Tanenhaus (2020). Connecting U.S. Supreme Court Case Information and Opinion Authorship (SCDB) to Full Case Text Data (CAP), 1791-2011 [Dataset]. http://doi.org/10.5281/zenodo.4344917
    Explore at:
    tsvAvailable download formats
    Dataset updated
    Dec 19, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Eric C. Nystrom; Eric C. Nystrom; David S. Tanenhaus; David S. Tanenhaus
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset was constructed to connect the rich metadata created by the Supreme Court Database (SCDB) to the Caselaw Access Project (CAP) full-text court opinion data. Since the SCDB includes only substantive opinions, it is necessarily a subset of the full range of opinions available through CAP.

    There are two parts to this data: the map connecting each SCDB ID to its corresponding CAP case number, and a more advanced (but error-prone) version in which the authorship of each opinion text identified for the case in CAP is attributed to the Justice who wrote it. Each of these data products have been hand-corrected to the best of this author's ability.

    SCDB-CAP map

    The SCDB->CAP map began as a relatively straightforward automated matching process, based on the US Reports citation for each case as expressed in both SCDB and CAP. Slightly over 80% of SCDB entries found a single CAP data match this way. From there, the data was entirely hand-corrected, with non-matches or duplicate matches individually investigated and manually corrected.

    Some SCDB entries simply could not be matched to an appropriate CAP text. Initially, the entirety of US Reports volume 44 was missing, but with the help of CAP staff, the volume was located as having been filed in the New York jurisdiction rather that the United States jurisdiction. The case numbers were then added to the map, but until the volume is relocated to the United States jurisdiction, it may be necessary to also incorporate the New York jurisdiction in full text analysis so that the cases from volume 44 can be searched. 108 more missing cases are from US Reports volume 131, which was a "catch up" volume published in the 19th century. These catch-up cases, many heard by the Supreme Court decades prior, were numbered with lowercase roman numerals instead of the ordinary numbers, which is almost certainly why CAP's software dismissed the catch-up section as prefatory material. Many of the rest of the errors seem largely to be examples where the SCDB project recognized a separate court action that CAP did not. Perhaps most of these seem to have been later rehearings for a case previously decided, which in the 19th century particularly were commonly reported out at the end of the first decision text. While SCDB sometimes gave these subsequent but related actions a separate SCDB entry, CAP seems to have largely incorporated them as part of the text of the main case. Additionally, there were a few that simply could not be found, despite a careful look through each database as well as the original US Reports and sometimes adjacent volumes. Finally, the cases were only matched up through the 2011 court term. After the 2011 term, the mismatches between CAP and SCDB were extensive and frequently seemed impossible to resolve.

    Even so, with the manual correction, the overall error rate is low. Of 28,304 cases, only 191 do not have a match, and of those, 108 are contained within the vol. 131 "catch up" volume. Since most of the rest are extremely short subsequent actions that were separately noted by SCDB, the effect of these non-matched cases would seem to be small in most cases.

    The typical use case would be that the researcher would generate some kind of results based on searching in the CAP full text, then could use the CAP ID to look up the SCDB ID in the map. With the SCDB ID, of course, the rich metadata from the SCDB can then be connected to each result as needed.

    Opinion authorship

    Being able to use the rich metadata of SCDB in conjunction with a case's full text is exciting, but it immediately prompts a further question -- what if the texts could be attributed directly to the Justices who authored them? SCDB produces its data in two forms; one is "case centered," where each record represents one case, and the other is "justice centered," in which each record is the vote of one Justice in one case. CAP, in turn, breaks the total text of the case into distinct opinions, and tries to attribute those opinions to their authors by scraping a string of text from the raw input. Therefore, the challenge was to connect these two sources at the opinion level.

    Connecting the opinions, like connecting the cases, involved an initial match by machines, followed by manual correction and revision. In this case, the scope of the manual effort was much larger than that posed by the case-level connection, and more errors were noted in both SCDB and CAP.

    The matching process involved a number of steps. First a list of opinions was generated from the CAP data, then matched to SCDB using the SCDB-CAP connector data described above. (Thus, a case without a CAP match in the SCDB-CAP data will not appear in the opinion author data either.) CAP opinions were numbered in the order they were encountered in each CAP case JSON object, and these numbers are used to distinguish the opinions.

    Next, a round of automatic matching was performed. If there was only one opinion, and only one author listed in the SCDB data, then the majority opinion author (as listed in SCDB) was safely assumed to be the author. If there was no author listed in SCDB, "percuriam" was recorded as the author in this data. If there were exactly two opinions and two authors, the process was also straightforward, as the SCDB-identified majority opinion author was assigned to opinion 1, and the remaining author assigned opinion 2.

    Subsequently, cases with more than two opinions were processed. A potential match (i.e. a "guess") for each opinion in a given case was created by listing each Justice identified by SCDB as having written an opinion in the case. These guesses were then parsed using a semi-automatic procedure with Levenshtein distance fuzzy name matching. With sufficiently conservative parameters, a successful fuzzy match meant that the non-successful guesses for that opinion could be deleted. These sorted guesses were then reviewed manually. Particular care was also taken for any opinion that contained authored opinions by Justices who had similar names (for example, Clark and Black differ by only a single letter). These sorts of cases, as well as instances of co-authorship, were identified and fixed manually.

    Those opinions whose authorship could not be matched then were fixed by hand. These included some where the CAP author strings were more complicated than SCDB's strict interpretation; others where the OCR in CAP which contained the Justice name was especially bad; and a number of others where "Mr. Chief Justice" couldn't be directly matched with an author name by the machine. After this light manual correction, almost 500 opinions with substantial errors remained to be individually investigated in depth, by examining the CAP record, the SCDB record, and images of the US Reports for that case. For these last tough customers, errors in the source data were commonly the cause of matching problems. Typically these were of three kinds: examples where CAP should have split the text but didn't (e.g. 2 opinions together in one opinion entry in CAP); examples where SCDB either did not identify or mis-identified an author (such as attributing it to Swayne when it was written by Miller); and examples of non-valid opinions (such as where CAP mistakenly split the opinion too early, leaving an opinion fragment).

    For these errors, a system of codes was created in the author field to signal the error type so that researchers can be suitably cautious. The error code is always at the beginning of the field and is followed by a comma and the names of each author, separated by a comma with no space to facilitate parsing. Note also that co-authors are listed as comma-separated names in this same field with no error code. Researchers will probably want to disaggregate this field to create duplicate records with each individual author for most purposes. The justice number field also contains information about all justices authoring the opinion but the error codes have been omitted here.

    • !C -- error: multiple opinion texts combined (i.e. CAP splitting error)
    • !X -- error: unattributed or misattributed opinion (not listed in SCDB as writer)
    • !D -- error: extra opinion that should be deleted, i.e. not a valid opinion
    • !W -- error: listed as Writers by SCDB, but should be co-authors

    Data file structure

    "scdb_cap-051820.tsv" is a Tab-separated data file containing 5 columns: SCDB ID, CAP ID, US Reports citation, case date, and case name (the latter three from the SCDB data).

    "scdb-cap-opinion-authorship_051920.tsv" is a Tab-separated data file containing seven columns: SCDB ID, CAP ID, US Reports citation, case name, opinion number in the case, opinion author, and SCDB justice ID. See above for caveats about disaggregating and error codes in fields six and seven.

    Errors

    It is likely that errors remain in this data, and it is also hoped that some of the errors beyond the author's immediate control might be fixed in the upstream data so that they can be corrected here. Authors would be grateful for error reports, and also reports of errors fixed, if any.

  2. Additional file 1 of SCDb: an integrated database of stomach cancer

    • figshare.com
    • springernature.figshare.com
    xlsx
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erli Gu; Wei Song; Ajing Liu; Hong Wang (2024). Additional file 1 of SCDb: an integrated database of stomach cancer [Dataset]. http://doi.org/10.6084/m9.figshare.12413894.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 19, 2024
    Dataset provided by
    figshare
    Authors
    Erli Gu; Wei Song; Ajing Liu; Hong Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1.

  3. U.S. Appeals and Supreme Court dataset for: Judicial hierarchy and...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Oct 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Felix Herron; Michael Livermore; Daniel Rockmore; Keith Carlson (2023). U.S. Appeals and Supreme Court dataset for: Judicial hierarchy and discursive influence [Dataset]. http://doi.org/10.5061/dryad.w3r2280wt
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 3, 2023
    Dataset provided by
    University of Virginia
    Sorbonne Université
    Dartmouth College
    Authors
    Felix Herron; Michael Livermore; Daniel Rockmore; Keith Carlson
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    This dataset contains written opinions from the 11 numbered Courts of Appeals and the DC Circuit Court of Appeals (not including the Federal Circuit), as well as the SCOTUS. It also contains metadata pertaining to each opinion, such as author, year, etc. It also contains the processed outputs of the rDIM model (Gerow et. al. 2018) pertaining to the experiments performed in our paper. These results contain the assigned influence and topic distribution for each case. Methods The data was curated from four main sources:

    Harvard Caselaw Access Project case.law Federal Judicial Center (FJC) list of judges A list of federal appeals court cases selected for review, as well as their corresponding SCOTUS opinions from Livermore et. al. "The Supreme Court and the Judicial Genre" The Supreme Court Database (SCDB)

    The opinions were cleaned using standard text cleaning techniques. The authors were deduced by performing regular expression matches between the noisy Caselaw author field and a list of judges from the FJC.

  4. Z

    Classification of majority opinions and headnotes written by U.S. Supreme...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nystrom, Eric C. (2020). Classification of majority opinions and headnotes written by U.S. Supreme Court Justice Antonin Scalia [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_3333947
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    Berger, Linda L.
    Nystrom, Eric C.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This is data to accompany an article by Linda L. Berger and Eric C. Nystrom, "A Rhetorical-Computational Analysis of Justice Antonin Scalia's 'Remarkable Influence': The Unexpected Importance of Deceptively Unanimous and Contested Majority Opinions," Journal of Appellate Practice and Process 20, no. 2 (2020).

    In "scalia-HN-with-ruletype.tsv," Berger classified each headnote from a Scalia-authored majority opinion as one of the following rhetorical types: argument, scalia rule, or preexisting rule. (See article for further explanation of these categories.) Organized by SCDB ID and headnote number.

    In "unanimity.tsv," Berger addressed each case with a Scalia-authored majority opinion, assessing the degree of unanimity, which may or may not be the same as that implied by the for/against vote in the case. Fields include case SCDB ID, majority-minority vote, and degree of unanimity.

    Both data files are in Tab-separated format. For further information, please contact the authors.

  5. a

    National Forest Estate Subcompartments Scotland 2019

    • hub.arcgis.com
    • dtechtive.com
    • +2more
    Updated Oct 18, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mapping.geodata_forestry (2017). National Forest Estate Subcompartments Scotland 2019 [Dataset]. https://hub.arcgis.com/maps/1a971b7b3e14439f8481d016f46d99d3_0/about
    Explore at:
    Dataset updated
    Oct 18, 2017
    Dataset authored and provided by
    mapping.geodata_forestry
    Area covered
    Description

    All organisations hold information about the core of their business. The Forestry Commission holds information on trees and forests. We use this information to help us run our business and make decisions. The role of the Forest Inventory (the Sub-compartment Database (SCDB) and the stock maps) is to be our authoritative data source, giving us information for recording, monitoring, analysis and reporting. Through this it supports decision-making on the whole of the FC estate. Information from the Inventory is used by the FC, wider government, industry and the public for economic, environmental and social forest-related decision-making. Furthermore, it supports forestrelated national policy development and government initiatives, and helps us meet our national and international forest-related reporting responsibilities. Information on our current forest resource, and the future expansion and availability of wood products from our forests, is vital for planners both in and outside the FC. It is used when looking at the development of processing industries, regional infrastructure, the effect upon communities of our actions, and to prepare and monitor government policies. The Inventory (SCDB and stock maps), with ‘Future Forest Structure’ and the ‘rollback’ functionality of Forester, will help provide a definitive measure of trends in extent, structure, composition, health, status, use, and management of all FC land holdings. We require this to meet national and international commitments, to report on the sustainable management of forests as well as to help us through the process of business and Forest Design Planning. As well as helping with the above, the SCDB helps us address detailed requests from industry, government, non-government organisations and the public for information on our estate. The FC’s growing national and international responsibilities and the requirements for monitoring and reporting on a range of forest statistics have highlighted the technical challenges we face in providing consistent, national level data. A well kept and managed SCDB and GIS (Geographical Information System - Forester) will provide the best solution for this and assist Countries in evidence-based policy making. Looking ahead at international reporting commitments; one example of an area where requirements look set to increase will be reporting on our work to combat climate change and how our estate contributes to carbon sequestration. We have put in place processes to ensure that at least the basics of our inventory are covered: 1. The inventory of forests; 2. The land-uses; 3. The land we own ( Deeds); 4. The roads we manage. We depend on others to allow us to manage the forests and to provide us with funds and in doing so we need to be seen to be responsible and accountable for our actions. A foundation of achieving this is good record keeping. A sub compartment should be recognisable on the ground. It will be similar enough in land use, species or habitat composition, yield class, age, condition, thinning history etc. to be treated as a single unit. They will generally be contiguous in nature and will not be split by roads, rivers, open space etc. Distinct boundaries are required, and these will often change as crops are felled, thinned, replanted and resurveyed. In some parts of the country foresters used historical and topographical features to delineate sub-compartment boundaries, such as hedges, walls and escarpments. In other areas no account of the history and topography of the site was taken, with field boundaries, hedges, walls, streams etc. being subsumed into the sub-compartment. Also, these features may or may not appear on the OS backdrop, again this was dependent on the staff involved and what they felt was relevant to the map. The main point is that, as managers we may find such obvious features in the middle of a sub-compartment when nothing is indicated on the stock map, while the same thing would be indicated elsewhere. Attributes; FOREST Cost centre Nos. COMPTMENT Compartemnt Nos. SUBCOMPT Sub-compartment letter SUBCOMPTID Unique identifier BLOCK Block nos. CULTCODE Cultivation Code CULTIVATN Cultivation PRIHABCODE Primary Habitat Code PRIHABITAT Primary Habitat PRILANDUSE Land Use of primary component PRISPECIES Primary component tree species PRI_PLYEAR prim. component year planted PRIPCTAREA Prim. component %Area of sub-compartment SECHABCODE Secondary Habitat Code SECHABITAT Secondary Habitat SECLANDUSE Land Use of secondary component SECSPECIES Secondary component tree species SEC_PLYEAR Secondary component year planted SECPCTAREA Secondary component %Area of sub-compartment TERLANDUSE Land Use of tertiary component TERSPECIES Tertiary component tree species TER_PLYEAR Tertiary component year planted TERPCTAREA Tertiary component %Area of sub-compartment TERHABITAT Tertiary Habitat TERHABCODE Tertiary Habitat Code

  6. w

    NATIONAL FOREST ESTATE SUBCOMPARTMENTS ENGLAND 2016

    • data.wu.ac.at
    Updated Aug 8, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Forestry Commission (2018). NATIONAL FOREST ESTATE SUBCOMPARTMENTS ENGLAND 2016 [Dataset]. https://data.wu.ac.at/odso/data_gov_uk/MjRlY2EzZTgtMmJlYS00YmVkLTg5NmEtZGUzMzY3ZGRjYTRi
    Explore at:
    Dataset updated
    Aug 8, 2018
    Dataset provided by
    Forestry Commission
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    2d8e55df232c51586c0851fced8780f5ed7fe775
    Description

    All organisations hold information about the core of their business. The Forestry Commission holds information on trees and forests. We use this information to help us run our business and make decisions.

    The role of the Forest Inventory (the Sub-compartment Database (SCDB) and the stock maps) is to be our authoritative data source, giving us information for recording, monitoring, analysis and reporting.

    Through this it supports decision-making on the whole of the FC estate. Information from the Inventory is used by the FC, wider government, industry and the public for economic, environmental and social forest-related decision-making. Furthermore, it supports forestrelated national policy development and government initiatives, and helps us meet our national and international forest-related reporting responsibilities.

    Information on our current forest resource, and the future expansion and availability of wood products from our forests, is vital for planners both in and outside the FC. It is used when looking at the development of processing industries, regional infrastructure, the effect upon communities of our actions, and to prepare and monitor government policies.

    The Inventory (SCDB and stock maps), with ‘Future Forest Structure’ and the ‘rollback’ functionality of Forester, will help provide a definitive measure of trends in extent, structure, composition, health, status, use, and management of all FC land holdings. We require this to meet national and international commitments, to report on the sustainable management of forests as well as to help us through the process of business and Forest Design Planning. As well as helping with the above, the SCDB helps us address detailed requests from industry, government, non-government organisations and the public for information on our estate.

    The FC’s growing national and international responsibilities and the requirements for monitoring and reporting on a range of forest statistics have highlighted the technical challenges we face in providing consistent, national level data. A well kept and managed SCDB and GIS (Geographical Information System - Forester) will provide the best solution for this and assist Countries in evidence-based policy making.

    Looking ahead at international reporting commitments; one example of an area where requirements look set to increase will be reporting on our work to combat climate change and how our estate contributes to carbon sequestration. We have put in place processes to ensure that at least the basics of our inventory are covered:

    1. The inventory of forests;
    2. The land-uses;
    3. The land we own ( Deeds);
    4. The roads we manage.

    We depend on others to allow us to manage the forests and to provide us with funds and in doing so we need to be seen to be responsible and accountable for our actions. A foundation of achieving this is good record keeping.

    A sub compartment should be recognisable on the ground. It will be similar enough in land use, species or habitat composition, yield class, age, condition, thinning history etc. to be treated as a single unit. They will generally be contiguous in nature and will not be split by roads, rivers, open space etc. Distinct boundaries are required, and these will often change as crops are felled, thinned, replanted and resurveyed.

    In some parts of the country foresters used historical and topographical features to delineate sub-compartment boundaries, such as hedges, walls and escarpments. In other areas no account of the history and topography of the site was taken, with field boundaries, hedges, walls, streams etc. being subsumed into the sub-compartment. Also, these features may or may not appear on the OS backdrop, again this was dependent on the staff involved and what they felt was relevant to the map. The main point is that, as managers we may find such obvious features in the middle of a sub-compartment when nothing is indicated on the stock map, while the same thing would be indicated elsewhere.

    Attributes;

    FOREST Cost centre Nos. COMPTMENT Compartemnt Nos. SUBCOMPT Sub-compartment letter SUBCOMPTID Unique identifier BLOCK Block nos. CULTCODE Cultivation Code CULTIVATN Cultivation PRIHABCODE Primary Habitat Code PRIHABITAT Primary Habitat PRILANDUSE Land Use of primary component PRISPECIES Primary component tree species PRI_PLYEAR prim. component year planted PRIPCTAREA Prim. component %Area of sub-compartment SECHABCODE Secondary Habitat Code SECHABITAT Secondary Habitat SECLANDUSE Land Use of secondary component SECSPECIES Secondary component tree species SEC_PLYEAR Secondary component year planted SECPCTAREA Secondary component %Area of sub-compartment TERLANDUSE Land Use of tertiary component TERSPECIES Tertiary component tree species TER_PLYEAR Tertiary component year planted TERPCTAREA Tertiary component %Area of sub-compartment TERHABITAT Tertiary Habitat TERHABCODE Tertiary Habitat Code

    Any maps produced using this data should contain the following Forestry Commission acknowledgement: “Contains, or is based on, information supplied by the Forestry Commission. © Crown copyright and database right [Year] Ordnance Survey [100021242]”. Attribution statement: Contains OS data © Crown copyright [and database right] [year].

  7. d

    Cadastral Control (Point) (LGATE-224) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Cadastral Control (Point) (LGATE-224) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/cadastral-control-point
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    This dataset contains both GESMAR (Geodetic Survey Mark Register database) survey marks and Non-Geodetic control points that are used by Landgate to maintain and improve the spatial accuracy of the Spatial Cadastral Database (SCDB) which is the official digital cadastral map base of all crown and freehold land parcels within the State of Western Australia. Non-geodetic control points (referred to as Cadastral Control), is a set of non GESMAR survey marks that have been spatially adjusted against the GESMAR network. Connections between Cadastral Control points and cadastral marks are used to improve spatial accuracy of the SCDB. The dataset can also be used to assist with the spatial upgrade or improvement of other SCDB datasets. Like cadastral point coordinates, the spatial location (coordinates) of Cadastral Control points is dynamic and may change as a result of adjustments to the GESMAR (Geodetic) network. This dataset should not be confused with the Geodetic Survey Control (LGATE-076) layer also available in SLIP, which contains detailed information relating to Geodetic Survey Control marks (GESMAR). NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. -- © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  8. Lodged Cadastre (Land) (LGATE-223)

    • data.gov.au
    esri featureserver +8
    Updated Dec 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landgate (2021). Lodged Cadastre (Land) (LGATE-223) [Dataset]. https://data.gov.au/dataset/ds-wa-c6da1fb8-3dba-42d1-8cf0-74ba2ad3f4d8
    Explore at:
    geopackage, geojson, shp, pdf, esri mapserver, esri featureserver, wfs, wms, fgdbAvailable download formats
    Dataset updated
    Dec 28, 2021
    Dataset provided by
    Western Australian Land Information Authorityhttp://www.landgate.wa.gov.au/
    Description

    Community Titles Act changes coming Feb 2022 - refer to the Data Dictionary here-in As with Cadastre (Land) (LGATE-218), Lodged Cadastre (Land) (LGATE-223) does not contain SCDB polygon_numbers, but …Show full descriptionCommunity Titles Act changes coming Feb 2022 - refer to the Data Dictionary here-in As with Cadastre (Land) (LGATE-218), Lodged Cadastre (Land) (LGATE-223) does not contain SCDB polygon_numbers, but instead is based on land_id as contained within the SCDB maintenance environment and combines polygons of common land parcel identifiers into single multi-polygon features with aggregated areas. It is a digital representation of all survey documents lodged at Landgate that have not yet progressed past the lodged survey status. The dataset covers the State of Western Australia and the Commonwealth jurisdictions of Cocos Keeling Island and Christmas Island. A Data Dictionary for this dataset is still in development and will be published as soon as is available. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions. For further information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au.

  9. d

    Cadastre (Land) (LGATE-218) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Cadastre (Land) (LGATE-218) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/cadastre-land
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    Changes will be applied to this dataset resulting from the implementation of the Community Titles Act 2018. These changes will be applied 9th February 2022 - please refer to the Data Dictionary and sample data below. The cadastral "LAND" based dataset is a digital representation of all land parcel boundaries within Western Australia. This cadastral dataset does not contain SCDB polygon_numbers, but instead is based on land_id as contained within the SCDB maintenance environment and combines polygons of common land parcel identifiers into single multi-polygon features with aggregated areas. It represents all crown land and freehold land and is sourced from the Spatial Cadastral Database (SCDB) which is the official digital cadastral map base of all crown and freehold land parcels within the State of Western Australia. The dataset covers the State of Western Australia and the Commonwealth jurisdictions of Cocos Keeling Island and Christmas Island. The dataset is updated with the current cadastral update cycle for data contained within SLIP. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions. For further information please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au.

  10. Townsites (LGATE-248)

    • data.gov.au
    data, esri mapserver +6
    Updated Jul 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landgate (2023). Townsites (LGATE-248) [Dataset]. https://data.gov.au/dataset/ds-wa-feba4da1-f683-4615-b913-40446b7fc059
    Explore at:
    wfs, esri mapserver, data, shp, fgdb, geopackage, geojson, wmsAvailable download formats
    Dataset updated
    Jul 31, 2023
    Dataset provided by
    Western Australian Land Information Authorityhttp://www.landgate.wa.gov.au/
    Description

    This layer supersedes Townsites (LGATE-007) Townsites are the urban centres described by technical description. A townsite must be approved (by document) by the Minister for Lands, under the Land …Show full descriptionThis layer supersedes Townsites (LGATE-007) Townsites are the urban centres described by technical description. A townsite must be approved (by document) by the Minister for Lands, under the Land Administration Act 1997. A townsite consists of urban land, rather than rural land. Derived from the Spatial Cadastral Database (SCDB) and based on GDA 94. Updated regularly when amendments/changes are formalised. For more information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  11. d

    Cadastral Control (Line) (LGATE-225) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Cadastral Control (Line) (LGATE-225) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/cadastral-control-line
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    This dataset contains cadastral control lines that are used by Landgate to maintain and improve the spatial accuracy of the Spatial Cadastral Database (SCDB) which is the official digital cadastral map base of all crown and freehold land parcels within the State of Western Australia. The dataset can also be used to assist with the spatial upgrade or improvement of other SCDB datasets. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  12. O

    Survey control register - Queensland series

    • data.qld.gov.au
    • researchdata.edu.au
    • +1more
    rest +4
    Updated Oct 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natural Resources and Mines, Manufacturing, and Regional and Rural Development (2023). Survey control register - Queensland series [Dataset]. https://www.data.qld.gov.au/dataset/survey-control-register-queensland-series
    Explore at:
    shp, tab, fgdb, kmz, gpkg(56623104), wms(1024), rest(1024), xml(1024), spatial data format(56623104)Available download formats
    Dataset updated
    Oct 18, 2023
    Dataset authored and provided by
    Natural Resources and Mines, Manufacturing, and Regional and Rural Development
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Queensland
    Description

    These datasets contain records of Queensland's geodetic survey control information. The database provides for the effective management of the geodetic survey control information for Queensland for which the Department of Resourcesis responsible under the Survey and Mapping Infrastructure Act 2003. The records contain: Registered number - number of survey control mark Local Authority - name of local authority Vertical Height - height of mark Vertical Datum - datum of the height.

  13. Agricultural Areas (LGATE-228)

    • data.gov.au
    data +8
    Updated Jun 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landgate (2022). Agricultural Areas (LGATE-228) [Dataset]. https://data.gov.au/dataset/ds-wa-624a4605-5491-455c-ba98-90107c4ef08d?q=
    Explore at:
    data, geojson, geopackage, fgdb, esri mapserver, esri featureserver, shp, wms, wfsAvailable download formats
    Dataset updated
    Jun 1, 2022
    Dataset provided by
    Western Australian Land Information Authorityhttp://www.landgate.wa.gov.au/
    Description

    This layer supersedes Agricultural Areas (LGATE-183) Republished with a redefined set of attributes, Agricultural Areas are declared parcels of land allocated under the Land Act 1933 to which …Show full descriptionThis layer supersedes Agricultural Areas (LGATE-183) Republished with a redefined set of attributes, Agricultural Areas are declared parcels of land allocated under the Land Act 1933 to which special provisions are applied for both alienation and improvement. Many of the Agricultural Areas were declared shortly after World War I, for the purposes of soldier resettlement, and were eventually opened up for sale to civilians also. Derived from the Spatial Cadastral Database (SCDB) and based on GDA 94 and replaces previously published Agricultural Areas (LGATE-183). If you have any questions or require further information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  14. Port Authorities (LGATE-243)

    • data.gov.au
    data, esri mapserver +6
    Updated Jun 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landgate (2023). Port Authorities (LGATE-243) [Dataset]. https://data.gov.au/dataset/ds-wa-b48837ee-ba70-46df-825b-75609a22b47f
    Explore at:
    wfs, wms, data, shp, geojson, geopackage, fgdb, esri mapserverAvailable download formats
    Dataset updated
    Jun 28, 2023
    Dataset provided by
    Western Australian Land Information Authorityhttp://www.landgate.wa.gov.au/
    Description

    This layer supersedes Port Authorities (LGATE-132) Port Authorities for the State of Western Australia. As originally declared under individual Port Authority Acts (7) and more recently declared …Show full descriptionThis layer supersedes Port Authorities (LGATE-132) Port Authorities for the State of Western Australia. As originally declared under individual Port Authority Acts (7) and more recently declared under the Port Authorities Act 1999 (2). Derived from the Spatial Cadastral Database (SCDB) and based on GDA 94. Updated regularly when amendments/changes to boundaries are formalised, all of which are now gazetted under the Port Authorities Act 1999. For more information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  15. d

    Plan Extent (LGATE-284) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated Aug 1, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Plan Extent (LGATE-284) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/plan-extent
    Explore at:
    Dataset updated
    Aug 1, 2019
    Area covered
    Western Australia
    Description

    This dataset has been developed to assist users in identifying survey documents required for survey and plan preparation. It is a derived spatial index of survey documents lodged at Landgate since 2002 as stored within the Spatial Cadastral Database (SCDB), combined with aggregated cadastral polygons of the same survey/plan identifier where plan surrounds/extent does not exist within the SCDB (no digital plan lodgement). This dataset is complimented by the following datasets published by Landgate: Field Record Labels (LGATE-282) Plan Labels (LGATE-283) Scanned Survey Index Plans (LGATE-285). These 4 layers combined, comprise the most comprehensive digital representation of survey information contained at Landgate. NOTE: This product is for information purposes only and is not guaranteed and should not be relied upon for legal purposes without further verification from the original documents.

  16. d

    Tenure Polygons - Interests (LGATE-352) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated Sep 6, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Tenure Polygons - Interests (LGATE-352) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/polygons-interests-lgate-352
    Explore at:
    Dataset updated
    Sep 6, 2022
    Area covered
    Western Australia
    Description

    Polygons - Interests is one of a suite of feature classes (5 in total) contained within Landgate's Tenure-by-Polygon SLIP service and provides a processed "flattened" data structure for cadastral polygons with land tenure type and ownership details. This layer compliments the Polygon - Master layer and contains all polygons captured within the Spatial Cadastral Database (SCDB) that is considered an "interest". _ NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. Strict access criteria applies, due to sensitivity of information contained in this data service, please contact BusinessSolutions@landgate.wa.gov.au for further information. _

  17. d

    Marine and Harbour Areas (LGATE-235) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Marine and Harbour Areas (LGATE-235) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/marine-and-harbour-areas
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    Republished with a redefined set of attributes, Marine and Harbour Areas are proclaimed areas of both land and water under the Marine and Harbours Act 1981 is vested in The Minister for Transport, and administered by the Department of Transport. Derived from the Spatial Cadastral Database (SCDB) and based on GDA 94 and replaces previously published Marine and Harbour Areas (LGATE-191). If you have any questions or require further information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  18. d

    Historical Cadastre 2024 (Polygon) (LGATE-489) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Historical Cadastre 2024 (Polygon) (LGATE-489) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/historical-cadastre-2024-polygon-lgate
    Explore at:
    Dataset updated
    Feb 18, 2025
    Area covered
    Western Australia
    Description

    This cadastral polygon dataset is a digital representation of all land parcel boundaries within Western Australia at a particular point in time. It represents all crown land (land owned by the State) and freehold land (land held in fee simple) and was sourced from the Spatial Cadastral Database (SCDB) which is the official digital cadastral map base of all crown and freehold land parcels within the State of Western Australia. The dataset covers the State of Western Australia and includes the Commonwealth jurisdictions of Cocos Keeling Island and Christmas Island. This dataset contains only those land parcel boundaries that were current (integrated) at the time of extraction (ie: does not contain "lodged" land parcels) and includes easement and other interests. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. © Western Australian Land Information Authority (Landgate).

  19. d

    Lodged Cadastre (Line) (LGATE-221) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Lodged Cadastre (Line) (LGATE-221) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/lodged-cadastre-line
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    Is part of the Spatial Cadastre Database (SCDB) which is an integrated database comprising of a number of datasets (layers)of digital spatial data, defining all Crown and Freehold land parcels within the State as well as subsidiary survey network control. Includes an integrated administrative boundaries dataset and a lodged layer containing recent survey. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. For further information please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions.

  20. d

    Cadastre (Point) (LGATE-215) - Datasets - data.wa.gov.au

    • catalogue.data.wa.gov.au
    Updated May 11, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). Cadastre (Point) (LGATE-215) - Datasets - data.wa.gov.au [Dataset]. https://catalogue.data.wa.gov.au/dataset/cadastre-point
    Explore at:
    Dataset updated
    May 11, 2018
    Area covered
    Western Australia
    Description

    This cadastral dataset, of point geometry type, is a digital representation of all land parcel boundaries within Western Australia. It provides detailed representation of all land parcel boundaries and aligns with the Cadastre (polygon) (LGATE-217) and Cadastre (Line) (LGATE-216) datasets. It provides feature level capture dates and accuracy. This dataset is sourced from the Spatial Cadastral Database (SCDB) which is the official digital cadastral map base of all crown and freehold land parcels within the State of Western Australia. The dataset covers the State of Western Australia and the Commonwealth jurisdictions of Cocos Keeling Island and Christmas Island. NOTE: This product is for information purposes only and is not guaranteed. The information may be out of date and should not be relied upon without further verification from the original documents. Where the information is being used for legal purposes then the original documents must be searched for all legal requirements. © Western Australian Land Information Authority (Landgate). Use of Landgate data is subject to Personal Use License terms and conditions unless otherwise authorised under approved License terms and conditions. For further information, please contact your Landgate Service Manager or email BusinessSolutions@landgate.wa.gov.au.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Eric C. Nystrom; Eric C. Nystrom; David S. Tanenhaus; David S. Tanenhaus (2020). Connecting U.S. Supreme Court Case Information and Opinion Authorship (SCDB) to Full Case Text Data (CAP), 1791-2011 [Dataset]. http://doi.org/10.5281/zenodo.4344917
Organization logo

Connecting U.S. Supreme Court Case Information and Opinion Authorship (SCDB) to Full Case Text Data (CAP), 1791-2011

Explore at:
tsvAvailable download formats
Dataset updated
Dec 19, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Eric C. Nystrom; Eric C. Nystrom; David S. Tanenhaus; David S. Tanenhaus
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
United States
Description

This dataset was constructed to connect the rich metadata created by the Supreme Court Database (SCDB) to the Caselaw Access Project (CAP) full-text court opinion data. Since the SCDB includes only substantive opinions, it is necessarily a subset of the full range of opinions available through CAP.

There are two parts to this data: the map connecting each SCDB ID to its corresponding CAP case number, and a more advanced (but error-prone) version in which the authorship of each opinion text identified for the case in CAP is attributed to the Justice who wrote it. Each of these data products have been hand-corrected to the best of this author's ability.

SCDB-CAP map

The SCDB->CAP map began as a relatively straightforward automated matching process, based on the US Reports citation for each case as expressed in both SCDB and CAP. Slightly over 80% of SCDB entries found a single CAP data match this way. From there, the data was entirely hand-corrected, with non-matches or duplicate matches individually investigated and manually corrected.

Some SCDB entries simply could not be matched to an appropriate CAP text. Initially, the entirety of US Reports volume 44 was missing, but with the help of CAP staff, the volume was located as having been filed in the New York jurisdiction rather that the United States jurisdiction. The case numbers were then added to the map, but until the volume is relocated to the United States jurisdiction, it may be necessary to also incorporate the New York jurisdiction in full text analysis so that the cases from volume 44 can be searched. 108 more missing cases are from US Reports volume 131, which was a "catch up" volume published in the 19th century. These catch-up cases, many heard by the Supreme Court decades prior, were numbered with lowercase roman numerals instead of the ordinary numbers, which is almost certainly why CAP's software dismissed the catch-up section as prefatory material. Many of the rest of the errors seem largely to be examples where the SCDB project recognized a separate court action that CAP did not. Perhaps most of these seem to have been later rehearings for a case previously decided, which in the 19th century particularly were commonly reported out at the end of the first decision text. While SCDB sometimes gave these subsequent but related actions a separate SCDB entry, CAP seems to have largely incorporated them as part of the text of the main case. Additionally, there were a few that simply could not be found, despite a careful look through each database as well as the original US Reports and sometimes adjacent volumes. Finally, the cases were only matched up through the 2011 court term. After the 2011 term, the mismatches between CAP and SCDB were extensive and frequently seemed impossible to resolve.

Even so, with the manual correction, the overall error rate is low. Of 28,304 cases, only 191 do not have a match, and of those, 108 are contained within the vol. 131 "catch up" volume. Since most of the rest are extremely short subsequent actions that were separately noted by SCDB, the effect of these non-matched cases would seem to be small in most cases.

The typical use case would be that the researcher would generate some kind of results based on searching in the CAP full text, then could use the CAP ID to look up the SCDB ID in the map. With the SCDB ID, of course, the rich metadata from the SCDB can then be connected to each result as needed.

Opinion authorship

Being able to use the rich metadata of SCDB in conjunction with a case's full text is exciting, but it immediately prompts a further question -- what if the texts could be attributed directly to the Justices who authored them? SCDB produces its data in two forms; one is "case centered," where each record represents one case, and the other is "justice centered," in which each record is the vote of one Justice in one case. CAP, in turn, breaks the total text of the case into distinct opinions, and tries to attribute those opinions to their authors by scraping a string of text from the raw input. Therefore, the challenge was to connect these two sources at the opinion level.

Connecting the opinions, like connecting the cases, involved an initial match by machines, followed by manual correction and revision. In this case, the scope of the manual effort was much larger than that posed by the case-level connection, and more errors were noted in both SCDB and CAP.

The matching process involved a number of steps. First a list of opinions was generated from the CAP data, then matched to SCDB using the SCDB-CAP connector data described above. (Thus, a case without a CAP match in the SCDB-CAP data will not appear in the opinion author data either.) CAP opinions were numbered in the order they were encountered in each CAP case JSON object, and these numbers are used to distinguish the opinions.

Next, a round of automatic matching was performed. If there was only one opinion, and only one author listed in the SCDB data, then the majority opinion author (as listed in SCDB) was safely assumed to be the author. If there was no author listed in SCDB, "percuriam" was recorded as the author in this data. If there were exactly two opinions and two authors, the process was also straightforward, as the SCDB-identified majority opinion author was assigned to opinion 1, and the remaining author assigned opinion 2.

Subsequently, cases with more than two opinions were processed. A potential match (i.e. a "guess") for each opinion in a given case was created by listing each Justice identified by SCDB as having written an opinion in the case. These guesses were then parsed using a semi-automatic procedure with Levenshtein distance fuzzy name matching. With sufficiently conservative parameters, a successful fuzzy match meant that the non-successful guesses for that opinion could be deleted. These sorted guesses were then reviewed manually. Particular care was also taken for any opinion that contained authored opinions by Justices who had similar names (for example, Clark and Black differ by only a single letter). These sorts of cases, as well as instances of co-authorship, were identified and fixed manually.

Those opinions whose authorship could not be matched then were fixed by hand. These included some where the CAP author strings were more complicated than SCDB's strict interpretation; others where the OCR in CAP which contained the Justice name was especially bad; and a number of others where "Mr. Chief Justice" couldn't be directly matched with an author name by the machine. After this light manual correction, almost 500 opinions with substantial errors remained to be individually investigated in depth, by examining the CAP record, the SCDB record, and images of the US Reports for that case. For these last tough customers, errors in the source data were commonly the cause of matching problems. Typically these were of three kinds: examples where CAP should have split the text but didn't (e.g. 2 opinions together in one opinion entry in CAP); examples where SCDB either did not identify or mis-identified an author (such as attributing it to Swayne when it was written by Miller); and examples of non-valid opinions (such as where CAP mistakenly split the opinion too early, leaving an opinion fragment).

For these errors, a system of codes was created in the author field to signal the error type so that researchers can be suitably cautious. The error code is always at the beginning of the field and is followed by a comma and the names of each author, separated by a comma with no space to facilitate parsing. Note also that co-authors are listed as comma-separated names in this same field with no error code. Researchers will probably want to disaggregate this field to create duplicate records with each individual author for most purposes. The justice number field also contains information about all justices authoring the opinion but the error codes have been omitted here.

  • !C -- error: multiple opinion texts combined (i.e. CAP splitting error)
  • !X -- error: unattributed or misattributed opinion (not listed in SCDB as writer)
  • !D -- error: extra opinion that should be deleted, i.e. not a valid opinion
  • !W -- error: listed as Writers by SCDB, but should be co-authors

Data file structure

"scdb_cap-051820.tsv" is a Tab-separated data file containing 5 columns: SCDB ID, CAP ID, US Reports citation, case date, and case name (the latter three from the SCDB data).

"scdb-cap-opinion-authorship_051920.tsv" is a Tab-separated data file containing seven columns: SCDB ID, CAP ID, US Reports citation, case name, opinion number in the case, opinion author, and SCDB justice ID. See above for caveats about disaggregating and error codes in fields six and seven.

Errors

It is likely that errors remain in this data, and it is also hoped that some of the errors beyond the author's immediate control might be fixed in the upstream data so that they can be corrected here. Authors would be grateful for error reports, and also reports of errors fixed, if any.

Search
Clear search
Close search
Google apps
Main menu