100+ datasets found
  1. d

    Corporations and Other Entities: All Filings

    • catalog.data.gov
    • data.ny.gov
    Updated Nov 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). Corporations and Other Entities: All Filings [Dataset]. https://catalog.data.gov/dataset/corporations-and-other-entities-all-filings
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset provided by
    data.ny.gov
    Description

    This data contains all Corporations and Other Entity filings in the Department of State electronic database. Each record contains the Department of State ID number, Filing ID, Date Filed, Effective Date, Entity Name, the law under which the filing was made and other pertinent filing information.

  2. SEC filings data (1993-2024)

    • kaggle.com
    zip
    Updated Apr 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Brás Oliveira (2025). SEC filings data (1993-2024) [Dataset]. https://www.kaggle.com/datasets/joaobrasoliveira/securities-and-exchange-comission-sec-master/data
    Explore at:
    zip(674759917 bytes)Available download formats
    Dataset updated
    Apr 6, 2025
    Authors
    João Brás Oliveira
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset intro:

    The present dataset made available into two distinct formats - .csv and .parquet - provides a compilation of all file submissions made with the Securities and Exchange Comission (SEC) between 1993 and 2024. The data was taken from the regulator´s website (https://www.sec.gov/) and as such is of the public domain.

    It provides answers to the following questions:

    • Who made the submission? (columns "*CIK*" and "*Company Name*")
    • When was the submission filed with the SEC? (column "*Date filed*")
    • What form was submitted/filed with the SEC? (column "*Form Type*")
    • Where can I find/access said submittion/filing? (column "*Filename*")

    It was obtained by merging all quarterly list of submissions files (.idx) for the 1993-2024 time interval. For more information, look at the "*Code for the Collection Methodology*" section.

    The main positive about is for research purposes if you do not have access to a highly costly proprietary database you can filter the documents filed with the SEC by their type, filer info, date or filename.

    The "*Filename*" column in particular can be very helpful as when combined with "https://www.sec.gov/Archives/" right before it one can obtain the direct link to said document within SEC´s EDGAR (Electronic Data Gathering, Analysis, and Retrieval) system.

    Code for the Collection Methodology:

    # Initialize an empty list to store data from all files
    all_data = []
    
    # Loop through all files in the folder
    for file_name in os.listdir(input_folder):
      if file_name.endswith(".idx"): # Only process .idx files
        file_path = os.path.join(input_folder, file_name)
    
        # Print the current file being processed
        print(f"Processing file: {file_name}")
        
        # Read the file
        with open(file_path, "r") as file:
          lines = file.readlines()
        
        # Extract the column headers and data rows
        column_names = lines[9].strip().split("|") # Split the 10th line (index 9) to get column headers
        data_rows = [
          line.strip().split("|") for line in lines[11:] # Skip the separator line (index 10) and process from index 11
          if not line.startswith("-") # Exclude any lines that start with dashes
        ]
        
        # Add the extracted data to the list
        all_data.extend(data_rows)
    
    # Create a single DataFrame from all the collected data
    df = pd.DataFrame(all_data, columns=column_names)
    df.head()
    
    df.to_csv(output_csv, index=False) # for .csv file - Version 1
    df.to_parquet(engine="pyarrow") # for .parquet file - Version 3
    
    
  3. US Company Filings Database

    • lseg.com
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2025). US Company Filings Database [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/filings/company-filings-database
    Explore at:
    csv,html,json,pdf,python,text,user interface,xmlAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Area covered
    United States
    Description

    Browse LSEG's US Company Filings Database, and find a range of filings content and history including annual reports, municipal bonds, and more.

  4. M-1 Filings Database

    • catalog.data.gov
    • datasets.ai
    • +2more
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Employee Benefits Security Administration (2025). M-1 Filings Database [Dataset]. https://catalog.data.gov/dataset/m-1-filings-database-0c56c
    Explore at:
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    Employee Benefits Security Administrationhttps://www.dol.gov/agencies/ebsa
    Description

    Electronic filing system for the Form M-1 annual report for multiple employer welfare arrangements

  5. Filings

    • lseg.com
    Updated Nov 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2023). Filings [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/filings
    Explore at:
    Dataset updated
    Nov 19, 2023
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    LSEG global Filings offers extensive coverage of developed and emerging markets, updated in real time. Discover the data.

  6. Data from: SEC Filings

    • kaggle.com
    zip
    Updated Jun 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). SEC Filings [Dataset]. https://www.kaggle.com/datasets/bigquery/sec-filings
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    Jun 5, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Description

    In the U.S. public companies, certain insiders and broker-dealers are required to regularly file with the SEC. The SEC makes this data available online for anybody to view and use via their Electronic Data Gathering, Analysis, and Retrieval (EDGAR) database. The SEC updates this data every quarter going back to January, 2009. For more information please see this site.

    To aid analysis a quick summary view of the data has been created that is not available in the original dataset. The quick summary view pulls together signals into a single table that otherwise would have to be joined from multiple tables and enables a more streamlined user experience.

    DISCLAIMER: The Financial Statement and Notes Data Sets contain information derived from structured data filed with the Commission by individual registrants as well as Commission-generated filing identifiers. Because the data sets are derived from information provided by individual registrants, we cannot guarantee the accuracy of the data sets. In addition, it is possible inaccuracies or other errors were introduced into the data sets during the process of extracting the data and compiling the data sets. Finally, the data sets do not reflect all available information, including certain metadata associated with Commission filings. The data sets are intended to assist the public in analyzing data contained in Commission filings; however, they are not a substitute for such filings. Investors should review the full Commission filings before making any investment decision.

  7. d

    Daily Corporation and Other Entity Filing Data

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ny.gov (2025). Daily Corporation and Other Entity Filing Data [Dataset]. https://catalog.data.gov/dataset/daily-corporation-and-other-entity-filing-data
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset provided by
    data.ny.gov
    Description

    This data contains Corporations and other Entities filing information that were processed in the previous thirty days. Each line contains the Department of State ID number, Film ID, Date Filed, Effective Date, Entity Name, the law under which the filing was made and other pertinent filing information.

  8. V

    Filing Details

    • data.virginia.gov
    csv
    Updated Oct 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    State Corporate Commission (2025). Filing Details [Dataset]. https://data.virginia.gov/dataset/filing-details
    Explore at:
    csv(36970)Available download formats
    Dataset updated
    Oct 30, 2025
    Dataset authored and provided by
    State Corporate Commission
    Description

    This dataset contains detailed information about UCC filings, typically related to secured transactions and the filing of UCC-1 financing statements. This file serves as a repository for the information related to each individual UCC filing, allowing users to efficiently access, track, and manage UCC filings within the public record system.

  9. S

    Corporations and Other Entities: All Filings – Entity Status History

    • data.ny.gov
    • datasets.ai
    • +2more
    csv, xlsx, xml
    Updated Dec 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York State Department of State (2025). Corporations and Other Entities: All Filings – Entity Status History [Dataset]. https://data.ny.gov/Economic-Development/Corporations-and-Other-Entities-All-Filings-Entity/3gg2-jgnp
    Explore at:
    xlsx, csv, xmlAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    New York State Department of Statehttp://www.dos.ny.gov/
    Description

    This data contains Corporations and Other Entity Status information for each entity in the NYS DOS Corporation and Other Entities electronic database. Each line contains the Department of State ID number, Date Filed, and Status of the entity after the filing of the document indicated.

  10. Sec_Filing

    • kaggle.com
    zip
    Updated Jul 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KHARANSHU VALANGAR (2024). Sec_Filing [Dataset]. https://www.kaggle.com/datasets/kharanshuvalangar/sec-filings
    Explore at:
    zip(33886 bytes)Available download formats
    Dataset updated
    Jul 10, 2024
    Authors
    KHARANSHU VALANGAR
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    This dataset consists of information related to SEC (Securities and Exchange Commission) filings, primarily focusing on various exhibits and related forms filed by different companies. The dataset contains 10,000 entries(ex-10 filings), each representing a filing with associated details such as company information, filing type, and a direct URL link to the official document.

  11. H

    A database for blockholders in US-listed firms including all Form 13D and...

    • dataverse.harvard.edu
    Updated Aug 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Philipp Harries (2021). A database for blockholders in US-listed firms including all Form 13D and Form 13G filings. [Dataset]. http://doi.org/10.7910/DVN/61Z64Q
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 17, 2021
    Dataset provided by
    Harvard Dataverse
    Authors
    Jan Philipp Harries
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/61Z64Qhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.7910/DVN/61Z64Q

    Description

    The Dataset contains structured, parsed data from 758,666 Form 13D and Form 13G blockholder filings from November 1993 to May 2021, downloaded from SEC Edgar. The data is made available as a RAR-compressed CSV file, in which each row represents a single filing and the 76 columns contain parsed information for each filing. Please see the accompanying paper "Determinants of Blockholdership - A new Dataset for Blockholder Analysis" for more information and cite it when using the data for your research. As stated by the SEC, "Information presented on www.sec.gov is considered public information and may be copied or further distributed by users of the web site without the SEC’s permission." (see https://www.sec.gov/privacy.htm#dissemination).

  12. c

    Campaign Finance Database

    • s.cnmilf.com
    • data.sfgov.org
    • +2more
    Updated Mar 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Campaign Finance Database [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/campaign-finance-database
    Explore at:
    Dataset updated
    Mar 29, 2025
    Dataset provided by
    data.sfgov.org
    Description

    The campaign finance database is the San Francisco Ethics Commission's repository for campaign finance filings. It can answer questions about who is contributing money, who is receiving money, and how it is being spent. Use the campaign finance database to research campaign contributions and expenditures reported on forms provided by the Fair Political Practices Commission. The database provides live access to the Ethics Commission's records. Filings are accessible once processed/posted by the Ethics Commission.Forms filed with the Ethics Commission can be downloaded in PDF format. Forms filed electronically can be searched and exported in Microsoft Excel format. The following Excel exports are available:- Excel file based on a search of itemized transactions up to 2,000 rows (Updated immediately, with the exception of FPPC filing deadlines -- within 48 hours);- Excel file by year or for entire life of a single committee (Updated immediately upon filing submission); and- Excel file by year for all committees in a single calendar year (Updated every 24 hours).

  13. Federal Court Cases: Integrated Database Series

    • s.cnmilf.com
    • catalog.data.gov
    Updated Mar 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Justice Statistics (2025). Federal Court Cases: Integrated Database Series [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/federal-court-cases-integrated-database-series-34e8a
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    Bureau of Justice Statisticshttp://bjs.ojp.gov/
    Description

    Investigator(s): Federal Judicial Center The purpose of this data collection is to provide an official public record of the business of the federal courts. The data originate from 100 court offices throughout the United States. Information was obtained at two points in the life of a case: filing and termination. The termination data contain information on both filing and terminations, while the pending data contain only filing information. For the appellate and civil data, the unit of analysis is a single case. The unit of analysis for the criminal data is a single defendant.Years Produced: Updated bi-annually with annual data.

  14. H

    Common Ownership Data: Scraped SEC form 13F filings for 1999-2017

    • dataverse.harvard.edu
    bin, csv +3
    Updated Aug 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2020). Common Ownership Data: Scraped SEC form 13F filings for 1999-2017 [Dataset]. http://doi.org/10.7910/DVN/ZRH3EU
    Explore at:
    txt(25964), bin(323182551), txt(14847), bin(2934960), text/x-perl-script(21999), csv(2363718396), bin(271859768), txt(3008286), txt(110929), bin(4653090), txt(303881), tsv(11192545), txt(156950), txt(196510)Available download formats
    Dataset updated
    Aug 17, 2020
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1999 - Dec 31, 2017
    Description

    Introduction In the course of researching the common ownership hypothesis, we found a number of issues with the Thomson Reuters (TR) "S34" dataset used by many researchers and frequently accessed via Wharton Research Data Services (WRDS). WRDS has done extensive work to improve the database, working with other researchers that have uncovered problems, specifically fixing a lack of records of BlackRock holdings. However, even with the updated dataset posted in the summer of 2018, we discovered a number of discrepancies when accessing data for constituent firms of the S&P 500 Index. We therefore set out to separately create a dataset of 13(f) holdings from the source documents, which are all public and available electronically from the Securities and Exchange Commission (SEC) website. Coverage is good starting in 1999, when electronic filing became mandatory. However, the SEC's Inspector General issued a critical report in 2010 about the information contained in 13(f) filings. The process: We gathered all 13(f) filings from 1999-2017 here. The corpus is over 318,000 filings and occupies ~25GB of space if unzipped. (We do not include the raw filings here as they can be downloaded from EDGAR). We wrote code to parse the filings to extract holding information using regular expressions in Perl. Our target list of holdings was all public firms with a market capitalization of at least $10M. From the header of the file, we first extract the filing date, reporting date, and reporting entity (Central Index Key, or CIK, and CIKNAME). Beginning with the September 30 2013 filing date, all filings were in XML format, which made parsing fairly straightforward, as all values are contained in tags. Prior to that date, the filings are remarkable for the heterogeneity in formatting. Several examples are linked to below. Our approach was to look for any lines containing a CUSIP code that we were interested in, and then attempting to determine the "number of shares" field and the "value" field. To help validate the values we extracted, we downloaded stock price data from CRSP for the filing date, as that allows for a logic check of (price * shares) = value. We do not claim that this will exhaustively extract all holding information. We can provide examples of filings that are formatted in such a way that we are not able to extract the relevant information. In both XML and non-XML filings, we attempt to remove any derivative holdings by looking for phrases such as OPT, CALL, PUT, WARR, etc. We then perform some final data cleaning: in the case of amended filings, we keep an amended level of holdings if the amended report a) occurred within 90 days of the reporting date and b) the initial filing fails our logic check described above. The resulting dataset has around 48M reported holdings (CIK-CUSIP) for all 76 quarters and between 4,000 and 7,000 CUSIPs and between 1,000 and 4,000 investors per quarter. We do not claim that our dataset is perfect; there are undoubtedly errors. As documented elsewhere, there are often errors in the actual source documents as well. However, our method seemed to produce more reliable data in several cases than the TR dataset, as shown in Online Appendix B of the related paper linked above. Included Files Perl Parsing Code (find_holdings_snp.pl). For reference, only needed if you wish to re-parse original filings. Investor holdings for 1999-2017: lightly cleaned. Each CIK-CUSIP-rdate is unique. Over 47M records. The fields are CIK: the central index key assigned by the SEC for this investor. Mapping to names is available below. CUSIP: the identity of the holdings. Consult the SEC's 13(f) listings to identify your CUSIPs of interest. shares: the number of shares reportedly held. Merging in CRSP data on shares outstanding at the CUSIP-Month level allows one to construct \beta. We make no distinction for the sole/shared/none voting discretion fields. If a researcher is interested, we did collect that starting in mid-2013, when filings are in XML format. rdate: reporting date (end of quarter). 8 digit, YYYYMMDD. fdate: filing date. 8 digit, YYYYMMDD. ftype: the form name. Notes: we did not consolidate separate BlackRock entities (or any other possibly related entities). If one wants to do so, use the CIK-CIKname mapping file below. We drop any CUSIP-rdate observation where any investor in that CUSIP reports owning greater than 50% of shares outstanding (even though legitimate cases exist - see, for example, Diamond Offshore and Loews Corporation). We also drop any CUSIP-rdate observation where greater than 120% of shares outstanding are reported to be held by 13(f) investors. Cases where the shares held are listed as zero likely mean the investor filing lists a holding for the firm but that our code could not find the number of shares due to the formatting of the file. We leave these in the data so that any researchers that find a zero know to go back to that source filing to manually gather the holdings for the securities they are interested in. Processed 13f holdings (airlines.parquet, cereal.parquet, out_scrape.parquet). These are used in our related AEJ:Microeconomics paper. The files contain all firms within the airline industry, RTE cereal industry, and all large cap firms (a superset of the S&P 500) respectively. These are a merged version of the scrape_parsed.csv file described above, that include the shares outstanding and percent ownership used to calculate measures of common ownership. These are distributed as brotli compressed Apache Parquet (binary) files. This preserves date information correctly. mgrno: manager number (which is actually CIK in the scraped data) rdate: reporting date ncusip: cusip rrdate: reportaing date in stata format mgrname: manager name shares: shares sole: shares with sole authority shared: shares with shared authority none: shares with no authority isbr/isfi/iss/isba/isvg: is this blackrock, statestreet, vanguard, barclay, fidelity numowners: how many owners prc: price at reporting date shares_out: shares outstanding at reporting date value: reported value in 13(f) beta: shares/shares_out permno: permno Profit weight values (i.e. \kappa) for all firms in the sample. (public_scrape_kappas_XXXX.parquet). Each file represents one year of data and is around 200MB and distributed as a compressed (brotli) parquet file. Fields are simply CUSIP_FROM, CUSIP_TO, KAPPA, QUARTER. Note that these have not been adjusted for multi-class share firms, insider holdings, etc. If looking at a particular market, some additional data cleaning on the investor holdings (above) followed by recomputing profit weights is recommended. For this, we did merge the separate BlackRock entities prior to computing \kappa. CIKmap.csv (~250K observations) Mapping is from CIK-rdate to CIKname. Use this if you want to consolidate holdings across reporting entities or explore the identities of reporting firms. In the case of amended filings that use different names than original ones, we keep the earliest name. Example of Parsing Challenge Prior to the XML era, filings were far from uniform, which creates a notable challenge for parsing them for holdings. In the examples directory we include several example text files of raw 13f filings. Example 1 is a "well behaved" filing, with CUSIP, followed by value, followed by number of shares, as recommended by the SEC. Example 2 shows a case where the ordering is changed: CUSIP, then shares, then value. The column headers show "item 5" coming before "item 4". Example 3 shows a case of a fixed width table, which in principle could be parsed very easily using the tags at the top, although not all filings consistently use these tags. Example 4 shows a case with a fixed width table, with no tag for the CUSIP column. Also, notice that if the firm holds more than 10M shares of a firm, that number occupies the entire width of the column and there is no longer a column separator (i.e. Cisco Systems on line 374). Example 5 shows a comma-separated table format. Example 6 shows a case of changing the column ordering, but also adding an (unrequired) column for share price. Example 7 shows a case where the table is split across subsequent pages, and so the CUSIP appears on a different line than the number of shares.

  15. U

    United States Number of Bankruptcy Filings: Annual: Non Business

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, United States Number of Bankruptcy Filings: Annual: Non Business [Dataset]. https://www.ceicdata.com/en/united-states/number-of-bankruptcy-filings/number-of-bankruptcy-filings-annual-non-business
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    United States
    Variables measured
    Enterprises Statistics
    Description

    United States Number of Bankruptcy Filings: Annual: Non Business data was reported at 765,863.000 Unit in 2017. This records a decrease from the previous number of 770,846.000 Unit for 2016. United States Number of Bankruptcy Filings: Annual: Non Business data is updated yearly, averaging 873,540.000 Unit from Dec 1980 (Median) to 2017, with 38 observations. The data reached an all-time high of 2,039,214.000 Unit in 2005 and a record low of 284,517.000 Unit in 1984. United States Number of Bankruptcy Filings: Annual: Non Business data remains active status in CEIC and is reported by Administrative Office of the United States Courts. The data is categorized under Global Database’s United States – Table US.O013: Number of Bankruptcy Filings.

  16. O

    Online Business Filing Transactions

    • data.ok.gov
    • s.cnmilf.com
    • +2more
    csv
    Updated Oct 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OKStateStat (2019). Online Business Filing Transactions [Dataset]. https://data.ok.gov/dataset/online-business-filing-transactions
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 31, 2019
    Dataset authored and provided by
    OKStateStat
    Description

    Increase the percentage of business filing transactions processed online from 33% in 2014 to 50% by 2018.

  17. U

    United States Number of Bankruptcy Filings: Annual

    • ceicdata.com
    Updated Oct 15, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). United States Number of Bankruptcy Filings: Annual [Dataset]. https://www.ceicdata.com/en/united-states/number-of-bankruptcy-filings/number-of-bankruptcy-filings-annual
    Explore at:
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2006 - Dec 1, 2017
    Area covered
    United States
    Variables measured
    Enterprises Statistics
    Description

    United States Number of Bankruptcy Filings: Annual data was reported at 789,020.000 Unit in 2017. This records a decrease from the previous number of 794,960.000 Unit for 2016. United States Number of Bankruptcy Filings: Annual data is updated yearly, averaging 931,698.000 Unit from Dec 1980 (Median) to 2017, with 38 observations. The data reached an all-time high of 2,078,415.000 Unit in 2005 and a record low of 331,264.000 Unit in 1980. United States Number of Bankruptcy Filings: Annual data remains active status in CEIC and is reported by Administrative Office of the United States Courts. The data is categorized under Global Database’s United States – Table US.O013: Number of Bankruptcy Filings.

  18. Z

    Data from: Russian Financial Statements Database: A firm-level collection of...

    • data.niaid.nih.gov
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bondarkov, Sergey; Ledenev, Victor; Skougarevskiy, Dmitriy (2025). Russian Financial Statements Database: A firm-level collection of the universe of financial statements [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14622208
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    European University at St. Petersburg
    European University at St Petersburg
    Authors
    Bondarkov, Sergey; Ledenev, Victor; Skougarevskiy, Dmitriy
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Russian Financial Statements Database (RFSD) is an open, harmonized collection of annual unconsolidated financial statements of the universe of Russian firms:

    • 🔓 First open data set with information on every active firm in Russia.

    • 🗂️ First open financial statements data set that includes non-filing firms.

    • 🏛️ Sourced from two official data providers: the Rosstat and the Federal Tax Service.

    • 📅 Covers 2011-2023 initially, will be continuously updated.

    • 🏗️ Restores as much data as possible through non-invasive data imputation, statement articulation, and harmonization.

    The RFSD is hosted on 🤗 Hugging Face and Zenodo and is stored in a structured, column-oriented, compressed binary format Apache Parquet with yearly partitioning scheme, enabling end-users to query only variables of interest at scale.

    The accompanying paper provides internal and external validation of the data: http://arxiv.org/abs/2501.05841.

    Here we present the instructions for importing the data in R or Python environment. Please consult with the project repository for more information: http://github.com/irlcode/RFSD.

    Importing The Data

    You have two options to ingest the data: download the .parquet files manually from Hugging Face or Zenodo or rely on 🤗 Hugging Face Datasets library.

    Python

    🤗 Hugging Face Datasets

    It is as easy as:

    from datasets import load_dataset import polars as pl

    This line will download 6.6GB+ of all RFSD data and store it in a 🤗 cache folder

    RFSD = load_dataset('irlspbru/RFSD')

    Alternatively, this will download ~540MB with all financial statements for 2023# to a Polars DataFrame (requires about 8GB of RAM)

    RFSD_2023 = pl.read_parquet('hf://datasets/irlspbru/RFSD/RFSD/year=2023/*.parquet')

    Please note that the data is not shuffled within year, meaning that streaming first n rows will not yield a random sample.

    Local File Import

    Importing in Python requires pyarrow package installed.

    import pyarrow.dataset as ds import polars as pl

    Read RFSD metadata from local file

    RFSD = ds.dataset("local/path/to/RFSD")

    Use RFSD_dataset.schema to glimpse the data structure and columns' classes

    print(RFSD.schema)

    Load full dataset into memory

    RFSD_full = pl.from_arrow(RFSD.to_table())

    Load only 2019 data into memory

    RFSD_2019 = pl.from_arrow(RFSD.to_table(filter=ds.field('year') == 2019))

    Load only revenue for firms in 2019, identified by taxpayer id

    RFSD_2019_revenue = pl.from_arrow( RFSD.to_table( filter=ds.field('year') == 2019, columns=['inn', 'line_2110'] ) )

    Give suggested descriptive names to variables

    renaming_df = pl.read_csv('local/path/to/descriptive_names_dict.csv') RFSD_full = RFSD_full.rename({item[0]: item[1] for item in zip(renaming_df['original'], renaming_df['descriptive'])})

    R

    Local File Import

    Importing in R requires arrow package installed.

    library(arrow) library(data.table)

    Read RFSD metadata from local file

    RFSD <- open_dataset("local/path/to/RFSD")

    Use schema() to glimpse into the data structure and column classes

    schema(RFSD)

    Load full dataset into memory

    scanner <- Scanner$create(RFSD) RFSD_full <- as.data.table(scanner$ToTable())

    Load only 2019 data into memory

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scanner <- scan_builder$Finish() RFSD_2019 <- as.data.table(scanner$ToTable())

    Load only revenue for firms in 2019, identified by taxpayer id

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scan_builder$Project(cols = c("inn", "line_2110")) scanner <- scan_builder$Finish() RFSD_2019_revenue <- as.data.table(scanner$ToTable())

    Give suggested descriptive names to variables

    renaming_dt <- fread("local/path/to/descriptive_names_dict.csv") setnames(RFSD_full, old = renaming_dt$original, new = renaming_dt$descriptive)

    Use Cases

    🌍 For macroeconomists: Replication of a Bank of Russia study of the cost channel of monetary policy in Russia by Mogiliat et al. (2024) — interest_payments.md

    🏭 For IO: Replication of the total factor productivity estimation by Kaukin and Zhemkova (2023) — tfp.md

    🗺️ For economic geographers: A novel model-less house-level GDP spatialization that capitalizes on geocoding of firm addresses — spatialization.md

    FAQ

    Why should I use this data instead of Interfax's SPARK, Moody's Ruslana, or Kontur's Focus?hat is the data period?

    To the best of our knowledge, the RFSD is the only open data set with up-to-date financial statements of Russian companies published under a permissive licence. Apart from being free-to-use, the RFSD benefits from data harmonization and error detection procedures unavailable in commercial sources. Finally, the data can be easily ingested in any statistical package with minimal effort.

    What is the data period?

    We provide financials for Russian firms in 2011-2023. We will add the data for 2024 by July, 2025 (see Version and Update Policy below).

    Why are there no data for firm X in year Y?

    Although the RFSD strives to be an all-encompassing database of financial statements, end users will encounter data gaps:

    We do not include financials for firms that we considered ineligible to submit financial statements to the Rosstat/Federal Tax Service by law: financial, religious, or state organizations (state-owned commercial firms are still in the data).

    Eligible firms may enjoy the right not to disclose under certain conditions. For instance, Gazprom did not file in 2022 and we had to impute its 2022 data from 2023 filings. Sibur filed only in 2023, Novatek — in 2020 and 2021. Commercial data providers such as Interfax's SPARK enjoy dedicated access to the Federal Tax Service data and therefore are able source this information elsewhere.

    Firm may have submitted its annual statement but, according to the Uniform State Register of Legal Entities (EGRUL), it was not active in this year. We remove those filings.

    Why is the geolocation of firm X incorrect?

    We use Nominatim to geocode structured addresses of incorporation of legal entities from the EGRUL. There may be errors in the original addresses that prevent us from geocoding firms to a particular house. Gazprom, for instance, is geocoded up to a house level in 2014 and 2021-2023, but only at street level for 2015-2020 due to improper handling of the house number by Nominatim. In that case we have fallen back to street-level geocoding. Additionally, streets in different districts of one city may share identical names. We have ignored those problems in our geocoding and invite your submissions. Finally, address of incorporation may not correspond with plant locations. For instance, Rosneft has 62 field offices in addition to the central office in Moscow. We ignore the location of such offices in our geocoding, but subsidiaries set up as separate legal entities are still geocoded.

    Why is the data for firm X different from https://bo.nalog.ru/?

    Many firms submit correcting statements after the initial filing. While we have downloaded the data way past the April, 2024 deadline for 2023 filings, firms may have kept submitting the correcting statements. We will capture them in the future releases.

    Why is the data for firm X unrealistic?

    We provide the source data as is, with minimal changes. Consider a relatively unknown LLC Banknota. It reported 3.7 trillion rubles in revenue in 2023, or 2% of Russia's GDP. This is obviously an outlier firm with unrealistic financials. We manually reviewed the data and flagged such firms for user consideration (variable outlier), keeping the source data intact.

    Why is the data for groups of companies different from their IFRS statements?

    We should stress that we provide unconsolidated financial statements filed according to the Russian accounting standards, meaning that it would be wrong to infer financials for corporate groups with this data. Gazprom, for instance, had over 800 affiliated entities and to study this corporate group in its entirety it is not enough to consider financials of the parent company.

    Why is the data not in CSV?

    The data is provided in Apache Parquet format. This is a structured, column-oriented, compressed binary format allowing for conditional subsetting of columns and rows. In other words, you can easily query financials of companies of interest, keeping only variables of interest in memory, greatly reducing data footprint.

    Version and Update Policy

    Version (SemVer): 1.0.0.

    We intend to update the RFSD annualy as the data becomes available, in other words when most of the firms have their statements filed with the Federal Tax Service. The official deadline for filing of previous year statements is April, 1. However, every year a portion of firms either fails to meet the deadline or submits corrections afterwards. Filing continues up to the very end of the year but after the end of April this stream quickly thins out. Nevertheless, there is obviously a trade-off between minimization of data completeness and version availability. We find it a reasonable compromise to query new data in early June, since on average by the end of May 96.7% statements are already filed, including 86.4% of all the correcting filings. We plan to make a new version of RFSD available by July.

    Licence

    Creative Commons License Attribution 4.0 International (CC BY 4.0).

    Copyright © the respective contributors.

    Citation

    Please cite as:

    @unpublished{bondarkov2025rfsd, title={{R}ussian {F}inancial {S}tatements {D}atabase}, author={Bondarkov, Sergey and Ledenev, Victor and Skougarevskiy, Dmitriy}, note={arXiv preprint arXiv:2501.05841}, doi={https://doi.org/10.48550/arXiv.2501.05841}, year={2025}}

    Acknowledgments and Contacts

    Data collection and processing: Sergey Bondarkov, sbondarkov@eu.spb.ru, Viktor Ledenev, vledenev@eu.spb.ru

    Project conception, data validation, and use cases: Dmitriy Skougarevskiy, Ph.D.,

  19. N

    DOB NOW: Build – Job Application Filings

    • data.cityofnewyork.us
    • datasets.ai
    • +3more
    csv, xlsx, xml
    Updated Dec 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Buildings (DOB) (2025). DOB NOW: Build – Job Application Filings [Dataset]. https://data.cityofnewyork.us/Housing-Development/DOB-NOW-Build-Job-Application-Filings/w9ak-ipjd
    Explore at:
    xml, csv, xlsxAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Department of Buildings (DOB)
    Description

    List of most job filings filed in DOB NOW. This dataset does not include certain types of job. For Electrical jobs, use https://data.cityofnewyork.us/browse?Data-Collection_Data-Collection=DOB+NOW+Electrical+Permits+Data. Elevator and LAA jobs will also be published separately.

  20. v

    Global import data of Filing Cabinet

    • volza.com
    csv
    Updated Oct 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of Filing Cabinet [Dataset]. https://www.volza.com/p/filing-cabinet/import/import-in-united-states/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Oct 11, 2025
    Dataset authored and provided by
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    13263 Global import shipment records of Filing Cabinet with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
data.ny.gov (2025). Corporations and Other Entities: All Filings [Dataset]. https://catalog.data.gov/dataset/corporations-and-other-entities-all-filings

Corporations and Other Entities: All Filings

Explore at:
Dataset updated
Nov 29, 2025
Dataset provided by
data.ny.gov
Description

This data contains all Corporations and Other Entity filings in the Department of State electronic database. Each record contains the Department of State ID number, Filing ID, Date Filed, Effective Date, Entity Name, the law under which the filing was made and other pertinent filing information.

Search
Clear search
Close search
Google apps
Main menu