100+ datasets found
  1. Financial Statements of Major Companies(2009-2023)

    • kaggle.com
    Updated Dec 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rishabh Patil (2023). Financial Statements of Major Companies(2009-2023) [Dataset]. https://www.kaggle.com/datasets/rish59/financial-statements-of-major-companies2009-2023
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rishabh Patil
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This is a compiled datasets comprising of data from various companies' 10-K annual reports and balance sheets. The data is a longitudinal or panel data, from year 2009-2022(/23) and also consists of a few bankrupt companies to help for investigating factors. The names of the companies are given according to their Stocks. Companies divided into specific categories.

  2. d

    Financial Statement Data Sets

    • catalog.data.gov
    • s.cnmilf.com
    Updated Jul 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Economic and Risk Analysis (2025). Financial Statement Data Sets [Dataset]. https://catalog.data.gov/dataset/financial-statement-data-sets
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset provided by
    Economic and Risk Analysis
    Description

    The data sets below provide selected information extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL).

  3. Data from: Company Financials Dataset

    • kaggle.com
    Updated Aug 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Atharva Arya (2023). Company Financials Dataset [Dataset]. https://www.kaggle.com/datasets/atharvaarya25/financials
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Atharva Arya
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This is a dataset that requires a lot of preprocessing with amazing EDA insights for a company. A dataset consisting of sales and profit data sorted by market segment and country/region.

    Tips for pre-processing: 1. Check for column names and find error there itself!! 2. Remove '$' sign and '-' from all columns where they are present 3. Change datatype from objects to int after the above two. 4. Challenge: Try removing " , " (comma) from all numerical numbers. 5. Try plotting sales and profit with respect to timeline

  4. Company Fundamentals (Company Financials)

    • lseg.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2024). Company Fundamentals (Company Financials) [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/company-data/company-fundamentals-data
    Explore at:
    csv,html,json,pdf,python,sql,text,user interface,xmlAvailable download formats
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    Company fundamentals data provides the user with a company's current financial health and when combined historically, the financial 'life-story' of the company.

  5. d

    Financial Statements API - 50,000+ Companies Covered

    • datarade.ai
    .json, .csv
    Updated Oct 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Financial Modeling Prep (2022). Financial Statements API - 50,000+ Companies Covered [Dataset]. https://datarade.ai/data-products/financial-statements-api-50-000-companies-covered-financial-modeling-prep
    Explore at:
    .json, .csvAvailable download formats
    Dataset updated
    Oct 28, 2022
    Dataset authored and provided by
    Financial Modeling Prep
    Area covered
    Thailand, Greece, Spain, Switzerland, Norway, United States of America, Colombia, Hungary, Germany, Singapore
    Description

    Our Financial API provides access to a vast collection of historical financial statements for over 50,000+ companies listed on major exchanges. With this powerful tool, you can easily retrieve balance sheets, income statements, and cash flow statements for any company in our extensive database. Stay informed about the financial health of various organizations and make data-driven decisions with confidence. Our API is designed to deliver accurate and up-to-date financial information, enabling you to gain valuable insights and streamline your analysis process. Experience the convenience and reliability of our company financial API today.

  6. Financial Statement Data Sets

    • kaggle.com
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vadim Vanak (2025). Financial Statement Data Sets [Dataset]. https://www.kaggle.com/datasets/vadimvanak/company-facts-2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vadim Vanak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset offers a detailed collection of US-GAAP financial data extracted from the financial statements of exchange-listed U.S. companies, as submitted to the U.S. Securities and Exchange Commission (SEC) via the EDGAR database. Covering filings from January 2009 onwards, this dataset provides key financial figures reported by companies in accordance with U.S. Generally Accepted Accounting Principles (GAAP).

    Dataset Features:

    • Data Scope: The dataset is restricted to figures reported under US-GAAP standards, with the exception of EntityCommonStockSharesOutstanding and EntityPublicFloat.
    • Currency and Units: The dataset exclusively includes figures reported in USD or shares, ensuring uniformity and comparability. It excludes ratios and non-financial metrics to maintain focus on financial data.
    • Company Selection: The dataset is limited to companies with U.S. exchange tickers, providing a concentrated analysis of publicly traded firms within the United States.
    • Submission Types: The dataset only incorporates data from 10-Q, 10-K, 10-Q/A, and 10-K/A filings, ensuring consistency in the type of financial reports analyzed.

    Data Sources and Extraction:

    This dataset primarily relies on the SEC's Financial Statement Data Sets and EDGAR APIs: - SEC Financial Statement Data Sets - EDGAR Application Programming Interfaces

    In instances where specific figures were missing from these sources, data was directly extracted from the companies' financial statements to ensure completeness.

    Please note that the dataset presents financial figures exactly as reported by the companies, which may occasionally include errors. A common issue involves incorrect reporting of scaling factors in the XBRL format. XBRL supports two tag attributes related to scaling: 'decimals' and 'scale.' The 'decimals' attribute indicates the number of significant decimal places but does not affect the actual value of the figure, while the 'scale' attribute adjusts the value by a specific factor.

    However, there are several instances, numbering in the thousands, where companies have incorrectly used the 'decimals' attribute (e.g., 'decimals="-6"') under the mistaken assumption that it controls scaling. This is not correct, and as a result, some figures may be inaccurately scaled. This dataset does not attempt to detect or correct such errors; it aims to reflect the data precisely as reported by the companies. A future version of the dataset may be introduced to address and correct these issues.

    The source code for data extraction is available here

  7. E

    European Financial Filings Database

    • financialreports.eu
    json
    Updated 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FinancialReports UG (2024). European Financial Filings Database [Dataset]. https://financialreports.eu/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    2024
    Dataset authored and provided by
    FinancialReports UG
    Time period covered
    2022 - 2024
    Area covered
    Europe
    Description

    Comprehensive database of over 100,000 financial filings from 8,000+ European companies

  8. a

    S.Korea Financial statements datasets

    • aiceltech.com
    Updated Jun 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KED Aicel (2024). S.Korea Financial statements datasets [Dataset]. https://www.aiceltech.com/datasets/financial-statements
    Explore at:
    Dataset updated
    Jun 23, 2024
    Dataset authored and provided by
    KED Aicel
    License

    https://www.aiceltech.com/termshttps://www.aiceltech.com/terms

    Time period covered
    2016 - 2024
    Area covered
    South Korea
    Description

    Korean Companies’ Financial Data provides important information to analyze a company’s financial status and performance. This data includes financial indicators such as revenue, expenses, assets, and liabilities. Collected from corporate financial reports and stock market data, it helps investors evaluate financial health and discover investment opportunities, essential for valuing Korean companies.

  9. b

    American Express Company Financial Statements

    • bullfincher.io
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bullfincher, American Express Company Financial Statements [Dataset]. https://bullfincher.io/companies/american-express-company/financial-statements
    Explore at:
    Dataset authored and provided by
    Bullfincher
    License

    https://bullfincher.io/privacy-policyhttps://bullfincher.io/privacy-policy

    Description

    Get detailed American Express Company Financial Statements 2020-2024. Find the income statements, balance sheet, cashflow, profitability, and other key ratios.

  10. Consolidated Financial Statements for Bank Holding Companies, Parent Company...

    • catalog.data.gov
    Updated Dec 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Board of Governors of the Federal Reserve System (2024). Consolidated Financial Statements for Bank Holding Companies, Parent Company Only Financial Statements for Large Holding Companies, Parent Company Only Financial Statements for Small Holding Companies, Financial Statements Employee Stock Ownership Plan Holding Companies, Supplement to the Consolidated Financial Statements for Bank Holding Companies [Dataset]. https://catalog.data.gov/dataset/consolidated-financial-statements-for-bank-holding-companies-parent-company-only-financial
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    Federal Reserve Systemhttp://www.federalreserve.gov/
    Federal Reserve Board of Governors
    Description

    The Financial Statements of Holding Companies (FR Y-9 Reports) collects standardized financial statements from domestic holding companies (HCs). This is pursuant to the Bank Holding Company Act of 1956, as amended (BHC Act), and the Home Owners Loan Act (HOLA). The FR Y-9C is used to identify emerging financial risks and monitor the safety and soundness of HC operations. HCs file the FR Y-9C and FR Y-9LP quarterly, the FR Y-9SP semiannually, the FR Y-9ES annually, and the FR Y-9CS on a schedule that is determined when this supplement is used.

  11. Data from: Company Data

    • lseg.com
    Updated Nov 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2023). Company Data [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/company-data
    Explore at:
    Dataset updated
    Nov 19, 2023
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    LSEG's Company Data offers an extensive portfolio of content about companies including estimates, filings and ESG. Browse the catalogue.

  12. b

    Hewlett Packard Enterprise Company Financial Statements

    • bullfincher.io
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bullfincher (2025). Hewlett Packard Enterprise Company Financial Statements [Dataset]. https://bullfincher.io/companies/hewlett-packard-enterprise-company/financial-statements
    Explore at:
    Dataset updated
    Jun 11, 2025
    Dataset authored and provided by
    Bullfincher
    License

    https://bullfincher.io/privacy-policyhttps://bullfincher.io/privacy-policy

    Description

    Get detailed Hewlett Packard Enterprise Company Financial Statements 2020-2024. Find the income statements, balance sheet, cashflow, profitability, and other key ratios.

  13. b

    Ford Motor Company Financial Statements

    • bullfincher.io
    Updated Aug 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bullfincher (2025). Ford Motor Company Financial Statements [Dataset]. https://bullfincher.io/companies/ford-motor-company/financial-statements
    Explore at:
    Dataset updated
    Aug 8, 2025
    Dataset authored and provided by
    Bullfincher
    License

    https://bullfincher.io/privacy-policyhttps://bullfincher.io/privacy-policy

    Description

    Get detailed Ford Motor Company Financial Statements 2020-2024. Find the income statements, balance sheet, cashflow, profitability, and other key ratios.

  14. US Company Filings Database

    • lseg.com
    Updated Feb 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2025). US Company Filings Database [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/filings/company-filings-database
    Explore at:
    csv,html,json,pdf,python,text,user interface,xmlAvailable download formats
    Dataset updated
    Feb 3, 2025
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    Browse LSEG's US Company Filings Database, and find a range of filings content and history including annual reports, municipal bonds, and more.

  15. b

    Fastenal Company Financial Statements

    • bullfincher.io
    Updated Jul 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bullfincher (2025). Fastenal Company Financial Statements [Dataset]. https://bullfincher.io/companies/fastenal-company/financial-statements
    Explore at:
    Dataset updated
    Jul 31, 2025
    Dataset authored and provided by
    Bullfincher
    License

    https://bullfincher.io/privacy-policyhttps://bullfincher.io/privacy-policy

    Description

    Get detailed Fastenal Company Financial Statements 2020-2024. Find the income statements, balance sheet, cashflow, profitability, and other key ratios.

  16. d

    FirstRate Data - US Fundamental Data (Historical Financial Data for 30 Years...

    • datarade.ai
    .xls
    Updated Dec 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FirstRate Data (2020). FirstRate Data - US Fundamental Data (Historical Financial Data for 30 Years Quarterly Financials for 5500 Tickers) [Dataset]. https://datarade.ai/data-products/us-fundamental-data-30-years-quarterly-financials-for-5500-tickers-firstrate-data
    Explore at:
    .xlsAvailable download formats
    Dataset updated
    Dec 20, 2020
    Dataset authored and provided by
    FirstRate Data
    Area covered
    United States
    Description
    • Data from Dec 1989 to Dec 2020.
    • Includes Income Statement, Balance Sheet, and Cashflow statement.
    • Adjusted for restatements.
    • Includes valuation metrics such as enterprise valuation and market capitalization.
    • Over 30 ratios such as p/e ratio, EBITDA/sales, gross margin etc..
    • Standardized categories for comparison between companies.
  17. S&P Compustat Database

    • lseg.com
    sql
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LSEG (2024). S&P Compustat Database [Dataset]. https://www.lseg.com/en/data-analytics/financial-data/company-data/fundamentals-data/standardized-fundamentals/sp-compustat-database
    Explore at:
    sqlAvailable download formats
    Dataset updated
    Nov 25, 2024
    Dataset provided by
    London Stock Exchange Grouphttp://www.londonstockexchangegroup.com/
    Authors
    LSEG
    License

    https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer

    Description

    Access historical and point-in-time financial statements, ratios, multiples, and press releases, with LSEG's S&P Compustat Database.

  18. b

    McCormick & Company, Incorporated Financial Statements

    • bullfincher.io
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bullfincher (2025). McCormick & Company, Incorporated Financial Statements [Dataset]. https://bullfincher.io/companies/mccormick-company-incorporated/financial-statements
    Explore at:
    Dataset updated
    Jun 2, 2025
    Dataset authored and provided by
    Bullfincher
    License

    https://bullfincher.io/privacy-policyhttps://bullfincher.io/privacy-policy

    Description

    Get detailed McCormick & Company, Incorporated Financial Statements 2020-2024. Find the income statements, balance sheet, cashflow, profitability, and other key ratios.

  19. d

    Financial Services Commission_Corporate financial information

    • data.go.kr
    json+xml
    Updated Jul 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Financial Services Commission_Corporate financial information [Dataset]. https://www.data.go.kr/en/data/15043459/openapi.do
    Explore at:
    json+xmlAvailable download formats
    Dataset updated
    Jul 17, 2025
    License

    https://data.go.kr/ugs/selectPortalPolicyView.dohttps://data.go.kr/ugs/selectPortalPolicyView.do

    Description

    Corporate financial information is data that allows you to search for corporate financial statement items based on the corporate registration number and fiscal year. The items provided include not only summarized financial information such as the company's sales, operating profit, total assets, total liabilities, and capital, but also financial statements and income statement items by account subject. The data consists of three operations: summary financial statement inquiry, financial statement inquiry, and income statement inquiry, and comparative figures for the previous year, current year, and previous quarter are also provided for each item. This data can be used for various financial analyses such as corporate management performance analysis, financial soundness evaluation, investment risk analysis, and financial comparison between companies.

  20. Z

    Data from: Russian Financial Statements Database: A firm-level collection of...

    • data.niaid.nih.gov
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Skougarevskiy, Dmitriy (2025). Russian Financial Statements Database: A firm-level collection of the universe of financial statements [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14622208
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Ledenev, Victor
    Skougarevskiy, Dmitriy
    Bondarkov, Sergey
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    Russia
    Description

    The Russian Financial Statements Database (RFSD) is an open, harmonized collection of annual unconsolidated financial statements of the universe of Russian firms:

    • 🔓 First open data set with information on every active firm in Russia.

    • 🗂️ First open financial statements data set that includes non-filing firms.

    • 🏛️ Sourced from two official data providers: the Rosstat and the Federal Tax Service.

    • 📅 Covers 2011-2023 initially, will be continuously updated.

    • 🏗️ Restores as much data as possible through non-invasive data imputation, statement articulation, and harmonization.

    The RFSD is hosted on 🤗 Hugging Face and Zenodo and is stored in a structured, column-oriented, compressed binary format Apache Parquet with yearly partitioning scheme, enabling end-users to query only variables of interest at scale.

    The accompanying paper provides internal and external validation of the data: http://arxiv.org/abs/2501.05841.

    Here we present the instructions for importing the data in R or Python environment. Please consult with the project repository for more information: http://github.com/irlcode/RFSD.

    Importing The Data

    You have two options to ingest the data: download the .parquet files manually from Hugging Face or Zenodo or rely on 🤗 Hugging Face Datasets library.

    Python

    🤗 Hugging Face Datasets

    It is as easy as:

    from datasets import load_dataset import polars as pl

    This line will download 6.6GB+ of all RFSD data and store it in a 🤗 cache folder

    RFSD = load_dataset('irlspbru/RFSD')

    Alternatively, this will download ~540MB with all financial statements for 2023# to a Polars DataFrame (requires about 8GB of RAM)

    RFSD_2023 = pl.read_parquet('hf://datasets/irlspbru/RFSD/RFSD/year=2023/*.parquet')

    Please note that the data is not shuffled within year, meaning that streaming first n rows will not yield a random sample.

    Local File Import

    Importing in Python requires pyarrow package installed.

    import pyarrow.dataset as ds import polars as pl

    Read RFSD metadata from local file

    RFSD = ds.dataset("local/path/to/RFSD")

    Use RFSD_dataset.schema to glimpse the data structure and columns' classes

    print(RFSD.schema)

    Load full dataset into memory

    RFSD_full = pl.from_arrow(RFSD.to_table())

    Load only 2019 data into memory

    RFSD_2019 = pl.from_arrow(RFSD.to_table(filter=ds.field('year') == 2019))

    Load only revenue for firms in 2019, identified by taxpayer id

    RFSD_2019_revenue = pl.from_arrow( RFSD.to_table( filter=ds.field('year') == 2019, columns=['inn', 'line_2110'] ) )

    Give suggested descriptive names to variables

    renaming_df = pl.read_csv('local/path/to/descriptive_names_dict.csv') RFSD_full = RFSD_full.rename({item[0]: item[1] for item in zip(renaming_df['original'], renaming_df['descriptive'])})

    R

    Local File Import

    Importing in R requires arrow package installed.

    library(arrow) library(data.table)

    Read RFSD metadata from local file

    RFSD <- open_dataset("local/path/to/RFSD")

    Use schema() to glimpse into the data structure and column classes

    schema(RFSD)

    Load full dataset into memory

    scanner <- Scanner$create(RFSD) RFSD_full <- as.data.table(scanner$ToTable())

    Load only 2019 data into memory

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scanner <- scan_builder$Finish() RFSD_2019 <- as.data.table(scanner$ToTable())

    Load only revenue for firms in 2019, identified by taxpayer id

    scan_builder <- RFSD$NewScan() scan_builder$Filter(Expression$field_ref("year") == 2019) scan_builder$Project(cols = c("inn", "line_2110")) scanner <- scan_builder$Finish() RFSD_2019_revenue <- as.data.table(scanner$ToTable())

    Give suggested descriptive names to variables

    renaming_dt <- fread("local/path/to/descriptive_names_dict.csv") setnames(RFSD_full, old = renaming_dt$original, new = renaming_dt$descriptive)

    Use Cases

    🌍 For macroeconomists: Replication of a Bank of Russia study of the cost channel of monetary policy in Russia by Mogiliat et al. (2024) — interest_payments.md

    🏭 For IO: Replication of the total factor productivity estimation by Kaukin and Zhemkova (2023) — tfp.md

    🗺️ For economic geographers: A novel model-less house-level GDP spatialization that capitalizes on geocoding of firm addresses — spatialization.md

    FAQ

    Why should I use this data instead of Interfax's SPARK, Moody's Ruslana, or Kontur's Focus?hat is the data period?

    To the best of our knowledge, the RFSD is the only open data set with up-to-date financial statements of Russian companies published under a permissive licence. Apart from being free-to-use, the RFSD benefits from data harmonization and error detection procedures unavailable in commercial sources. Finally, the data can be easily ingested in any statistical package with minimal effort.

    What is the data period?

    We provide financials for Russian firms in 2011-2023. We will add the data for 2024 by July, 2025 (see Version and Update Policy below).

    Why are there no data for firm X in year Y?

    Although the RFSD strives to be an all-encompassing database of financial statements, end users will encounter data gaps:

    We do not include financials for firms that we considered ineligible to submit financial statements to the Rosstat/Federal Tax Service by law: financial, religious, or state organizations (state-owned commercial firms are still in the data).

    Eligible firms may enjoy the right not to disclose under certain conditions. For instance, Gazprom did not file in 2022 and we had to impute its 2022 data from 2023 filings. Sibur filed only in 2023, Novatek — in 2020 and 2021. Commercial data providers such as Interfax's SPARK enjoy dedicated access to the Federal Tax Service data and therefore are able source this information elsewhere.

    Firm may have submitted its annual statement but, according to the Uniform State Register of Legal Entities (EGRUL), it was not active in this year. We remove those filings.

    Why is the geolocation of firm X incorrect?

    We use Nominatim to geocode structured addresses of incorporation of legal entities from the EGRUL. There may be errors in the original addresses that prevent us from geocoding firms to a particular house. Gazprom, for instance, is geocoded up to a house level in 2014 and 2021-2023, but only at street level for 2015-2020 due to improper handling of the house number by Nominatim. In that case we have fallen back to street-level geocoding. Additionally, streets in different districts of one city may share identical names. We have ignored those problems in our geocoding and invite your submissions. Finally, address of incorporation may not correspond with plant locations. For instance, Rosneft has 62 field offices in addition to the central office in Moscow. We ignore the location of such offices in our geocoding, but subsidiaries set up as separate legal entities are still geocoded.

    Why is the data for firm X different from https://bo.nalog.ru/?

    Many firms submit correcting statements after the initial filing. While we have downloaded the data way past the April, 2024 deadline for 2023 filings, firms may have kept submitting the correcting statements. We will capture them in the future releases.

    Why is the data for firm X unrealistic?

    We provide the source data as is, with minimal changes. Consider a relatively unknown LLC Banknota. It reported 3.7 trillion rubles in revenue in 2023, or 2% of Russia's GDP. This is obviously an outlier firm with unrealistic financials. We manually reviewed the data and flagged such firms for user consideration (variable outlier), keeping the source data intact.

    Why is the data for groups of companies different from their IFRS statements?

    We should stress that we provide unconsolidated financial statements filed according to the Russian accounting standards, meaning that it would be wrong to infer financials for corporate groups with this data. Gazprom, for instance, had over 800 affiliated entities and to study this corporate group in its entirety it is not enough to consider financials of the parent company.

    Why is the data not in CSV?

    The data is provided in Apache Parquet format. This is a structured, column-oriented, compressed binary format allowing for conditional subsetting of columns and rows. In other words, you can easily query financials of companies of interest, keeping only variables of interest in memory, greatly reducing data footprint.

    Version and Update Policy

    Version (SemVer): 1.0.0.

    We intend to update the RFSD annualy as the data becomes available, in other words when most of the firms have their statements filed with the Federal Tax Service. The official deadline for filing of previous year statements is April, 1. However, every year a portion of firms either fails to meet the deadline or submits corrections afterwards. Filing continues up to the very end of the year but after the end of April this stream quickly thins out. Nevertheless, there is obviously a trade-off between minimization of data completeness and version availability. We find it a reasonable compromise to query new data in early June, since on average by the end of May 96.7% statements are already filed, including 86.4% of all the correcting filings. We plan to make a new version of RFSD available by July.

    Licence

    Creative Commons License Attribution 4.0 International (CC BY 4.0).

    Copyright © the respective contributors.

    Citation

    Please cite as:

    @unpublished{bondarkov2025rfsd, title={{R}ussian {F}inancial {S}tatements {D}atabase}, author={Bondarkov, Sergey and Ledenev, Victor and Skougarevskiy, Dmitriy}, note={arXiv preprint arXiv:2501.05841}, doi={https://doi.org/10.48550/arXiv.2501.05841}, year={2025}}

    Acknowledgments and Contacts

    Data collection and processing: Sergey Bondarkov, sbondarkov@eu.spb.ru, Viktor Ledenev, vledenev@eu.spb.ru

    Project conception, data validation, and use cases: Dmitriy Skougarevskiy, Ph.D.,

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Rishabh Patil (2023). Financial Statements of Major Companies(2009-2023) [Dataset]. https://www.kaggle.com/datasets/rish59/financial-statements-of-major-companies2009-2023
Organization logo

Financial Statements of Major Companies(2009-2023)

Data from 10-K reports and balance sheets.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Rishabh Patil
License

http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

Description

This is a compiled datasets comprising of data from various companies' 10-K annual reports and balance sheets. The data is a longitudinal or panel data, from year 2009-2022(/23) and also consists of a few bankrupt companies to help for investigating factors. The names of the companies are given according to their Stocks. Companies divided into specific categories.

Search
Clear search
Close search
Google apps
Main menu