http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a compiled datasets comprising of data from various companies' 10-K annual reports and balance sheets. The data is a longitudinal or panel data, from year 2009-2022(/23) and also consists of a few bankrupt companies to help for investigating factors. The names of the companies are given according to their Stocks. Companies divided into specific categories.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset offers a detailed collection of US-GAAP financial data extracted from the financial statements of exchange-listed U.S. companies, as submitted to the U.S. Securities and Exchange Commission (SEC) via the EDGAR database. Covering filings from January 2009 onwards, this dataset provides key financial figures reported by companies in accordance with U.S. Generally Accepted Accounting Principles (GAAP).
This dataset primarily relies on the SEC's Financial Statement Data Sets and EDGAR APIs: - SEC Financial Statement Data Sets - EDGAR Application Programming Interfaces
In instances where specific figures were missing from these sources, data was directly extracted from the companies' financial statements to ensure completeness.
Please note that the dataset presents financial figures exactly as reported by the companies, which may occasionally include errors. A common issue involves incorrect reporting of scaling factors in the XBRL format. XBRL supports two tag attributes related to scaling: 'decimals' and 'scale.' The 'decimals' attribute indicates the number of significant decimal places but does not affect the actual value of the figure, while the 'scale' attribute adjusts the value by a specific factor.
However, there are several instances, numbering in the thousands, where companies have incorrectly used the 'decimals' attribute (e.g., 'decimals="-6"') under the mistaken assumption that it controls scaling. This is not correct, and as a result, some figures may be inaccurately scaled. This dataset does not attempt to detect or correct such errors; it aims to reflect the data precisely as reported by the companies. A future version of the dataset may be introduced to address and correct these issues.
The source code for data extraction is available here
The data sets below provide selected information extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL).
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a dataset that requires a lot of preprocessing with amazing EDA insights for a company. A dataset consisting of sales and profit data sorted by market segment and country/region.
Tips for pre-processing: 1. Check for column names and find error there itself!! 2. Remove '$' sign and '-' from all columns where they are present 3. Change datatype from objects to int after the above two. 4. Challenge: Try removing " , " (comma) from all numerical numbers. 5. Try plotting sales and profit with respect to timeline
Our Financial API provides access to a vast collection of historical financial statements for over 50,000+ companies listed on major exchanges. With this powerful tool, you can easily retrieve balance sheets, income statements, and cash flow statements for any company in our extensive database. Stay informed about the financial health of various organizations and make data-driven decisions with confidence. Our API is designed to deliver accurate and up-to-date financial information, enabling you to gain valuable insights and streamline your analysis process. Experience the convenience and reliability of our company financial API today.
https://www.aiceltech.com/termshttps://www.aiceltech.com/terms
Korean Companies’ Financial Data provides important information to analyze a company’s financial status and performance. This data includes financial indicators such as revenue, expenses, assets, and liabilities. Collected from corporate financial reports and stock market data, it helps investors evaluate financial health and discover investment opportunities, essential for valuing Korean companies.
The Financial Statements of Holding Companies (FR Y-9 Reports) collects standardized financial statements from domestic holding companies (HCs). This is pursuant to the Bank Holding Company Act of 1956, as amended (BHC Act), and the Home Owners Loan Act (HOLA). The FR Y-9C is used to identify emerging financial risks and monitor the safety and soundness of HC operations. HCs file the FR Y-9C and FR Y-9LP quarterly, the FR Y-9SP semiannually, the FR Y-9ES annually, and the FR Y-9CS on a schedule that is determined when this supplement is used.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The Financial Statement Data Sets below provide numeric information from the face financials of all financial statements. This data is extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL). As compared to the more extensive Financial Statement and Notes Data Sets, which provide the numeric and narrative disclosures from all financial statements and their notes, the Financial Statement Data Sets are more compact.
The information is presented without change from the "as filed" financial reports submitted by each registrant. The data is presented in a flattened format to help users analyze and compare corporate disclosure information over time and across registrants. The data sets also contain additional fields including a company's Standard Industrial Classification to facilitate the data's use.
Each quarter's data is stored as a json of the original text files. This was necessary to limit the overall number of files. The num.txt
file will likely be of most interest.
This dataset was kindly made available by the SEC. You can find the original dataset, which is updated quarterly, here.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Annual reports Assessment Dataset
This dataset will help investors, merchant bankers, credit rating agencies, and the community of equity research analysts explore annual reports in a more automated way, saving them time.
Following Sub Dataset(s) are there :
a) pdf and corresponding OCR text of 100 Indian annual reports These 100 annual reports are for the 100 largest companies listed on the Bombay Stock Exchange. The total number of words in OCRed text is 12.25 million.
b) A Few Examples of Sentences with Corresponding Classes The author defined 16 widely used topics used in the investment community as classes like:
Accounting Standards
Accounting for Revenue Recognition
Corporate Social Responsbility
Credit Ratings
Diversity Equity and Inclusion
Electronic Voting
Environment and Sustainability
Hedging Strategy
Intellectual Property Infringement Risk
Litigation Risk
Order Book
Related Party Transaction
Remuneration
Research and Development
Talent Management
Whistle Blower Policy
These classes should help generate ideas and investment decisions, as well as identify red flags and early warning signs of trouble when everything appears to be proceeding smoothly.
ABOUT DATA ::
"scrips.json" is a json with name of companies "SC_CODE" is BSE Scrip Id "SC_NAME" is Listed Companies Name "NET_TURNOV" is Turnover on the day of consideration
"source_pdf" is folder containing both PDF and OCR Output from Tesseract "raw_pdf.zip" contains raw PDF and it can be used to try another OCR. "ocr.zip" contains json file (annual_report_content.json) containing OCR text for each pdf. "annual_report_content.json" is an array of 100 elements and each element is having two keys "file_name" and "content"
"classif_data_rank_freezed.json" is used for evaluation of results contains "sentence" and corresponding "class"
Every public company publishes a financial report to declare the financial activities and position of a business. This financial statement contains many tables to present the information. We classify these tables into predefined categories, such as below.
1) Income Statements 2) Balance Sheets 3) Cash Flows 4) Notes 5) Others
Datasets: Within the given dataset you will find 5 folders with the above category names. Every folder contains .html files with respective tabular data.
Expecting the grouping of documents in such a way that the files appear distinguished as per their category. The categories can only be used as a benchmark for evaluation later.
Data extracted: The data has been taken from the Publically available Hexaware Technologies financial annual reports. You can find here on link https://hexaware.com/investors/
Thank you for your Patience, Enjoy the dataset and Explore and learn more. Peace out✌️
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Browse LSEG's US Company Filings Database, and find a range of filings content and history including annual reports, municipal bonds, and more.
The Financial Statements of U.S. Nonbank Subsidiaries of U.S. Holding Companies (FR Y-11; FR Y-11S) reporting forms collect financial information for individual nonfunctional regulated U.S. nonbank subsidiaries of domestic holding companies, which is essential for monitoring the subsidiaries' potential impact on the condition of the holding company or its subsidiary banks. Holding companies file the FR Y-11 on a quarterly or annual basis or the FR Y-11S on an annual basis, predominantly based on whether the organization meets certain asset size thresholds. The FR Y-11 data are used with other holding company data to assess the condition of holding companies that are heavily engaged in nonbanking activities and to monitor the volume, nature, and condition of their nonbanking operations.
Comprehensive database of over 100,000 financial filings from 8,000+ European companies
Our comprehensive and advanced database is completed with all the information you need, with up to >1.5 million company financial records at your disposal. This allows you to easily perform company search on company profile and company directory, with 99% coverage in Malaysia.
Our database also contains company profiles on private limited or limited companies globally, including information such as shareholders and financial accounts can be accessed instantly.
Yahoo Finance Business Information dataset to access comprehensive details on companies, including financial data and business profiles. Popular use cases include market analysis, investment research, and competitive benchmarking.
Use our Yahoo Finance Business Information dataset to access comprehensive financial and corporate data, including company profiles, stock prices, market capitalization, revenue, and key performance metrics. This dataset is tailored for financial analysts, investors, and researchers to analyze market trends and evaluate company performance.
Popular use cases include investment research, competitor benchmarking, and trend forecasting. Leverage this dataset to make informed financial decisions, identify growth opportunities, and gain a deeper understanding of the business landscape.
In the U.S. public companies, certain insiders and broker-dealers are required to regularly file with the SEC. The SEC makes this data available online for anybody to view and use via their Electronic Data Gathering, Analysis, and Retrieval (EDGAR) database. The SEC updates this data every quarter going back to January, 2009. For more information please see this site.
To aid analysis a quick summary view of the data has been created that is not available in the original dataset. The quick summary view pulls together signals into a single table that otherwise would have to be joined from multiple tables and enables a more streamlined user experience.
DISCLAIMER: The Financial Statement and Notes Data Sets contain information derived from structured data filed with the Commission by individual registrants as well as Commission-generated filing identifiers. Because the data sets are derived from information provided by individual registrants, we cannot guarantee the accuracy of the data sets. In addition, it is possible inaccuracies or other errors were introduced into the data sets during the process of extracting the data and compiling the data sets. Finally, the data sets do not reflect all available information, including certain metadata associated with Commission filings. The data sets are intended to assist the public in analyzing data contained in Commission filings; however, they are not a substitute for such filings. Investors should review the full Commission filings before making any investment decision.
The dataset included with this article contains three files describing and defining the sample and variables for VAT impact, and Excel file 1 consists of all raw and filtered data for the variables for the panel data sample. Excel file 2 depicts time-series and cross-sectional data for nonfinancial firms listed on the Saudi market for the second and third quarters of 2019 and the third and fourth quarters of 2020. Excel file 3 presents the raw material of variables used in measuring the company's profitability of the panel data sample
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Access historical and point-in-time financial statements, ratios, multiples, and press releases, with LSEG's S&P Compustat Database.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset Card for Financial Fraud Labeled Dataset
Dataset Details
This dataset collects financial filings from various companies submitted to the U.S. Securities and Exchange Commission (SEC). The dataset consists of 85 companies involved in fraudulent cases and an equal number of companies not involved in fraudulent activities. The Fillings column includes information such as the company's MD&A, and financial statement over the years the company stated on the SEC… See the full description on the dataset page: https://huggingface.co/datasets/amitkedia/Financial-Fraud-Dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘📊 Financial market screener’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/pierrelouisdanieau/financial-market-screener on 28 January 2022.
--- Dataset description provided by original source is as follows ---
In this dataset you will find several characteristics on global companies listed on the stock exchange. These characteristics are analyzed by millions of investors before they invest their money.
Analyze the stock market performance of thousands of companies ! This is the objective of this dataset !
Among thse charateristics you will find :
All this data is public data, obtained from the annual financial reports of these companies. They have been retrieved from the Yahoo Finance API and have been checked beforehand.
This dataset has been designed so that it is possible to build a recommendation engine. For example, from an existing position in a portfolio, recommend an alternative with similar characteristics (sector, market capitalization, current ratio,...) but more in line with an investor's expectations (may be with less risk or with more dividends etc...)
If you have question about this dataset you can contact me
--- Original source retains full ownership of the source dataset ---
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This is a compiled datasets comprising of data from various companies' 10-K annual reports and balance sheets. The data is a longitudinal or panel data, from year 2009-2022(/23) and also consists of a few bankrupt companies to help for investigating factors. The names of the companies are given according to their Stocks. Companies divided into specific categories.