The data sets below provide selected information extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL).
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
This dataset is from the SEC's Financial Statements and Notes Data Set.
It was a personal project to see if I could make the queries efficient.
It's just been collecting dust ever since, maybe someone will make good use of it.
Data is up to about early-2024.
It doesn't differ from the source, other than it's compiled - so maybe you can try it out, then compile your own (with the link below).
Dataset was created using SEC Files and SQL Server on Docker.
For details on the SQL Server database this came from, see: "dataset-previous-life-info" folder, which will contain:
- Row Counts
- Primary/Foreign Keys
- SQL Statements to recreate database tables
- Example queries on how to join the data tables.
- A pretty picture of the table associations.
Source: https://www.sec.gov/data-research/financial-statement-notes-data-sets
Happy coding!
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Annual reports Assessment Dataset
This dataset will help investors, merchant bankers, credit rating agencies, and the community of equity research analysts explore annual reports in a more automated way, saving them time.
Following Sub Dataset(s) are there :
a) pdf and corresponding OCR text of 100 Indian annual reports These 100 annual reports are for the 100 largest companies listed on the Bombay Stock Exchange. The total number of words in OCRed text is 12.25 million.
b) A Few Examples of Sentences with Corresponding Classes The author defined 16 widely used topics used in the investment community as classes like:
Accounting Standards
Accounting for Revenue Recognition
Corporate Social Responsbility
Credit Ratings
Diversity Equity and Inclusion
Electronic Voting
Environment and Sustainability
Hedging Strategy
Intellectual Property Infringement Risk
Litigation Risk
Order Book
Related Party Transaction
Remuneration
Research and Development
Talent Management
Whistle Blower Policy
These classes should help generate ideas and investment decisions, as well as identify red flags and early warning signs of trouble when everything appears to be proceeding smoothly.
ABOUT DATA ::
"scrips.json" is a json with name of companies "SC_CODE" is BSE Scrip Id "SC_NAME" is Listed Companies Name "NET_TURNOV" is Turnover on the day of consideration
"source_pdf" is folder containing both PDF and OCR Output from Tesseract "raw_pdf.zip" contains raw PDF and it can be used to try another OCR. "ocr.zip" contains json file (annual_report_content.json) containing OCR text for each pdf. "annual_report_content.json" is an array of 100 elements and each element is having two keys "file_name" and "content"
"classif_data_rank_freezed.json" is used for evaluation of results contains "sentence" and corresponding "class"
Success.ai offers a cutting-edge solution for businesses and organizations seeking Company Financial Data on private and public companies. Our comprehensive database is meticulously crafted to provide verified profiles, including contact details for financial decision-makers such as CFOs, financial analysts, corporate treasurers, and other key stakeholders. This robust dataset is continuously updated and validated using AI technology to ensure accuracy and relevance, empowering businesses to make informed decisions and optimize their financial strategies.
Key Features of Success.ai's Company Financial Data:
Global Coverage: Access data from over 70 million businesses worldwide, including public and private companies across all major industries and regions. Our datasets span 250+ countries, offering extensive reach for your financial analysis and market research.
Detailed Financial Profiles: Gain insights into company financials, including revenue, profit margins, funding rounds, and operational costs. Profiles are enriched with key contact details, including work emails, phone numbers, and physical addresses, ensuring direct access to decision-makers.
Industry-Specific Data: Tailored datasets for sectors such as financial services, manufacturing, technology, healthcare, and energy, among others. Each dataset is customized to meet the unique needs of industry professionals and analysts.
Real-Time Accuracy: With continuous updates powered by AI-driven validation, our financial data maintains a 99% accuracy rate, ensuring you have access to the most reliable and up-to-date information available.
Compliance and Security: All data is collected and processed in strict adherence to global compliance standards, including GDPR, ensuring ethical and lawful usage.
Why Choose Success.ai for Company Financial Data?
Best Price Guarantee: We pride ourselves on offering the most competitive pricing in the industry, ensuring you receive unparalleled value for comprehensive financial data.
AI-Validated Accuracy: Our advanced AI algorithms meticulously verify every data point to ensure precision and reliability, helping you avoid costly errors in your financial decision-making.
Customized Data Solutions: Whether you need data for a specific region, industry, or type of business, we tailor our datasets to align perfectly with your requirements.
Scalable Data Access: From small startups to global enterprises, our platform caters to businesses of all sizes, delivering scalable solutions to suit your operational needs.
Comprehensive Use Cases for Financial Data:
Leverage our detailed financial profiles to create accurate budgets, forecasts, and strategic plans. Gain insights into competitors’ financial health and market positions to make data-driven decisions.
Access key financial details and contact information to streamline your M&A processes. Identify potential acquisition targets or partners with verified profiles and financial data.
Evaluate the financial performance of public and private companies for informed investment decisions. Use our data to identify growth opportunities and assess risk factors.
Enhance your sales outreach by targeting CFOs, financial analysts, and other decision-makers with verified contact details. Utilize accurate email and phone data to increase conversion rates.
Understand market trends and financial benchmarks with our industry-specific datasets. Use the data for competitive analysis, benchmarking, and identifying market gaps.
APIs to Power Your Financial Strategies:
Enrichment API: Integrate real-time updates into your systems with our Enrichment API. Keep your financial data accurate and current to drive dynamic decision-making and maintain a competitive edge.
Lead Generation API: Supercharge your lead generation efforts with access to verified contact details for key financial decision-makers. Perfect for personalized outreach and targeted campaigns.
Tailored Solutions for Industry Professionals:
Financial Services Firms: Gain detailed insights into revenue streams, funding rounds, and operational costs for competitor analysis and client acquisition.
Corporate Finance Teams: Enhance decision-making with precise data on industry trends and benchmarks.
Consulting Firms: Deliver informed recommendations to clients with access to detailed financial datasets and key stakeholder profiles.
Investment Firms: Identify potential investment opportunities with verified data on financial performance and market positioning.
What Sets Success.ai Apart?
Extensive Database: Access detailed financial data for 70M+ companies worldwide, including small businesses, startups, and large corporations.
Ethical Practices: Our data collection and processing methods are fully comp...
https://www.aiceltech.com/termshttps://www.aiceltech.com/terms
Korean Companies’ Financial Data provides important information to analyze a company’s financial status and performance. This data includes financial indicators such as revenue, expenses, assets, and liabilities. Collected from corporate financial reports and stock market data, it helps investors evaluate financial health and discover investment opportunities, essential for valuing Korean companies.
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Browse LSEG's US Company Filings Database, and find a range of filings content and history including annual reports, municipal bonds, and more.
The Corporate Financial Fraud project is a study of company and top-executive characteristics of firms that ultimately violated Securities and Exchange Commission (SEC) financial accounting and securities fraud provisions compared to a sample of public companies that did not. The fraud firm sample was identified through systematic review of SEC accounting enforcement releases from 2005-2010, which included administrative and civil actions, and referrals for criminal prosecution that were identified through mentions in enforcement release, indictments, and news searches. The non-fraud firms were randomly selected from among nearly 10,000 US public companies censused and active during at least one year between 2005-2010 in Standard and Poor's Compustat data. The Company and Top-Executive (CEO) databases combine information from numerous publicly available sources, many in raw form that were hand-coded (e.g., for fraud firms: Accounting and Auditing Enforcement Releases (AAER) enforcement releases, investigation summaries, SEC-filed complaints, litigation proceedings and case outcomes). Financial and structural information on companies for the year leading up to the financial fraud (or around year 2000 for non-fraud firms) was collected from Compustat financial statement data on Form 10-Ks, and supplemented by hand-collected data from original company 10-Ks, proxy statements, or other financial reports accessed via Electronic Data Gathering, Analysis, and Retrieval (EDGAR), SEC's data-gathering search tool. For CEOs, data on personal background characteristics were collected from Execucomp and BoardEx databases, supplemented by hand-collection from proxy-statement biographies.
https://brightdata.com/licensehttps://brightdata.com/license
Stay informed with our comprehensive Financial News Dataset, designed for investors, analysts, and businesses to track market trends, monitor financial events, and make data-driven decisions.
Dataset Features
Financial News Articles: Access structured financial news data, including headlines, summaries, full articles, publication dates, and source details. Market & Economic Indicators: Track financial reports, stock market updates, economic forecasts, and corporate earnings announcements. Sentiment & Trend Analysis: Analyze news sentiment, categorize articles by financial topics, and monitor emerging trends in global markets. Historical & Real-Time Data: Retrieve historical financial news archives or access continuously updated feeds for real-time insights.
Customizable Subsets for Specific Needs Our Financial News Dataset is fully customizable, allowing you to filter data based on publication date, region, financial topics, sentiment, or specific news sources. Whether you need broad coverage for market research or focused data for investment analysis, we tailor the dataset to your needs.
Popular Use Cases
Investment Strategy & Risk Management: Monitor financial news to assess market risks, identify investment opportunities, and optimize trading strategies. Market & Competitive Intelligence: Track industry trends, competitor financial performance, and economic developments. AI & Machine Learning Training: Use structured financial news data to train AI models for sentiment analysis, stock prediction, and automated trading. Regulatory & Compliance Monitoring: Stay updated on financial regulations, policy changes, and corporate governance news. Economic Research & Forecasting: Analyze financial news trends to predict economic shifts and market movements.
Whether you're tracking stock market trends, analyzing financial sentiment, or training AI models, our Financial News Dataset provides the structured data you need. Get started today and customize your dataset to fit your business objectives.
Problem Statement 👉 Download the case studies here A financial services firm faced inefficiencies in generating accurate and timely financial reports. The manual reporting process was labor-intensive, prone to errors, and delayed decision-making. With increasing data complexity and regulatory requirements, the firm sought an automated solution to streamline financial reporting while maintaining high accuracy. Challenge Implementing an automated financial reporting system involved addressing… See the full description on the dataset page: https://huggingface.co/datasets/globosetechnology12/Automated-Financial-Reporting.
LinkedIn companies use datasets to access public company data for machine learning, ecosystem mapping, and strategic decisions. Popular use cases include competitive analysis, CRM enrichment, and lead generation.
Use our LinkedIn Companies Information dataset to access comprehensive data on companies worldwide, including business size, industry, employee profiles, and corporate activity. This dataset provides key company insights, organizational structure, and competitive landscape, tailored for market researchers, HR professionals, business analysts, and recruiters.
Leverage the LinkedIn Companies dataset to track company growth, analyze industry trends, and refine your recruitment strategies. By understanding company dynamics and employee movements, you can optimize sourcing efforts, enhance business development opportunities, and gain a strategic edge in your market. Stay informed and make data-backed decisions with this essential resource for understanding global company ecosystems.
This dataset is ideal for:
- Market Research: Identifying key trends and patterns across different industries and geographies.
- Business Development: Analyzing potential partners, competitors, or customers.
- Investment Analysis: Assessing investment potential based on company size, funding, and industries.
- Recruitment & Talent Analytics: Understanding the workforce size and specialties of various companies.
CUSTOM
Please review the respective licenses below:
https://www.lseg.com/en/policies/website-disclaimerhttps://www.lseg.com/en/policies/website-disclaimer
Company fundamentals data provides the user with a company's current financial health and when combined historically, the financial 'life-story' of the company.
Comprehensive database of over 100,000 financial filings from 8,000+ European companies
The Financial Statements of Holding Companies (FR Y-9 Reports) collects standardized financial statements from domestic holding companies (HCs). This is pursuant to the Bank Holding Company Act of 1956, as amended (BHC Act), and the Home Owners Loan Act (HOLA). The FR Y-9C is used to identify emerging financial risks and monitor the safety and soundness of HC operations. HCs file the FR Y-9C and FR Y-9LP quarterly, the FR Y-9SP semiannually, the FR Y-9ES annually, and the FR Y-9CS on a schedule that is determined when this supplement is used.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset provided includes information about various companies, their stock symbols, financial metrics such as price-to-book ratio and share price, as well as details about their origin countries. Additionally, the dataset contains frequency distribution information for certain ranges of price-to-book ratios and share prices.
The dataset appears to be a compilation of financial data for different companies, likely for investment analysis or comparison purposes. It includes the following key components:
This dataset can be utilized for various financial analyses such as company valuation, comparison of financial metrics across companies, and investment decision-making.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ArGiMI Ardian datasets : Text only
The ArGiMi project is committed to open-source principles and data sharing. Thanks to our generous partners, we are releasing several valuable datasets to the public.
Dataset description
This dataset comprises 11,000 financial annual reports, written in english, meticulously extracted from their original PDF format to provide a valuable resource for researchers and developers in financial analysis and natural language… See the full description on the dataset page: https://huggingface.co/datasets/artefactory/Argimi-Ardian-Finance-10k-text.
Our comprehensive and advanced database is completed with all the information you need, with up to >1.5 million company financial records at your disposal. This allows you to easily perform company search on company profile and company directory, with 99% coverage in Malaysia.
Our database also contains company profiles on private limited or limited companies globally, including information such as shareholders and financial accounts can be accessed instantly.
See Readme.txt. Visit https://dataone.org/datasets/sha256%3A780638d72909580abd52dd18a16fa53928d07fc992dbc160bc66271c61cc1d9d for complete metadata about this dataset.
In the U.S. public companies, certain insiders and broker-dealers are required to regularly file with the SEC. The SEC makes this data available online for anybody to view and use via their Electronic Data Gathering, Analysis, and Retrieval (EDGAR) database. The SEC updates this data every quarter going back to January, 2009. For more information please see this site.
To aid analysis a quick summary view of the data has been created that is not available in the original dataset. The quick summary view pulls together signals into a single table that otherwise would have to be joined from multiple tables and enables a more streamlined user experience.
DISCLAIMER: The Financial Statement and Notes Data Sets contain information derived from structured data filed with the Commission by individual registrants as well as Commission-generated filing identifiers. Because the data sets are derived from information provided by individual registrants, we cannot guarantee the accuracy of the data sets. In addition, it is possible inaccuracies or other errors were introduced into the data sets during the process of extracting the data and compiling the data sets. Finally, the data sets do not reflect all available information, including certain metadata associated with Commission filings. The data sets are intended to assist the public in analyzing data contained in Commission filings; however, they are not a substitute for such filings. Investors should review the full Commission filings before making any investment decision.
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This dataset contains 604 public company financial statement annually in IDX (Bursa Efek Indonesia), largest number that I can see in kaggle :D. Company that's not included in this dataset either do not report their financial statement or contains some irrelevant publishing date.
Please leave a message on suggestions!
Type | Description | Translate (in Indonesia) |
---|---|---|
BS | Balance Sheet/Statement of FInancial Position | Laporan Posisi Neraca / Laporan Posisi Keuangan |
IS | (Consolidated) Income Statement | Laporan Laba/Rugi (Konsolidasian) |
CF | Statement of Cash Flow | Laporan Arus Kas |
Account | Type | Translate (in Indonesia) |
---|---|---|
Accounts Payable | BS | Utang Usaha |
Accounts Receivable | BS | Piutang Usaha |
Accumulated Depreciation | BS | Akumulasi Penyusutan |
Additional Paid In Capital (PIC) / Share Premium | BS | Saham premium |
Allowance For Doubtful Accounts Receivable (AFDA) | BS | Cadangan Piutang Usaha |
Buildings And Improvements | BS | Bangunan dan Pengembangan |
Capital Stock | BS | Saham |
Cash And Cash Equivalents | BS | Kas dan Setara Kas |
Cash Cash Equivalents And Short Term Investments | BS | Kas, Setara Kas, dan Investasi Jangka Pendek |
Cash Equivalents | BS | Setara Kas |
Cash Financial | BS | Kas yang berhubungan dengan aktiviatas keuangan |
Common Stock | BS | Saham Biasa |
Common Stock Equity | BS | Ekuitas Saham Biasa |
Construction In Progress | BS | Konstruksi yang Sedang Berlangsung |
Current Assets | BS | Aset Lancar |
Current Debt | BS | Utang Lancar |
Current Debt And Capital Lease Obligation | BS | Utang Lancar dan Kewajiban Sewa Kapital |
Current Liabilities | BS | Liabilitas Lancar |
Finished Goods | BS | Barang Jadi |
Goodwill | BS | Nilai Tambah (Goodwill) |
Goodwill And Other Intangible Assets | BS | Nilai Tambah (Goodwill) dan Aset Tidak Berwujud Lainnya |
Gross Accounts Receivable | BS | Piutang Usaha Bruto |
Gross PPE | BS | Aktiva Tetap Bruto (Properti, Pabrik, dan Peralatan) |
Inventory | BS | Persediaan |
Invested Capital | BS | Kapital yang Diinvestasikan |
Investmentsin Joint Venturesat Cost | BS | Investasi dalam Usaha Patungan dengan Harga Perolehan |
Land And Improvements | BS | Tanah dan Pengembangan |
Long Term Debt | BS | Utang Jangka Panjang |
Long Term Debt And Capital Lease Obligation | BS | Utang Jangka Panjang dan Kewajiban Sewa Kapital |
Long Term Equity Investment | BS | Investasi Ekuitas Jangka Panjang |
Machinery Furniture Equipment | BS | Mesin, Perabotan dan Perlengkapan |
Minority Interest | BS | Kepentingan Minoritas |
Net Debt | BS | Utang Bersih |
Net PPE | BS | Aktiva Tetap Bersih (Properti, Pabrik, dan Peralatan) |
Net Tangible Assets | BS | Aset Berwujud Bersih |
Non Current Deferred Taxes Assets | BS | Aset Pajak Tangguhan Non Lancar |
Non Current Deferred Taxes Liabilities | BS | Liabilitas Pajak Tangguhan Non Lancar |
Non Current Pension And Other Postretirement Benefit Plans | BS | Rencana Pensiun Non Lancar dan Manfaat Pasca Pensiun Lainnya |
Ordinary Shares Number | BS | Jumlah Saham Biasa |
Other Current Liabilities | BS | Liabilitas Lancar Lainnya |
Other Equity Interest | BS | Kepentingan Ekuitas Lainnya |
Other Inventories | BS | Persediaan Lainnya |
Other Non Current Assets | BS | Aset Non Lancar Lainnya |
Other Non Current Liabilities | BS | Liabilitas Non Lancar Lainnya |
Other Payable | BS | Hutang Lainnya |
Other Properties | BS | Properti Lainnya |
Other Receivables | BS | Piutang Lainnya |
Payables | BS | Utang |
Pensionand Other Post Retirement Benefit Plans Current | BS | Rencana Pensiun dan Manfaat Pasca Pensiun Lainnya Saat Ini |
Prepaid Assets | BS | Aset Dibayar Dimuka |
Properties | BS | Properti |
Raw Materials | BS | Bahan Baku |
Retained Earnings | BS | Laba Ditahan |
Share Issued | BS | Saham yang Diterbitkan |
Stockholders Equity | BS | Ekuitas Pemegang Saham |
Tangible Book Value | BS | Nilai Buku Berwujud |
Total Assets | BS | Total Aset |
Total Capitalization | BS | Total Kapitalisasi |
Total Debt | BS | Total Utang |
Total Equity Gross Minority Interest | BS | Total Ekuitas Bruto dengan Kepentingan Minoritas |
Total Liabilities Net Minority Interest | BS | Total Liabilitas Bersih dengan Kepentingan Minoritas |
Total Non Current Assets | BS | Total Aset Non Lancar |
Total Non Current Liabilities Net Minority Interest | BS | Total Liabilitas Non Lancar Bersih dengan Kepentingan Minoritas |
Total Tax Payable | BS | Total Utang Pajak |
Treasury Shares Number | BS | Jumlah Saham Treasuri |
Work In Process | BS | Pekerjaan dalam Proses |
Working Capital | BS | Modal Kerja / Kapital Jangka Pendek |
Beginning Cash Position | CF | Posisi Kas Awal |
Capital Expenditure | CF | Pengeluaran - Kapital |
Capital Expenditure Reported | CF | Pengeluaran - Kapital yang Dilaporkan |
Cash Dividends Paid | CF | Dividen Tunai yang Dibayarkan |
Cash Flowsfromusedin Operating Activities Direct | CF | Arus Kas yang Digunakan dalam Aktivitas Operasional Langsung |
Changes In Cash... |
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
\r From 11 March 2025, the dataset will be updated to include 1 new field, Date of Deregistration, (see help file for details). \r \r
\r From 7 August 2018, the Company dataset will be updated weekly every Tuesday. As a result, the information might not be accurate at the time you check the Company dataset.\r ASIC-Connect updates information in real time, therefore, please consider accessing information on that platform if you need up to date information.\r \r ***\r \r
ASIC is Australia’s corporate, markets and financial services regulator. ASIC contributes to Australia’s economic reputation and wellbeing by ensuring that Australia’s financial markets are fair and transparent, supported by confident and informed investors and consumers.\r \r Australian companies are required to keep their details up to date on ASIC's Company Register. Information contained in the register is made available to the public to search via ASIC's website.\r \r Select data from the ASIC's Company Register will be uploaded each week to www.data.gov.au. The data made available will be a snapshot of the register at a point in time. Legislation prescribes the type of information ASIC is allowed to disclose to the public.\r \r The information included in the downloadable dataset is:\r \r * Company Name\r * Australian Company Number (ACN)\r * Type \r * Class\r * Sub Class \r * Status\r * Date of Registration\r * Date of Deregistration (Available from 11 March 2025)\r * Previous State of Registration (where applicable)\r * State Registration Number (where applicable) \r * Modified since last report – flag to indicate if data has been modified since last report\r * Current Name Indicator\r * Australian Business Number (ABN) \r * Current Name\r * Current Name Start Date\r \r Additional information about companies can be found via ASIC's website. Accessing some information may attract a fee.\r \r More information about searching ASIC's registers.\r
The data sets below provide selected information extracted from exhibits to corporate financial reports filed with the Commission using eXtensible Business Reporting Language (XBRL).