3 datasets found
  1. Center for Research in Security Prices (CRSP) Stock Files

    • archive.ciser.cornell.edu
    Updated Oct 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Center for Research in Security Prices (2023). Center for Research in Security Prices (CRSP) Stock Files [Dataset]. https://archive.ciser.cornell.edu/studies/2191
    Explore at:
    Dataset updated
    Oct 4, 2023
    Dataset authored and provided by
    Center for Research in Security Prices
    Description

    The Center for Research in Security Prices (CRSP) stock databases provide time-series and event data on individual stocks, augmented with market time-series. Daily and monthly time-series variables include returns, closing, low bid and high ask prices, and trading volume. Event data includes distributions, shares outstanding, names, etc.

    Dataset is an external database available here for Cornell affiliates: https://johnson.library.cornell.edu/database/wharton-research-data-services-wrds/

  2. F

    Financial Database Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Financial Database Report [Dataset]. https://www.marketreportanalytics.com/reports/financial-database-75303
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Apr 10, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global financial database market is experiencing robust growth, driven by increasing demand for real-time data analytics and insights across various financial sectors. The market, currently estimated at $15 billion in 2025, is projected to expand at a compound annual growth rate (CAGR) of 8% from 2025 to 2033, reaching approximately $28 billion by 2033. This expansion is fueled by several key factors. The rise of algorithmic trading and quantitative finance necessitates access to high-quality, comprehensive financial data, driving demand for both real-time and historical databases. Moreover, regulatory compliance requirements are pushing financial institutions to invest in robust data management systems, contributing to market growth. The increasing adoption of cloud-based solutions and advanced analytical tools further accelerates market expansion. The market is segmented by application (personal and commercial use) and database type (real-time and historical). The commercial segment currently dominates, propelled by the needs of large financial institutions, investment banks, and asset management firms. However, the personal use segment is expected to witness significant growth driven by the increasing accessibility of financial data and analytical tools to individual investors. Geographical distribution shows a strong presence in North America and Europe, which are expected to remain dominant markets due to the established financial infrastructure and advanced technological capabilities. However, Asia-Pacific is anticipated to demonstrate the fastest growth, driven by increasing economic activity and the expansion of financial markets in emerging economies. Competition is intense, with established players like Bloomberg and Refinitiv (Thomson Reuters) alongside emerging niche players. The competitive landscape is marked by both established giants and agile newcomers. Established players, like Bloomberg, Thomson Reuters, and WRDS, leverage their extensive data networks and brand reputation. However, these are challenged by newer entrants offering innovative solutions and specialized datasets targeting specific niche markets. The ongoing technological advancements, such as the rise of big data analytics and artificial intelligence, presents both opportunities and challenges. While AI-powered analytics unlock deeper insights from financial data, the need to adapt to evolving technologies and data security concerns require substantial investment. Regulatory changes and data privacy concerns also represent potential restraints, requiring continuous adaptation and compliance measures. The future of the market hinges on the ability of players to innovate, adapt to evolving regulations, and meet the increasing demand for speed, accuracy, and comprehensive financial data insights. The market's trajectory strongly suggests a promising future for both established and emerging companies.

  3. H

    Common Ownership Data: Scraped SEC form 13F filings for 1999-2017

    • dataverse.harvard.edu
    Updated Aug 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Backus; Christopher T Conlon; Michael Sinkinson (2020). Common Ownership Data: Scraped SEC form 13F filings for 1999-2017 [Dataset]. http://doi.org/10.7910/DVN/ZRH3EU
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 17, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Matthew Backus; Christopher T Conlon; Michael Sinkinson
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1999 - Dec 31, 2017
    Description

    Introduction In the course of researching the common ownership hypothesis, we found a number of issues with the Thomson Reuters (TR) "S34" dataset used by many researchers and frequently accessed via Wharton Research Data Services (WRDS). WRDS has done extensive work to improve the database, working with other researchers that have uncovered problems, specifically fixing a lack of records of BlackRock holdings. However, even with the updated dataset posted in the summer of 2018, we discovered a number of discrepancies when accessing data for constituent firms of the S&P 500 Index. We therefore set out to separately create a dataset of 13(f) holdings from the source documents, which are all public and available electronically from the Securities and Exchange Commission (SEC) website. Coverage is good starting in 1999, when electronic filing became mandatory. However, the SEC's Inspector General issued a critical report in 2010 about the information contained in 13(f) filings. The process: We gathered all 13(f) filings from 1999-2017 here. The corpus is over 318,000 filings and occupies ~25GB of space if unzipped. (We do not include the raw filings here as they can be downloaded from EDGAR). We wrote code to parse the filings to extract holding information using regular expressions in Perl. Our target list of holdings was all public firms with a market capitalization of at least $10M. From the header of the file, we first extract the filing date, reporting date, and reporting entity (Central Index Key, or CIK, and CIKNAME). Beginning with the September 30 2013 filing date, all filings were in XML format, which made parsing fairly straightforward, as all values are contained in tags. Prior to that date, the filings are remarkable for the heterogeneity in formatting. Several examples are linked to below. Our approach was to look for any lines containing a CUSIP code that we were interested in, and then attempting to determine the "number of shares" field and the "value" field. To help validate the values we extracted, we downloaded stock price data from CRSP for the filing date, as that allows for a logic check of (price * shares) = value. We do not claim that this will exhaustively extract all holding information. We can provide examples of filings that are formatted in such a way that we are not able to extract the relevant information. In both XML and non-XML filings, we attempt to remove any derivative holdings by looking for phrases such as OPT, CALL, PUT, WARR, etc. We then perform some final data cleaning: in the case of amended filings, we keep an amended level of holdings if the amended report a) occurred within 90 days of the reporting date and b) the initial filing fails our logic check described above. The resulting dataset has around 48M reported holdings (CIK-CUSIP) for all 76 quarters and between 4,000 and 7,000 CUSIPs and between 1,000 and 4,000 investors per quarter. We do not claim that our dataset is perfect; there are undoubtedly errors. As documented elsewhere, there are often errors in the actual source documents as well. However, our method seemed to produce more reliable data in several cases than the TR dataset, as shown in Online Appendix B of the related paper linked above. Included Files Perl Parsing Code (find_holdings_snp.pl). For reference, only needed if you wish to re-parse original filings. Investor holdings for 1999-2017: lightly cleaned. Each CIK-CUSIP-rdate is unique. Over 47M records. The fields are CIK: the central index key assigned by the SEC for this investor. Mapping to names is available below. CUSIP: the identity of the holdings. Consult the SEC's 13(f) listings to identify your CUSIPs of interest. shares: the number of shares reportedly held. Merging in CRSP data on shares outstanding at the CUSIP-Month level allows one to construct \beta. We make no distinction for the sole/shared/none voting discretion fields. If a researcher is interested, we did collect that starting in mid-2013, when filings are in XML format. rdate: reporting date (end of quarter). 8 digit, YYYYMMDD. fdate: filing date. 8 digit, YYYYMMDD. ftype: the form name. Notes: we did not consolidate separate BlackRock entities (or any other possibly related entities). If one wants to do so, use the CIK-CIKname mapping file below. We drop any CUSIP-rdate observation where any investor in that CUSIP reports owning greater than 50% of shares outstanding (even though legitimate cases exist - see, for example, Diamond Offshore and Loews Corporation). We also drop any CUSIP-rdate observation where greater than 120% of shares outstanding are reported to be held by 13(f) investors. Cases where the shares held are listed as zero likely mean the investor filing lists a holding for the firm but that our code could not find the number of shares due to the formatting of the file. We leave these in the data so that any researchers that find a zero know to go back to that source filing to manually gather the...

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Center for Research in Security Prices (2023). Center for Research in Security Prices (CRSP) Stock Files [Dataset]. https://archive.ciser.cornell.edu/studies/2191
Organization logo

Center for Research in Security Prices (CRSP) Stock Files

Explore at:
17 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Oct 4, 2023
Dataset authored and provided by
Center for Research in Security Prices
Description

The Center for Research in Security Prices (CRSP) stock databases provide time-series and event data on individual stocks, augmented with market time-series. Daily and monthly time-series variables include returns, closing, low bid and high ask prices, and trading volume. Event data includes distributions, shares outstanding, names, etc.

Dataset is an external database available here for Cornell affiliates: https://johnson.library.cornell.edu/database/wharton-research-data-services-wrds/

Search
Clear search
Close search
Google apps
Main menu