22 datasets found
  1. Stock Market Dataset

    • kaggle.com
    zip
    Updated Apr 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oleh Onyshchak (2020). Stock Market Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/1054465
    Explore at:
    zip(547714524 bytes)Available download formats
    Dataset updated
    Apr 2, 2020
    Authors
    Oleh Onyshchak
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview

    This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.

    It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.

    Data Structure

    The date for every symbol is saved in CSV format with common fields:

    • Date - specifies trading date
    • Open - opening price
    • High - maximum price during the day
    • Low - minimum price during the day
    • Close - close price adjusted for splits
    • Adj Close - adjusted close price adjusted for both dividends and splits.
    • Volume - the number of shares that changed hands during a given day

    All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv contains some additional metadata for each ticker such as full name.

  2. m

    Integrando Google Colab e Yahoo Finance (compactação e download de cotações...

    • data.mendeley.com
    Updated Aug 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernardo Mendes (2021). Integrando Google Colab e Yahoo Finance (compactação e download de cotações em formato CSV) published at the "Open Code Community" [Dataset]. http://doi.org/10.17632/r58pyjyvbx.1
    Explore at:
    Dataset updated
    Aug 26, 2021
    Authors
    Bernardo Mendes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
  3. Bitcoin Historical Data (2014-2025) Yahoo! Finance

    • kaggle.com
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eldintaro Farrandi (2025). Bitcoin Historical Data (2014-2025) Yahoo! Finance [Dataset]. https://www.kaggle.com/datasets/eldintarofarrandi/bitcoin-historical-data-2014-2025-yahoo-finance
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 21, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Eldintaro Farrandi
    Description

    This dataset includes daily historical price data for Bitcoin (BTC-USD) from 2014 to 2025, obtained through web scraping from the Yahoo Finance page using Selenium. The primary data source can be accessed at Yahoo Finance - Bitcoin Historical Data . The dataset contains daily information such as opening price (Open), highest price (High), lowest price (Low), closing price (Close), adjusted closing price (Adj Close), and trading volume (Volume).

    About Bitcoin: Bitcoin (BTC) is the world's first decentralized digital currency, introduced in 2009 by an anonymous creator known as Satoshi Nakamoto. It operates on a peer-to-peer network powered by blockchain technology, enabling secure, transparent, and trustless transactions without the need for intermediaries like banks. Bitcoin's limited supply of 21 million coins and its growing adoption have made it a popular asset for investment, trading, and as a hedge against inflation.

    We are excited to share this dataset and look forward to seeing the insights it can provide. We hope it will inspire collaboration and innovation within the community. By leveraging this daily data, we can explore trends, develop predictive models, and design innovative trading strategies that deepen our understanding of Bitcoin's market behavior. Together, we can unlock new opportunities and contribute to the collective advancement of cryptocurrency research and analysis.

  4. TESLA STOCK PRICE HISTORY

    • kaggle.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adil Shamim (2025). TESLA STOCK PRICE HISTORY [Dataset]. https://www.kaggle.com/datasets/adilshamim8/tesla-stock-price-history
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Kaggle
    Authors
    Adil Shamim
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset presents an extensive record of daily historical stock prices for Tesla, Inc. (TSLA), one of the world’s most innovative and closely watched electric vehicle and clean energy companies. The data was sourced from Yahoo Finance, a widely used and trusted provider of financial market data, and covers a significant period spanning from Tesla’s initial public offering (IPO) to the most recent date available at the time of extraction.

    The dataset includes critical trading metrics for each market day, such as the opening price, highest and lowest prices of the day, closing price, adjusted closing price (accounting for dividends and splits), and total trading volume. This rich dataset supports a variety of use cases, including financial market analysis, investment research, time series forecasting, development and backtesting of trading algorithms, and educational projects in data science and finance.

    Dataset Features

    • Date: The calendar date for each trading session (in YYYY-MM-DD format)
    • Open: The opening price of TSLA shares at the start of the trading day
    • High: The highest price reached during the trading session
    • Low: The lowest price reached during the trading session
    • Close: The last price at which the stock traded during the day
    • Adj Close: The closing price adjusted for corporate actions (splits, dividends, etc.)
    • Volume: The total number of TSLA shares traded on that day

    Source and Collection Details

    • Source: Yahoo Finance - Tesla (TSLA) Historical Data
    • Collection Method: Data was downloaded using Yahoo Finance's CSV export feature for accuracy and completeness.
    • Time Range: Covers from Tesla’s IPO (June 2010) to the most recent available trading day.
    • Data Integrity: Minimal cleaning was performed—dates were standardized, and any duplicate or empty rows were removed; all values remain as originally reported by Yahoo Finance.

    Example Use Cases

    • Stock Price Prediction: Train and test time series models (ARIMA, LSTM, Prophet, etc.) to forecast Tesla’s stock prices.
    • Algorithmic Trading: Backtest and evaluate trading strategies using historical price and volume data.
    • Market Trend Analysis: Analyze price trends, volatility, and return rates over different periods.
    • Event Study: Investigate the impact of major announcements (e.g., product launches, earnings releases) on TSLA stock price.
    • Educational Projects: Use as a hands-on resource for learning finance, statistics, or machine learning.

    License & Acknowledgments

    • Intended Use: This dataset is provided for academic, research, and personal projects. For commercial or investment use, please verify data accuracy and consult Yahoo Finance’s terms of use.
    • Acknowledgment: Data sourced from Yahoo Finance. All trademarks and copyrights belong to their respective owners.
  5. m

    Low- and High-Dimensional Asset Prices Data

    • data.mendeley.com
    Updated Oct 18, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chi Seng Pun (2017). Low- and High-Dimensional Asset Prices Data [Dataset]. http://doi.org/10.17632/ndxfrshm74.2
    Explore at:
    Dataset updated
    Oct 18, 2017
    Authors
    Chi Seng Pun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data files contain seven low-dimensional financial research data (in .txt format) and four high-dimensional daily stock prices data (in .csv format). The low-dimensional data sets are provided by Lorenzo Garlappi on his website, while the high-dimensional data sets are downloaded from Yahoo!Finance by the contributor's own efforts. The description of the low-dimensional data sets can be found in DeMiguel et al. (2009, RFS).

  6. Integrated Cryptocurrency Historical Data for a Predictive Data-Driven...

    • cryptodata.center
    Updated Dec 4, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    cryptodata.center (2024). Integrated Cryptocurrency Historical Data for a Predictive Data-Driven Decision-Making Algorithm - Dataset - CryptoData Hub [Dataset]. https://cryptodata.center/dataset/integrated-cryptocurrency-historical-data-for-a-predictive-data-driven-decision-making-algorithm
    Explore at:
    Dataset updated
    Dec 4, 2024
    Dataset provided by
    CryptoDATA
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cryptocurrency historical datasets from January 2012 (if available) to October 2021 were obtained and integrated from various sources and Application Programming Interfaces (APIs) including Yahoo Finance, Cryptodownload, CoinMarketCap, various Kaggle datasets, and multiple APIs. While these datasets used various formats of time (e.g., minutes, hours, days), in order to integrate the datasets days format was used for in this research study. The integrated cryptocurrency historical datasets for 80 cryptocurrencies including but not limited to Bitcoin (BTC), Ethereum (ETH), Binance Coin (BNB), Cardano (ADA), Tether (USDT), Ripple (XRP), Solana (SOL), Polkadot (DOT), USD Coin (USDC), Dogecoin (DOGE), Tron (TRX), Bitcoin Cash (BCH), Litecoin (LTC), EOS (EOS), Cosmos (ATOM), Stellar (XLM), Wrapped Bitcoin (WBTC), Uniswap (UNI), Terra (LUNA), SHIBA INU (SHIB), and 60 more cryptocurrencies were uploaded in this online Mendeley data repository. Although the primary attribute of including the mentioned cryptocurrencies was the Market Capitalization, a subject matter expert i.e., a professional trader has also guided the initial selection of the cryptocurrencies by analyzing various indicators such as Relative Strength Index (RSI), Moving Average Convergence/Divergence (MACD), MYC Signals, Bollinger Bands, Fibonacci Retracement, Stochastic Oscillator and Ichimoku Cloud. The primary features of this dataset that were used as the decision-making criteria of the CLUS-MCDA II approach are Timestamps, Open, High, Low, Closed, Volume (Currency), % Change (7 days and 24 hours), Market Cap and Weighted Price values. The available excel and CSV files in this data set are just part of the integrated data and other databases, datasets and API References that was used in this study are as follows: [1] https://finance.yahoo.com/ [2] https://coinmarketcap.com/historical/ [3] https://cryptodatadownload.com/ [4] https://kaggle.com/philmohun/cryptocurrency-financial-data [5] https://kaggle.com/deepshah16/meme-cryptocurrency-historical-data [6] https://kaggle.com/sudalairajkumar/cryptocurrencypricehistory [7] https://min-api.cryptocompare.com/data/price?fsym=BTC&tsyms=USD [8] https://min-api.cryptocompare.com/ [9] https://p.nomics.com/cryptocurrency-bitcoin-api [10] https://www.coinapi.io/ [11] https://www.coingecko.com/en/api [12] https://cryptowat.ch/ [13] https://www.alphavantage.co/ This dataset is part of the CLUS-MCDA (Cluster analysis for improving Multiple Criteria Decision Analysis) and CLUS-MCDAII Project: https://aimaghsoodi.github.io/CLUSMCDA-R-Package/ https://github.com/Aimaghsoodi/CLUS-MCDA-II https://github.com/azadkavian/CLUS-MCDA

  7. Stock Market Supplementary Data

    • kaggle.com
    Updated Jun 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Mooney (2021). Stock Market Supplementary Data [Dataset]. https://www.kaggle.com/datasets/paultimothymooney/stock-market-supplementary-data/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 14, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Paul Mooney
    Description

    Context

    Stock Market Supplementary Data: Company Names and Ticker Symbols

    Content

    Company Names and Ticker Symbols

    nasdaq.csv
    
    nyse.csv
    
    sp500.csv
    
    forbes2000.csv
    
    yahootickers.xlsx
    
    • About
    • Currency
    • ETF
    • Future
    • Index
    • Mutual Fund
    • Stock

    Acknowledgements

    Data from - https://datahub.io/core/nasdaq-listings (License) - https://datahub.io/core/s-and-p-500-companies (License) - https://datahub.io/core/nyse-other-listings (License) - https://investexcel.net/all-yahoo-finance-stock-tickers (open data) - https://www.kaggle.com/ash316/forbes-top-2000-companies (open data)

    Banner Photo: https://unsplash.com/photos/VP4WmibxvcY

  8. m

    ESG rating of general stock indices

    • data.mendeley.com
    • narcis.nl
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Szilárd Erhart (2021). ESG rating of general stock indices [Dataset]. http://doi.org/10.17632/58mwkj5pf8.1
    Explore at:
    Dataset updated
    Oct 22, 2021
    Authors
    Szilárd Erhart
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    THE FILES HAVE BEEN CREATED BY SZILÁRD ERHART FOR A RESEARCH: ERHART (2021): ESG RATINGS OF GENERAL

    STOCK EXCHANGE INDICES, INTERNATIONAL REVIEW OF FINANCIAL ANALYSIS

    USERS OF THE FILES AGREE TO QUOTE THE ABOVE PAPER

    THE PYTHON SCRIPT (PYTHONESG_ERHART.TXT) HELPS USERS TO GET TICKERS BY STOCK EXCHANGES AND EXTRACT ESG SCORES FOR THE UNDERLYING STOCKS FROM YAHOO FINANCE.

    THE R SCRIPT (ESG_UA.TXT) HELPS TO REPLICATE THE MONTE CARLO EXPERIMENT DETAILED IN THE STUDY.

    THE EXPORT_ALL CSV CONTAINS THE DOWNLOADED ESG DATA (SCORES, CONTROVERSIES, ETC) ORGANIZED BY STOCKS AND EXCHANGES.

    DISCLAIMER

    The author takes no responsibility for the timeliness, accuracy, completeness or quality of the information provided.

    The author is in no event liable for damages of any kind incurred or suffered as a result of the use or non-use of the

    information presented or the use of defective or incomplete information.

    The contents are subject to confirmation and not binding.

    The author expressly reserves the right to alter, amend, whole and in part,

    without prior notice or to discontinue publication for a period of time or even completely.

    ##############################READ ME

    BEFORE USING THE MONTE CARLO SIMULATIONS SCRIPT:

    (1) COPY THE goascores.csv and goalscores_alt.csv FILES ONTO YOUR ON COMPUTER DRIVE. THE TWO FILES ARE IDENTICAL.

    (2) SET THE EXACT FILE LOCATION INFORMATION IN THE 'Read in data' SECTION OF THE MONTE CARLO SCRIPT AND FOR THE OUTPUT FILES AT THE END OF THE SCRIPT

    (3) LOAD MISC TOOLS AND MATRIXSTATS IN YOUR R APPLICATION

    (4) RUN THE CODE.

    ##############################READ ME
  9. NASDAQ Historical Prices (2014-2024)

    • kaggle.com
    Updated Apr 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arslanr369 (2024). NASDAQ Historical Prices (2014-2024) [Dataset]. https://www.kaggle.com/datasets/arslanr369/nasdaq-historical-prices-2014-2024
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 27, 2024
    Dataset provided by
    Kaggle
    Authors
    Arslanr369
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Experience a decade of NASDAQ market dynamics with this comprehensive historical price dataset from 2014 to 2024.

    The NASDAQ Composite is a benchmark index representing the performance of more than 2,500 stocks listed on the NASDAQ stock exchange, encompassing various sectors including technology, healthcare, and finance. This dataset, sourced meticulously from Yahoo Finance, offers daily insights into the index's opening, highest, lowest, and closing prices, along with adjusted close prices and daily volume.

  10. R

    Data from: Determining Multi-Class Trading Signals for Bitcoin: A...

    • repod.icm.edu.pl
    tsv, txt
    Updated May 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stawarz, Marcin (2025). Determining Multi-Class Trading Signals for Bitcoin: A Comparative Study of XGBoost, LightGBM, and Random Forest [Dataset]. http://doi.org/10.18150/FXSBZP
    Explore at:
    txt(1271), tsv(402980)Available download formats
    Dataset updated
    May 5, 2025
    Dataset provided by
    RepOD
    Authors
    Stawarz, Marcin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    National Science Centre (Poland)
    Description

    The file provides daily trading data for Bitcoin (BTC) in USD, covering the period from January 1, 2015, to June 30, 2024. The dataset includes key indicators such as Open Price, High Price, Low Price, Close Price, Adjusted Close Price, and Volume.This data originates from Yahoo Finance and serves as a foundation for time series analysis, forecasting, and machine learning models, focusing on identifying price patterns, volatility trends, and trading behaviors within the cryptocurrency market.Raw data (CSV file). Source: Yahoo Finance.

  11. Meta updated stocks complete dataset

    • kaggle.com
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Atif Latif (2025). Meta updated stocks complete dataset [Dataset]. https://www.kaggle.com/datasets/matiflatif/meta-stocks-complete-data-set
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 15, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    M Atif Latif
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset contains daily stock data for Meta Platforms, Inc. (META), formerly Facebook Inc., from May 19, 2012, to January 20, 2025. It offers a comprehensive view of Meta’s stock performance and market fluctuations during a period of significant growth, acquisitions, and technological advancements. This dataset is valuable for financial analysis, market prediction, machine learning projects, and evaluating the impact of Meta’s business decisions on its stock price.

    Content

    The dataset includes the following key features:

    Open: Stock price at the start of the trading day. High: Highest stock price during the trading day. Low: Lowest stock price during the trading day. Close: Stock price at the end of the trading day. Adj Close: Adjusted closing price, accounting for corporate actions like stock splits, dividends, and other financial adjustments. Volume: Total number of shares traded during the trading day.

    Variables

    Date: The date of the trading day, formatted as YYYY-MM-DD. Open: The stock price at the start of the trading day. High: The highest price reached by the stock during the trading day. Low: The lowest price reached by the stock during the trading day. Close: The stock price at the end of the trading day. Adj Close: The adjusted closing price, which reflects corporate actions like stock splits and dividend payouts. Volume: The total number of shares traded on that specific day.

    Acknowledgements

    This dataset was sourced from reliable public APIs such as Yahoo Finance or Alpha Vantage. It is provided for educational and research purposes and is not affiliated with Meta Platforms, Inc. Users are encouraged to adhere to the terms of use of the original data provider.

  12. yahoo_finance_data_nse_2000_stocks

    • kaggle.com
    zip
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stormblessed_Ash (2025). yahoo_finance_data_nse_2000_stocks [Dataset]. https://www.kaggle.com/datasets/ashvinvinodh97/yahoo-finance-data-nse-2000-stocks
    Explore at:
    zip(198144682 bytes)Available download formats
    Dataset updated
    Apr 11, 2025
    Authors
    Stormblessed_Ash
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset contains daily OHLCV data for ~ 2000 Indian Stocks listed on the National Stock Exchange for all time. The columns are multi-index columns, so this needs to be taken into account when reading and using the data. Source : Yahoo Finance Type: All files are CSV format. Currency : INR

    All the tickers have been collected from here : https://www.nseindia.com/market-data/securities-available-for-trading

    If using pandas, the following function is a utility to read any of the CSV files: ``` import pandas as pd def read_ohlcv(filename): "read a given ohlcv data file downloaded from yfinance" return pd.read_csv( filename, skiprows=[0, 1, 2], # remove the multiindex rows that cause trouble names=["Date", "Close", "High", "Low", "Open", "Volume"], index_col="Date", parse_dates=["Date"], )

    dataset = read_ohlcv("ABCAPITAL.NS.csv")

  13. Stock market predictions

    • kaggle.com
    Updated Feb 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanishq dublish (2024). Stock market predictions [Dataset]. https://www.kaggle.com/datasets/tanishqdublish/stock-market-predictions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 18, 2024
    Dataset provided by
    Kaggle
    Authors
    Tanishq dublish
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Actually, I prepare this dataset for students on my Deep Learning and NLP course.

    But I am also very happy to see kagglers play around with it.

    Have fun!

    Description:

    There are two channels of data provided in this dataset:

    News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

    Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

    I provided three data files in .csv format:

    RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

    DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

    Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".

  14. Full Nasdaq Stocks Data

    • kaggle.com
    Updated May 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fjgonzalez (2020). Full Nasdaq Stocks Data [Dataset]. https://www.kaggle.com/gonzalezfrancisco/full-nasdaq-stocks-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 31, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    fjgonzalez
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Predicting the stock market is one of the most commonly performed projects when someone is learning about ML and Data Science. After all, who wouldn't want to delegate the task of picking stocks to a model and reap the rewards for themselves? However, one of the most difficult and tedious steps to predict what stocks to invest in is actually gathering the data to use. There are so many options and it is important to get sufficient information for each. But, what if you can skip this step and just download a dataset that has all that information easily available for you? Look no further as this is the answer to this problem.

    Content

    This dataset contains information of 4447 stocks traded under Nasdaq across various exchanges. There is a file that contains information for all 4447 stocks but also has several null fields, which is why I labeled it as full_financial_stocks_raw.csv --it has minimal modifications to the values inside the rows. The second file, dividend_stocks_only.csv, is still a raw-ish style dataset but it only contains stocks that pay out dividends to its shareholders. Interestingly, it seems dividend-paying stocks have more information about them, which explains why this file has significantly fewer rows with null values.

    Update: In the next 24 hours, I will be uploading an optimized, feature-engineered dataset that has fewer columns overall and fewer rows with null values. This dataset is intended to be a fully cleaned option to directly feed into ML/DL models.

    Acknowledgements

    I would like to thank the sources where I obtained my data, which are the FTP Nasdaq Trader website and the Yahoo Finance API.

    Inspiration

    Analyzing the stock market is one of the most intriguing endeavors I could think of as the ways it can be influenced are so broad and distinct from one another. A news article can influence how investors view a particular company, social media can directly fluctuate a company's share price, and there are numerous calculations and formulas that can show what stocks are worth investing in.

  15. ETH-USD

    • kaggle.com
    Updated Oct 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    rescue96 (2024). ETH-USD [Dataset]. https://www.kaggle.com/datasets/rescue96/eth-usd/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 25, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    rescue96
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ETH USD price CSV staring from 1 JAN 2021 till 24 OCT 2024. Information shared here as is provided as per data imported using Yahoo finance.

    Update Frequency Since new stock market data is generated and made available every day, in order to have the latest and most useful information, the dataset will be updated once a month.

    Acknowledgements Yahoo Finance : https://finance.yahoo.com/ Teckgeekz

  16. Financial_Data

    • kaggle.com
    Updated Jan 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira gibin (2024). Financial_Data [Dataset]. http://doi.org/10.34740/kaggle/dsv/7494625
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 27, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    willian oliveira gibin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F32e64021eb6b6e1d4dcf950ae6700e2f%2FDesign%20sem%20nome.png?generation=1706391327847470&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F0e4269c105a1cf21e25070bf13321c5f%2Fsddsdssd.png?generation=1706391479391619&alt=media" alt="">

    The dataset comprises the historical stock prices of PT Bank Central Asia Tbk (BBCA.JK) retrieved from the Yahoo Finance website, spanning from January 2019 to the present date. This data provides valuable information for various analytical purposes, such as forecasting future stock prices, implementing machine learning models, and conducting data analysis or visualization tasks. By making this dataset available to the Kaggle community, contributors can explore and utilize it for research, modeling, and educational purposes. The dataset is regularly updated through an automated process scheduled on Kaggle, ensuring its reliability and relevance for ongoing projects and analyses.

  17. NASDAQ and NYSE stocks histories

    • kaggle.com
    Updated Nov 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiun Yen (2018). NASDAQ and NYSE stocks histories [Dataset]. https://www.kaggle.com/qks1lver/nasdaq-and-nyse-stocks-histories/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 5, 2018
    Dataset provided by
    Kaggle
    Authors
    Jiun Yen
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    NASDAQ and NYSE stocks histories

    Update every Saturday night because I'm too tired to do anything on Friday

    Full history of stock symbols on NASDAQ and NYSE:

    • Unzip fh_< version_date >.zip
    • Each stock symbol has a .csv file under full_history/
      • i.e. AMD.csv
    • Columns in .csv
      • date - year-month-day, 2018-08-08
      • volume - int, volume of the day
      • open - float, opening price of the day
      • close - float, closing price of the day
      • high - float, highest price of the day
      • low - float, lowest price of the day
      • adjclose - float, adjusted closing price of the day

    Other files:

    • all_symbols.txt - All the stock symbols with history
    • excluded_symbols.txt - All the ones that I couldn't retrieve data for
    • NASDAQ.txt - NASDAQ listing
    • NYSE.txt - NYSE listing

    All data compiled from Yahoo Finance

    If you have questions, e-mail me: jiunyyen@gmail.com

    Happy mining!

  18. Sentiment Analysis on Financial Tweets

    • kaggle.com
    zip
    Updated Sep 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivek Rathi (2019). Sentiment Analysis on Financial Tweets [Dataset]. https://www.kaggle.com/datasets/vivekrathi055/sentiment-analysis-on-financial-tweets
    Explore at:
    zip(2538259 bytes)Available download formats
    Dataset updated
    Sep 5, 2019
    Authors
    Vivek Rathi
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Context

    The following information can also be found at https://www.kaggle.com/davidwallach/financial-tweets. Out of curosity, I just cleaned the .csv files to perform a sentiment analysis. So both the .csv files in this dataset are created by me.

    Anything you read in the description is written by David Wallach and using all this information, I happen to perform my first ever sentiment analysis.

    "I have been interested in using public sentiment and journalism to gather sentiment profiles on publicly traded companies. I first developed a Python package (https://github.com/dwallach1/Stocker) that scrapes the web for articles written about companies, and then noticed the abundance of overlap with Twitter. I then developed a NodeJS project that I have been running on my RaspberryPi to monitor Twitter for all tweets coming from those mentioned in the content section. If one of them tweeted about a company in the stocks_cleaned.csv file, then it would write the tweet to the database. Currently, the file is only from earlier today, but after about a month or two, I plan to update the tweets.csv file (hopefully closer to 50,000 entries.

    I am not quite sure how this dataset will be relevant, but I hope to use these tweets and try to generate some sense of public sentiment score."

    Content

    This dataset has all the publicly traded companies (tickers and company names) that were used as input to fill the tweets.csv. The influencers whose tweets were monitored were: ['MarketWatch', 'business', 'YahooFinance', 'TechCrunch', 'WSJ', 'Forbes', 'FT', 'TheEconomist', 'nytimes', 'Reuters', 'GerberKawasaki', 'jimcramer', 'TheStreet', 'TheStalwart', 'TruthGundlach', 'Carl_C_Icahn', 'ReformedBroker', 'benbernanke', 'bespokeinvest', 'BespokeCrypto', 'stlouisfed', 'federalreserve', 'GoldmanSachs', 'ianbremmer', 'MorganStanley', 'AswathDamodaran', 'mcuban', 'muddywatersre', 'StockTwits', 'SeanaNSmith'

    Acknowledgements

    The data used here is gathered from a project I developed : https://github.com/dwallach1/StockerBot

    Inspiration

    I hope to develop a financial sentiment text classifier that would be able to track Twitter's (and the entire public's) feelings about any publicly traded company (and cryptocurrency)

  19. Solana Price History (SOL-USD)

    • kaggle.com
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gokberk Kozak (2024). Solana Price History (SOL-USD) [Dataset]. https://www.kaggle.com/datasets/gokberkkozak/solana-price-history-sol-usd
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 8, 2024
    Dataset provided by
    Kaggle
    Authors
    Gokberk Kozak
    Description

    This dataset contains the price movements of the SOL cryptocurrency over the last four years. The data has been collected through the Yahoo Finance API. The dataset consists of the following columns:

    DATE: Date and time the price information pertains to.
    OPEN: Opening price on the specified date.
    HIGH: Highest price reached on the specified date.
    LOW: Lowest price reached on the specified date.
    CLOSE: Closing price on the specified date.
    VOLUME: Volume of transactions that occurred on the specified date.
    

    This dataset can be utilized to analyze recent price movements of the SOL cryptocurrency, identify trends, and make future price predictions. It can be used for various purposes including financial analysis, training machine learning models, and understanding market trends.

  20. US Stock Market Data

    • kaggle.com
    zip
    Updated Jan 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohammed Obeidat (2023). US Stock Market Data [Dataset]. https://www.kaggle.com/mohammedobeidat/us-stock-market-data
    Explore at:
    zip(42432995 bytes)Available download formats
    Dataset updated
    Jan 14, 2023
    Authors
    Mohammed Obeidat
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset contains the file required for training and testing and split accordingly.

    There are two groups of features that you can use for prediction:

    1. Fundamentals and ratios: Values collected form statements and balance sheets for each ticker
    2. Technical indicators and strategy flags: Technical indicators calculated on close value of each day and buy and sell signals generated using some commonly used trading strategies.

    Files found in Fundamentals folder is a processed format of the files found in raw folder. Ratios and other values are stretched to match the length of the closing price column such that the value in the pe_ratio column for example is the PE ratio from the most recent quarter and this applies for every column.

    Technical indicators are calculated with the default parameters used in Pandas_TA package.

    Data is collected form finance.yahoo.com and macrotrends.net Timeframe for the given data is different from one ticker to another because of unavailability of some stocks for a given time frame on either of the websites.

    All code required to collect the data and perform preprocessing and feature engineering to get the data in the given format can be found in the following notebooks:

    1. https://www.kaggle.com/code/mohammedobeidat/us-stocks-data-collection
    2. https://www.kaggle.com/code/mohammedobeidat/us-stocks-technicals-feature-engineering-and-eda
    3. https://www.kaggle.com/code/mohammedobeidat/us-stocks-fundamentals-preprocessing-and-eda

    Files

    • {<>_ticker_train}.csv - the training set
    • {<>_ticker_train}.csv - the test set

    Columns

    Columns names are supposed to be self-explanatory assuming you are familiar with the stock market. Some acronyms you may encounter:

    1. tmm is short for Trailing Twelve Months
    2. pe is short for Price to Earnings
    3. pb is short for Price to Book Value
    4. ps is short for Price to Sales
    5. fcf is short for Free Cash Flow
    6. eps is short for Earnings per Share
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Oleh Onyshchak (2020). Stock Market Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/1054465
Organization logo

Stock Market Dataset

Historical daily prices of Nasdaq-traded stocks and ETFs

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
zip(547714524 bytes)Available download formats
Dataset updated
Apr 2, 2020
Authors
Oleh Onyshchak
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Overview

This dataset contains historical daily prices for all tickers currently trading on NASDAQ. The up to date list is available from nasdaqtrader.com. The historic data is retrieved from Yahoo finance via yfinance python package.

It contains prices for up to 01 of April 2020. If you need more up to date data, just fork and re-run data collection script also available from Kaggle.

Data Structure

The date for every symbol is saved in CSV format with common fields:

  • Date - specifies trading date
  • Open - opening price
  • High - maximum price during the day
  • Low - minimum price during the day
  • Close - close price adjusted for splits
  • Adj Close - adjusted close price adjusted for both dividends and splits.
  • Volume - the number of shares that changed hands during a given day

All that ticker data is then stored in either ETFs or stocks folder, depending on a type. Moreover, each filename is the corresponding ticker symbol. At last, symbols_valid_meta.csv contains some additional metadata for each ticker such as full name.

Search
Clear search
Close search
Google apps
Main menu