Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Stock market data can be interesting to analyze and as a further incentive, strong predictive models can have large financial payoff. The amount of financial data on the web is seemingly endless. A large and well structured dataset on a wide array of companies can be hard to come by. Here I provide a dataset with historical stock prices (last 5 years) for all companies currently found on the S&P 500 index.
The script I used to acquire all of these .csv files can be found in this GitHub repository In the future if you wish for a more up to date dataset, this can be used to acquire new versions of the .csv files.
Feb 2018 note: I have just updated the dataset to include data up to Feb 2018. I have also accounted for changes in the stocks on the S&P 500 index (RIP whole foods etc. etc.).
The data is presented in a couple of formats to suit different individual's needs or computational limitations. I have included files containing 5 years of stock data (in the all_stocks_5yr.csv and corresponding folder).
The folder individual_stocks_5yr contains files of data for individual stocks, labelled by their stock ticker name. The all_stocks_5yr.csv contains the same data, presented in a merged .csv file. Depending on the intended use (graphing, modelling etc.) the user may prefer one of these given formats.
All the files have the following columns: Date - in format: yy-mm-dd
Open - price of the stock at market open (this is NYSE data so all in USD)
High - Highest price reached in the day
Low Close - Lowest price reached in the day
Volume - Number of shares traded
Name - the stock's ticker name
Due to volatility in google finance, for the newest version I have switched over to acquiring the data from The Investor's Exchange api, the simple script I use to do this is found here. Special thanks to Kaggle, Github, pandas_datareader and The Market.
This dataset lends itself to a some very interesting visualizations. One can look at simple things like how prices change over time, graph an compare multiple stocks at once, or generate and graph new metrics from the data provided. From these data informative stock stats such as volatility and moving averages can be easily calculated. The million dollar question is: can you develop a model that can beat the market and allow you to make statistically informed trades!
Facebook
TwitterIn 2025, ** percent of adults in the United States invested in the stock market. This figure has remained steady over the last few years and is still below the levels before the Great Recession, when it peaked in 2007 at ** percent. What is the stock market? The stock market can be defined as a group of stock exchanges where investors can buy shares in a publicly traded company. In more recent years, it is estimated an increasing number of Americans are using neobrokers, making stock trading more accessible to investors. Other investments A significant number of people think stocks and bonds are the safest investments, while others point to real estate, gold, bonds, or a savings account. Since witnessing the significant one-day losses in the stock market during the financial crisis, many investors were turning towards these alternatives in hopes for more stability, particularly for investments with longer maturities. This could explain the decrease in this statistic since 2007. Nevertheless, some speculators enjoy chasing the short-run fluctuations, and others see value in choosing particular stocks.
Facebook
TwitterThe dataset contains prices and volumes for different stocks
Here is an example:
cat 201801_Amsterdam_AALB_NoExpiry.txt
01/02/2018,09:01:00, 42.39, 42.39, 42.21, 42.21, 737 01/02/2018,09:02:00, 42.28, 42.28, 42.27, 42.27, 277 01/02/2018,09:04:00, 42.24, 42.24, 42.24, 42.24, 177 01/02/2018,09:05:00, 42.23, 42.23, 42.22, 42.22, 1543 01/02/2018,09:06:00, 42.23, 42.23, 42.23, 42.23, 241
The dataset contains trading data for 2182 unique stocks, on 40 unique stock exchanges. The monthly data is provided by stocks with each stock being associated with a specific stock exchange and is initially stored in the .txt format. Each file contains a trading history of a stock in a particular month and has the following schema.
Dataset is a zipped file of stocks from many stock markets and forex. It covers the whole of 2018. Notice the following: 1. All mentioned timestamps are CET. 2. There are missing records and irregularities on the updates – see the previous example. You need to decide how to handle the missing values/records. 3. Different stocks have different update frequencies.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
About Dataset Context Stock market data can be interesting to analyze and as a further incentive, strong predictive models can have large financial payoff. The amount of financial data on the web is seemingly endless. A large and well structured dataset on a wide array of companies can be hard to come by. Here I provide a dataset with historical stock prices (last 5 years) for all companies currently found on the S&P 500 index.
The script I used to acquire all of these .csv files can be found in this GitHub repository In the future if you wish for a more up to date dataset, this can be used to acquire new versions of the .csv files.
Feb 2018 note: I have just updated the dataset to include data up to Feb 2018. I have also accounted for changes in the stocks on the S&P 500 index (RIP whole foods etc. etc.).
Content The data is presented in a couple of formats to suit different individual's needs or computational limitations. I have included files containing 5 years of stock data (in the allstocks5yr.csv and corresponding folder).
The folder individualstocks5yr contains files of data for individual stocks, labelled by their stock ticker name. The allstocks5yr.csv contains the same data, presented in a merged .csv file. Depending on the intended use (graphing, modelling etc.) the user may prefer one of these given formats.
All the files have the following columns: Date - in format: yy-mm-dd
Open - price of the stock at market open (this is NYSE data so all in USD)
High - Highest price reached in the day
Low Close - Lowest price reached in the day
Volume - Number of shares traded
Name - the stock's ticker name
Acknowledgements Due to volatility in google finance, for the newest version I have switched over to acquiring the data from The Investor's Exchange api, the simple script I use to do this is found here. Special thanks to Kaggle, Github, pandas_datareader and The Market.
Inspiration This dataset lends itself to a some very interesting visualizations. One can look at simple things like how prices change over time, graph an compare multiple stocks at once, or generate and graph new metrics from the data provided. From these data informative stock stats such as volatility and moving averages can be easily calculated. The million dollar question is: can you develop a model that can beat the market and allow you to make statistically informed trades!
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Investments Time Series for People.Cn Co Ltd. People.cn CO., LTD provides advertising and publicity services in China. The company offers multi-dimensional advertising and publicity services through text links, pictures, multimedia, and other forms of expression. It also provides content technology service, such as content risk control services and aggregation distribution services; data and information services, which includes news, public opinion, life, entertainment, big data storage, management and use service, and other information services; and network technology service comprises of website construction, hosting, and network access to social users, as well as professional technical, software development service, and software platform construction services. In addition, the company offers consulting services, training services, Internet service operations, and other service businesses. The company was founded in 1997 and is based in Beijing, the People's Republic of China.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains Apple's (AAPL) stock data for the last 10 years (from 2010 to date). I believe insights from this data can be used to build useful price forecasting algorithms to aid investment. I would like to thank Nasdaq for providing access to this rich dataset. I will make sure I update this dataset every few months.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Liabilities-and-Stockholders-Equity Time Series for South Plains Financial Inc. South Plains Financial, Inc. operates as a bank holding company for City Bank that provides commercial and consumer financial services to small and medium-sized businesses and individuals. It offers deposit products, including demand deposit accounts, interest-bearing products, savings accounts, and certificate of deposits. The company also provides traditional trust products and services; debit and credit cards; retirement services and products, including real estate administration, family trust administration, revocable and irrevocable trusts, charitable trusts for individuals and corporations, and self-directed individual retirement accounts. In addition, it offers investment services, such as self-directed IRAs, money market funds, mutual funds, annuities and tax-deferred annuities, stocks and bonds, investments for non-U.S. residents, treasury bills, treasury notes and bonds, and tax-exempt municipal bonds. Further, the company provides commercial real estate loans; general and specialized commercial loans, including agricultural production and real estate, energy, finance, investment, and insurance loans, as well as loans to goods, services, restaurant and retail, construction, and other industries; residential construction loans; and 1-4 family residential loans, auto loans, and other loans for recreational vehicles or other purposes; and mortgage banking services. The company was founded in 1941 and is headquartered in Lubbock, Texas.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Total-Stockholder-Equity Time Series for Angel One Limited. Angel One Limited provides broking and advisory services, margin funding, and financial products to its clients in India and internationally. The company provides equity, commodities, derivatives, and currency derivative products. It also manages client's securities in the electronic form; facilitates clients in applying for initial public offerings; and provides funds to investors for trading. In addition, the company offers insurance, mutual funds, sovereign gold bonds, and credit products; and investment advice and market research services, as well as educates clients on financial markets and investing strategies. Further, it provides portfolio management, trading account, initial public offering, and DEMAT account services. The company operates through Angel One Super App, Angel One Trade, and Smart API investing and trading platforms. It serves resident and non-resident individuals, salaried professionals, high net worth individuals, Hindu undivided Families, corporates and trusts, partnership firms and LLP, and co-operative societies. The company was formerly known as Angel Broking Limited and changed its name to Angel One Limited in September 2021. Angel One Limited was incorporated in 1996 and is based in Mumbai, India.
Facebook
TwitterThis dataset combines the details of the stock market data of Netflix company and the sentiment analysis of Twitter for Netflix. This dataset covers the range from 1-1-2018 to 11-7-2022. The twitter polarity (P_sum) column is the average of all tweet’s sentiments between positive and negative and neutral and the p_mean column is the normalized version (p_sum / number of tweets in this day) The financial columns have been downloaded from the yahoo finance website. If you have any questions about this dataset, please write a comment and I will reply.
Facebook
TwitterThe algorithmic trading space is buzzing with new strategies. Companies have spent billions in infrastructures and R&D to be able to jump ahead of the competition and beat the market. Still, it is well acknowledged that the buy & hold strategy is able to outperform many of the algorithmic strategies, especially in the long-run. However, finding value in stocks is an art that very few mastered, can a computer do that?
This Data repo contains two datasets:
Example_2019_price_var.csv. I built this dataset thanks to Financial Modeling Prep API and to pandas_datareader. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API, which is free and highly recommended). The column contains the percent price variation of each stock for the year 2019. In other words, it collects the percent price variation of each stock from the first trading day on Jan 2019 to the last trading day of Dec 2019. To compute this price variation I decided to consider the Adjusted Close Price.
Example_DATASET.csv. I built this dataset thanks to Financial Modeling Prep API. Each row is a stock from the technology sector of the US stock market (that is available from the aforementioned API). Each column is a financial indicator that can be found in the 2018 10-K filings of each company. There are no Nans or empty cells. Furthermore, the last column is the CLASS of each stock, where:
In other words, the last column is used to classify each stock in buy-worthy or not, and this relationship is what should allow a machine learning model to learn to recognize stocks that will increase their value from those that won't.
NOTE: the number of stocks does not match between the two datasets because the API did not have all the required financial indicators for some stocks. It is possible to remove from Example_2019_price_var.csv those rows that do not appear in Example_DATASET.csv.
I built this dataset during the 2019 winter holidays period, because I wanted to answer a simple question: is it possible to have a machine learning model learn the differences between stocks that perform well and those that don't, and then leverage this knowledge in order to predict which stock will be worth buying? Moreover, is it possible to achieve this simply by looking at financial indicators found in the 10-K filings?
Facebook
TwitterThis dataset contains stock prices of Financial Times Stock Exchange(FTSE), NIKKEI(Japan), S&P, and DAX(German) since 1994
Stock Exchange values between 1994 and 2018
Dataset used for my time series analysis notebooks but not belong to me.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Investments Time Series for Bank of Jiangsu Co Ltd. Bank of Jiangsu Co., Ltd. provides various banking products and services in China. It operates through corporate finance business, personal finance business, capital business, and other businesses segments. The company's personal banking services include private banking; investment and financing; personal savings; investment and wealth management; personal exchange; personal loans; payment, and welfare and sport lottery agency; and consumer finance services. It also offers corporate banking services comprising time, call, structured, and current deposits; project financing and wealth management services; inward and outward remittance, export documentary collection and credit, import payment agency, and package loans, as well as L/C advice, review and negotiation, confirmation, and establishment services; money market, fixed income, and foreign exchange and derivatives; investment banking services, such as M&A and reorganization, structured financing, bond market, and securitization; and custody services. In addition, the company provides corporate loan, trade financing, finance leasing, guarantee, remittance and settlement, securities agency, credit card, repurchase transaction, and derivatives trading services; debt instrument investment services; foreign currency; and internet banking services, including direct, mobile, personal online, corporate online, telephone, SMS, and WeChat banking. The company serves individuals, corporations, government agencies, and financial institutions. Bank of Jiangsu Co., Ltd. was incorporated in 2006 and is based in Nanjing, China.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Title: Yahoo Finance Stock Dataset
Description:
The Yahoo Finance Stock Dataset is a comprehensive collection of stock market data pertaining to four prominent companies: AAPL (Apple Inc.), AMZN (Amazon.com Inc.), TSLA (Tesla Inc.), and GOOG (Alphabet Inc., Google's parent company). The dataset covers a period from November 20, 2018, onwards, providing daily records of stock prices and trading volume for these companies.
Attributes:
Purpose:
This dataset is suitable for various financial analyses, including but not limited to:
Data Source:
The dataset was obtained from Yahoo Finance, a widely-used platform for financial data and market analysis. It provides reliable and frequently updated information on stock prices and other financial metrics.
Preprocessing:
The dataset may require preprocessing steps such as handling missing values, checking for outliers, normalizing numerical values, and converting date formats for optimal analysis and model building.
Usage:
Data scientists, financial analysts, researchers, and machine learning practitioners can utilize this dataset for exploratory data analysis (EDA), modeling, and predicting future stock trends.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides a comprehensive collection of historical stock data for all the companies listed on the S&P 500 index. It includes details such as daily open, close, high, low prices, and volume for each listed company. The dataset is intended to help researchers, investors, and data scientists gain insights into the performance of these companies and explore trends, patterns, or anomalies in the stock market.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Retained-Earnings Time Series for GF Securities Co Ltd. GF Securities Co., Ltd., together with its subsidiaries, provides capital market services for corporations, individuals, institutional investors, financial institutions, and government clients in the People's Republic of China. It operates through four segments: Investment Banking; Wealth Management; Trading and Institution; and Investment Management. The company offers equity and debt finance, and sponsor and financial advisory services. It also provides wealth management services, including trading of equities, bonds, funds, futures and other tradable securities; sell wealth management products; investment advisory; financial products, such as fund products, asset management schemes and trust products; and margin financing and securities lending, repurchase transactions, financial leasing, and management of settlement fund on behalf of clients. In addition, the company offers equity investment and trading, fixed income sales and trading; equity derivatives sales and trading; alternative investment, investment research, and asset custody; transaction consultation and execution; and broker services to institutional customers. Further, it provides asset management; public and private fund management; and alternative investment services. GF Securities Co., Ltd. was founded in 1991 and is headquartered in Guangzhou, the People's Republic of China.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Short-Term-Investments Time Series for Charter Hall Group. Charter Hall is Australia's leading fully integrated diversified property investment and funds management group. We use our expertise to access, deploy, manage and invest equity to create value and generate superior returns for our investor customers. We've curated a diverse portfolio of high-quality properties across our core sectors " Office, Industrial & Logistics, Retail and Social Infrastructure. With partnerships and financial discipline at the heart of our approach, we create and invest in places that support our customers, people and communities to grow.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Total-Stockholder-Equity Time Series for China Construction Bank Co. China Construction Bank Corporation engages in the provision of various banking and related financial services to individuals and corporate customers in the People's Republic of China and internationally. It operates through Corporate Finance Business, Personal Finance Business, Treasury and Asset Management Business, and Others segments. The company offers corporate and personal loans; trade financing; care business; deposit taking and wealth management, agency, financial consulting and advisory, cash management, remittance and settlement, guarantee, and investment banking services to individuals, corporations, government agencies, and financial institutions. It also involved in inter-bank deposit and placement transactions, repurchase and resale transactions, and invests in debt securities; trades in derivatives and foreign currencies; precious metal trading; and custody services. In addition, the company provides finance leasing, transfer and purchase of finance lease assets, and fixed-income investment; motor vehicle, business and household property, construction and engineering, liability insurance, hull and cargo, and short-term health and accidental injury insurance, as well as reinsurance; and cost consulting, whole-process engineering consulting, project management, investment consulting, and bidding agency services; and debt-to-equity swaps and relevant supporting businesses. Further, it engages in private equity investment and management of development funds and other private equity funds; investment banking related services, such as sponsoring and underwriting of public offerings, corporate merger and acquisition and restructuring, direct investment, asset management, and securities brokerage and market research; house rental business; raising and selling of funds; management of annuity and pension funds; and pension advisory service. China Construction Bank Corporation was founded in 1954 and is headquartered in Beijing, the People's Republic of China.
Facebook
TwitterOur project involves creating a model using Multiple Linear Regression to analyze and predict the stock prices of Pepsico. Multiple Linear Regression is a statistical technique that allows us to understand the relationship between multiple independent variables and a dependent variable, in this case, the stock price of Pepsico. By considering various factors such as historical stock prices, market trends, and financial indicators, we aim to develop a robust model that can provide valuable insights and predictions for investors and analysts. Through the implementation of this model, we hope to uncover meaningful patterns and correlations within the Pepsico share data, enabling more informed decision-making in the dynamic world of stock market investments.
Facebook
TwitterApple, Amazon, Microsoft, Google, Berkshire Hathaway Stocks (4/2/18-3/29/23) that I downloaded from Yahoo Financial to analyze with a Gold trade dataset (https://www.kaggle.com/datasets/arslanr369/daily-gold-price-2018-2023). They are the top 5 market cap companies according to (https://companiesmarketcap.com/)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Other-Cashflows-From-Investing-Activities Time Series for Alpha Services and Holdings S.A.. Alpha Services and Holdings S.A., together with its subsidiaries, provides various banking and financial products and services to individuals, professionals, and companies in Greece and internationally. It operates through Retail Banking, Corporate Banking, Asset Management and Insurance, Investment Banking and Treasury, South Eastern Europe, and Other segments. The company offers various deposit products, including deposits/savings accounts, working capital/current accounts, checking accounts, investment facilities/term deposits, repos, and swaps; loans comprising mortgage loans, consumer loans, working capital facilities, corporate loans, and letters of guarantee; and debit and credit cards. It also provides leasing and factoring services; asset management services; insurance products; stock exchange, advisory, and brokerage services relating to capital markets; investment banking facilities, as well as deals in interbank market activities and securitization transactions; and mobile and Web banking services. Further, the company provides in the real estate management and hotel services. The company was founded in 1879 and is based in Athens, Greece.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Stock market data can be interesting to analyze and as a further incentive, strong predictive models can have large financial payoff. The amount of financial data on the web is seemingly endless. A large and well structured dataset on a wide array of companies can be hard to come by. Here I provide a dataset with historical stock prices (last 5 years) for all companies currently found on the S&P 500 index.
The script I used to acquire all of these .csv files can be found in this GitHub repository In the future if you wish for a more up to date dataset, this can be used to acquire new versions of the .csv files.
Feb 2018 note: I have just updated the dataset to include data up to Feb 2018. I have also accounted for changes in the stocks on the S&P 500 index (RIP whole foods etc. etc.).
The data is presented in a couple of formats to suit different individual's needs or computational limitations. I have included files containing 5 years of stock data (in the all_stocks_5yr.csv and corresponding folder).
The folder individual_stocks_5yr contains files of data for individual stocks, labelled by their stock ticker name. The all_stocks_5yr.csv contains the same data, presented in a merged .csv file. Depending on the intended use (graphing, modelling etc.) the user may prefer one of these given formats.
All the files have the following columns: Date - in format: yy-mm-dd
Open - price of the stock at market open (this is NYSE data so all in USD)
High - Highest price reached in the day
Low Close - Lowest price reached in the day
Volume - Number of shares traded
Name - the stock's ticker name
Due to volatility in google finance, for the newest version I have switched over to acquiring the data from The Investor's Exchange api, the simple script I use to do this is found here. Special thanks to Kaggle, Github, pandas_datareader and The Market.
This dataset lends itself to a some very interesting visualizations. One can look at simple things like how prices change over time, graph an compare multiple stocks at once, or generate and graph new metrics from the data provided. From these data informative stock stats such as volatility and moving averages can be easily calculated. The million dollar question is: can you develop a model that can beat the market and allow you to make statistically informed trades!