Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Information about more than 4600 companies tradable on Robinhood website
The information is scraped from Robinhood website
The dataset contains useful information about Public Companies are being traded in US Stock market. Information about company size, market cap, PE ratio, list date, etc. is provided which can provide some insights about stock market in the US.
Facebook
TwitterJoin us at LechterVentures.com to explore other interesting topics in Data Science and marketplaces.
Numerous people had asked me to study the role retail trading plays in driving asset prices. Using this as my inspiration, I found a dataset with hourly tick data for ~9,000 stocks and another one with hourly Robinhood user participation data (aka how many Robinhood users own a stock in a particular time period) . Here you will not only find the data used to perform my research, but also a copy of the notebook I ended up using. Excited to see what the community does with this!
2 major sources were used to acquire this data: - Stooq - While not written in English, this website hosts numerous free stock tick datasets. I was able to directionally confirm accuracy of the data vs what my personal brokerage account reported over this time period. I cannot speak to the preciseness of this data. - RobinTrack - This website collects Robinhood user participation data for stocks that trade on their platform. Per Bloomberg, it does appear Robinhood will stop providing access to this data in the near future (as of August 2020)
Additionally, you can find the notebook I used to prepare the research for my article here
The data covers the time period between September 2019 and July 2020.
I originally tried to input this information directly in the Data Explorer section but Kaggle kept bugging out.
Robinhood_Master_v1.csvThis is the master dataframe that includes hourly tick and Robinhood user participation data for ~9,000 stocks going back ~1 year - #: Index column; it can be ignored - Clean_Datetime: This column can also be ignored. - Close: Closing price for the stock noted in the Ticker column during this row's time period - High: Highest price reached for the stock noted in the Ticker column during this row's time period - Low: Lowest price reached for the stock noted in the Ticker column during this row's time period - Close: Closing price for the stock noted in the Ticker column during this row's time period - Open: Opening price for the stock noted in the Ticker column during this row's time period - OpenInt: This column can be ignored - its almost all 0 - Ticker: The stock ticker analyzed in a given row. For example, if this shows 'AAPL' then this row is reporting data on Apple stock. - users _ holding _ first: The initial amount of Robinhood users who owned the stock noted in the Ticker column during this row's time period - users _ holding _ last: The final amount of Robinhood users who owned the stock noted in the Ticker column during this row's time period - users _ holding _ max: The highest amount of Robinhood users who owned the stock noted in the Ticker column during this row's time period - users _ holding _ min: The lowest amount of Robinhood users who owned the stock noted in the Ticker column during this row's time period
df_apple_final.csvThis is the pre-processed dataframe that includes the cleaned predictors I used for my Apple time series modeling. All columns (except "y", "Clean _ Datetime _ PST" and "ds") were shifted back 1 day. The idea here is that all predictors need to occur on or before the target data. Otherwise, you end up using future data to predict the past. I'll only describe columns below that are not also found in the master dataframe. - users _ holding _ 1D _ change: the day-over-day change in Robinhood stock ownership for Apple - users _ holding _ 13D _ change: the 13 day change in Robinhood stock ownership for Apple - Open 6D_change: the 6 day change in Apple’s stock market opening price - Open 13D_change: the 13 day change in Apple’s stock market opening price - SPY users _ holding _ 1D _ change: the day-over-day change in Robinhood stock ownership for SPY - SPY Open 1D _ change: the day-over-day change in SPY’s stock market opening price - SPY Open 13D _ change: the 13 day change in SPY’s stock market opening price
custom_functions.pyIn my notebook, I had to create a couple custom functions to run the graphs used there (this file is explicitly imported into my notebook with all the other python libraries). If you want to run my notebook, make sure it can find this file so it can run these functions.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset provides historical stock market performance data for specific companies. It enables users to analyze and understand the past trends and fluctuations in stock prices over time. This information can be utilized for various purposes such as investment analysis, financial research, and market trend forecasting.