2 datasets found

Stock market predictions
kaggle.com
Updated Feb 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tanishq dublish (2024). Stock market predictions [Dataset]. https://www.kaggle.com/datasets/tanishqdublish/stock-market-predictions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 18, 2024
Dataset provided by
Kaggle
Authors
Tanishq dublish
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Actually, I prepare this dataset for students on my Deep Learning and NLP course.

But I am also very happy to see kagglers play around with it.

Have fun!

Description:

There are two channels of data provided in this dataset:

News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

I provided three data files in .csv format:

RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".
Daily News for Stock Market Prediction
kaggle.com
zip
Updated Nov 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron7sun (2019). Daily News for Stock Market Prediction [Dataset]. https://www.kaggle.com/datasets/aaron7sun/stocknews/discussion/41925
Explore at:
zip(6097730 bytes)Available download formats
Dataset updated
Nov 13, 2019
Authors
Aaron7sun
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Actually, I prepare this dataset for students on my Deep Learning and NLP course.

But I am also very happy to see kagglers play around with it.

Have fun!

Description:

There are two channels of data provided in this dataset:

News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

I provided three data files in .csv format:

RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".

=========================================

To my students:

I made this a binary classification task. Hence, there are only two labels:

"1" when DJIA Adj Close value rose or stayed as the same;

"0" when DJIA Adj Close value decreased.

For task evaluation, please use data from 2008-08-08 to 2014-12-31 as Training Set, and Test Set is then the following two years data (from 2015-01-02 to 2016-07-01). This is roughly a 80%/20% split.

And, of course, use AUC as the evaluation metric.

=========================================

+++++++++++++++++++++++++++++++++++++++++

To all kagglers:

Please upvote this dataset if you like this idea for market prediction.

If you think you coded an amazing trading algorithm,

friendly advice

do play safe with your own money :)

+++++++++++++++++++++++++++++++++++++++++

Feel free to contact me if there is any question~

And, remember me when you become a millionaire :P

Note: If you'd like to cite this dataset in your publications, please use:

Sun, J. (2016, August). Daily News for Stock Market Prediction, Version 1. Retrieved [Date You Retrieved This Data] from https://www.kaggle.com/aaron7sun/stocknews.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Tanishq dublish (2024). Stock market predictions [Dataset]. https://www.kaggle.com/datasets/tanishqdublish/stock-market-predictions

Stock market predictions

Contains daily news for stock market predictions

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Feb 18, 2024

Dataset provided by

Kaggle

Authors

Tanishq dublish

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Actually, I prepare this dataset for students on my Deep Learning and NLP course.

But I am also very happy to see kagglers play around with it.

Have fun!

Description:

There are two channels of data provided in this dataset:

News data: I crawled historical news headlines from Reddit WorldNews Channel (/r/worldnews). They are ranked by reddit users' votes, and only the top 25 headlines are considered for a single date. (Range: 2008-06-08 to 2016-07-01)

Stock data: Dow Jones Industrial Average (DJIA) is used to "prove the concept". (Range: 2008-08-08 to 2016-07-01)

I provided three data files in .csv format:

RedditNews.csv: two columns The first column is the "date", and second column is the "news headlines". All news are ranked from top to bottom based on how hot they are. Hence, there are 25 lines for each date.

DJIA_table.csv: Downloaded directly from Yahoo Finance: check out the web page for more info.

Combined_News_DJIA.csv: To make things easier for my students, I provide this combined dataset with 27 columns. The first column is "Date", the second is "Label", and the following ones are news headlines ranging from "Top1" to "Top25".

Clear search

Close search

Google apps

Main menu

Stock market predictions

Daily News for Stock Market Prediction

Stock market predictions

Contains daily news for stock market predictions