Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, rose to 6173 points on June 27, 2025, gaining 0.52% from the previous session. Over the past month, the index has climbed 4.83% and is up 13.05% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
https://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
View data of the S&P 500, an index of the stocks of 500 leading companies in the US economy, which provides a gauge of the U.S. equity market.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Interactive chart of the S&P 500 stock market index since 1927. Historical data is inflation-adjusted using the headline CPI and each data point represents the month-end closing value. The current month is updated on an hourly basis with today's latest value.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United Kingdom's main stock market index, the GB100, rose to 8736 points on June 26, 2025, gaining 0.19% from the previous session. Over the past month, the index has declined 0.48%, though it remains 6.80% higher than a year ago, according to trading on a contract for difference (CFD) that tracks this benchmark index from United Kingdom. United Kingdom Stock Market Index (GB100) - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
France's main stock market index, the FR40, rose to 7692 points on June 27, 2025, gaining 1.78% from the previous session. Over the past month, the index has declined 1.24%, though it remains 2.84% higher than a year ago, according to trading on a contract for difference (CFD) that tracks this benchmark index from France. France Stock Market Index (FR40) - values, historical data, forecasts and news - updated on June of 2025.
https://fred.stlouisfed.org/legal/#copyright-pre-approvalhttps://fred.stlouisfed.org/legal/#copyright-pre-approval
Graph and download economic data for Dow Jones Industrial Average (DJIA) from 2015-06-29 to 2025-06-27 about stock market, average, industry, and USA.
NIFTY 500 is India’s first broad-based stock market index of the Indian stock market. It contains the top 500 listed companies on the NSE. The NIFTY 500 index represents about 96.1% of free-float market capitalization and 96.5% of the total turnover on the National Stock Exchange (NSE).
NIFTY 500 companies are disaggregated into 72 industry indices. Industry weights in the index reflect industry weights in the market. For example, if the banking sector has a 5% weight in the universe of stocks traded on the NSE, banking stocks in the index would also have an approximate representation of 5% in the index. NIFTY 500 can be used for a variety of purposes such as benchmarking fund portfolios, launching index funds, ETFs, and other structured products.
The dataset comprises various parameters and features for each of the NIFTY 500 Stocks, including Company Name, Symbol, Industry, Series, Open, High, Low, Previous Close, Last Traded Price, Change, Percentage Change, Share Volume, Value in Indian Rupee, 52 Week High, 52 Week Low, 365 Day Percentage Change, and 30 Day Percentage Change.
Company Name: Name of the Company.
Symbol: A stock symbol is a unique series of letters assigned to a security for trading purposes.
Industry: Name of the industry to which the stock belongs.
Series: EQ stands for Equity. In this series intraday trading is possible in addition to delivery and BE stands for Book Entry. Shares falling in the Trade-to-Trade or T-segment are traded in this series and no intraday is allowed. This means trades can only be settled by accepting or giving the delivery of shares.
Open: It is the price at which the financial security opens in the market when trading begins. It may or may not be different from the previous day's closing price. The security may open at a higher price than the closing price due to excess demand for the security.
High: It is the highest price at which a stock is traded during the course of the trading day and is typically higher than the closing or equal to the opening price.
Low: Today's low is a security's intraday low trading price. Today's low is the lowest price at which a stock trades over the course of a trading day.
Previous Close: The previous close almost always refers to the prior day's final price of a security when the market officially closes for the day. It can apply to a stock, bond, commodity, futures or option co-contract, market index, or any other security.
Last Traded Price: The last traded price (LTP) usually differs from the closing price of the day. This is because the closing price of the day on NSE is the weighted average price of the last 30 mins of trading. The last traded price of the day is the actual last traded price.
Change: For a stock or bond quote, change is the difference between the current price and the last trade of the previous day. For interest rates, change is benchmarked against a major market rate (e.g., LIBOR) and may only be updated as infrequently as once a quarter.
Percentage Change: Take the selling price and subtract the initial purchase price. The result is the gain or loss. Take the gain or loss from the investment and divide it by the original amount or purchase price of the investment. Finally, multiply the result by 100 to arrive at the percentage change in the investment.
Share Volume: Volume is an indicator that means the total number of shares that have been bought or sold in a specific period of time or during the trading day. It will also involve the buying and selling of every share during a specific time period.
Value (Indian Rupee): Market value—also known as market cap—is calculated by multiplying a company's outstanding shares by its current market price.
52-Week High: A 52-week high is the highest share price that a stock has traded at during a passing year. Many market aficionados view the 52-week high as an important factor in determining a stock's current value and predicting future price movement. 52-week High prices are adjusted for Bonus, Split & Rights Corporate actions.
52-Week Low: A 52-week low is the lowest ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China's main stock market index, the SHANGHAI, fell to 3424 points on June 27, 2025, losing 0.70% from the previous session. Over the past month, the index has climbed 2.52% and is up 15.39% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from China. China Shanghai Composite Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sweden's main stock market index, the Stockholm, rose to 2506 points on June 27, 2025, gaining 2.26% from the previous session. Over the past month, the index has declined 0.16% and is down 2.46% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from Sweden. Sweden Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Hong Kong's main stock market index, the HK50, fell to 24284 points on June 27, 2025, losing 0.17% from the previous session. Over the past month, the index has climbed 4.41% and is up 37.05% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from Hong Kong. Hong Kong Stock Market Index (HK50) - values, historical data, forecasts and news - updated on June of 2025.
Stock market has become of the wonderful place to make money. Many loss and many gains. Many have tried to predict the price of a stock but fails miserably. Those who say they're able to do so, are the one who hide their biggest losses. If stock price cannot be determined by price alone, then there might be other way to predict it, or say to invest it in the "better" way. Otherwise Warren Buffet wouldn't as rich as he is now by luck alone. But who says we cannot play around with it and create our standard of investing in stock?
EDA RNN to predict future price Trend identifier Classifier Stock Recommendation
Feature | Description |
---|---|
Date | date of the price movement |
Open | the first price of security traded in a day |
High | highest price in a day |
Low | lowest price in a day |
Close | the last price of security traded in a day |
Adj Close | stands for adjusting price or stock's closing price to reflect that stock's value after accounting for any corporate action |
Volume | total stock traded in a day |
You also could use dataset outside this one. This dataset present all public company data in Indonesia. Might be helpful to do certain task, e.g. classification for the industry, etc.
Yahoo Finance
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Canada's main stock market index, the TSX, fell to 26692 points on June 27, 2025, losing 0.22% from the previous session. Over the past month, the index has climbed 1.56% and is up 22.02% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from Canada. Canada Stock Market Index (TSX) - values, historical data, forecasts and news - updated on June of 2025.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Description:
The myusabank.csv
dataset contains daily financial data for a fictional bank (MyUSA Bank) over a two-year period. It includes various key financial metrics such as interest income, interest expense, average earning assets, net income, total assets, shareholder equity, operating expenses, operating income, market share, and stock price. The data is structured to simulate realistic scenarios in the banking sector, including outliers, duplicates, and missing values for educational purposes.
Potential Student Tasks:
Data Cleaning and Preprocessing:
Exploratory Data Analysis (EDA):
Calculating Key Performance Indicators (KPIs):
Building Tableau Dashboards:
Forecasting and Predictive Modeling:
Business Insights and Reporting:
Educational Goals:
The dataset aims to provide hands-on experience in data preprocessing, analysis, and visualization within the context of banking and finance. It encourages students to apply data science techniques to real-world financial data, enhancing their skills in data-driven decision-making and strategic analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In March 2024 Bitcoin BTC reached a new all-time high with prices exceeding 73000 USD marking a milestone for the cryptocurrency market This surge was due to the approval of Bitcoin exchange-traded funds ETFs in the United States allowing investors to access Bitcoin without directly holding it This development increased Bitcoin’s credibility and brought fresh demand from institutional investors echoing previous price surges in 2021 when Tesla announced its 15 billion investment in Bitcoin and Coinbase was listed on the Nasdaq By the end of 2022 Bitcoin prices dropped sharply to 15000 USD following the collapse of cryptocurrency exchange FTX and its bankruptcy which caused a loss of confidence in the market By August 2024 Bitcoin rebounded to approximately 64178 USD but remained volatile due to inflation and interest rate hikes Unlike fiat currency like the US dollar Bitcoin’s supply is finite with 21 million coins as its maximum supply By September 2024 over 92 percent of Bitcoin had been mined Bitcoin’s value is tied to its scarcity and its mining process is regulated through halving events which cut the reward for mining every four years making it harder and more energy-intensive to mine The next halving event in 2024 will reduce the reward to 3125 BTC from its current 625 BTC The final Bitcoin is expected to be mined around 2140 The energy required to mine Bitcoin has led to criticisms about its environmental impact with estimates in 2021 suggesting that one Bitcoin transaction used as much energy as Argentina Bitcoin’s future price is difficult to predict due to the influence of large holders known as whales who own about 92 percent of all Bitcoin These whales can cause dramatic market swings by making large trades and many retail investors still dominate the market While institutional interest has grown it remains a small fraction compared to retail Bitcoin is vulnerable to external factors like regulatory changes and economic crises leading some to believe it is in a speculative bubble However others argue that Bitcoin is still in its early stages of adoption and will grow further as more institutions and governments recognize its potential as a hedge against inflation and a store of value 2024 has also seen the rise of Bitcoin Layer 2 technologies like the Lightning Network which improve scalability by enabling faster and cheaper transactions These innovations are crucial for Bitcoin’s wider adoption especially for day-to-day use and cross-border remittances At the same time central bank digital currencies CBDCs are gaining traction as several governments including China and the European Union have accelerated the development of their own state-controlled digital currencies while Bitcoin remains decentralized offering financial sovereignty for those who prefer independence from government control The rise of CBDCs is expected to increase interest in Bitcoin as a hedge against these centralized currencies Bitcoin’s journey in 2024 highlights its growing institutional acceptance alongside its inherent market volatility While the approval of Bitcoin ETFs has significantly boosted interest the market remains sensitive to events like exchange collapses and regulatory decisions With the limited supply of Bitcoin and improvements in its transaction efficiency it is expected to remain a key player in the financial world for years to come Whether Bitcoin is currently in a speculative bubble or on a sustainable path to greater adoption will ultimately be revealed over time.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset builds upon "Financial Statement Data Sets" by incorporating several key improvements to enhance the accuracy and usability of US-GAAP financial data from SEC filings of U.S. exchange-listed companies. Drawing on submissions from January 2009 onward, the enhanced dataset aims to provide analysts with a cleaner, more consistent dataset by addressing common challenges found in the original data.
The source code for data extraction is available here
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Iran Market Capitalization: Tehran Stock Exchange (TSE) data was reported at 1,743.023 USD bn in Sep 2023. This records a decrease from the previous number of 1,761.129 USD bn for Aug 2023. Iran Market Capitalization: Tehran Stock Exchange (TSE) data is updated monthly, averaging 105.978 USD bn from Dec 2005 (Median) to Sep 2023, with 199 observations. The data reached an all-time high of 1,991.152 USD bn in May 2023 and a record low of 33.814 USD bn in Jun 2007. Iran Market Capitalization: Tehran Stock Exchange (TSE) data remains active status in CEIC and is reported by Tehran Stock Exchange. The data is categorized under Global Database’s Iran – Table IR.Z002: Tehran Stock Exchange: Market Capitalization. [COVID-19-IMPACT]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
South Korea's main stock market index, the KOSPI, fell to 3056 points on June 27, 2025, losing 0.77% from the previous session. Over the past month, the index has climbed 14.45% and is up 9.23% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from South Korea. South Korea Stock Market - values, historical data, forecasts and news - updated on June of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.
However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.
2 Data-set Introduction
2.1 Data Collection
We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:
The headline must have one or more words directly or indirectly related to COVID-19.
The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.
The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.
Avoid taking duplicate reports.
Maintain a time frame for the above mentioned newspapers.
To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.
2.2 Data Pre-processing and Statistics
Some pre-processing steps performed on the newspaper report dataset are as follows:
Remove hyperlinks.
Remove non-English alphanumeric characters.
Remove stop words.
Lemmatize text.
While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.
The primary data statistics of the two dataset are shown in Table 1 and 2.
Table 1: Covid-News-USA-NNK data statistics
No of words per headline
7 to 20
No of words per body content
150 to 2100
Table 2: Covid-News-BD-NNK data statistics No of words per headline
10 to 20
No of words per body content
100 to 1500
2.3 Dataset Repository
We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.
3 Literature Review
Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.
Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].
Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.
Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.
4 Our experiments and Result analysis
We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:
In February, both the news paper have talked about China and source of the outbreak.
StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.
Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.
Washington Post discussed global issues more than StarTribune.
StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.
While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.
We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases
where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Interactive chart of historical daily platinum prices back to 1985. The price shown is in U.S. Dollars per troy ounce.
The initial sample of this study covers the A-share companies listed on the Shanghai and Shenzhen stock exchanges during the period 2008-2020. We then screened and processed the initial sample data, including (a) Screening for companies with both RepRisk's ESG rating and Bloomberg's ESG rating. Specifically, the selection is based on samples with the same ISIN code and companies' English names in the Bloomberg and RepRisk lndex (RRI) databases. The ISIN code is a securities coding standard developed by the International Organization for Standardization (ISO) and is a unique code used to identify securities in each country or region around the world. We exclude samples that do not provide ISIN codes or have inconsistent English names. (b) We exclude observations with missing values for the main variables. (c) We exclude the ST, *ST and PT trading status samples during the observation period. Our final sample contains 1352 firm-year observations.The ESG disclosure score data and ESG performance score data required for the ESG-washing construction are respectively obtained from the Bloomberg database and the RepRisk Index (RRI) database of the Wharton Research Centre for Data Studies (WRDS). Positive media coverage data is sourced from the China Research Data Services Platform (CNRDS), while the instrumental variable (IV_population) is obtained from the EPS database and Juhe Data (https://www.gotohui.com/). Unless otherwise stated, all other data in this study are from the China Stock Market and Accounting Research (CSMAR) database. Data on executive company changes were collected manually by the authors back-to-back and independently. Then we compared and reconciled the data collected by each, and where there were discrepancies, we again collected and calibrated the data to maximize their reliability. We first obtained executive biographies from the CSMAR database, and the missing values were retrieved from Sina Finance ( https://finance.sina.com.cn/). Due to the unstructured nature of the resume data, we manually processed more than 30,000 resumes of executives to get the data of executives' company changes, based on which we calculated the per capita number of job hops of all executives in each company. The number of part-time jobs held by executives also reflects their pursuit of career changes and development, so in the robustness test the per capita mean of the number of part-time jobs held by executives is used as a proxy variable for careerist orientation. These data can be obtained directly from the CSMAR database.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The main stock market index of United States, the US500, rose to 6173 points on June 27, 2025, gaining 0.52% from the previous session. Over the past month, the index has climbed 4.83% and is up 13.05% compared to the same time last year, according to trading on a contract for difference (CFD) that tracks this benchmark index from United States. United States Stock Market Index - values, historical data, forecasts and news - updated on June of 2025.