Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The provided dataset is extracted from yahoo finance using pandas and yahoo finance library in python. This deals with stock market index of the world best economies. The code generated data from Jan 01, 2003 to Jun 30, 2023 that’s more than 20 years. There are 18 CSV files, dataset is generated for 16 different stock market indices comprising of 7 different countries. Below is the list of countries along with number of indices extracted through yahoo finance library, while two CSV files deals with annualized return and compound annual growth rate (CAGR) has been computed from the extracted data.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F90ce8a986761636e3edbb49464b304d8%2FNumber%20of%20Index.JPG?generation=1688490342207096&alt=media" alt="">
This dataset is useful for research purposes, particularly for conducting comparative analyses involving capital market performance and could be used along with other economic indicators.
There are 18 distinct CSV files associated with this dataset. First 16 CSV files deals with number of indices and last two CSV file deals with annualized return of each year and CAGR of each index. If data in any column is blank, it portrays that index was launch in later years, for instance: Bse500 (India), this index launch in 2007, so earlier values are blank, similarly China_Top300 index launch in year 2021 so early fields are blank too.
The extraction process involves applying different criteria, like in 16 CSV files all columns are included, Adj Close is used to calculate annualized return. The algorithm extracts data based on index name (code given by the yahoo finance) according start and end date.
Annualized return and CAGR has been calculated and illustrated in below image along with machine readable file (CSV) attached to that.
To extract the data provided in the attachment, various criteria were applied:
Content Filtering: The data was filtered based on several attributes, including the index name, start and end date. This filtering process ensured that only relevant data meeting the specified criteria.
Collaborative Filtering: Another filtering technique used was collaborative filtering using yahoo finance, which relies on index similarity. This approach involves finding indices that are similar to other index or extended dataset scope to other countries or economies. By leveraging this method, the algorithm identifies and extracts data based on similarities between indices.
In the last two CSV files, one belongs to annualized return, that was calculated based on the Adj close column and new DataFrame created to store its outcome. Below is the image of annualized returns of all index (if unreadable, machine-readable or CSV format is attached with the dataset).
As far as annualised rate of return is concerned, most of the time India stock market indices leading, followed by USA, Canada and Japan stock market indices.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F37645bd90623ea79f3708a958013c098%2FAnnualized%20Return.JPG?generation=1688525901452892&alt=media" alt="">
The best performing index based on compound growth is Sensex (India) that comprises of top 30 companies is 15.60%, followed by Nifty500 (India) that is 11.34% and Nasdaq (USA) all is 10.60%.
The worst performing index is China top300, however this is launch in 2021 (post pandemic), so would not possible to examine at that stage (due to less data availability). Furthermore, UK and Russia indices are also top 5 in the worst order.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F58ae33f60a8800749f802b46ec1e07e7%2FCAGR.JPG?generation=1688490409606631&alt=media" alt="">
Geography: Stock Market Index of the World Top Economies
Time period: Jan 01, 2003 – June 30, 2023
Variables: Stock Market Index Title, Open, High, Low, Close, Adj Close, Volume, Year, Month, Day, Yearly_Return and CAGR
File Type: CSV file
This is not a financial advice; due diligence is required in each investment decision.
Facebook
TwitterMeasuring the usage of informatics resources such as software tools and databases is essential to quantifying their impact, value and return on investment. We have developed a publicly available dataset of informatics resource publications and their citation network, along with an associated metric (u-Index) to measure informatics resources’ impact over time. Our dataset differentiates the context in which citations occur to distinguish between ‘awareness’ and ‘usage’, and uses a citing universe of open access publications to derive citation counts for quantifying impact. Resources with a high ratio of usage citations to awareness citations are likely to be widely used by others and have a high u-Index score. We have pre-calculated the u-Index for nearly 100,000 informatics resources. We demonstrate how the u-Index can be used to track informatics resource impact over time. The method of calculating the u-Index metric, the pre-computed u-Index values, and the dataset we compiled to calculate the u-Index are publicly available.
Facebook
TwitterThe Case Mix Index (CMI) is the average relative DRG weight of a hospital’s inpatient discharges, calculated by summing the Medicare Severity-Diagnosis Related Group (MS-DRG) weight for each discharge and dividing the total by the number of discharges. The CMI reflects the diversity, clinical complexity, and resource needs of all the patients in the hospital. A higher CMI indicates a more complex and resource-intensive case load. Although the MS-DRG weights, provided by the Centers for Medicare & Medicaid Services (CMS), were designed for the Medicare population, they are applied here to all discharges regardless of payer. Note: It is not meaningful to add the CMI values together.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Reference: https://www.zillow.com/research/zhvi-methodology/
In setting out to create a new home price index, a major problem Zillow sought to overcome in existing indices was their inability to deal with the changing composition of properties sold in one time period versus another time period. Both a median sale price index and a repeat sales index are vulnerable to such biases (see the analysis here for an example of how influential the bias can be). For example, if expensive homes sell at a disproportionately higher rate than less expensive homes in one time period, a median sale price index will characterize this market as experiencing price appreciation relative to the prior period of time even if the true value of homes is unchanged between the two periods.
The ideal home price index would be based off sale prices for the same set of homes in each time period so there was never an issue of the sales mix being different across periods. This approach of using a constant basket of goods is widely used, common examples being a commodity price index and a consumer price index. Unfortunately, unlike commodities and consumer goods, for which we can observe prices in all time periods, we can’t observe prices on the same set of homes in all time periods because not all homes are sold in every time period.
The innovation that Zillow developed in 2005 was a way of approximating this ideal home price index by leveraging the valuations Zillow creates on all homes (called Zestimates). Instead of actual sale prices on every home, the index is created from estimated sale prices on every home. While there is some estimation error associated with each estimated sale price (which we report here), this error is just as likely to be above the actual sale price of a home as below (in statistical terms, this is referred to as minimal systematic error). Because of this fact, the distribution of actual sale prices for homes sold in a given time period looks very similar to the distribution of estimated sale prices for this same set of homes. But, importantly, Zillow has estimated sale prices not just for the homes that sold, but for all homes even if they didn’t sell in that time period. From this data, a comprehensive and robust benchmark of home value trends can be computed which is immune to the changing mix of properties that sell in different periods of time (see Dorsey et al. (2010) for another recent discussion of this approach).
For an in-depth comparison of the Zillow Home Value Index to the Case Shiller Home Price Index, please refer to the Zillow Home Value Index Comparison to Case-Shiller
Each Zillow Home Value Index (ZHVI) is a time series tracking the monthly median home value in a particular geographical region. In general, each ZHVI time series begins in April 1996. We generate the ZHVI at seven geographic levels: neighborhood, ZIP code, city, congressional district, county, metropolitan area, state and the nation.
Estimated sale prices (Zestimates) are computed based on proprietary statistical and machine learning models. These models begin the estimation process by subdividing all of the homes in United States into micro-regions, or subsets of homes either near one another or similar in physical attributes to one another. Within each micro-region, the models observe recent sale transactions and learn the relative contribution of various home attributes in predicting the sale price. These home attributes include physical facts about the home and land, prior sale transactions, tax assessment information and geographic location. Based on the patterns learned, these models can then estimate sale prices on homes that have not yet sold.
The sale transactions from which the models learn patterns include all full-value, arms-length sales that are not foreclosure resales. The purpose of the Zestimate is to give consumers an indication of the fair value of a home under the assumption that it is sold as a conventional, non-foreclosure sale. Similarly, the purpose of the Zillow Home Value Index is to give consumers insight into the home value trends for homes that are not being sold out of foreclosure status. Zillow research indicates that homes sold as foreclosures have typical discounts relative to non-foreclosure sales of between 20 and 40 percent, depending on the foreclosure saturation of the market. This is not to say that the Zestimate is not influenced by foreclosure resales. Zestimates are, in fact, influenced by foreclosure sales, but the pathway of this influence is through the downward pressure foreclosure sales put on non-foreclosure sale prices. It is the price signal observed in the latter that we are attempting to measure and, in turn, predict with the Zestimate.
Market Segments Within each region, we calculate the ZHVI for various subsets of homes (or mar...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The GDI measures gender gaps in human development achievements by accounting for disparities between women and men in three basic dimensions of human development—health, knowledge and living standards using the same component indicators as in the HDI. The GDI is the ratio of the HDIs calculated separately for females and males using the same methodology as in the HDI. It is a direct measure of gender gap showing the female HDI as a percentage of the male HDI.
https://www.google.com/imgres?imgurl=https%3A%2F%2Fhdr.undp.org%2Fsites%2Fdefault%2Ffiles%2Fgdi_2020.jpg&imgrefurl=https%3A%2F%2Fhdr.undp.org%2Fen%2Fcontent%2Fgender-development-index-gdi&tbnid=CRWB4jnF-9JL4M&vet=12ahUKEwiDtIKw2dX3AhV9D7cAHS6EAmMQMygAegUIARDKAQ..i&docid=kXbJ3J1idYexTM&w=1592&h=478&q=Gender%20development%20index&ved=2ahUKEwiDtIKw2dX3AhV9D7cAHS6EAmMQMygAegUIARDKAQ" alt="Gender Development Index Calculation">
Facebook
Twitterhttps://eidc.ceh.ac.uk/licences/historic-SPI/plainhttps://eidc.ceh.ac.uk/licences/historic-SPI/plain
5km gridded Standardised Precipitation Index (SPI) data for Great Britain, which is a drought index based on the probability of precipitation for a given accumulation period as defined by McKee et al [1]. There are seven accumulation periods: 1, 3, 6, 9, 12, 18, 24 months and for each period SPI is calculated for each of the twelve calendar months. Note that values in monthly (and for longer accumulation periods also annual) time series of the data therefore are likely to be autocorrelated. The standard period which was used to fit the gamma distribution is 1961-2010. The dataset covers the period from 1862 to 2015. This version supersedes previous versions (version 2 and 3) of the same dataset due to minor errors in the data files. NOTE: the difference between this dataset with the previously published dataset 'Gridded Standardized Precipitation Index (SPI) using gamma distribution with standard period 1961-2010 for Great Britain [SPIgamma61-10]' (Tanguy et al., 2015; https://doi.org/10.5285/94c9eaa3-a178-4de4-8905-dbfab03b69a0) , apart from the temporal and spatial extent, is the underlying rainfall data from which SPI was calculated. In the previously published dataset, CEH-GEAR (Tanguy et al., 2014; https://doi.org/10.5285/5dc179dc-f692-49ba-9326-a6893a503f6e) was used, whereas in this new version, Met Office 5km rainfall grids were used (see supporting information for more details). The methodology to calculate SPI is the same in the two datasets. [1] McKee, T. B., Doesken, N. J., Kleist, J. (1993). The Relationship of Drought Frequency and Duration to Time Scales. Eighth Conference on Applied Climatology, 17-22 January 1993, Anaheim, California.
Facebook
TwitterThe Consumer Price Index (CPI) is a measure of the average change over time in the prices paid by urban consumers for a market basket of consumer goods and services. Indexes are available for the U.S. and various geographic areas. Average price data for select utility, automotive fuel, and food items are also available. Prices for the goods and services used to calculate the CPI are collected in 75 urban areas throughout the country and from about 23,000 retail and service establishments. Data on rents are collected from about 43,000 landlords or tenants. More information and details about the data provided can be found at http://www.bls.gov/cpi
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We extend our previous work with the Yost Index by adding 90% confidence intervals to the index values. These were calculated using the variance replicate estimates published in association with the American Community Survey of the United States Census Bureau.
In the file yost-tract-2015-2019.csv, the data fields consists of 11-digit geographic ID built from FIPS codes (2 digit state, 3 digit county, 6 digit census tract); Yost index, 90% lower confidence interval; 90% upper confidence interval. Data is provided for 72,793 census tracts for which sufficient data were available. The Yost Index ranges from 1 (lowest socioeconomic position) to 100 (highest socioeconomic position).
For those only interested in using the index as we have calculated it, the file yost-tract-2015-2019 is the only file you need. The other 368 files here are provided for anyone who wishes to replicate our results using the R program yost-conf-intervals.R. The program presumes the user is running Windows machine and that all files reside in a folder called C:/yostindex. The R program requires a number of packages, all of which are specified in lines 10-22 of the program.
Details of this project were published in Boscoe FP, Liu B, LaFantasie J, Niu L, Lee FF. Estimating uncertainty in a socioeconomic index derived from the American Community Survey. SSM-Population Health 2022; 18: 101078. Full text
Additional years of data following this format are planned to be added to this repository in time.
Facebook
TwitterThe endpoints selected for evaluation of the HIINT formula were percent relative liver weight of mice (PcLiv) and the logarithm of ALT [Log(ALT)], where the log transformation was used to help stabilize the increases in variance with dose found in the ALT dataset.
Facebook
TwitterThe Superdiversity dataset includes the Superdiversity Index (SI) calculated on the diversity of the emotional content expressed in texts of different communities. The emotional valences of words used by a community are extracted from Twitter data produced by that specific community. The Superdiversity dataset includes the SI built on Twitter data and lexicon-based Sentiment Analysis. In addition, the dataset comprises other possible diversity measures calculated from the same data from which the SI is calculated, such as the number of tweets in the community language and the Type-Token Ratio, the number of languages in a community. The SI ranges in [0, 1]: a value of 0 means an emotional content very close between the computed valences and a standard emotional lexicon. a value of 0.5 indicates no correlation between the emotional content of words used by the community on Twitter and the standard emotional content. a value of 1 would correspond to the use of terms with the opposite emotional content compared to the standard. Data is computed at three different geographical scales based on the Classification of Territorial Units for Statistics (NUTS), i.e., NUTS1, NUTS2, and NUTS3, for two different nations Italy and the United Kingdom. The untagged Twitter dataset is composed of just under 73,175,500 geolocalised tweets gathered for 3 months, from the 1st August to the 31st October of 2015.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Global Aridity Index (Global-AI) and Global Reference Evapo-Transpiration (Global-ET0) datasets provided in Version 3 of the Global Aridity Index and Potential Evapo-Transpiration (ET0) Database (Global-AI_PET_v3) provide high-resolution (30 arc-seconds) global raster data for the 1970-2000 period, related to evapotranspiration processes and rainfall deficit for potential vegetative growth, based upon implementation of the FAO-56 Penman-Monteith Reference Evapotranspiration (ET0) equation.
Aridity Index represent the ratio between precipitation and ET0, thus rainfall over vegetation water demand (aggregated on annual basis). Under this formulation, Aridity Index values increase for more humid conditions, and decrease with more arid conditions. The Aridity Index values reported within the Global-AI geodataset have been multiplied by a factor of 10,000 to derive and distribute the data as integers (with 4 decimal accuracy). This multiplier has been used to increase the precision of the variable values without using decimals. The Readme File is provided with a detailed description of the dataset files, and the following article for a description of the methodology and a technical validation.The Global-AI_PET_v3 datasets are provided for non-commercial use in standard GeoTiff format, at 30 arc seconds or ~ 1km at the equator.
Facebook
TwitterThis data release provides tabulated liquefaction potential index (LPI) values calculated for a standard set of magnitudes (M), peak ground accelerations (PGA), and groundwater depths (GWD), as described in detail in Engler and others (2025). We use these data to rapidly interpolate LPI values for any M-PGA-GWD combination. The LPI results are computed at cone penetration test (CPT) sites in the San Francisco Bay Area (Holzer and others, 2010). Additionally, the CPT sites are classified using surface geology maps (Wentworth and others, 2023; Wills and others, 2015; Witter and others, 2006).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Global Aridity Index (Global-Aridity_ET0) and Global Reference Evapotranspiration (Global-ET0) Version 2 dataset provides high-resolution (30 arc-seconds) global raster climate data for the 1970-2000 period, related to evapotranspiration processes and rainfall deficit for potential vegetative growth, based upon the implementation of a Penman Monteith Evapotranspiration equation for reference crop. The dataset follows the development and is based upon the WorldClim 2.0: http://worldclim.org/version2 Aridity Index represent the ratio between precipitation and ET0, thus rainfall over vegetation water demand (aggregated on annual basis). Under this formulation, Aridity Index values increase for more humid conditions, and decrease with more arid conditions. The Aridity Index values reported within the Global Aridity Index_ET0 geodataset have been multiplied by a factor of 10,000 to derive and distribute the data as integers (with 4 decimal accuracy). This multiplier has been used to increase the precision of the variable values without using decimals.The Global-Aridity_ET0 and Global-ET0 datasets are provided for non-commercial use in standard GeoTiff format, at 30 arc seconds or ~ 1km at the equator.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This ULF index is constructed from 1-minute ground-based magnetic field observations provided by SuperMAG. The index is derived for the north (N) and east (E) magnetic field components from magnetometers on the northern hemisphere between 65 to 70 magnetic latitude and divided into four MLT sectors. The index is available from January 1995 to December 2023. The index is given in monthly files with following naming convention: Pc5_MLTd_index_LAT_65_70_YYYYMM_COMPONENT.dat
date: YYYY-MM-DD
time: hh:mm:ss
P: Pc5 index in a given sector (nT^2)
N: number of SuperMAG stations used to calculate the index
Definition of MLT sectors:
Day: 6-18 MLT
Night: 18-6 MLT
Dawn: 3-9 MLT
Noon: 9-15 MLT
Dusk: 15-21 MLT
Midnight: 21-3 MLT
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This document describes the data sources and variables used in the third Anthropic Economic Index (AEI) report.
The core dataset contains Claude AI usage metrics aggregated by geography and analysis dimensions (facets).
Source files:
- aei_raw_claude_ai_2025-08-04_to_2025-08-11.csv (pre-enrichment data in data/intermediate/)
- aei_enriched_claude_ai_2025-08-04_to_2025-08-11.csv (enriched data in data/output/)
Note on data sources: The AEI raw file contains raw counts and percentages. Derived metrics (indices, tiers, per capita calculations, automation/augmentation percentages) are calculated during the enrichment process in aei_report_v3_preprocessing_claude_ai.ipynb.
Each row represents one metric value for a specific geography and facet combination:
| Column | Type | Description |
|---|---|---|
geo_id | string | Geographic identifier (ISO-2 country code for countries, US state code, or "GLOBAL", ISO-3 country codes in enriched data) |
geography | string | Geographic level: "country", "state_us", or "global" |
date_start | date | Start of data collection period |
date_end | date | End of data collection period |
platform_and_product | string | "Claude AI (Free and Pro)" |
facet | string | Analysis dimension (see Facets below) |
level | integer | Sub-level within facet (0-2) |
variable | string | Metric name (see Variables below) |
cluster_name | string | Specific entity within facet (task, pattern, etc.). For intersections, format is "base::category" |
value | float | Numeric metric value |
Variables follow the pattern {prefix}_{suffix} with specific meanings:
From AEI processing: *_count, *_pct
From enrichment: *_per_capita, *_per_capita_index, *_pct_index, *_tier, automation_pct, augmentation_pct, soc_pct
O*NET Task Metrics: - onet_task_count: Number of conversations using this specific O*NET task - onet_task_pct: Percentage of geographic total using this task - onet_task_pct_index: Specialization index comparing task usage to baseline (global for countries, US for states) - onet_task_collaboration_count: Number of conversations with both this task and collaboration pattern (intersection) - onet_task_collaboration_pct: Percentage of the base task's total that has this collaboration pattern (sums to 100% within each task)
Request Metrics: - request_count: Number of conversations in this request category level - request_pct: Percentage of geographic total in this category - request_pct_index: Specialization index comparing request usage to baseline - request_collaboration_count: Number of conversations with both this request category and collaboration pattern (intersection) - request_collaboration_pct: Percentage of the base request's total that has this collaboration pattern (sums to 100% within each request)
Collaboration Pattern Metrics: - collaboration_count: Number of conversations with this collaboration pattern - collaboration_pct: Percentage of geographic total with this pattern - collaboration_pct_index: Specialization index comparing pattern to baseline - automation_pct: Percentage of classifiable collaboration that is automation-focused (directive, feedback loop patterns) - augmentation_pct: Percentage of classifiable collaboration that is augmentation-focused (validation, task iteration, learning patterns)
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The Dataset "AI Global index" includes The Global AI Index itself and seven indicators affecting the Index on 62 countries, as well as general information about the countries (region, cluster, income group and political regime).
The Global AI Index is the first index to benchmark nations on their level of investment, innovation and implementation of artificial intelligence.
Talent, Infrastructure and Operating Environment are the factors of AI Implementation group of indicators, which represents the application of artificial intelligence by professionals in various sectors, such as businesses, governments, and communities. - Talent indicator focuses on the availability of skilled practitioners for the provision of artificial intelligence solutions. - Infrastructure indicator focuses on the reliability and scale of access infrastructure, from electricity and internet, to super computing capabilities. - Operating Environment indicator focuses on the regulatory context, and public opinion surrounding artificial intelligence.
Research and Development are the factors of Innovation group of indicators, which reflects the progress made in technology and methodology, which signify the potential for artificial intelligence to evolve and improve. - Research indicator focuses on the extent of specialist research and researchers; investigating the amount of publications and citations in credible academic journals. - Development indicator focuses on the development of fundamental platforms and algorithms upon which innovative artificial intelligence projects rely.
Government Strategy and Commercial are the factors of Investment group of indicators, which reflects financial and procedural commitments to artificial intelligence. - Government Strategy indicator focuses on the depth of commitment from national government to artificial intelligence; investigating spending commitments and national strategies. - Commercial indicator focuses on the level of startup activity, investment and business initiatives based on artificial intelligence.
All these seven indicators were calculated by Tortoise Media via weighting and summarizing 143 other indicators.
The dataset can be used for practicing data cleaning, data visualization, finding correlations between the indexes, Machine Learning (classification, regression, clustering).
The data was used in the analytical article research Artificial Intelligence on the World Stage: Dominant Players and Aspiring Challengers
Facebook
TwitterThis dataset includes soil wet aggregate stability measurements from the Upper Mississippi River Basin LTAR site in Ames, Iowa. Samples were collected in 2021 from this long-term tillage and cover crop trial in a corn-based agroecosystem. We measured wet aggregate stability using digital photography to quantify disintegration (slaking) of submerged aggregates over time, similar to the technique described by Fajardo et al. (2016) and Rieke et al. (2021). However, we adapted the technique to larger sample numbers by using a multi-well tray to submerge 20-36 aggregates simultaneously. We used this approach to measure slaking index of 160 soil samples (2120 aggregates). This dataset includes slaking index calculated for each aggregates, and also summarized by samples. There were usually 10-12 aggregates measured per sample. We focused primarily on methodological issues, assessing the statistical power of slaking index, needed replication, sensitivity to cultural practices, and sensitivity to sample collection date. We found that small numbers of highly unstable aggregates lead to skewed distributions for slaking index. We concluded at least 20 aggregates per sample were preferred to provide confidence in measurement precision. However, the experiment had high statistical power with only 10-12 replicates per sample. Slaking index was not sensitive to the initial size of dry aggregates (3 to 10 mm diameter); therefore, pre-sieving soils was not necessary. The field trial showed greater aggregate stability under no-till than chisel plow practice, and changing stability over a growing season. These results will be useful to researchers and agricultural practitioners who want a simple, fast, low-cost method for measuring wet aggregate stability on many samples.
Facebook
TwitterA quantitative basis for comparing, analyzing, and understanding environmental performance for 180 countries. We score and rank these countries on their environmental performance using the most recent year of data available and calculate how these scores have changed over the previous decade. Data provided by country, and can be filtered by regions. Relavent metrics scored on the EPI include: Access to sanitation and drinking water, Unsafe sanitation, unsafe drinking water, water resources impact (based on wastewater discharge), Wastewater Treatment
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data on the climate-related financial policy index (CRFPI) - comprising the global climate-related financial policies adopted globally and the bindingness of the policy - are provided for 74 countries from 2000 to 2020. The data include the index values from four statistical models used to calculate the composite index as described in D’Orazio and Thole 2022. The four alternative statistical approaches were designed to experiment with alternative weighting assumptions and illustrate how sensitive the proposed index is to changes in the steps followed to construct it. The index data shed light on countries’ engagement in climate-related financial planning and highlight policy gaps in relevant policy sectors.
Facebook
Twitter208 views (4 recent) Dataset extent Map data © OpenStreetMap contributors. (i) Turnover indices calculation answer to a national and a European imperative. They are used to measure the monthly changes in sales of companies in the sectors concerned. As such, they are a primary information to monitor the business cycle in France. (ii) The turnover indices are calculated according to the nomenclature NAF rev. 2. The indexes are calculated over all monthly VAT companies returns. (iii) These indexes cover “whole France” including overseas departments (excepted French Guyana and Mayotte, which are not liable for VAT). (iv) https://www.insee.fr/en/metadonnees/source/indicateur/p1669/documentation-methodologique
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The provided dataset is extracted from yahoo finance using pandas and yahoo finance library in python. This deals with stock market index of the world best economies. The code generated data from Jan 01, 2003 to Jun 30, 2023 that’s more than 20 years. There are 18 CSV files, dataset is generated for 16 different stock market indices comprising of 7 different countries. Below is the list of countries along with number of indices extracted through yahoo finance library, while two CSV files deals with annualized return and compound annual growth rate (CAGR) has been computed from the extracted data.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F90ce8a986761636e3edbb49464b304d8%2FNumber%20of%20Index.JPG?generation=1688490342207096&alt=media" alt="">
This dataset is useful for research purposes, particularly for conducting comparative analyses involving capital market performance and could be used along with other economic indicators.
There are 18 distinct CSV files associated with this dataset. First 16 CSV files deals with number of indices and last two CSV file deals with annualized return of each year and CAGR of each index. If data in any column is blank, it portrays that index was launch in later years, for instance: Bse500 (India), this index launch in 2007, so earlier values are blank, similarly China_Top300 index launch in year 2021 so early fields are blank too.
The extraction process involves applying different criteria, like in 16 CSV files all columns are included, Adj Close is used to calculate annualized return. The algorithm extracts data based on index name (code given by the yahoo finance) according start and end date.
Annualized return and CAGR has been calculated and illustrated in below image along with machine readable file (CSV) attached to that.
To extract the data provided in the attachment, various criteria were applied:
Content Filtering: The data was filtered based on several attributes, including the index name, start and end date. This filtering process ensured that only relevant data meeting the specified criteria.
Collaborative Filtering: Another filtering technique used was collaborative filtering using yahoo finance, which relies on index similarity. This approach involves finding indices that are similar to other index or extended dataset scope to other countries or economies. By leveraging this method, the algorithm identifies and extracts data based on similarities between indices.
In the last two CSV files, one belongs to annualized return, that was calculated based on the Adj close column and new DataFrame created to store its outcome. Below is the image of annualized returns of all index (if unreadable, machine-readable or CSV format is attached with the dataset).
As far as annualised rate of return is concerned, most of the time India stock market indices leading, followed by USA, Canada and Japan stock market indices.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F37645bd90623ea79f3708a958013c098%2FAnnualized%20Return.JPG?generation=1688525901452892&alt=media" alt="">
The best performing index based on compound growth is Sensex (India) that comprises of top 30 companies is 15.60%, followed by Nifty500 (India) that is 11.34% and Nasdaq (USA) all is 10.60%.
The worst performing index is China top300, however this is launch in 2021 (post pandemic), so would not possible to examine at that stage (due to less data availability). Furthermore, UK and Russia indices are also top 5 in the worst order.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F58ae33f60a8800749f802b46ec1e07e7%2FCAGR.JPG?generation=1688490409606631&alt=media" alt="">
Geography: Stock Market Index of the World Top Economies
Time period: Jan 01, 2003 – June 30, 2023
Variables: Stock Market Index Title, Open, High, Low, Close, Adj Close, Volume, Year, Month, Day, Yearly_Return and CAGR
File Type: CSV file
This is not a financial advice; due diligence is required in each investment decision.