100+ datasets found
  1. Data from: Disaster Scene Description and Indexing (DSDI) Dataset

    • catalog.data.gov
    • s.cnmilf.com
    Updated Feb 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). Disaster Scene Description and Indexing (DSDI) Dataset [Dataset]. https://catalog.data.gov/dataset/disaster-scene-description-and-indexing-dsdi-dataset
    Explore at:
    Dataset updated
    Feb 23, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The testing dataset used at TRECVID for the DSDI task in 2020-2022.The dataset includes public videos, ground truth and features of the DSDI task. As the task is continuing, the dataset will be continually updated.There are 32 features across 5 main categories (Environment, Vehicles, Water, Infrastructure, Damage). All videos are airborne low altitude from natural disaster events.

  2. Dataset for Stock Market Index of 7 Economies

    • kaggle.com
    zip
    Updated Jul 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saad Aziz (2023). Dataset for Stock Market Index of 7 Economies [Dataset]. https://www.kaggle.com/datasets/saadaziz1985/dataset-for-stock-market-index-of-7-countries
    Explore at:
    zip(1917326 bytes)Available download formats
    Dataset updated
    Jul 4, 2023
    Authors
    Saad Aziz
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context:

    The provided dataset is extracted from yahoo finance using pandas and yahoo finance library in python. This deals with stock market index of the world best economies. The code generated data from Jan 01, 2003 to Jun 30, 2023 that’s more than 20 years. There are 18 CSV files, dataset is generated for 16 different stock market indices comprising of 7 different countries. Below is the list of countries along with number of indices extracted through yahoo finance library, while two CSV files deals with annualized return and compound annual growth rate (CAGR) has been computed from the extracted data.

    Number of Countries & Index:

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F90ce8a986761636e3edbb49464b304d8%2FNumber%20of%20Index.JPG?generation=1688490342207096&alt=media" alt="">

    Content:

    Unit of analysis: Stock Market Index Analysis

    This dataset is useful for research purposes, particularly for conducting comparative analyses involving capital market performance and could be used along with other economic indicators.

    There are 18 distinct CSV files associated with this dataset. First 16 CSV files deals with number of indices and last two CSV file deals with annualized return of each year and CAGR of each index. If data in any column is blank, it portrays that index was launch in later years, for instance: Bse500 (India), this index launch in 2007, so earlier values are blank, similarly China_Top300 index launch in year 2021 so early fields are blank too.

    The extraction process involves applying different criteria, like in 16 CSV files all columns are included, Adj Close is used to calculate annualized return. The algorithm extracts data based on index name (code given by the yahoo finance) according start and end date.

    Annualized return and CAGR has been calculated and illustrated in below image along with machine readable file (CSV) attached to that.

    To extract the data provided in the attachment, various criteria were applied:

    1. Content Filtering: The data was filtered based on several attributes, including the index name, start and end date. This filtering process ensured that only relevant data meeting the specified criteria.

    2. Collaborative Filtering: Another filtering technique used was collaborative filtering using yahoo finance, which relies on index similarity. This approach involves finding indices that are similar to other index or extended dataset scope to other countries or economies. By leveraging this method, the algorithm identifies and extracts data based on similarities between indices.

    In the last two CSV files, one belongs to annualized return, that was calculated based on the Adj close column and new DataFrame created to store its outcome. Below is the image of annualized returns of all index (if unreadable, machine-readable or CSV format is attached with the dataset).

    Annualized Return:

    As far as annualised rate of return is concerned, most of the time India stock market indices leading, followed by USA, Canada and Japan stock market indices.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F37645bd90623ea79f3708a958013c098%2FAnnualized%20Return.JPG?generation=1688525901452892&alt=media" alt="">

    Compound Annual Growth Rate (CAGR):

    The best performing index based on compound growth is Sensex (India) that comprises of top 30 companies is 15.60%, followed by Nifty500 (India) that is 11.34% and Nasdaq (USA) all is 10.60%.

    The worst performing index is China top300, however this is launch in 2021 (post pandemic), so would not possible to examine at that stage (due to less data availability). Furthermore, UK and Russia indices are also top 5 in the worst order.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15657145%2F58ae33f60a8800749f802b46ec1e07e7%2FCAGR.JPG?generation=1688490409606631&alt=media" alt="">

    Geography: Stock Market Index of the World Top Economies

    Time period: Jan 01, 2003 – June 30, 2023

    Variables: Stock Market Index Title, Open, High, Low, Close, Adj Close, Volume, Year, Month, Day, Yearly_Return and CAGR

    File Type: CSV file

    Inspiration:

    • Time series prediction model
    • Investment opportunities in world best economies
    • Comparative Analysis of past data with other stock market indices or other indices

    Disclaimer:

    This is not a financial advice; due diligence is required in each investment decision.

  3. R

    Indexing Magic Cards Dataset

    • universe.roboflow.com
    zip
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MagicAl Index (2025). Indexing Magic Cards Dataset [Dataset]. https://universe.roboflow.com/magical-index/indexing-magic-cards-gmo7a/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset authored and provided by
    MagicAl Index
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Magic Cards Bounding Boxes
    Description

    Indexing Magic Cards

    ## Overview
    
    Indexing Magic Cards is a dataset for object detection tasks - it contains Magic Cards annotations for 297 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. Z

    MESINESP: Medical Semantic Indexing in Spanish - Development dataset

    • data-staging.niaid.nih.gov
    • live.european-language-grid.eu
    • +2more
    Updated Nov 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Medical Semantic Indexing in Spanish - Development dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3746595
    Explore at:
    Dataset updated
    Nov 5, 2022
    Dataset provided by
    Barcelona Supercomputing Center
    Authors
    Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

    Introduction

    The Mesinesp (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) development set has a total of 750 records indexed manually by seven experienced medical literature indexers. Indexing is done using DeCS codes, a sort of Spanish equivalent to MeSH terms. Records were distributed in a way that each article was annotated, at least, by two different human indexers.

    The data annotation process consisted in two steps:

    Manual indexing step. DeCS codes were manually assigned to each record following the DeCS manual indexing guidelines.

    Manual validation and consensus. The joined set of manually indexed DeCS codes generated by both indexers were manually revised and corrections were done.

    These annotations were analyzed, resulting in an agreement using the Jaccard index.

    Records consisted basically in medical literature abstracts and titles from the IBECS and LILACS databases.

    Zip structure The zip file contains two different development sets:

    Official development set, which has the union of the annotations, with an agreement of macro = 0.6568 and micro = 0.6819. This set is composed by all the different (unique) DeCS codes that have been added by any annotator for each document; and

    Core-descriptors development set, which has the intersection of the annotations, with an agreement of macro = 1.0 and micro = 1.0. This set is composed of the common DeCS codes that have been added by two or more annotators for each document.

    Corpus format

    Each dataset is a JSON object with one single key named "articles", which contains a list of documents. So, the raw format of the file is one line per document plus two additional lines (the first and the last) to enclose that list of documents and the expected type of data is as follows:

    {"articles":[ {"abstractText":str,"db":str,"decsCodes":list,"id":str,"journal":str,"title":str,"year":int}, ... ]}

    To clarify, the order of appearance of the fields in each document is as follows (note that this example it is pretty printed for readability purposes):

    { "articles": [ { "abstractText": "Content of the abstract", "db": "Name of the source database", "decsCodes": [ "code1", "code2", "code3" ], "id": "Id of the document", "journal": "Name of the journal", "title": "Title of the document", "year": 2019 } ] }

    Note: The fields "db", "journal" and "year" might be null.

    Copyright (c) 2020 Secretaría de Estado de Digitalización e Inteligencia Artificial

  5. Human Capital Index

    • kaggle.com
    zip
    Updated Oct 17, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tjysdsg (2018). Human Capital Index [Dataset]. https://www.kaggle.com/datasets/tjysdsg/humancapitalindex
    Explore at:
    zip(1152109 bytes)Available download formats
    Dataset updated
    Oct 17, 2018
    Authors
    tjysdsg
    Description

    This database is not owned by me, I uploaded this merely to make importing it to kaggle kernels more convenient. I don't have any responsibility for maintaining this dataset, and all rights are reserved for the original author(s)

    All data is downloaded from https://datacatalog.worldbank.org/dataset/human-capital-index

    For documentation, please see https://datacatalog.worldbank.org/dataset/human-capital-index

  6. PLANTAS Dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    application/gzip
    Updated Jun 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Toselli; Enrique Vidal; Alejandro Toselli; Enrique Vidal (2022). PLANTAS Dataset [Dataset]. http://doi.org/10.5281/zenodo.6608342
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 3, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alejandro Toselli; Enrique Vidal; Alejandro Toselli; Enrique Vidal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset "PLANTAS" (“Historia de las plantas”, Vol.1) were written using a quill-pen by Bernardo de Cienfuegos, one of the most outstanding Spanish botanists in the XVII century. The book was writing mainly in Spanish, but a significant number of words and full sentences are in Latin and many other languages. The originals of PLANTAS are currently available at the "Biblioteca Nacional de España", and a digital reproduction of it can be found at the "Biblioteca Digital Hispánica" (http://bdh-rd.bne.es/viewer.vm?id=0000140162). In this dataset, only the first volume of PLANTAS (Mss 3357, with 1,035 pages and around 20,000 handwritten text lines) was considered.

  7. City Happiness Index - 2024

    • kaggle.com
    zip
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EMİRHAN BULUT (2024). City Happiness Index - 2024 [Dataset]. https://www.kaggle.com/datasets/emirhanai/city-happiness-index-2024
    Explore at:
    zip(7931 bytes)Available download formats
    Dataset updated
    Jan 22, 2024
    Authors
    EMİRHAN BULUT
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Name: City Happiness Index

    Dataset Description:

    This dataset and the related codes are entirely prepared, original, and exclusive by Emirhan BULUT. The dataset includes crucial features and measurements from various cities around the world, focusing on factors that may affect the overall happiness score of each city. By analyzing these factors, we aim to gain insights into the living conditions and satisfaction of the population in urban environments.

    The dataset consists of the following features:

    • City: Name of the city.
    • Month: The month in which the data is recorded.
    • Year: The year in which the data is recorded.
    • Decibel_Level: Average noise levels in decibels, indicating the auditory comfort of the citizens.
    • Traffic_Density: Level of traffic density (Low, Medium, High, Very High), which might impact citizens' daily commute and stress levels.
    • Green_Space_Area: Percentage of green spaces in the city, positively contributing to the mental well-being and relaxation of the inhabitants.
    • Air_Quality_Index: Index measuring the quality of air, a crucial aspect affecting citizens' health and overall satisfaction.
    • Happiness_Score: The average happiness score of the city (on a 1-10 scale), representing the subjective well-being of the population.
    • Cost_of_Living_Index: Index measuring the cost of living in the city (relative to a reference city), which could impact the financial satisfaction of the citizens.
    • Healthcare_Index: Index measuring the quality of healthcare in the city, an essential component of the population's well-being and contentment.

    With these features, the dataset aims to analyze and understand the relationship between various urban factors and the happiness of a city's population. The developed Deep Q-Network model, PIYAAI_2, is designed to learn from this data to provide accurate predictions in future scenarios. Using Reinforcement Learning, the model is expected to improve its performance over time as it learns from new data and adapts to changes in the environment.

  8. Reference count CSV dataset of all bibliographic resources in OpenCitations...

    • figshare.com
    zip
    Updated Dec 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OpenCitations ​ (2023). Reference count CSV dataset of all bibliographic resources in OpenCitations Index [Dataset]. http://doi.org/10.6084/m9.figshare.24747498.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 11, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    OpenCitations ​
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A CSV dataset containing the number of references of each bibliographic entity identified by an OMID in the OpenCitations Index (https://opencitations.net/index).The dataset is based on the last release of the OpenCitations Index (https://opencitations.net/download) – November 2023. The size of the zipped archive is 0.35 GB, while the size of the unzipped CSV file is 1.7 GB.The CSV dataset contains the reference count of 71,805,806 bibliographic entities. The first column (omid) lists the entities, while the second column (references) indicates the corresponding number of incoming citations.

  9. T

    United States Dallas Fed Manufacturing Shipments Index

    • tradingeconomics.com
    • jp.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States Dallas Fed Manufacturing Shipments Index [Dataset]. https://tradingeconomics.com/united-states/dallas-fed-manufacturing-shipments-index
    Explore at:
    xml, excel, csv, jsonAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 2004 - Nov 30, 2025
    Area covered
    United States
    Description

    Dallas Fed Manufacturing Shipments Index in the United States increased to 15.10 points in November from 5.80 points in October of 2025. This dataset includes a chart with historical data for the United States Dallas Fed Manufacturing Shipments Index.

  10. o

    services index - Dataset - Open Government Data Portal

    • opendata.gov.jo
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). services index - Dataset - Open Government Data Portal [Dataset]. https://opendata.gov.jo/dataset/services-index-1924-2022
    Explore at:
    Dataset updated
    Jul 16, 2025
    Description

    services index

  11. Environmental Quality Index

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Environmental Quality Index [Dataset]. https://catalog.data.gov/dataset/environmental-quality-index
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    An Environmental Quality Index (EQI) for all counties in the United States for the time period 2000-2005 was developed which incorporated data from five environmental domains: air, water, land, built, and socio-demographic. The EQI was developed in four parts: domain identification; data source identification and review; variable construction; and data reduction using principal components analysis (PCA). The methods applied provide a reproducible approach that capitalizes almost exclusively on publically-available data sources. The primary goal in creating the EQI is to use it as a composite environmental indicator for research on human health. A series of peer reviewed manuscripts utilized the EQI in examining health outcomes. This dataset is not publicly accessible because: This series of papers are considered Human health research - not to be loaded onto ScienceHub. It can be accessed through the following means: The EQI data can be accessed at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: EQI data, metadata, formats, and data dictionary all available at website. This dataset is associated with the following publications: Gray, C., L. Messer, K. Rappazzo, J. Jagai, S. Grabich, and D. Lobdell. The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS ONE. Public Library of Science, San Francisco, CA, USA, 13(8): e0203301, (2018). Patel, A., J. Jagai, L. Messer, C. Gray, K. Rappazzo, S. DeflorioBarker, and D. Lobdell. Associations between environmental quality and infant mortality in the United States, 2000-2005. Archives of Public Health. BioMed Central Ltd, London, UK, 76(60): 1, (2018). Gray, C., D. Lobdell, K. Rappazzo, Y. Jian, J. Jagai, L. Messer, A. Patel, S. Deflorio-Barker, C. Lyttle, J. Solway, and A. Rzhetsky. Associations between environmental quality and adult asthma prevalence in medical claims data. ENVIRONMENTAL RESEARCH. Elsevier B.V., Amsterdam, NETHERLANDS, 166: 529-536, (2018).

  12. d

    Neighborhood Index

    • catalog.data.gov
    • data.providenceri.gov
    • +2more
    Updated Nov 29, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.providenceri.gov (2021). Neighborhood Index [Dataset]. https://catalog.data.gov/dataset/neighborhood-index
    Explore at:
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    data.providenceri.gov
    Description

    Providence Neighborhood Code with related Description. Useful to cross-reference tax assessment and collection data.

  13. C

    Data from: chis_shore - Coastal Vulnerability Index (CVI) dataset for...

    • data.cnra.ca.gov
    • data.amerigeoss.org
    zip
    Updated May 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ocean Data Partners (2019). chis_shore - Coastal Vulnerability Index (CVI) dataset for Channel Islands National Park [Dataset]. https://data.cnra.ca.gov/dataset/chis_shore-coastal-vulnerability-index-cvi-dataset-for-channel-islands-national-park
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 8, 2019
    Dataset authored and provided by
    Ocean Data Partners
    Area covered
    Channel Islands of California
    Description

    A coastal vulnerability index (CVI) was used to map the relative vulnerability of the coast to future sea-level rise within Channel Islands National Park in California. The CVI ranks the following in terms of their physical contribution to sea-level rise-related coastal change: geomorphology, regional coastal slope, rate of relative sea-level rise, historical shoreline change rates, mean tidal range and mean significant wave height. The rankings for each input variable were combined and an index value calculated for 1-minute grid cells covering the park. The CVI highlights those regions where the physical effects of sea-level rise might be the greatest. This approach combines the coastal system's susceptibility to change with its natural ability to adapt to changing environmental conditions, yielding a quantitative, although relative, measure of the park's natural vulnerability to the effects of sea-level rise. The CVI and the data contained within this dataset provide an objective technique for evaluation and long-term planning by scientists and park managers.

  14. D

    Data from: U-Index, a dataset and an impact metric for informatics tools and...

    • datasetcatalog.nlm.nih.gov
    • data.niaid.nih.gov
    • +2more
    Updated Feb 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Winnenburg, Rainer; Shah, Nigam H.; Callahan, Alison (2019). U-Index, a dataset and an impact metric for informatics tools and databases [Dataset]. http://doi.org/10.5061/dryad.gj651
    Explore at:
    Dataset updated
    Feb 22, 2019
    Authors
    Winnenburg, Rainer; Shah, Nigam H.; Callahan, Alison
    Description

    Measuring the usage of informatics resources such as software tools and databases is essential to quantifying their impact, value and return on investment. We have developed a publicly available dataset of informatics resource publications and their citation network, along with an associated metric (u-Index) to measure informatics resources’ impact over time. Our dataset differentiates the context in which citations occur to distinguish between ‘awareness’ and ‘usage’, and uses a citing universe of open access publications to derive citation counts for quantifying impact. Resources with a high ratio of usage citations to awareness citations are likely to be widely used by others and have a high u-Index score. We have pre-calculated the u-Index for nearly 100,000 informatics resources. We demonstrate how the u-Index can be used to track informatics resource impact over time. The method of calculating the u-Index metric, the pre-computed u-Index values, and the dataset we compiled to calculate the u-Index are publicly available.

  15. H

    Historical Index of Ethnic Fractionalization Dataset (HIEF)

    • dataverse.harvard.edu
    Updated Sep 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lenka Drazanova (2020). Historical Index of Ethnic Fractionalization Dataset (HIEF) [Dataset]. http://doi.org/10.7910/DVN/4JQRCL
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 12, 2020
    Dataset provided by
    Harvard Dataverse
    Authors
    Lenka Drazanova
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Historical Index of Ethnic Fractionalization (HIEF) dataset contains an ethnic fractionalization index for 165 countries across all continents. The dataset covers annually the period 1945-2013. The ethnic fractionalization index corresponds to the probability that two randomly drawn individuals within a country are not from the same ethnic group. The new dataset is a natural extension of previous ethnic fractionalization indices and it allows its users to compare developments in ethnic fractionalization over time. The applications of HIEF pertain to the pattern of ethnic diversity across countries and over time.

  16. Sea Ice Index, Version 4 - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Sea Ice Index, Version 4 - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/sea-ice-index-version-4
    Explore at:
    Dataset updated
    Aug 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Notice: Due to funding limitations, this data set was recently changed to a “Basic” Level of Service. Learn more about what this means for users and how you can share your story here: Level of Service Update for Data Products. The Sea Ice Index provides a quick look at Arctic- and Antarctic-wide changes in sea ice. It is a source for consistent, up-to-date sea ice extent and concentration images, in PNG format, and data values, in GeoTIFF and ASCII text files, from November 1978 to the present. Sea Ice Index images also depict trends and anomalies in ice cover calculated using a 30-year reference period of 1981 through 2010.The images and data are produced in a consistent way that makes the Index time-series appropriate for use when looking at long-term trends in sea ice cover. Both monthly and daily products are available. However, monthly products are better to use for long-term trend analysis because errors in the daily product tend to be averaged out in the monthly product and because day-to-day variations are often the result of short-term weather.

  17. Dataset used for research on predicting next day closing price

    • figshare.com
    application/csv
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Firdaus Cayzer (2024). Dataset used for research on predicting next day closing price [Dataset]. http://doi.org/10.6084/m9.figshare.26169403.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ahmad Firdaus Cayzer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There are a total of 5 datasets.sp500_datasp500_newFeatures_datasp500_lagged_datanasdaq_lagged_datahsi_lagged_dataThe first dataset contains 34 years worth of data from 1990 to 2023 for the stock index S&P500. This dataset has been preprocessed and is used for training and testing. The second dataset transforms the initial dataset with the addition of new features derived from the first dataset. The third dataset is a different transformation of the first dataset where the features are mostly contained of lagged features. The fourth dataset contains 10 years of data for the NASDAQ index from 2014-2023 following the same format of lagged features like the third dataset. The fifth dataset has 10 years of data from 2014-2023 for the HSI stock index. This dataset also follows the same format of features as the third datasetAll five of these datasets were used as implementations for a research to predict tomorrow's closing price based on today's financial features

  18. Z

    MESINESP: Post-workshop datasets. Silver Standard and annotator records

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Nov 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Krallinger; Carlos Rodríguez-Penagos; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Post-workshop datasets. Silver Standard and annotator records [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3946557
    Explore at:
    Dataset updated
    Nov 5, 2022
    Dataset provided by
    Barcelona Supercomputing Center
    Authors
    Martin Krallinger; Carlos Rodríguez-Penagos; Aitor Gonzalez-Agirre; Alejandro Asensio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

    The MESINESP (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) Challenge was held in May-June 2020, and as a result of a strong participation and the manual annotation of an evaluation dataset, two additional datasets are released now:

    1) "all_annotations_withIDsv3.tsv" contains a tab-separated file with all manual annotations (both validated and non-validated) of the evaluation dataset prepared for the competition. It contains the following fields:

    annotatorName: Human annotator id

    documentId: Document ID in the source database

    decsCode: A DeCS code added to it or validated

    timestamp: When it was added

    validated: if it was validated at that point by another annotator, or not yet

    SpanishTerm: The Spanish descriptor corresponding to the DeCS code

    mesinespId: The internal document id in the distributed evaluation file

    dataset: if part of the evaluation or the test sets

    source: which database it was taken from

    Example:

    annotatorName documentId decsCode timestamp validated SpanishTerm mesinespId dataset source A7 biblio-1001069 6893 2020-01-17T11:27:07.000Z false caballos mesinesp-dev-671 dev LILACS A7 biblio-1001069 4345 2020-01-17T11:27:12.000Z false perros mesinesp-dev-671 dev LILACS

    2) A "Silver Standard" created from the 24 system runs submitted by 6 participating teams. It contains each of the submitted DeCS code for each document in the test set, as well as other information that can help ascertain reliability and source for anyone that wants to use this dataset to enrich their training data. It contains more that 5.8 million datapoints, and is structured as follows

    SubmissionName: Alias of the team that submitted the run

    REALdocumentId: The real id of the document

    mesinespId: The mesinesp assigned id in the evaluation dataset

    docSource: The source database

    decsCode: the DeCS code assigned to it by the team's system

    SpanishTerm: The Spanish descriptor of the DeCS code

    MiF: The Micro-f1 scored by that system's run

    MiR: The Micro-Recall scored by that system's run

    MiP: The Micro-Precision scored by that system's run

    Acc: The Accuracy scored by that system's run

    consensus: The number of runs where that DeCS code was assigned to this document by the participating teams (max. is 24)

    Example:

    SubmissionName REALdocumentId mesinespId docSource decsCode SpanishTerm MiF MiR MiP Acc consensus AN ibc-177565 mesinesp-evaluation-00001 IBECS 28567 riesgo 0.2054 0.1930 0.2196 0.1198 4 AN ibc-177565 mesinesp-evaluation-00001 IBECS 15335 trabajo 0.2054 0.1930 0.2196 0.1198 4 AN ibc-177565 mesinesp-evaluation-00001 IBECS 33182 conocimiento 0.2054 0.1930 0.2196 0.1198 7

    For citation and a detailed description of the Challenge, please cite: Anastasios, Nentidis and Anastasia, Krithara and Konstantinos, Bougiatiotis and Martin, Krallinger and Carlos, Rodriguez-Penagos and Marta, Villegas and Georgios, Paliouras. Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering (2020). Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2020). Thessaloniki, Greece, September 22--25

    Citation

    @inproceedings{durusan2019overview, title={Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering}, author={Anastasios, Nentidis and Anastasia, Krithara and Konstantinos, Bougiatiotis and Martin, Krallinger and Carlos, Rodriguez-Penagos and Marta, Villegas and Georgios, Paliouras}, booktitle={Experimental IR Meets Multilinguality, Multimodality, and Interaction Proceedings of the Eleventh International Conference of the CLEF Association (CLEF 2020), Thessaloniki, Greece, September 22--25, 2020, Proceedings}, volume={12260}, year={2020}, organization={Springer} }

    Copyright (c) 2020 Secretaría de Estado de Digitalización e Inteligencia Artificial

  19. c

    AI Global Index Dataset

    • cubig.ai
    zip
    Updated Jun 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). AI Global Index Dataset [Dataset]. https://cubig.ai/store/products/529/ai-global-index-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The AI Global Index Dataset is a comprehensive index that benchmarks 62 countries based on the level of AI investment, innovation, and implementation, including seven key indicators (human resources, infrastructure, operational environment, research, development, government strategy, commercialization) and general information by country (region, cluster, income group, political system).

    2) Data Utilization (1) AI Global Index Dataset has characteristics that: • This dataset consists of a total of 13 columns with 5 categorical variables (regions, clusters, etc.) and 8 numerical variables (scores for each indicator), covering 62 countries. • The seven key indicators are classified into three pillars: △ implementation (human resources/infrastructure/operational environment) △ innovation (R&D) △ investment (government strategy/commercialization), and assess each country's overall AI ecosystem capabilities in multiple dimensions. (2) AI Global Index Dataset can be used to: • Global AI leadership pattern analysis: Correlation analysis between seven indicators can identify AI strengths and weaknesses by country and perform group comparisons by region and income level. • Machine learning-based predictive model: It can be used for data science education and application, such as country-specific index prediction through regression analysis or classification of AI development types through clustering.

  20. B

    Canadian Student-led Academic Journals - platforms and indexing data

    • borealisdata.ca
    • search.dataone.org
    Updated May 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariya Maistrovskaya (2023). Canadian Student-led Academic Journals - platforms and indexing data [Dataset]. http://doi.org/10.5683/SP3/QXEUVH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 4, 2023
    Dataset provided by
    Borealis
    Authors
    Mariya Maistrovskaya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Canada
    Description

    This dataset was compiled as part of a study on Barriers and Opportunities in the Discoverability and Indexing of Student-led Academic Journals. The list of student journals and their details is compiled from public sources. This list is used to identify the presence of Canadian student journals in Google Scholar as well as in select indexes and databases: DOAJ, Scopus, Web of Science, Medline, Erudit, ProQuest, and HeinOnline. Additionally, journal publishing platform is recorded to be used for a correlational analysis against Google Scholar indexing results. For further details see README.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2023). Disaster Scene Description and Indexing (DSDI) Dataset [Dataset]. https://catalog.data.gov/dataset/disaster-scene-description-and-indexing-dsdi-dataset
Organization logo

Data from: Disaster Scene Description and Indexing (DSDI) Dataset

Related Article
Explore at:
Dataset updated
Feb 23, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description

The testing dataset used at TRECVID for the DSDI task in 2020-2022.The dataset includes public videos, ground truth and features of the DSDI task. As the task is continuing, the dataset will be continually updated.There are 32 features across 5 main categories (Environment, Vehicles, Water, Infrastructure, Damage). All videos are airborne low altitude from natural disaster events.

Search
Clear search
Close search
Google apps
Main menu