100+ datasets found
  1. Z

    MESINESP: Medical Semantic Indexing in Spanish - Development dataset

    • data-staging.niaid.nih.gov
    • live.european-language-grid.eu
    • +2more
    Updated Nov 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Medical Semantic Indexing in Spanish - Development dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3746595
    Explore at:
    Dataset updated
    Nov 5, 2022
    Dataset provided by
    Barcelona Supercomputing Center
    Authors
    Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

    Introduction

    The Mesinesp (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) development set has a total of 750 records indexed manually by seven experienced medical literature indexers. Indexing is done using DeCS codes, a sort of Spanish equivalent to MeSH terms. Records were distributed in a way that each article was annotated, at least, by two different human indexers.

    The data annotation process consisted in two steps:

    Manual indexing step. DeCS codes were manually assigned to each record following the DeCS manual indexing guidelines.

    Manual validation and consensus. The joined set of manually indexed DeCS codes generated by both indexers were manually revised and corrections were done.

    These annotations were analyzed, resulting in an agreement using the Jaccard index.

    Records consisted basically in medical literature abstracts and titles from the IBECS and LILACS databases.

    Zip structure The zip file contains two different development sets:

    Official development set, which has the union of the annotations, with an agreement of macro = 0.6568 and micro = 0.6819. This set is composed by all the different (unique) DeCS codes that have been added by any annotator for each document; and

    Core-descriptors development set, which has the intersection of the annotations, with an agreement of macro = 1.0 and micro = 1.0. This set is composed of the common DeCS codes that have been added by two or more annotators for each document.

    Corpus format

    Each dataset is a JSON object with one single key named "articles", which contains a list of documents. So, the raw format of the file is one line per document plus two additional lines (the first and the last) to enclose that list of documents and the expected type of data is as follows:

    {"articles":[ {"abstractText":str,"db":str,"decsCodes":list,"id":str,"journal":str,"title":str,"year":int}, ... ]}

    To clarify, the order of appearance of the fields in each document is as follows (note that this example it is pretty printed for readability purposes):

    { "articles": [ { "abstractText": "Content of the abstract", "db": "Name of the source database", "decsCodes": [ "code1", "code2", "code3" ], "id": "Id of the document", "journal": "Name of the journal", "title": "Title of the document", "year": 2019 } ] }

    Note: The fields "db", "journal" and "year" might be null.

    Copyright (c) 2020 Secretaría de Estado de Digitalización e Inteligencia Artificial

  2. R

    Real-Time Index Database Report

    • marketreportanalytics.com
    doc, pdf, ppt
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Report Analytics (2025). Real-Time Index Database Report [Dataset]. https://www.marketreportanalytics.com/reports/real-time-index-database-75396
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Apr 10, 2025
    Dataset authored and provided by
    Market Report Analytics
    License

    https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Unlock the power of real-time data! Explore the booming real-time index database market, projected to reach $32 billion by 2033. Discover key trends, leading companies (Elastic, AWS, Splunk), and regional insights in this comprehensive market analysis.

  3. City Happiness Index - 2024

    • kaggle.com
    zip
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    EMİRHAN BULUT (2024). City Happiness Index - 2024 [Dataset]. https://www.kaggle.com/datasets/emirhanai/city-happiness-index-2024
    Explore at:
    zip(7931 bytes)Available download formats
    Dataset updated
    Jan 22, 2024
    Authors
    EMİRHAN BULUT
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Name: City Happiness Index

    Dataset Description:

    This dataset and the related codes are entirely prepared, original, and exclusive by Emirhan BULUT. The dataset includes crucial features and measurements from various cities around the world, focusing on factors that may affect the overall happiness score of each city. By analyzing these factors, we aim to gain insights into the living conditions and satisfaction of the population in urban environments.

    The dataset consists of the following features:

    • City: Name of the city.
    • Month: The month in which the data is recorded.
    • Year: The year in which the data is recorded.
    • Decibel_Level: Average noise levels in decibels, indicating the auditory comfort of the citizens.
    • Traffic_Density: Level of traffic density (Low, Medium, High, Very High), which might impact citizens' daily commute and stress levels.
    • Green_Space_Area: Percentage of green spaces in the city, positively contributing to the mental well-being and relaxation of the inhabitants.
    • Air_Quality_Index: Index measuring the quality of air, a crucial aspect affecting citizens' health and overall satisfaction.
    • Happiness_Score: The average happiness score of the city (on a 1-10 scale), representing the subjective well-being of the population.
    • Cost_of_Living_Index: Index measuring the cost of living in the city (relative to a reference city), which could impact the financial satisfaction of the citizens.
    • Healthcare_Index: Index measuring the quality of healthcare in the city, an essential component of the population's well-being and contentment.

    With these features, the dataset aims to analyze and understand the relationship between various urban factors and the happiness of a city's population. The developed Deep Q-Network model, PIYAAI_2, is designed to learn from this data to provide accurate predictions in future scenarios. Using Reinforcement Learning, the model is expected to improve its performance over time as it learns from new data and adapts to changes in the environment.

  4. Index match, Index match Advance

    • kaggle.com
    zip
    Updated Mar 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanjana Murthy (2024). Index match, Index match Advance [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/index-match-index-match-advance
    Explore at:
    zip(10258 bytes)Available download formats
    Dataset updated
    Mar 15, 2024
    Authors
    Sanjana Murthy
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This data contains Index match, index match Advance

  5. Historical S&P 500 (^GSPC) Index Data (1927–2025)

    • kaggle.com
    zip
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reza Nematpour (2025). Historical S&P 500 (^GSPC) Index Data (1927–2025) [Dataset]. https://www.kaggle.com/datasets/rezanematpour/historical-s-and-p-500-gspc-index-data-19272025
    Explore at:
    zip(350147 bytes)Available download formats
    Dataset updated
    Aug 31, 2025
    Authors
    Reza Nematpour
    Description

    This dataset contains the full historical record of the S&P 500 index (^GSPC), downloaded via the Yahoo Finance API using the yfinance Python library.

    The dataset includes: - Date: Trading date - Open, High, Low, Close: Daily price levels - Volume: Daily trading volume

    Period covered: Dec 30, 1927 – Aug 31, 2025 Frequency: Daily

    ⚠️ Disclaimer: This dataset is provided for educational and research purposes only. Redistribution or commercial use may be subject to Yahoo Finance’s Terms of Service

    License

    Data sourced from Yahoo Finance. Provided for educational and research purposes only. Redistribution may be restricted.

  6. T

    United States Dallas Fed Manufacturing Shipments Index

    • tradingeconomics.com
    • jp.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS, United States Dallas Fed Manufacturing Shipments Index [Dataset]. https://tradingeconomics.com/united-states/dallas-fed-manufacturing-shipments-index
    Explore at:
    xml, excel, csv, jsonAvailable download formats
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 2004 - Nov 30, 2025
    Area covered
    United States
    Description

    Dallas Fed Manufacturing Shipments Index in the United States increased to 15.10 points in November from 5.80 points in October of 2025. This dataset includes a chart with historical data for the United States Dallas Fed Manufacturing Shipments Index.

  7. R

    Indexing Magic Cards Dataset

    • universe.roboflow.com
    zip
    Updated Oct 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MagicAl Index (2025). Indexing Magic Cards Dataset [Dataset]. https://universe.roboflow.com/magical-index/indexing-magic-cards-gmo7a/dataset/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset authored and provided by
    MagicAl Index
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Magic Cards Bounding Boxes
    Description

    Indexing Magic Cards

    ## Overview
    
    Indexing Magic Cards is a dataset for object detection tasks - it contains Magic Cards annotations for 297 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  8. Transportation Services Index - Passenger

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Jan 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of Transportation Statistics (2025). Transportation Services Index - Passenger [Dataset]. https://catalog.data.gov/dataset/transportation-services-index-passenger
    Explore at:
    Dataset updated
    Jan 2, 2025
    Dataset provided by
    Bureau of Transportation Statisticshttp://www.rita.dot.gov/bts
    Description

    A monthly measure of the volume of services performed by the for-hire transportation sector. The index covers the activities of local mass transit, intercity passenger rail, and passenger air transportation.

  9. Environmental Quality Index

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Environmental Quality Index [Dataset]. https://catalog.data.gov/dataset/environmental-quality-index
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    An Environmental Quality Index (EQI) for all counties in the United States for the time period 2000-2005 was developed which incorporated data from five environmental domains: air, water, land, built, and socio-demographic. The EQI was developed in four parts: domain identification; data source identification and review; variable construction; and data reduction using principal components analysis (PCA). The methods applied provide a reproducible approach that capitalizes almost exclusively on publically-available data sources. The primary goal in creating the EQI is to use it as a composite environmental indicator for research on human health. A series of peer reviewed manuscripts utilized the EQI in examining health outcomes. This dataset is not publicly accessible because: This series of papers are considered Human health research - not to be loaded onto ScienceHub. It can be accessed through the following means: The EQI data can be accessed at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: EQI data, metadata, formats, and data dictionary all available at website. This dataset is associated with the following publications: Gray, C., L. Messer, K. Rappazzo, J. Jagai, S. Grabich, and D. Lobdell. The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS ONE. Public Library of Science, San Francisco, CA, USA, 13(8): e0203301, (2018). Patel, A., J. Jagai, L. Messer, C. Gray, K. Rappazzo, S. DeflorioBarker, and D. Lobdell. Associations between environmental quality and infant mortality in the United States, 2000-2005. Archives of Public Health. BioMed Central Ltd, London, UK, 76(60): 1, (2018). Gray, C., D. Lobdell, K. Rappazzo, Y. Jian, J. Jagai, L. Messer, A. Patel, S. Deflorio-Barker, C. Lyttle, J. Solway, and A. Rzhetsky. Associations between environmental quality and adult asthma prevalence in medical claims data. ENVIRONMENTAL RESEARCH. Elsevier B.V., Amsterdam, NETHERLANDS, 166: 529-536, (2018).

  10. C

    Data from: chis_shore - Coastal Vulnerability Index (CVI) dataset for...

    • data.cnra.ca.gov
    • data.amerigeoss.org
    zip
    Updated May 8, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ocean Data Partners (2019). chis_shore - Coastal Vulnerability Index (CVI) dataset for Channel Islands National Park [Dataset]. https://data.cnra.ca.gov/dataset/chis_shore-coastal-vulnerability-index-cvi-dataset-for-channel-islands-national-park
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 8, 2019
    Dataset authored and provided by
    Ocean Data Partners
    Area covered
    Channel Islands of California
    Description

    A coastal vulnerability index (CVI) was used to map the relative vulnerability of the coast to future sea-level rise within Channel Islands National Park in California. The CVI ranks the following in terms of their physical contribution to sea-level rise-related coastal change: geomorphology, regional coastal slope, rate of relative sea-level rise, historical shoreline change rates, mean tidal range and mean significant wave height. The rankings for each input variable were combined and an index value calculated for 1-minute grid cells covering the park. The CVI highlights those regions where the physical effects of sea-level rise might be the greatest. This approach combines the coastal system's susceptibility to change with its natural ability to adapt to changing environmental conditions, yielding a quantitative, although relative, measure of the park's natural vulnerability to the effects of sea-level rise. The CVI and the data contained within this dataset provide an objective technique for evaluation and long-term planning by scientists and park managers.

  11. Cyanobacteria Index (MERIS)

    • catalog.data.gov
    • s.cnmilf.com
    • +2more
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Cyanobacteria Index (MERIS) [Dataset]. https://catalog.data.gov/dataset/cyanobacteria-index-meris
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset shows the concentration of cyanobacteria cells/ml in fresh water bodies and estuaries of the Ohio and Florida derived from 300x300 meter MEdium Resolution Imaging Spectrometer (MERIS) satellite imagery. This dataset was produced through partnership with the National Oceanic and Atmospheric Administration (NOAA), the National Aeronautics and Space Administration (NASA), the United States Geological Survey (USGS), and the United States Environmental Protection Agency (USEPA). This cyanobacteria dataset was derived using the European Space Agency (ESA) Envisat satellite and MERIS instrument. MERIS is a 68.5 degree field-of-view nadir-pointing imaging spectrometer which measures the solar radiation reflected by the Earth in 15 spectral bands (visible and near-infrared). MERIS imagery was used to identify long-wavelength spectral bands (from red through near-infrared portion of the spectrum) to locate algal blooms within freshwaters and estuaries of the continental United States. This dataset is associated with the following publication: Urquhart, E., B. Schaeffer, R. Stumpf, K. Loftin, and J. Wedell. .A method for examining temporal changes in cyanobacterial harmful algal bloom spatial extent using satellite remote sensing. Harmful Algae. Elsevier B.V., Amsterdam, NETHERLANDS, 67: 144-152, (2017).

  12. Dataset used for research on predicting next day closing price

    • figshare.com
    application/csv
    Updated Jul 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ahmad Firdaus Cayzer (2024). Dataset used for research on predicting next day closing price [Dataset]. http://doi.org/10.6084/m9.figshare.26169403.v1
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Jul 3, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Ahmad Firdaus Cayzer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    There are a total of 5 datasets.sp500_datasp500_newFeatures_datasp500_lagged_datanasdaq_lagged_datahsi_lagged_dataThe first dataset contains 34 years worth of data from 1990 to 2023 for the stock index S&P500. This dataset has been preprocessed and is used for training and testing. The second dataset transforms the initial dataset with the addition of new features derived from the first dataset. The third dataset is a different transformation of the first dataset where the features are mostly contained of lagged features. The fourth dataset contains 10 years of data for the NASDAQ index from 2014-2023 following the same format of lagged features like the third dataset. The fifth dataset has 10 years of data from 2014-2023 for the HSI stock index. This dataset also follows the same format of features as the third datasetAll five of these datasets were used as implementations for a research to predict tomorrow's closing price based on today's financial features

  13. US House Price Index Prediction Dataset

    • kaggle.com
    zip
    Updated Jun 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohit Gupta (2024). US House Price Index Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/mohitgupta12/us-house-price-index-prediction-dataset
    Explore at:
    zip(26547 bytes)Available download formats
    Dataset updated
    Jun 8, 2024
    Authors
    Mohit Gupta
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Dataset

    This dataset was created by Mohit Gupta

    Released under CC0: Public Domain

    Contents

  14. T

    United States CFNAI Employment, Unemployment and Hours Index

    • tradingeconomics.com
    • ru.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States CFNAI Employment, Unemployment and Hours Index [Dataset]. https://tradingeconomics.com/united-states/cfnai-employment-index
    Explore at:
    csv, json, xml, excelAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 31, 1967 - Aug 31, 2025
    Area covered
    United States
    Description

    CFNAI Employment Index in the United States increased to -0.07 points in August from -0.10 points in July of 2025. This dataset includes a chart with historical data for the United States CFNAI Employment Index.

  15. faiss-512-wikipedia-202308

    • kaggle.com
    zip
    Updated Sep 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    averagemn (2023). faiss-512-wikipedia-202308 [Dataset]. https://www.kaggle.com/datasets/donkeys/faiss-512-wikipedia-202308
    Explore at:
    zip(8678597313 bytes)Available download formats
    Dataset updated
    Sep 16, 2023
    Authors
    averagemn
    Description

    A faiss index for the wikipedia documents chunked from early august 2023 wikipedia dump, with FAISS doc id's matching the doc id's in these two pre-chunked databases:

    https://www.kaggle.com/datasets/donkeys/wikipedia-202308-64tk/data https://www.kaggle.com/datasets/donkeys/wikipedia-202308-chunks-256tk-sqlite

    see the using notebook for example code. it can be used to look up similarities to given indices and the received id values can be used to retrieve the documents matching the closest ones, along with the document chunks, which can in turn be used for finer-grained similarity search

    The embedding model used to build this was this: https://www.kaggle.com/datasets/donkeys/bge-small-en/data

  16. c

    AI Global Index Dataset

    • cubig.ai
    zip
    Updated Jun 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). AI Global Index Dataset [Dataset]. https://cubig.ai/store/products/529/ai-global-index-dataset
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The AI Global Index Dataset is a comprehensive index that benchmarks 62 countries based on the level of AI investment, innovation, and implementation, including seven key indicators (human resources, infrastructure, operational environment, research, development, government strategy, commercialization) and general information by country (region, cluster, income group, political system).

    2) Data Utilization (1) AI Global Index Dataset has characteristics that: • This dataset consists of a total of 13 columns with 5 categorical variables (regions, clusters, etc.) and 8 numerical variables (scores for each indicator), covering 62 countries. • The seven key indicators are classified into three pillars: △ implementation (human resources/infrastructure/operational environment) △ innovation (R&D) △ investment (government strategy/commercialization), and assess each country's overall AI ecosystem capabilities in multiple dimensions. (2) AI Global Index Dataset can be used to: • Global AI leadership pattern analysis: Correlation analysis between seven indicators can identify AI strengths and weaknesses by country and perform group comparisons by region and income level. • Machine learning-based predictive model: It can be used for data science education and application, such as country-specific index prediction through regression analysis or classification of AI development types through clustering.

  17. d

    Alabama ESI: INDEX (Index Polygons)

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated May 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (Point of Contact, Custodian) (2025). Alabama ESI: INDEX (Index Polygons) [Dataset]. https://catalog.data.gov/dataset/alabama-esi-index-index-polygons1
    Explore at:
    Dataset updated
    May 29, 2025
    Dataset provided by
    (Point of Contact, Custodian)
    Area covered
    Alabama
    Description

    This data set contains vector polygons representing the boundaries of all hardcopy cartographic products produced as part of the Environmental Sensitivity Index (ESI) for Alabama. This data set comprises a portion of the ESI data for Alabama. ESI data characterize the marine and coastal environments and wildlife by their sensitivity to spilled oil. The ESI data include information for three main components: shoreline habitats, sensitive biological resources, and human-use resources.

  18. m

    Index Index dataset

    • data.mendeley.com
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ivan Olier (2023). Index Index dataset [Dataset]. http://doi.org/10.17632/8ypy94frxg.1
    Explore at:
    Dataset updated
    Feb 16, 2023
    Authors
    Ivan Olier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the data used for the development of the Index Index model.

  19. u

    Data from: A dataset of spatiotemporally sampled MODIS Leaf Area Index with...

    • agdatacommons.nal.usda.gov
    application/csv
    Updated Nov 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson (2025). A dataset of spatiotemporally sampled MODIS Leaf Area Index with corresponding Landsat surface reflectance over the contiguous US [Dataset]. http://doi.org/10.15482/USDA.ADC/1521097
    Explore at:
    application/csvAvailable download formats
    Dataset updated
    Nov 22, 2025
    Dataset provided by
    Ag Data Commons
    Authors
    Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Leaf Area Index (LAI) is a fundamental vegetation structural variable that drives energy and mass exchanges between the plant and the atmosphere. Moderate-resolution (300m – 7km) global LAI data products have been widely applied to track global vegetation changes, drive Earth system models, monitor crop growth and productivity, etc. Yet, cutting-edge applications in climate adaptation, hydrology, and sustainable agriculture require LAI information at higher spatial resolution (< 100m) to model and understand heterogeneous landscapes. This dataset was built to assist a machine-learning-based approach for mapping LAI from 30m-resolution Landsat images across the contiguous US (CONUS). The data was derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) Version 6 LAI/FPAR, Landsat Collection 1 surface reflectance, and NLCD Land Cover datasets over 2006 – 2018 using Google Earth Engine. Each record/sample/row includes a MODIS LAI value, corresponding Landsat surface reflectance in green, red, NIR, SWIR1 bands, a land cover (biome) type, geographic location, and other auxiliary information. Each sample represents a MODIS LAI pixel (500m) within which a single biome type dominates 90% of the area. The spatial homogeneity of the samples was further controlled by a screening process based on the coefficient of variation of the Landsat surface reflectance. In total, there are approximately 1.6 million samples, stratified by biome, Landsat sensor, and saturation status from the MODIS LAI algorithm. This dataset can be used to train machine learning models and generate LAI maps for Landsat 5, 7, 8 surface reflectance images within CONUS. Detailed information on the sample generation and quality control can be found in the related journal article. Resources in this dataset:Resource Title: README. File Name: LAI_train_samples_CONUS_README.txtResource Description: Description and metadata of the main datasetResource Software Recommended: Notepad,url: https://www.microsoft.com/en-us/p/windows-notepad/9msmlrh6lzf3?activetab=pivot:overviewtab Resource Title: LAI_training_samples_CONUS. File Name: LAI_train_samples_CONUS_v0.1.1.csvResource Description: This CSV file consists of the training samples for estimating Leaf Area Index based on Landsat surface reflectance images (Collection 1 Tire 1). Each sample has a MODIS LAI value and corresponding surface reflectance derived from Landsat pixels within the MODIS pixel. Contact: Yanghui Kang (kangyanghui@gmail.com)
    Column description

    UID: Unique identifier. Format: LATITUDE_LONGITUDE_SENSOR_PATHROW_DATE
    Landsat_ID: Landsat image ID Date: Landsat image date in "YYYYMMDD" Latitude: Latitude (WGS84) of the MODIS LAI pixel center Longitude: Longitude (WGS84) of the MODIS LAI pixel center MODIS_LAI: MODIS LAI value in "m2/m2" MODIS_LAI_std: MODIS LAI standard deviation in "m2/m2" MODIS_LAI_sat: 0 - MODIS Main (RT) method used no saturation; 1 - MODIS Main (RT) method with saturation NLCD_class: Majority class code from the National Land Cover Dataset (NLCD) NLCD_frequency: Percentage of the area cover by the majority class from NLCD Biome: Biome type code mapped from NLCD (see below for more information) Blue: Landsat surface reflectance in the blue band Green: Landsat surface reflectance in the green band Red: Landsat surface reflectance in the red band Nir: Landsat surface reflectance in the near infrared band Swir1: Landsat surface reflectance in the shortwave infrared 1 band Swir2: Landsat surface reflectance in the shortwave infrared 2 band Sun_zenith: Solar zenith angle from the Landsat image metadata. This is a scene-level value. Sun_azimuth: Solar azimuth angle from the Landsat image metadata. This is a scene-level value. NDVI: Normalized Difference Vegetation Index computed from Landsat surface reflectance EVI: Enhanced Vegetation Index computed from Landsat surface reflectance NDWI: Normalized Difference Water Index computed from Landsat surface reflectance GCI: Green Chlorophyll Index = Nir/Green - 1

    Biome code

    1 - Deciduous Forest
    2 - Evergreen Forest
    3 - Mixed Forest
    4 - Shrubland
    5 - Grassland/Pasture
    6 - Cropland
    7 - Woody Wetland
    8 - Herbaceous Wetland

    Reference Dataset: All data was accessed through Google Earth Engine Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. MODIS Version 6 Leaf Area Index/FPAR 4-day L5 Global 500m Myneni, R., Y. Knyazikhin, T. Park. MOD15A2H MODIS/Terra Leaf Area Index/FPAR 8-Day L4 Global 500m SIN Grid V006. 2015, distributed by NASA EOSDIS Land Processes DAAC, https://doi.org/10.5067/MODIS/MOD15A2H.006 Landsat 5/7/8 Collection 1 Surface Reflectance Landsat Level-2 Surface Reflectance Science Product courtesy of the U.S. Geological Survey. Masek, J.G., Vermote, E.F., Saleous N.E., Wolfe, R., Hall, F.G., Huemmrich, K.F., Gao, F., Kutler, J., and Lim, T-K. (2006). A Landsat surface reflectance dataset for North America, 1990–2000. IEEE Geoscience and Remote Sensing Letters 3(1):68-72. http://dx.doi.org/10.1109/LGRS.2005.857030. Vermote, E., Justice, C., Claverie, M., & Franch, B. (2016). Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product. Remote Sensing of Environment. http://dx.doi.org/10.1016/j.rse.2016.04.008. National Land Cover Dataset (NLCD) Yang, Limin, Jin, Suming, Danielson, Patrick, Homer, Collin G., Gass, L., Bender, S.M., Case, Adam, Costello, C., Dewitz, Jon A., Fry, Joyce A., Funk, M., Granneman, Brian J., Liknes, G.C., Rigge, Matthew B., Xian, George, A new generation of the United States National Land Cover Database—Requirements, research priorities, design, and implementation strategies: ISPRS Journal of Photogrammetry and Remote Sensing, v. 146, p. 108–123, at https://doi.org/10.1016/j.isprsjprs.2018.09.006 Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel

  20. T

    United States CFNAI Sales, Orders and Inventories Index

    • tradingeconomics.com
    • fa.tradingeconomics.com
    • +13more
    csv, excel, json, xml
    Updated Aug 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2025). United States CFNAI Sales, Orders and Inventories Index [Dataset]. https://tradingeconomics.com/united-states/cfnai-sales-orders-and-inventories-index
    Explore at:
    xml, json, excel, csvAvailable download formats
    Dataset updated
    Aug 15, 2025
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 31, 1967 - Aug 31, 2025
    Area covered
    United States
    Description

    CFNAI Sales Orders and Inventories Index in the United States increased to 0 percent in August from -0.02 percent in July of 2025. This dataset includes a chart with historical data for the United States CFNAI Sales, Orders and Inventories Index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Medical Semantic Indexing in Spanish - Development dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3746595

MESINESP: Medical Semantic Indexing in Spanish - Development dataset

Explore at:
Dataset updated
Nov 5, 2022
Dataset provided by
Barcelona Supercomputing Center
Authors
Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

Introduction

The Mesinesp (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) development set has a total of 750 records indexed manually by seven experienced medical literature indexers. Indexing is done using DeCS codes, a sort of Spanish equivalent to MeSH terms. Records were distributed in a way that each article was annotated, at least, by two different human indexers.

The data annotation process consisted in two steps:

Manual indexing step. DeCS codes were manually assigned to each record following the DeCS manual indexing guidelines.

Manual validation and consensus. The joined set of manually indexed DeCS codes generated by both indexers were manually revised and corrections were done.

These annotations were analyzed, resulting in an agreement using the Jaccard index.

Records consisted basically in medical literature abstracts and titles from the IBECS and LILACS databases.

Zip structure The zip file contains two different development sets:

Official development set, which has the union of the annotations, with an agreement of macro = 0.6568 and micro = 0.6819. This set is composed by all the different (unique) DeCS codes that have been added by any annotator for each document; and

Core-descriptors development set, which has the intersection of the annotations, with an agreement of macro = 1.0 and micro = 1.0. This set is composed of the common DeCS codes that have been added by two or more annotators for each document.

Corpus format

Each dataset is a JSON object with one single key named "articles", which contains a list of documents. So, the raw format of the file is one line per document plus two additional lines (the first and the last) to enclose that list of documents and the expected type of data is as follows:

{"articles":[ {"abstractText":str,"db":str,"decsCodes":list,"id":str,"journal":str,"title":str,"year":int}, ... ]}

To clarify, the order of appearance of the fields in each document is as follows (note that this example it is pretty printed for readability purposes):

{ "articles": [ { "abstractText": "Content of the abstract", "db": "Name of the source database", "decsCodes": [ "code1", "code2", "code3" ], "id": "Id of the document", "journal": "Name of the journal", "title": "Title of the document", "year": 2019 } ] }

Note: The fields "db", "journal" and "year" might be null.

Copyright (c) 2020 Secretaría de Estado de Digitalización e Inteligencia Artificial

Search
Clear search
Close search
Google apps
Main menu