100+ datasets found

Z
MESINESP: Medical Semantic Indexing in Spanish - Development dataset
data-staging.niaid.nih.gov
live.european-language-grid.eu
+2more
Updated Nov 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Medical Semantic Indexing in Spanish - Development dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3746595
Explore at:
Dataset updated
Nov 5, 2022
Dataset provided by
Barcelona Supercomputing Center
Authors
Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

Introduction

The Mesinesp (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) development set has a total of 750 records indexed manually by seven experienced medical literature indexers. Indexing is done using DeCS codes, a sort of Spanish equivalent to MeSH terms. Records were distributed in a way that each article was annotated, at least, by two different human indexers.

The data annotation process consisted in two steps:

Manual indexing step. DeCS codes were manually assigned to each record following the DeCS manual indexing guidelines.

Manual validation and consensus. The joined set of manually indexed DeCS codes generated by both indexers were manually revised and corrections were done.

These annotations were analyzed, resulting in an agreement using the Jaccard index.

Records consisted basically in medical literature abstracts and titles from the IBECS and LILACS databases.

Zip structure The zip file contains two different development sets:

Official development set, which has the union of the annotations, with an agreement of macro = 0.6568 and micro = 0.6819. This set is composed by all the different (unique) DeCS codes that have been added by any annotator for each document; and

Core-descriptors development set, which has the intersection of the annotations, with an agreement of macro = 1.0 and micro = 1.0. This set is composed of the common DeCS codes that have been added by two or more annotators for each document.

Corpus format

Each dataset is a JSON object with one single key named "articles", which contains a list of documents. So, the raw format of the file is one line per document plus two additional lines (the first and the last) to enclose that list of documents and the expected type of data is as follows:

{"articles":[ {"abstractText":str,"db":str,"decsCodes":list,"id":str,"journal":str,"title":str,"year":int}, ... ]}

To clarify, the order of appearance of the fields in each document is as follows (note that this example it is pretty printed for readability purposes):

{ "articles": [ { "abstractText": "Content of the abstract", "db": "Name of the source database", "decsCodes": [ "code1", "code2", "code3" ], "id": "Id of the document", "journal": "Name of the journal", "title": "Title of the document", "year": 2019 } ] }

Note: The fields "db", "journal" and "year" might be null.

Copyright (c) 2020 Secretaría de Estado de Digitalización e Inteligencia Artificial
R
Real-Time Index Database Report
marketreportanalytics.com
doc, pdf, ppt
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Market Report Analytics (2025). Real-Time Index Database Report [Dataset]. https://www.marketreportanalytics.com/reports/real-time-index-database-75396
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Apr 10, 2025
Dataset authored and provided by
Market Report Analytics
License
https://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
Unlock the power of real-time data! Explore the booming real-time index database market, projected to reach $32 billion by 2033. Discover key trends, leading companies (Elastic, AWS, Splunk), and regional insights in this comprehensive market analysis.
City Happiness Index - 2024
kaggle.com
zip
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EMİRHAN BULUT (2024). City Happiness Index - 2024 [Dataset]. https://www.kaggle.com/datasets/emirhanai/city-happiness-index-2024
Explore at:
zip(7931 bytes)Available download formats
Dataset updated
Jan 22, 2024
Authors
EMİRHAN BULUT
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Name: City Happiness Index

Dataset Description:

This dataset and the related codes are entirely prepared, original, and exclusive by Emirhan BULUT. The dataset includes crucial features and measurements from various cities around the world, focusing on factors that may affect the overall happiness score of each city. By analyzing these factors, we aim to gain insights into the living conditions and satisfaction of the population in urban environments.

The dataset consists of the following features:

City: Name of the city.

Month: The month in which the data is recorded.

Year: The year in which the data is recorded.

Decibel_Level: Average noise levels in decibels, indicating the auditory comfort of the citizens.

Traffic_Density: Level of traffic density (Low, Medium, High, Very High), which might impact citizens' daily commute and stress levels.

Green_Space_Area: Percentage of green spaces in the city, positively contributing to the mental well-being and relaxation of the inhabitants.

Air_Quality_Index: Index measuring the quality of air, a crucial aspect affecting citizens' health and overall satisfaction.

Happiness_Score: The average happiness score of the city (on a 1-10 scale), representing the subjective well-being of the population.

Cost_of_Living_Index: Index measuring the cost of living in the city (relative to a reference city), which could impact the financial satisfaction of the citizens.

Healthcare_Index: Index measuring the quality of healthcare in the city, an essential component of the population's well-being and contentment.

With these features, the dataset aims to analyze and understand the relationship between various urban factors and the happiness of a city's population. The developed Deep Q-Network model, PIYAAI_2, is designed to learn from this data to provide accurate predictions in future scenarios. Using Reinforcement Learning, the model is expected to improve its performance over time as it learns from new data and adapts to changes in the environment.
Index match, Index match Advance
kaggle.com
zip
Updated Mar 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). Index match, Index match Advance [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/index-match-index-match-advance
Explore at:
zip(10258 bytes)Available download formats
Dataset updated
Mar 15, 2024
Authors
Sanjana Murthy
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This data contains Index match, index match Advance
Historical S&P 500 (^GSPC) Index Data (1927–2025)
kaggle.com
zip
Updated Aug 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Reza Nematpour (2025). Historical S&P 500 (^GSPC) Index Data (1927–2025) [Dataset]. https://www.kaggle.com/datasets/rezanematpour/historical-s-and-p-500-gspc-index-data-19272025
Explore at:
zip(350147 bytes)Available download formats
Dataset updated
Aug 31, 2025
Authors
Reza Nematpour
Description
This dataset contains the full historical record of the S&P 500 index (^GSPC), downloaded via the Yahoo Finance API using the yfinance Python library.

The dataset includes: - Date: Trading date - Open, High, Low, Close: Daily price levels - Volume: Daily trading volume

Period covered: Dec 30, 1927 – Aug 31, 2025 Frequency: Daily

⚠️ Disclaimer: This dataset is provided for educational and research purposes only. Redistribution or commercial use may be subject to Yahoo Finance’s Terms of Service

License

Data sourced from Yahoo Finance. Provided for educational and research purposes only. Redistribution may be restricted.
T
United States Dallas Fed Manufacturing Shipments Index
tradingeconomics.com
jp.tradingeconomics.com
+13more
csv, excel, json, xml
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS, United States Dallas Fed Manufacturing Shipments Index [Dataset]. https://tradingeconomics.com/united-states/dallas-fed-manufacturing-shipments-index
Explore at:
xml, excel, csv, jsonAvailable download formats
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 30, 2004 - Nov 30, 2025
Area covered
United States
Description
Dallas Fed Manufacturing Shipments Index in the United States increased to 15.10 points in November from 5.80 points in October of 2025. This dataset includes a chart with historical data for the United States Dallas Fed Manufacturing Shipments Index.
R
Indexing Magic Cards Dataset
universe.roboflow.com
zip
Updated Oct 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MagicAl Index (2025). Indexing Magic Cards Dataset [Dataset]. https://universe.roboflow.com/magical-index/indexing-magic-cards-gmo7a/dataset/1
Explore at:
zipAvailable download formats
Dataset updated
Oct 13, 2025
Dataset authored and provided by
MagicAl Index
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Magic Cards Bounding Boxes
Description
Indexing Magic Cards

## Overview Indexing Magic Cards is a dataset for object detection tasks - it contains Magic Cards annotations for 297 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Transportation Services Index - Passenger
catalog.data.gov
data.virginia.gov
+1more
Updated Jan 2, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bureau of Transportation Statistics (2025). Transportation Services Index - Passenger [Dataset]. https://catalog.data.gov/dataset/transportation-services-index-passenger
Explore at:
Dataset updated
Jan 2, 2025
Dataset provided by
Bureau of Transportation Statisticshttp://www.rita.dot.gov/bts
Description
A monthly measure of the volume of services performed by the for-hire transportation sector. The index covers the activities of local mass transit, intercity passenger rail, and passenger air transportation.
Environmental Quality Index
catalog.data.gov
s.cnmilf.com
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Environmental Quality Index [Dataset]. https://catalog.data.gov/dataset/environmental-quality-index
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
An Environmental Quality Index (EQI) for all counties in the United States for the time period 2000-2005 was developed which incorporated data from five environmental domains: air, water, land, built, and socio-demographic. The EQI was developed in four parts: domain identification; data source identification and review; variable construction; and data reduction using principal components analysis (PCA). The methods applied provide a reproducible approach that capitalizes almost exclusively on publically-available data sources. The primary goal in creating the EQI is to use it as a composite environmental indicator for research on human health. A series of peer reviewed manuscripts utilized the EQI in examining health outcomes. This dataset is not publicly accessible because: This series of papers are considered Human health research - not to be loaded onto ScienceHub. It can be accessed through the following means: The EQI data can be accessed at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: EQI data, metadata, formats, and data dictionary all available at website. This dataset is associated with the following publications: Gray, C., L. Messer, K. Rappazzo, J. Jagai, S. Grabich, and D. Lobdell. The association between physical inactivity and obesity is modified by five domains of environmental quality in U.S. adults: A cross-sectional study. PLoS ONE. Public Library of Science, San Francisco, CA, USA, 13(8): e0203301, (2018). Patel, A., J. Jagai, L. Messer, C. Gray, K. Rappazzo, S. DeflorioBarker, and D. Lobdell. Associations between environmental quality and infant mortality in the United States, 2000-2005. Archives of Public Health. BioMed Central Ltd, London, UK, 76(60): 1, (2018). Gray, C., D. Lobdell, K. Rappazzo, Y. Jian, J. Jagai, L. Messer, A. Patel, S. Deflorio-Barker, C. Lyttle, J. Solway, and A. Rzhetsky. Associations between environmental quality and adult asthma prevalence in medical claims data. ENVIRONMENTAL RESEARCH. Elsevier B.V., Amsterdam, NETHERLANDS, 166: 529-536, (2018).
C
Data from: chis_shore - Coastal Vulnerability Index (CVI) dataset for...
data.cnra.ca.gov
data.amerigeoss.org
zip
Updated May 8, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ocean Data Partners (2019). chis_shore - Coastal Vulnerability Index (CVI) dataset for Channel Islands National Park [Dataset]. https://data.cnra.ca.gov/dataset/chis_shore-coastal-vulnerability-index-cvi-dataset-for-channel-islands-national-park
Explore at:
zipAvailable download formats
Dataset updated
May 8, 2019
Dataset authored and provided by
Ocean Data Partners
Area covered
Channel Islands of California
Description
A coastal vulnerability index (CVI) was used to map the relative vulnerability of the coast to future sea-level rise within Channel Islands National Park in California. The CVI ranks the following in terms of their physical contribution to sea-level rise-related coastal change: geomorphology, regional coastal slope, rate of relative sea-level rise, historical shoreline change rates, mean tidal range and mean significant wave height. The rankings for each input variable were combined and an index value calculated for 1-minute grid cells covering the park. The CVI highlights those regions where the physical effects of sea-level rise might be the greatest. This approach combines the coastal system's susceptibility to change with its natural ability to adapt to changing environmental conditions, yielding a quantitative, although relative, measure of the park's natural vulnerability to the effects of sea-level rise. The CVI and the data contained within this dataset provide an objective technique for evaluation and long-term planning by scientists and park managers.
Cyanobacteria Index (MERIS)
catalog.data.gov
s.cnmilf.com
+2more
Updated Nov 12, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Cyanobacteria Index (MERIS) [Dataset]. https://catalog.data.gov/dataset/cyanobacteria-index-meris
Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description
This dataset shows the concentration of cyanobacteria cells/ml in fresh water bodies and estuaries of the Ohio and Florida derived from 300x300 meter MEdium Resolution Imaging Spectrometer (MERIS) satellite imagery. This dataset was produced through partnership with the National Oceanic and Atmospheric Administration (NOAA), the National Aeronautics and Space Administration (NASA), the United States Geological Survey (USGS), and the United States Environmental Protection Agency (USEPA). This cyanobacteria dataset was derived using the European Space Agency (ESA) Envisat satellite and MERIS instrument. MERIS is a 68.5 degree field-of-view nadir-pointing imaging spectrometer which measures the solar radiation reflected by the Earth in 15 spectral bands (visible and near-infrared). MERIS imagery was used to identify long-wavelength spectral bands (from red through near-infrared portion of the spectrum) to locate algal blooms within freshwaters and estuaries of the continental United States. This dataset is associated with the following publication: Urquhart, E., B. Schaeffer, R. Stumpf, K. Loftin, and J. Wedell. .A method for examining temporal changes in cyanobacterial harmful algal bloom spatial extent using satellite remote sensing. Harmful Algae. Elsevier B.V., Amsterdam, NETHERLANDS, 67: 144-152, (2017).
Dataset used for research on predicting next day closing price
figshare.com
application/csv
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmad Firdaus Cayzer (2024). Dataset used for research on predicting next day closing price [Dataset]. http://doi.org/10.6084/m9.figshare.26169403.v1
Explore at:
application/csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26169403.v1
Dataset updated
Jul 3, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ahmad Firdaus Cayzer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are a total of 5 datasets.sp500_datasp500_newFeatures_datasp500_lagged_datanasdaq_lagged_datahsi_lagged_dataThe first dataset contains 34 years worth of data from 1990 to 2023 for the stock index S&P500. This dataset has been preprocessed and is used for training and testing. The second dataset transforms the initial dataset with the addition of new features derived from the first dataset. The third dataset is a different transformation of the first dataset where the features are mostly contained of lagged features. The fourth dataset contains 10 years of data for the NASDAQ index from 2014-2023 following the same format of lagged features like the third dataset. The fifth dataset has 10 years of data from 2014-2023 for the HSI stock index. This dataset also follows the same format of features as the third datasetAll five of these datasets were used as implementations for a research to predict tomorrow's closing price based on today's financial features
US House Price Index Prediction Dataset
kaggle.com
zip
Updated Jun 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohit Gupta (2024). US House Price Index Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/mohitgupta12/us-house-price-index-prediction-dataset
Explore at:
zip(26547 bytes)Available download formats
Dataset updated
Jun 8, 2024
Authors
Mohit Gupta
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Dataset

This dataset was created by Mohit Gupta

Released under CC0: Public Domain

Contents
T
United States CFNAI Employment, Unemployment and Hours Index
tradingeconomics.com
ru.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Aug 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States CFNAI Employment, Unemployment and Hours Index [Dataset]. https://tradingeconomics.com/united-states/cfnai-employment-index
Explore at:
csv, json, xml, excelAvailable download formats
Dataset updated
Aug 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 31, 1967 - Aug 31, 2025
Area covered
United States
Description
CFNAI Employment Index in the United States increased to -0.07 points in August from -0.10 points in July of 2025. This dataset includes a chart with historical data for the United States CFNAI Employment Index.
faiss-512-wikipedia-202308
kaggle.com
zip
Updated Sep 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
averagemn (2023). faiss-512-wikipedia-202308 [Dataset]. https://www.kaggle.com/datasets/donkeys/faiss-512-wikipedia-202308
Explore at:
zip(8678597313 bytes)Available download formats
Dataset updated
Sep 16, 2023
Authors
averagemn
Description
A faiss index for the wikipedia documents chunked from early august 2023 wikipedia dump, with FAISS doc id's matching the doc id's in these two pre-chunked databases:

https://www.kaggle.com/datasets/donkeys/wikipedia-202308-64tk/data https://www.kaggle.com/datasets/donkeys/wikipedia-202308-chunks-256tk-sqlite

see the using notebook for example code. it can be used to look up similarities to given indices and the received id values can be used to retrieve the documents matching the closest ones, along with the document chunks, which can in turn be used for finer-grained similarity search

The embedding model used to build this was this: https://www.kaggle.com/datasets/donkeys/bge-small-en/data
c
AI Global Index Dataset
cubig.ai
zip
Updated Jun 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). AI Global Index Dataset [Dataset]. https://cubig.ai/store/products/529/ai-global-index-dataset
Explore at:
zipAvailable download formats
Dataset updated
Jun 30, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The AI Global Index Dataset is a comprehensive index that benchmarks 62 countries based on the level of AI investment, innovation, and implementation, including seven key indicators (human resources, infrastructure, operational environment, research, development, government strategy, commercialization) and general information by country (region, cluster, income group, political system).

2) Data Utilization (1) AI Global Index Dataset has characteristics that: • This dataset consists of a total of 13 columns with 5 categorical variables (regions, clusters, etc.) and 8 numerical variables (scores for each indicator), covering 62 countries. • The seven key indicators are classified into three pillars: △ implementation (human resources/infrastructure/operational environment) △ innovation (R&D) △ investment (government strategy/commercialization), and assess each country's overall AI ecosystem capabilities in multiple dimensions. (2) AI Global Index Dataset can be used to: • Global AI leadership pattern analysis: Correlation analysis between seven indicators can identify AI strengths and weaknesses by country and perform group comparisons by region and income level. • Machine learning-based predictive model: It can be used for data science education and application, such as country-specific index prediction through regression analysis or classification of AI development types through clustering.
d
Alabama ESI: INDEX (Index Polygons)
catalog.data.gov
datasets.ai
+1more
Updated May 29, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(Point of Contact, Custodian) (2025). Alabama ESI: INDEX (Index Polygons) [Dataset]. https://catalog.data.gov/dataset/alabama-esi-index-index-polygons1
Explore at:
Dataset updated
May 29, 2025
Dataset provided by
(Point of Contact, Custodian)
Area covered
Alabama
Description
This data set contains vector polygons representing the boundaries of all hardcopy cartographic products produced as part of the Environmental Sensitivity Index (ESI) for Alabama. This data set comprises a portion of the ESI data for Alabama. ESI data characterize the marine and coastal environments and wildlife by their sensitivity to spilled oil. The ESI data include information for three main components: shoreline habitats, sensitive biological resources, and human-use resources.
m
Index Index dataset
data.mendeley.com
Updated Feb 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ivan Olier (2023). Index Index dataset [Dataset]. http://doi.org/10.17632/8ypy94frxg.1
Explore at:
Unique identifier
https://doi.org/10.17632/8ypy94frxg.1
Dataset updated
Feb 16, 2023
Authors
Ivan Olier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the data used for the development of the Index Index model.
u
Data from: A dataset of spatiotemporally sampled MODIS Leaf Area Index with...
agdatacommons.nal.usda.gov
application/csv
Updated Nov 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson (2025). A dataset of spatiotemporally sampled MODIS Leaf Area Index with corresponding Landsat surface reflectance over the contiguous US [Dataset]. http://doi.org/10.15482/USDA.ADC/1521097
Explore at:
application/csvAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1521097
Dataset updated
Nov 22, 2025
Dataset provided by
Ag Data Commons
Authors
Yanghui Kang; Mutlu Ozdogan; Feng Gao; Martha C. Anderson; William A. White; Yun Yang; Yang Yang; Tyler A. Erickson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
Leaf Area Index (LAI) is a fundamental vegetation structural variable that drives energy and mass exchanges between the plant and the atmosphere. Moderate-resolution (300m – 7km) global LAI data products have been widely applied to track global vegetation changes, drive Earth system models, monitor crop growth and productivity, etc. Yet, cutting-edge applications in climate adaptation, hydrology, and sustainable agriculture require LAI information at higher spatial resolution (< 100m) to model and understand heterogeneous landscapes. This dataset was built to assist a machine-learning-based approach for mapping LAI from 30m-resolution Landsat images across the contiguous US (CONUS). The data was derived from the Moderate Resolution Imaging Spectroradiometer (MODIS) Version 6 LAI/FPAR, Landsat Collection 1 surface reflectance, and NLCD Land Cover datasets over 2006 – 2018 using Google Earth Engine. Each record/sample/row includes a MODIS LAI value, corresponding Landsat surface reflectance in green, red, NIR, SWIR1 bands, a land cover (biome) type, geographic location, and other auxiliary information. Each sample represents a MODIS LAI pixel (500m) within which a single biome type dominates 90% of the area. The spatial homogeneity of the samples was further controlled by a screening process based on the coefficient of variation of the Landsat surface reflectance. In total, there are approximately 1.6 million samples, stratified by biome, Landsat sensor, and saturation status from the MODIS LAI algorithm. This dataset can be used to train machine learning models and generate LAI maps for Landsat 5, 7, 8 surface reflectance images within CONUS. Detailed information on the sample generation and quality control can be found in the related journal article. Resources in this dataset:Resource Title: README. File Name: LAI_train_samples_CONUS_README.txtResource Description: Description and metadata of the main datasetResource Software Recommended: Notepad,url: https://www.microsoft.com/en-us/p/windows-notepad/9msmlrh6lzf3?activetab=pivot:overviewtab Resource Title: LAI_training_samples_CONUS. File Name: LAI_train_samples_CONUS_v0.1.1.csvResource Description: This CSV file consists of the training samples for estimating Leaf Area Index based on Landsat surface reflectance images (Collection 1 Tire 1). Each sample has a MODIS LAI value and corresponding surface reflectance derived from Landsat pixels within the MODIS pixel. Contact: Yanghui Kang (kangyanghui@gmail.com)
Column description

UID: Unique identifier. Format: LATITUDE_LONGITUDE_SENSOR_PATHROW_DATE
Landsat_ID: Landsat image ID Date: Landsat image date in "YYYYMMDD" Latitude: Latitude (WGS84) of the MODIS LAI pixel center Longitude: Longitude (WGS84) of the MODIS LAI pixel center MODIS_LAI: MODIS LAI value in "m2/m2" MODIS_LAI_std: MODIS LAI standard deviation in "m2/m2" MODIS_LAI_sat: 0 - MODIS Main (RT) method used no saturation; 1 - MODIS Main (RT) method with saturation NLCD_class: Majority class code from the National Land Cover Dataset (NLCD) NLCD_frequency: Percentage of the area cover by the majority class from NLCD Biome: Biome type code mapped from NLCD (see below for more information) Blue: Landsat surface reflectance in the blue band Green: Landsat surface reflectance in the green band Red: Landsat surface reflectance in the red band Nir: Landsat surface reflectance in the near infrared band Swir1: Landsat surface reflectance in the shortwave infrared 1 band Swir2: Landsat surface reflectance in the shortwave infrared 2 band Sun_zenith: Solar zenith angle from the Landsat image metadata. This is a scene-level value. Sun_azimuth: Solar azimuth angle from the Landsat image metadata. This is a scene-level value. NDVI: Normalized Difference Vegetation Index computed from Landsat surface reflectance EVI: Enhanced Vegetation Index computed from Landsat surface reflectance NDWI: Normalized Difference Water Index computed from Landsat surface reflectance GCI: Green Chlorophyll Index = Nir/Green - 1

Biome code

1 - Deciduous Forest
2 - Evergreen Forest
3 - Mixed Forest
4 - Shrubland
5 - Grassland/Pasture
6 - Cropland
7 - Woody Wetland
8 - Herbaceous Wetland

Reference Dataset: All data was accessed through Google Earth Engine Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., & Moore, R. (2017). Google Earth Engine: Planetary-scale geospatial analysis for everyone. Remote Sensing of Environment. MODIS Version 6 Leaf Area Index/FPAR 4-day L5 Global 500m Myneni, R., Y. Knyazikhin, T. Park. MOD15A2H MODIS/Terra Leaf Area Index/FPAR 8-Day L4 Global 500m SIN Grid V006. 2015, distributed by NASA EOSDIS Land Processes DAAC, https://doi.org/10.5067/MODIS/MOD15A2H.006 Landsat 5/7/8 Collection 1 Surface Reflectance Landsat Level-2 Surface Reflectance Science Product courtesy of the U.S. Geological Survey. Masek, J.G., Vermote, E.F., Saleous N.E., Wolfe, R., Hall, F.G., Huemmrich, K.F., Gao, F., Kutler, J., and Lim, T-K. (2006). A Landsat surface reflectance dataset for North America, 1990–2000. IEEE Geoscience and Remote Sensing Letters 3(1):68-72. http://dx.doi.org/10.1109/LGRS.2005.857030. Vermote, E., Justice, C., Claverie, M., & Franch, B. (2016). Preliminary analysis of the performance of the Landsat 8/OLI land surface reflectance product. Remote Sensing of Environment. http://dx.doi.org/10.1016/j.rse.2016.04.008. National Land Cover Dataset (NLCD) Yang, Limin, Jin, Suming, Danielson, Patrick, Homer, Collin G., Gass, L., Bender, S.M., Case, Adam, Costello, C., Dewitz, Jon A., Fry, Joyce A., Funk, M., Granneman, Brian J., Liknes, G.C., Rigge, Matthew B., Xian, George, A new generation of the United States National Land Cover Database—Requirements, research priorities, design, and implementation strategies: ISPRS Journal of Photogrammetry and Remote Sensing, v. 146, p. 108–123, at https://doi.org/10.1016/j.isprsjprs.2018.09.006 Resource Software Recommended: Microsoft Excel,url: https://www.microsoft.com/en-us/microsoft-365/excel
T
United States CFNAI Sales, Orders and Inventories Index
tradingeconomics.com
fa.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Aug 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States CFNAI Sales, Orders and Inventories Index [Dataset]. https://tradingeconomics.com/united-states/cfnai-sales-orders-and-inventories-index
Explore at:
xml, json, excel, csvAvailable download formats
Dataset updated
Aug 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 31, 1967 - Aug 31, 2025
Area covered
United States
Description
CFNAI Sales Orders and Inventories Index in the United States increased to 0 percent in August from -0.02 percent in July of 2025. This dataset includes a chart with historical data for the United States CFNAI Sales, Orders and Inventories Index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio (2022). MESINESP: Medical Semantic Indexing in Spanish - Development dataset [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_3746595

MESINESP: Medical Semantic Indexing in Spanish - Development dataset

Explore at:

Dataset updated

Nov 5, 2022

Dataset provided by

Barcelona Supercomputing Center

Authors

Martin Krallinger; Aitor Gonzalez-Agirre; Alejandro Asensio

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Please use the MESINESP2 corpus (the second edition of the shared-task) since it has a higher level of curation, quality and is organized by document type (scientific articles, patents and clinical trials).

Introduction

The Mesinesp (Spanish BioASQ track, see https://temu.bsc.es/mesinesp) development set has a total of 750 records indexed manually by seven experienced medical literature indexers. Indexing is done using DeCS codes, a sort of Spanish equivalent to MeSH terms. Records were distributed in a way that each article was annotated, at least, by two different human indexers.

The data annotation process consisted in two steps:

Manual indexing step. DeCS codes were manually assigned to each record following the DeCS manual indexing guidelines.

Manual validation and consensus. The joined set of manually indexed DeCS codes generated by both indexers were manually revised and corrections were done.

These annotations were analyzed, resulting in an agreement using the Jaccard index.

Records consisted basically in medical literature abstracts and titles from the IBECS and LILACS databases.

Zip structure The zip file contains two different development sets:

Official development set, which has the union of the annotations, with an agreement of macro = 0.6568 and micro = 0.6819. This set is composed by all the different (unique) DeCS codes that have been added by any annotator for each document; and

Core-descriptors development set, which has the intersection of the annotations, with an agreement of macro = 1.0 and micro = 1.0. This set is composed of the common DeCS codes that have been added by two or more annotators for each document.

Corpus format

Each dataset is a JSON object with one single key named "articles", which contains a list of documents. So, the raw format of the file is one line per document plus two additional lines (the first and the last) to enclose that list of documents and the expected type of data is as follows:

{"articles":[ {"abstractText":str,"db":str,"decsCodes":list,"id":str,"journal":str,"title":str,"year":int}, ... ]}

To clarify, the order of appearance of the fields in each document is as follows (note that this example it is pretty printed for readability purposes):

{ "articles": [ { "abstractText": "Content of the abstract", "db": "Name of the source database", "decsCodes": [ "code1", "code2", "code3" ], "id": "Id of the document", "journal": "Name of the journal", "title": "Title of the document", "year": 2019 } ] }

Note: The fields "db", "journal" and "year" might be null.

Clear search

Close search

Google apps

Main menu

MESINESP: Medical Semantic Indexing in Spanish - Development dataset

Real-Time Index Database Report

City Happiness Index - 2024

Index match, Index match Advance

Historical S&P 500 (^GSPC) Index Data (1927–2025)

License

United States Dallas Fed Manufacturing Shipments Index

Indexing Magic Cards Dataset

Indexing Magic Cards

Transportation Services Index - Passenger

Environmental Quality Index

Data from: chis_shore - Coastal Vulnerability Index (CVI) dataset for...

Cyanobacteria Index (MERIS)

Dataset used for research on predicting next day closing price

US House Price Index Prediction Dataset

Dataset

Contents

United States CFNAI Employment, Unemployment and Hours Index

faiss-512-wikipedia-202308

AI Global Index Dataset

Alabama ESI: INDEX (Index Polygons)

Index Index dataset

Data from: A dataset of spatiotemporally sampled MODIS Leaf Area Index with...

United States CFNAI Sales, Orders and Inventories Index

MESINESP: Medical Semantic Indexing in Spanish - Development datasetSee More Versions

MESINESP: Medical Semantic Indexing in Spanish - Development dataset