8 datasets found
  1. Data from: Satellite remote sensing dataset of Sentinel-2 for phenology...

    • zenodo.org
    • producciocientifica.uv.es
    • +1more
    txt
    Updated Apr 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dessislava Ganeva; Dessislava Ganeva; Lukas Graf Valentin; Lukas Graf Valentin; Egor Prikaziuk; Egor Prikaziuk; Gerbrand Koren; Gerbrand Koren; Enrico Tomelleri; Enrico Tomelleri; Jochem Verrelst; Jochem Verrelst; Katja Berger; Katja Berger; Santiago Belda; Santiago Belda; Zhanzhang Cai; Zhanzhang Cai; Cláudio Silva Figueira; Cláudio Silva Figueira (2023). Satellite remote sensing dataset of Sentinel-2 for phenology metrics extraction from sites in Bulgaria and France [Dataset]. http://doi.org/10.5281/zenodo.7825727
    Explore at:
    txtAvailable download formats
    Dataset updated
    Apr 28, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dessislava Ganeva; Dessislava Ganeva; Lukas Graf Valentin; Lukas Graf Valentin; Egor Prikaziuk; Egor Prikaziuk; Gerbrand Koren; Gerbrand Koren; Enrico Tomelleri; Enrico Tomelleri; Jochem Verrelst; Jochem Verrelst; Katja Berger; Katja Berger; Santiago Belda; Santiago Belda; Zhanzhang Cai; Zhanzhang Cai; Cláudio Silva Figueira; Cláudio Silva Figueira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Site Description:

    In this dataset, there are seventeen production crop fields in Bulgaria where winter rapeseed and wheat were grown and two research fields in France where winter wheat – rapeseed – barley – sunflower and winter wheat – irrigated maize crop rotation is used. The full description of those fields is in the database "In-situ crop phenology dataset from sites in Bulgaria and France" (doi.org/10.5281/zenodo.7875440).

    Methodology and Data Description:

    Remote sensing data is extracted from Sentinel-2 tiles 35TNJ for Bulgarian sites and 31TCJ for French sites on the day of the overpass since September 2015 for Sentinel-2 derived vegetation indices and since October 2016 for HR-VPP products. To suppress spectral mixing effects at the parcel boundaries, as highlighted by Meier et al., 2020, the values from all datasets were subgrouped per field and then aggregated to a single median value for further analysis.

    Sentinel-2 data was downloaded for all test sites from CREODIAS (https://creodias.eu/) in L2A processing level using a maximum scene-wide cloudy cover threshold of 75%. Scenes before 2017 were available in L1C processing level only. Scenes in L1C processing level were corrected for atmospheric effects after downloading using Sen2Cor (v2.9) with default settings. This was the same version used for the L2A scenes obtained intermediately from CREODIAS.

    Next, the data was extracted from the Sentinel-2 scenes for each field parcel where only SCL classes 4 (vegetation) and 5 (bare soil) pixels were kept. We resampled the 20m band B8A to match the spatial resolution of the green and red band (10m) using nearest neighbor interpolation. The entire image processing chain was carried out using the open-source Python Earth Observation Data Analysis Library (EOdal) (Graf et al., 2022).

    Apart from the widely used Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI), we included two recently proposed indices that were reported to have a higher correlation with photosynthesis and drought response of vegetation: These were the Near-Infrared Reflection of Vegetation (NIRv) (Badgley et al., 2017) and Kernel NDVI (kNDVI) (Camps-Valls et al., 2021). We calculated the vegetation indices in two different ways:

    First, we used B08 as near-infrared (NIR) band which comes in a native spatial resolution of 10 m. B08 (central wavelength 833 nm) has a relatively coarse spectral resolution with a bandwidth of 106 nm.

    Second, we used B8A which is available at 20 m spatial resolution. B8A differs from B08 in its central wavelength (864 nm) and has a narrower bandwidth (21 nm or 22 nm in the case of Sentinel-2A and 2B, respectively) compared to B08.

    The High Resolution Vegetation Phenology and Productivity (HR-VPP) dataset from Copernicus Land Monitoring Service (CLMS) has three 10-m set products of Sentinel-2: vegetation indices, vegetation phenology and productivity parameters and seasonal trajectories (Tian et al., 2021). Both vegetation indices, Normalized Vegetation Index (NDVI) and Plant Phenology (PPI) and plant parameters, Fraction of Absorbed Photosynthetic Active Radiation (FAPAR) and Leaf Area Index (LAI) were computed for the time of Sentinel-2 overpass by the data provider.

    NDVI is computed directly from B04 and B08 and PPI is computed using Difference Vegetation Index (DVI = B08 - B04) and its seasonal maximum value per pixel. FAPAR and LAI are retrieved from B03 and B04 and B08 with neural network training on PROSAIL model simulations. The dataset has a quality flag product (QFLAG2) which is a 16-bit that extends the scene classification band (SCL) of the Sentinel-2 Level-2 products. A “medium” filter was used to mask out QFLAG2 values from 2 to 1022, leaving land pixels (bit 1) within or outside cloud proximity (bits 11 and 13) or cloud shadow proximity (bits 12 and 14).

    The HR-VPP daily raw vegetation indices products are described in detail in the user manual (Smets et al., 2022) and the computations details of PPI are given by Jin and Eklundh (2014). Seasonal trajectories refer to the 10-daily smoothed time-series of PPI used for vegetation phenology and productivity parameters retrieval with TIMESAT (Jönsson and Eklundh 2002, 2004).

    HR-VPP data was downloaded through the WEkEO Copernicus Data and Information Access Services (DIAS) system with a Python 3.8.10 harmonized data access (HDA) API 0.2.1. Zonal statistics [’min’, ’max’, ’mean’, ’median’, ’count’, ’std’, ’majority’] were computed on non-masked pixel values within field boundaries with rasterstats Python package 0.17.00.

    The Start of season date (SOSD), end of season date (EOSD) and length of seasons (LENGTH) were extracted from the annual Vegetation Phenology and Productivity Parameters (VPP) dataset as an additional source for comparison. These data are a product of the Vegetation Phenology and Productivity Parameters, see (https://land.copernicus.eu/pan-european/biophysical-parameters/high-resolution-vegetation-phenology-and-productivity/vegetation-phenology-and-productivity) for detailed information.

    File Description:

    4 datasets:

    1_senseco_data_S2_B08_Bulgaria_France; 1_senseco_data_S2_B8A_Bulgaria_France; 1_senseco_data_HR_VPP_Bulgaria_France; 1_senseco_data_phenology_VPP_Bulgaria_France

    3 metadata:

    2_senseco_metadata_S2_B08_B8A_Bulgaria_France; 2_senseco_metadata_HR_VPP_Bulgaria_France; 2_senseco_metadata_phenology_VPP_Bulgaria_France

    The dataset files “1_senseco_data_S2_B8_Bulgaria_France” and “1_senseco_data_S2_B8A_Bulgaria_France” concerns all vegetation indices (EVI, NDVI, kNDVI, NIRv) data values and related information, and metadata file “2_senseco_metadata_S2_B08_B8A_Bulgaria_France” describes all the existing variables. Both “1_senseco_data_S2_B8_Bulgaria_France” and “1_senseco_data_S2_B8A_Bulgaria_France” have the same column variable names and for that reason, they share the same metadata file “2_senseco_metadata_S2_B08_B8A_Bulgaria_France”.

    The dataset file “1_senseco_data_HR_VPP_Bulgaria_France” concerns vegetation indices (NDVI, PPI) and plant parameters (LAI, FAPAR) data values and related information, and metadata file “2_senseco_metadata_HRVPP_Bulgaria_France” describes all the existing variables.

    The dataset file “1_senseco_data_phenology_VPP_Bulgaria_France” concerns the vegetation phenology and productivity parameters (LENGTH, SOSD, EOSD) values and related information, and metadata file “2_senseco_metadata_VPP_Bulgaria_France” describes all the existing variables.

    Bibliography

    G. Badgley, C.B. Field, J.A. Berry, Canopy near-infrared reflectance and terrestrial photosynthesis, Sci. Adv. 3 (2017) e1602244. https://doi.org/10.1126/sciadv.1602244.

    G. Camps-Valls, M. Campos-Taberner, Á. Moreno-Martínez, S. Walther, G. Duveiller, A. Cescatti, M.D. Mahecha, J. Muñoz-Marí, F.J. García-Haro, L. Guanter, M. Jung, J.A. Gamon, M. Reichstein, S.W. Running, A unified vegetation index for quantifying the terrestrial biosphere, Sci. Adv. 7 (2021) eabc7447. https://doi.org/10.1126/sciadv.abc7447.

    L.V. Graf, G. Perich, H. Aasen, EOdal: An open-source Python package for large-scale agroecological research using Earth Observation and gridded environmental data, Comput. Electron. Agric. 203 (2022) 107487. https://doi.org/10.1016/j.compag.2022.107487.

    H. Jin, L. Eklundh, A physically based vegetation index for improved monitoring of plant phenology, Remote Sens. Environ. 152 (2014) 512–525. https://doi.org/10.1016/j.rse.2014.07.010.

    P. Jonsson, L. Eklundh, Seasonality extraction by function fitting to time-series of satellite sensor data, IEEE Trans. Geosci. Remote Sens. 40 (2002) 1824–1832. https://doi.org/10.1109/TGRS.2002.802519.

    P. Jönsson, L. Eklundh, TIMESAT—a program for analyzing time-series of satellite sensor data, Comput. Geosci. 30 (2004) 833–845. https://doi.org/10.1016/j.cageo.2004.05.006.

    J. Meier, W. Mauser, T. Hank, H. Bach, Assessments on the impact of high-resolution-sensor pixel sizes for common agricultural policy and smart farming services in European regions, Comput. Electron. Agric. 169 (2020) 105205. https://doi.org/10.1016/j.compag.2019.105205.

    B. Smets, Z. Cai, L. Eklund, F. Tian, K. Bonte, R. Van Hoost, R. Van De Kerchove, S. Adriaensen, B. De Roo, T. Jacobs, F. Camacho, J. Sánchez-Zapero, S. Else, H. Scheifinger, K. Hufkens, P. Jönsson, HR-VPP Product User Manual Vegetation Indices, 2022.

    F. Tian, Z. Cai, H. Jin, K. Hufkens, H. Scheifinger, T. Tagesson, B. Smets, R. Van Hoolst, K. Bonte, E. Ivits, X. Tong, J. Ardö, L. Eklundh, Calibrating vegetation phenology from Sentinel-2 using eddy covariance, PhenoCam, and PEP725 networks across Europe, Remote Sens. Environ. 260 (2021) 112456. https://doi.org/10.1016/j.rse.2021.112456.

  2. m

    HLS-CMDS: Heart and Lung Sounds Dataset Recorded from a Clinical Manikin...

    • data.mendeley.com
    • dataverse.harvard.edu
    Updated May 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yasaman Torabi (2025). HLS-CMDS: Heart and Lung Sounds Dataset Recorded from a Clinical Manikin using Digital Stethoscope [Dataset]. http://doi.org/10.17632/8972jxbpmp.3
    Explore at:
    Dataset updated
    May 7, 2025
    Authors
    Yasaman Torabi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ** Accepted in IEEE Data Descriptions Journal ** This dataset contains 535 recordings of heart and lung sounds captured using a digital stethoscope from a clinical manikin, including both individual and mixed recordings of heart and lung sounds; It includes 50 heart sounds, 50 lung sounds, and 145 mixed sounds. For each mixed sound, the corresponding source heart sound (145 recordings) and source lung sound (145 recordings) were also recorded. It includes recordings from different anatomical chest locations, with normal and abnormal sounds. Each recording has been filtered to highlight specific sound types, making it valuable for artificial intelligence (AI) research and applications in automated cardiopulmonary disease detection, sound classification, and deep learning algorithms related to audio signal processing. If you use this dataset in your research, please cite the following paper:

    Y. Torabi, S. Shirani and J. P. Reilly, "Descriptor: Heart and Lung Sounds Dataset Recorded from a Clinical Manikin using Digital Stethoscope (HLS-CMDS)," in IEEE Data Descriptions, https://doi.org/10.1109/IEEEDATA.2025.3566012 .

    Data Type: Audio files (.wav), Comma Separated Values (.CSV)

    Each category is accompanied by a corresponding CSV file that provides metadata for the respective audio files. The CSV files (HS.csv, LS.csv, and Mix.csv) contain metadata about the corresponding audio files, including the file name, gender, heart and lung sound type, and the anatomical location where we recorded the sound.

    Sound Types: Normal Heart, Late Diastolic Murmur, Mid Systolic Murmur, Late Systolic Murmur, Atrial Fibrillation, Fourth Heart Sound, Early Systolic Murmur, Third Heart Sound, Tachycardia, Atrioventricular Block, Normal Lung, Wheezing, Fine Crackles, Rhonchi, Pleural Rub, and Coarse Crackles.

    Auscultation Landmarks: Right Upper Sternal Border, Left Upper Sternal Border, Lower Left Sternal Border, Right Costal Margin, Left Costal Margin, Apex, Right Upper Anterior, Left Upper Anterior, Right Mid Anterior, Left Mid Anterior, Right Lower Anterior, and Left Lower Anterior.

    Applications: AI-based cardiopulmonary disease detection, unsupervised sound separation techniques, and deep learning for audio signal processing.

  3. Z

    GLOBMAP global Leaf Area Index since 1981

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Ronggao; Liu, Yang; Chen, Jingming (2024). GLOBMAP global Leaf Area Index since 1981 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4700263
    Explore at:
    Dataset updated
    Jul 9, 2024
    Dataset provided by
    Institute of Geographic Sciences and Natural Resources Research, Chinese Academy of Sciences
    Department of Geography and Program in Planning, University of Toronto
    Authors
    Liu, Ronggao; Liu, Yang; Chen, Jingming
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    GLOBMAP LAI (Version 3) provides a consistent long-term global leaf area index (LAI) product (1981-2020, continuously updated) at 8km resolution on Geographic grid by quantitative fusion of Moderate Resolution Imaging Spectroradiometer (MODIS) and historical Advanced Very High Resolution Radiometer (AVHRR) data. The long-term LAI series was made up by combination of AVHRR LAI (1981–2000) and MODIS LAI (2001–). MODIS LAI series was generated from MODIS land surface reflectance data (MOD09A1 C6) based on the GLOBCARBON LAI algorithm (Deng et al., 2006). The relationships between AVHRR observations (GIMMS NDVI (Tucker et al., 2005)) and MODIS LAI were established pixel by pixel using two data series during overlapped period (2000–2006). Then the AVHRR LAI back to 1981 was estimated from historical AVHRR observations based on these pixel-level relationships. Detailed descriptions of algorithm and evaluation of the algorithm see Liu et al. (2012, JGR-B).

    Several changes have been made compared with the JGR paper:

    The MODIS C6 land surface reflectance products MOD09A1 was used to generate MODIS LAI in this GLOBMAP V3 products instead of C5 products.

    The clumping effects was considered at the pixel level by employing global clumping index map at 500m resolution (He et al., 2012) instead of land cover-specific clumping index in generation of MODIS LAI. And the new pixel-based AVHRR SR-MODIS LAI relationships were established based on these MODIS LAI series and used for AVHRR LAI retrieval.

    The cloud mask for MOD09A1 data were generated by a new cloud detection algorithm based on time series surface reflectance observations (Liu and Liu, 2013). And the contaminated pixels were filled by locally adjusted cubic spline capping approach (Chen et al., 2006).

    Dataset Characteristics:

    Spatial Coverage: 180ºW~180ºE, 63ºS~90ºN;

    Temporal Coverage: July, 1981-Dec. 2020 (continuously updated);

    Spatial Resolution: 0.0727273º;

    Temporal Resolution: Half month (1981-2000), 8-day (2001-);

    Projection: Geographic;

    Data Format: HDF/Geotiff;

    Scale: 0.01;

    Valid Range: 0-1000.

    Citation (Please cite this paper whenever these data are used):

    Liu, Y., R. Liu, and J. M. Chen (2012), Retrospective retrieval of long-term consistent global leaf area index (1981–2011) from combined AVHRR and MODIS data, J. Geophys. Res., 117, G04003, doi:10.1029/2012JG002084.

    If you have any questions, please contact Prof. Ronggao Liu (liurg@igsnrr.ac.cn) or Dr. Yang Liu (liuyang@igsnrr.ac.cn).

    Related publications with this dataset:

    Chen, J. M., F. Deng, and M. Chen (2006), Locally adjusted cubic-spline capping for reconstructing seasonal trajectories of a satellite-derived surface parameter, IEEE Trans. Geosci. Remote Sens., 44, 2230-2238

    Deng, F., J. M. Chen, S. Plummer, M. Z. Chen, and J. Pisek (2006), Algorithm for global leaf area index retrieval using satellite imagery, IEEE Trans. Geosci. Remote Sens., 44(8), 2219–2229.

    He, L. M., J. M. Chen, J. Pisek, C. B. Schaaf, and A. H. Strahler (2012), Global clumping index map derived from the MODIS BRDF product, Remote Sens. Environ., 119, 118-130.

    Liu, R. G., and Y. Liu (2013), Generation of new cloud masks from MODIS land surface reflectance products, Remote Sens. Environ., 133, 21-37.

    Tucker, C. J., J. E. Pinzon, M. E. Brown, D. A. Slayback, E. W. Pak,R. Mahoney, E. F. Vermote, and N. El Saleous (2005), An extended AVHRR 8-km NDVI dataset compatible with MODIS and SPOT vegetation NDVI data, Int. J. Remote Sens., 26(20), 4485–4498.

  4. Z

    TED dataset

    • data.niaid.nih.gov
    Updated Oct 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Popescu-Belis, Andrei (2020). TED dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4061423
    Explore at:
    Dataset updated
    Oct 6, 2020
    Dataset provided by
    Pappas, Nikolaos
    Popescu-Belis, Andrei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    A dataset for recommendations collected from ted.com which contains metadata fields for TED talks and user profiles with rating and commenting transactions.

    The TED dataset contains all the audio-video recordings of the TED talks downloaded from the official TED website, http://www.ted.com, on April 27th 2012 (first version) and on September 10th 2012 (second version). No processing has been done on any of the metadata fields. The metadata was obtained by crawling the HTML source of the list of talks and users, as well as talk and user webpages using scripts written by Nikolaos Pappas at the Idiap Research Institute, Martigny, Switzerland. The dataset is shared under the Creative Commons license (the same as the content of the TED talks) which is stored in the COPYRIGHT file. The dataset is shared for research purposes which are explained in detail in the following papers. The dataset can be used to benchmark systems that perform two tasks, namely personalized recommendations and generic recommendations. Please check the CBMI 2013 paper for a detailed description of each task.

    Nikolaos Pappas, Andrei Popescu-Belis, "Combining Content with User Preferences for TED Lecture Recommendation", 11th International Workshop on Content Based Multimedia Indexing, Veszprém, Hungary, IEEE, 2013 PDF document, Bibtex citation

    Nikolaos Pappas, Andrei Popescu-Belis, Sentiment Analysis of User Comments for One-Class Collaborative Filtering over TED Talks, 36th ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, ACM, 2013 PDF document, Bibtex citation

    If you use the TED dataset for your research please cite one of the above papers (specifically the 1st paper for the April 2012 version and the 2nd paper for the September 2012 version of the dataset).

    TED website

    The TED website is a popular online repository of audiovisual recordings of public lectures given by prominent speakers, under a Creative Commons non-commercial license (see www.ted.com). The site provides extended metadata and user-contributed material. The speakers are scientists, writers, journalists, artists, and businesspeople from all over the world who are generally given a maximum of 18 minutes to present their ideas. The talks are given in English and are usually transcribed and then translated into several other languages by volunteer users. The quality of the talks has made TED one of the most popular online lecture repositories, as each talk was viewed on average almost 500,000 times.

    Metadata

    The dataset contains two main entry types: talks and users. The talks have the following data fields: identifier, title, description, speaker name, TED event at which they were given, transcript, publication date, filming date, number of views. Each talk has a variable number of user comments, organized in threads. In addition, three fields were assigned by TED editorial staff: related tags, related themes, and related talks. Each talk generally has three related talks and 95% of them have a high- quality transcript available. The dataset includes 1,149 talks from 960 speakers and 69,023 registered users that have made about 100,000 favorites and 200,000 comments.

  5. FSD50K

    • zenodo.org
    • opendatalab.com
    • +2more
    bin, zip
    Updated Apr 24, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eduardo Fonseca; Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Frederic Font; Xavier Serra; Xavier Serra; Xavier Favory; Jordi Pons (2022). FSD50K [Dataset]. http://doi.org/10.5281/zenodo.4060432
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    Apr 24, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Eduardo Fonseca; Eduardo Fonseca; Xavier Favory; Jordi Pons; Frederic Font; Frederic Font; Xavier Serra; Xavier Serra; Xavier Favory; Jordi Pons
    Description

    FSD50K is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

    Citation

    If you use the FSD50K dataset, or part of it, please cite our TASLP paper (available from [arXiv] [TASLP]):

    @article{fonseca2022FSD50K,
     title={{FSD50K}: an open dataset of human-labeled sound events},
     author={Fonseca, Eduardo and Favory, Xavier and Pons, Jordi and Font, Frederic and Serra, Xavier},
     journal={IEEE/ACM Transactions on Audio, Speech, and Language Processing},
     volume={30},
     pages={829--852},
     year={2022},
     publisher={IEEE}
    }
    

    Paper update: This paper has been published in TASLP at the beginning of 2022. The accepted camera-ready version includes a number of improvements with respect to the initial submission. The main updates include: estimation of the amount of label noise in FSD50K, SNR comparison between FSD50K and AudioSet, improved description of evaluation metrics including equations, clarification of experimental methodology and some results, some content moved to Appendix for readability. The TASLP-accepted camera-ready version is available from arXiv (in particular, it is v2 in arXiv, displayed by default).

    Data curators

    Eduardo Fonseca, Xavier Favory, Jordi Pons, Mercedes Collado, Ceren Can, Rachit Gupta, Javier Arredondo, Gary Avendano and Sara Fernandez

    Contact

    You are welcome to contact Eduardo Fonseca should you have any questions, at efonseca@google.com.

    ABOUT FSD50K

    Freesound Dataset 50k (or FSD50K for short) is an open dataset of human-labeled sound events containing 51,197 Freesound clips unequally distributed in 200 classes drawn from the AudioSet Ontology [1]. FSD50K has been created at the Music Technology Group of Universitat Pompeu Fabra.

    What follows is a brief summary of FSD50K's most important characteristics. Please have a look at our paper (especially Section 4) to extend the basic information provided here with relevant details for its usage, as well as discussion, limitations, applications and more.

    Basic characteristics:

    • FSD50K contains 51,197 audio clips from Freesound, totalling 108.3 hours of multi-labeled audio
    • The dataset encompasses 200 sound classes (144 leaf nodes and 56 intermediate nodes) hierarchically organized with a subset of the AudioSet Ontology.
    • The audio content is composed mainly of sound events produced by physical sound sources and production mechanisms, including human sounds, sounds of things, animals, natural sounds, musical instruments and more. The vocabulary can be inspected in vocabulary.csv (see Files section below).
    • The acoustic material has been manually labeled by humans following a data labeling process using the Freesound Annotator platform [2].
    • Clips are of variable length from 0.3 to 30s, due to the diversity of the sound classes and the preferences of Freesound users when recording sounds.
    • All clips are provided as uncompressed PCM 16 bit 44.1 kHz mono audio files.
    • Ground truth labels are provided at the clip-level (i.e., weak labels).
    • The dataset poses mainly a large-vocabulary multi-label sound event classification problem, but also allows development and evaluation of a variety of machine listening approaches (see Sec. 4D in our paper).
    • In addition to audio clips and ground truth, additional metadata is made available (including raw annotations, sound predominance ratings, Freesound metadata, and more), allowing a variety of analyses and sound event research tasks (see Files section below).
    • The audio clips are grouped into a development (dev) set and an evaluation (eval) set such that they do not have clips from the same Freesound uploader.

    Dev set:

    • 40,966 audio clips totalling 80.4 hours of audio
    • Avg duration/clip: 7.1s
    • 114,271 smeared labels (i.e., labels propagated in the upwards direction to the root of the ontology)
    • Labels are correct but could be occasionally incomplete
    • A train/validation split is provided (Sec. 3H). If a different split is used, it should be specified for reproducibility and fair comparability of results (see Sec. 5C of our paper)

    Eval set:

    • 10,231 audio clips totalling 27.9 hours of audio
    • Avg duration/clip: 9.8s
    • 38,596 smeared labels
    • Eval set is labeled exhaustively (labels are correct and complete for the considered vocabulary)

    Note: All classes in FSD50K are represented in AudioSet, except Crash cymbal, Human group actions, Human voice, Respiratory sounds, and Domestic sounds, home sounds.

    LICENSE

    All audio clips in FSD50K are released under Creative Commons (CC) licenses. Each clip has its own license as defined by the clip uploader in Freesound, some of them requiring attribution to their original authors and some forbidding further commercial reuse. Specifically:

    The development set consists of 40,966 clips with the following licenses:

    • CC0: 14,959
    • CC-BY: 20,017
    • CC-BY-NC: 4616
    • CC Sampling+: 1374

    The evaluation set consists of 10,231 clips with the following licenses:

    • CC0: 4914
    • CC-BY: 3489
    • CC-BY-NC: 1425
    • CC Sampling+: 403

    For attribution purposes and to facilitate attribution of these files to third parties, we include a mapping from the audio clips to their corresponding licenses. The licenses are specified in the files dev_clips_info_FSD50K.json and eval_clips_info_FSD50K.json.

    In addition, FSD50K as a whole is the result of a curation process and it has an additional license: FSD50K is released under CC-BY. This license is specified in the LICENSE-DATASET file downloaded with the FSD50K.doc zip file. We note that the choice of one license for the dataset as a whole is not straightforward as it comprises items with different licenses (such as audio clips, annotations, or data split). The choice of a global license in these cases may warrant further investigation (e.g., by someone with a background in copyright law).

    Usage of FSD50K for commercial purposes:

    If you'd like to use FSD50K for commercial purposes, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

    Also, if you are interested in using FSD50K for machine learning competitions, please contact Eduardo Fonseca and Frederic Font at efonseca@google.com and frederic.font@upf.edu.

    FILES

    FSD50K can be downloaded as a series of zip files with the following directory structure:

    root
    │ 
    └───FSD50K.dev_audio/          Audio clips in the dev set
    │ 
    └───FSD50K.eval_audio/         Audio clips in the eval set
    │  
    └───FSD50K.ground_truth/        Files for FSD50K's ground truth
    │  │  
    │  └─── dev.csv               Ground truth for the dev set
    │  │    
    │  └─── eval.csv               Ground truth for the eval set      
    │  │      
    │  └─── vocabulary.csv            List of 200 sound classes in FSD50K 
    │  
    └───FSD50K.metadata/          Files for additional metadata
    │  │      
    │  └─── class_info_FSD50K.json        Metadata about the sound classes
    │  │      
    │  └─── dev_clips_info_FSD50K.json      Metadata about the dev clips
    │  │      
    │  └─── eval_clips_info_FSD50K.json     Metadata about the eval clips
    │  │      
    │  └─── pp_pnp_ratings_FSD50K.json      PP/PNP ratings  
    │  │      
    │  └─── collection/             Files for the *sound collection* format  
    │  
    └───FSD50K.doc/
      │      
      └───README.md               The dataset description file that you are reading
      │      
      └───LICENSE-DATASET            License of the FSD50K dataset as an entity  
    

    Each row (i.e. audio clip) of dev.csv contains the following information:

    • fname: the file name without the .wav extension, e.g., the fname 64760 corresponds to the file 64760.wav in disk. This number is the Freesound id. We always use Freesound ids as filenames.
    • labels: the class labels (i.e., the ground truth). Note these

  6. Pretraining data of SkySense++

    • zenodo.org
    bin
    Updated Mar 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kang Wu; Kang Wu (2025). Pretraining data of SkySense++ [Dataset]. http://doi.org/10.5281/zenodo.14994430
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 23, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kang Wu; Kang Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 9, 2024
    Description

    This repository contains the data description and processing for the paper titled "SkySense++: A Semantic-Enhanced Multi-Modal Remote Sensing Foundation Model for Earth Observation." The code is in here

    📢 Latest Updates

    🔥🔥🔥 Last Updated on 2025.03.23 🔥🔥🔥

    • [2025.3.23] updated pretraining data list of representation-enhanced pretraining
    • [2025.3.14] updated optical images of JL-16 dataset (https://huggingface.co/datasets/KKKKKKang/JL-16)
    • [2025.3.12] updated sentinel-1 images and labels of JL-16 dataset
    • [2025.3.9] updated pretrain and evaluation data

    Pretrain Data

    RS-Semantic Dataset

    We conduct semantic-enhanced pretraining on the RS-Semantic dataset, which consists of 13 datasets with pixel-level annotations. Below are the specifics of these datasets.

    DatasetModalitiesGSD(m)SizeCategoriesDownload Link
    Five Billion PixelsGaofen-246800x720024Download
    PotsdamAirborne0.056000x60005Download
    VaihingenAirborne0.052494x20645Download
    DeepglobeWorldView0.52448x24486Download
    iSAIDMultiple Sensors-800x800 to 4000x1300015Download
    LoveDASpaceborne0.31024x10247Download
    DynamicEarthNetWorldView0.31024x10247Download
    Sentinel-2*1032x32
    Sentinel-1*1032x33
    Pastis-MMWorldView0.31024x102418Download
    Sentinel-2*1032x32
    Sentinel-1*1032x33
    C2Seg-ABSentinel-2*10128x12813Download
    Sentinel-1*10128x128
    FLAIRSpot-50.2512x51212Download
    Sentinel-2*1040x40
    DFC20Sentinel-210256x2569Download
    Sentinel-110256x256
    S2-naipNAIP1512x51232Download
    Sentinel-2*1064x64
    Sentinel-1*1064x64
    JL-16Jilin-10.72512x51216Download
    Sentinel-1*1040x40

    * for time-series data.

    EO Benchmark

    We evaluate our SkySense++ on 12 typical Earth Observation (EO) tasks across 7 domains: agriculture, forestry, oceanography, atmosphere, biology, land surveying, and disaster management. The detailed information about the datasets used for evaluation is as follows.

    DomainTask typeDatasetModalitiesGSDImage sizeDownload LinkNotes
    AgricultureCrop classificationGermanySentinel-2*1024x24Download
    ForesetryTree species classificationTreeSatAI-Time-SeriesAirborne,0.2304x304Download
    Sentinel-2*106x6
    Sentinel-1*106x6
    Deforestation segmentationAtlanticSentinel-210512x512Download
    OceanographyOil spill segmentationSOSSentinel-110256x256Download
    AtmosphereAir pollution regression3pollutionSentinel-210200x200Download
    Sentinel-5P2600120x120
    BiologyWildlife detectionKenyaAirborne-3068x4603Download
    Land surveyingLULC mappingC2Seg-BWGaofen-610256x256Download
    Gaofen-310256x256
    Change detectiondsifn-cdGoogleEarth0.3512x512Download
    Disaster managementFlood monitoringFlood-3iAirborne0.05256 × 256Download
    C2SMSFloodsSentinel-2, Sentinel-110512x512Download
    Wildfire monitoringCABUARSentinel-2105490 × 5490Download
    Landslide mappingGVLMGoogleEarth0.31748x1748 ~ 10808x7424Download
    Building damage assessmentxBDWorldView0.31024x1024Download

    * for time-series data.

  7. BinMov2023: Binaural Dataset for Source Position Estimation with Head...

    • zenodo.org
    • data.niaid.nih.gov
    bin, txt, zip
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Archontis Politis; Guillermo Garcia Barrios; Daniel Aleksander Krause; Annamaria Mesaros; Archontis Politis; Guillermo Garcia Barrios; Daniel Aleksander Krause; Annamaria Mesaros (2024). BinMov2023: Binaural Dataset for Source Position Estimation with Head Rotation and Moving Listeners [Dataset]. http://doi.org/10.5281/zenodo.7689063
    Explore at:
    zip, bin, txtAvailable download formats
    Dataset updated
    Jan 2, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Archontis Politis; Guillermo Garcia Barrios; Daniel Aleksander Krause; Annamaria Mesaros; Archontis Politis; Guillermo Garcia Barrios; Daniel Aleksander Krause; Annamaria Mesaros
    Description

    DESCRIPTION

    BinMov2023: Binaural Dataset for Source Position Estimation with Head Rotation and Moving Listeners is a binaural dataset containing synthetic data of single source speech signals reverberated with simulated room impulse responses. The data allows for experiments related to audio tasks of sound source localization and sound distance estimation.
    The dataset consists of three subsets, related to three different scenarios:

    - static: a static sound source and a static listener
    - rotation: a static sound source and a static listener with a head rotating in the azimuth plane
    - walking: a static sound source and a listener moving in space

    Each sound file contains a unique combination of a simulated room and source and receiver positions. The walking scenario contains simulations of 2500 different rooms, whereas the static and rotation scenarios contain 5000 rooms.

    REPORT AND REFERENCE

    A detailed description of the dataset and the data generation process can be found in:

    D. A. Krause, G. García-Barrios, A. Politis and A. Mesaros, "Binaural Sound Source Distance Estimation and Localization for a Moving Listener," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, doi: 10.1109/TASLP.2023.3346297.

    available here. The supplementary material describing the data simulation is available under this link.

    If you use the dataset, please consider citing the abovementioned paper.

    METADATA

    Each sound file has a separate metadata file assigned. The information in the metadata comes per frame in the following format:

    [nb_frame (int)], [x (float)], [y (float)], [z (float)], [a (float)], [b (float)], [c (float)], [d (float)]

    Where nb_frame is the number of frame, {x, y, z} are the unnormalized Cartesian coordinates of the sound source and {a, b, c, d} are the quaternion values related to the rotation of the listener's head.

    LICENSE

    The database is published under a custom **open non-commercial with attribution** license. It can be found in the `LICENSE.txt` file that accompanies the data.

  8. DCASE 2020 Challenge Task 2 Development Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated May 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuma Koizumi; Yuma Koizumi; Yohei Kawaguchi; Yohei Kawaguchi; Keisuke Imoto; Keisuke Imoto; Toshiki Nakamura; Yuki Nikaido; Ryo Tanabe; Harsh Purohit; Kaori Suefusa; Kaori Suefusa; Takashi Endo; Masahito Yasuda; Noboru Harada; Noboru Harada; Toshiki Nakamura; Yuki Nikaido; Ryo Tanabe; Harsh Purohit; Takashi Endo; Masahito Yasuda (2022). DCASE 2020 Challenge Task 2 Development Dataset [Dataset]. http://doi.org/10.5281/zenodo.3678171
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 24, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Yuma Koizumi; Yuma Koizumi; Yohei Kawaguchi; Yohei Kawaguchi; Keisuke Imoto; Keisuke Imoto; Toshiki Nakamura; Yuki Nikaido; Ryo Tanabe; Harsh Purohit; Kaori Suefusa; Kaori Suefusa; Takashi Endo; Masahito Yasuda; Noboru Harada; Noboru Harada; Toshiki Nakamura; Yuki Nikaido; Ryo Tanabe; Harsh Purohit; Takashi Endo; Masahito Yasuda
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Description

    This dataset is the "development dataset" for the DCASE 2020 Challenge Task 2 "Unsupervised Detection of Anomalous Sounds for Machine Condition Monitoring" [task description].

    The data comprises parts of ToyADMOS and the MIMII Dataset consisting of the normal/anomalous operating sounds of six types of toy/real machines. Each recording is a single-channel (proximately) 10-sec length audio that includes both a target machine's operating sound and environmental noise. The following six types of toy/real machines are used in this task:

    • Toy-car (ToyADMOS)
    • Toy-conveyor (ToyADMOS)
    • Valve (MIMII Dataset)
    • Pump (MIMII Dataset)
    • Fan (MIMII Dataset)
    • Slide rail (MIMII Dataset)

    Recording procedure

    The ToyADMOS consists of normal/anomalous operating sounds of miniature machines (toys) collected with four microphones, and the MIMII dataset consists of those of real-machines collected with eight microphones. Anomalous sounds in these datasets were collected by deliberately damaging target machines. For simplifying the task, we used only the first channel of multi-channel recordings; all recordings are regarded as single-channel recordings of a fixed microphone. The sampling rate of all signals has been downsampled to 16 kHz. From ToyADMOS, we used only IND-type data that contain the operating sounds of the entire operation (i.e., from start to stop) in a recording. We mixed a target machine sound with environmental noise, and only noisy recordings are provided as training/test data. For the details of the recording procedure, please refer to the papers of ToyADMOS and MIMII Dataset.

    Data

    We first define two important terms in this task: Machine Type and Machine ID. Machine Type means the kind of machine, which in this task can be one of six: toy-car, toy-conveyor, valve, pump, fan, and slide rail. Machine ID is the identifier of each individual of the same type of machine, which in the training dataset can be of three or four. Each machine ID's dataset consists of (i) around 1,000 samples of normal sounds for training and (ii) 100-200 samples each of normal and anomalous sounds for the test. The given labels for each training/test sample are Machine Type, Machine ID, and condition (normal/anomaly). Machine Type information is given by directory name, and Machine ID and condition information are given by their respective file names.

    Directory structure

    When you unzip the downloaded files from Zenodo, you can see the following directory structure. As described in the previous section, Machine Type information is given by directory name, and Machine ID and condition information are given by file name, as:

    /dev_data

    • /ToyCar
      • /train (Only normal data for all Machine IDs are included.)
        • /normal_id_01_00000000.wav
        • ...
        • /normal_id_01_00000999.wav
        • /normal_id_02_00000000.wav
        • ...
        • /normal_id_04_00000999.wav
      • /test (Normal and anomaly data for all Machine IDs are included.)
        • /normal_id_01_00000000.wav
        • ...
        • /normal_id_01_00000349.wav
        • /anomaly_id_01_00000000.wav
        • ...
        • /anomaly_id_01_00000263.wav
        • /normal_id_02_00000000.wav
        • ...
        • /anomaly_id_04_00000264.wav
    • /ToyConveyor (The other Machine Types have the same directory structure as ToyCar.)
    • /fan
    • /pump
    • /slider
    • /valve

    The paths of audio files are:

    • "/dev_data/
    • "/dev_data/
    • "/dev_data/

    For example, the Machine Type and Machine ID of "/ToyCar/train/normal_id_01_00000000.wav" are "ToyCar" and "01", respectively, and its condition is normal. The Machine Type and Machine ID of "/fan/test/anomaly_id_00_00000000.wav" are "fan" and "00", respectively, and its condition is anomalous.

    Baseline system

    A simple baseline system is available on the Github repository [URL]. The baseline system provides a simple entry-level approach that gives a reasonable performance in the dataset of Task 2. It is a good starting point, especially for entry-level researchers who want to get familiar with the anomalous-sound-detection task.

    Conditions of use

    This dataset was created jointly by NTT Corporation and Hitachi, Ltd. and is available under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) license.

    Publication

    If you use this dataset, please cite all the following three papers:

    Yuma Koizumi, Shoichiro Saito, Noboru Harada, Hisashi Uematsu, and Keisuke Imoto, "ToyADMOS: A Dataset of Miniature-Machine Operating Sounds for Anomalous Sound Detection," in Proc of Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), 2019. [pdf]

    Harsh Purohit, Ryo Tanabe, Kenji Ichige, Takashi Endo, Yuki Nikaido, Kaori Suefusa, and Yohei Kawaguchi, “MIMII Dataset: Sound Dataset for Malfunctioning Industrial Machine Investigation and Inspection,” in Proc. 4th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2019. [pdf]

    Yuma Koizumi, Yohei Kawaguchi, Keisuke Imoto, Toshiki Nakamura, Yuki Nikaido, Ryo Tanabe, Harsh Purohit, Kaori Suefusa, Takashi Endo, Masahiro Yasuda, and Noboru Harada, "Description and Discussion on DCASE2020 Challenge Task2: Unsupervised Anomalous Sound Detection for Machine Condition Monitoring," in Proc. 5th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2020. [pdf]


    Feedback

    If there is any problem, please contact us:

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dessislava Ganeva; Dessislava Ganeva; Lukas Graf Valentin; Lukas Graf Valentin; Egor Prikaziuk; Egor Prikaziuk; Gerbrand Koren; Gerbrand Koren; Enrico Tomelleri; Enrico Tomelleri; Jochem Verrelst; Jochem Verrelst; Katja Berger; Katja Berger; Santiago Belda; Santiago Belda; Zhanzhang Cai; Zhanzhang Cai; Cláudio Silva Figueira; Cláudio Silva Figueira (2023). Satellite remote sensing dataset of Sentinel-2 for phenology metrics extraction from sites in Bulgaria and France [Dataset]. http://doi.org/10.5281/zenodo.7825727
Organization logo

Data from: Satellite remote sensing dataset of Sentinel-2 for phenology metrics extraction from sites in Bulgaria and France

Related Article
Explore at:
txtAvailable download formats
Dataset updated
Apr 28, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Dessislava Ganeva; Dessislava Ganeva; Lukas Graf Valentin; Lukas Graf Valentin; Egor Prikaziuk; Egor Prikaziuk; Gerbrand Koren; Gerbrand Koren; Enrico Tomelleri; Enrico Tomelleri; Jochem Verrelst; Jochem Verrelst; Katja Berger; Katja Berger; Santiago Belda; Santiago Belda; Zhanzhang Cai; Zhanzhang Cai; Cláudio Silva Figueira; Cláudio Silva Figueira
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Site Description:

In this dataset, there are seventeen production crop fields in Bulgaria where winter rapeseed and wheat were grown and two research fields in France where winter wheat – rapeseed – barley – sunflower and winter wheat – irrigated maize crop rotation is used. The full description of those fields is in the database "In-situ crop phenology dataset from sites in Bulgaria and France" (doi.org/10.5281/zenodo.7875440).

Methodology and Data Description:

Remote sensing data is extracted from Sentinel-2 tiles 35TNJ for Bulgarian sites and 31TCJ for French sites on the day of the overpass since September 2015 for Sentinel-2 derived vegetation indices and since October 2016 for HR-VPP products. To suppress spectral mixing effects at the parcel boundaries, as highlighted by Meier et al., 2020, the values from all datasets were subgrouped per field and then aggregated to a single median value for further analysis.

Sentinel-2 data was downloaded for all test sites from CREODIAS (https://creodias.eu/) in L2A processing level using a maximum scene-wide cloudy cover threshold of 75%. Scenes before 2017 were available in L1C processing level only. Scenes in L1C processing level were corrected for atmospheric effects after downloading using Sen2Cor (v2.9) with default settings. This was the same version used for the L2A scenes obtained intermediately from CREODIAS.

Next, the data was extracted from the Sentinel-2 scenes for each field parcel where only SCL classes 4 (vegetation) and 5 (bare soil) pixels were kept. We resampled the 20m band B8A to match the spatial resolution of the green and red band (10m) using nearest neighbor interpolation. The entire image processing chain was carried out using the open-source Python Earth Observation Data Analysis Library (EOdal) (Graf et al., 2022).

Apart from the widely used Normalized Difference Vegetation Index (NDVI) and Enhanced Vegetation Index (EVI), we included two recently proposed indices that were reported to have a higher correlation with photosynthesis and drought response of vegetation: These were the Near-Infrared Reflection of Vegetation (NIRv) (Badgley et al., 2017) and Kernel NDVI (kNDVI) (Camps-Valls et al., 2021). We calculated the vegetation indices in two different ways:

First, we used B08 as near-infrared (NIR) band which comes in a native spatial resolution of 10 m. B08 (central wavelength 833 nm) has a relatively coarse spectral resolution with a bandwidth of 106 nm.

Second, we used B8A which is available at 20 m spatial resolution. B8A differs from B08 in its central wavelength (864 nm) and has a narrower bandwidth (21 nm or 22 nm in the case of Sentinel-2A and 2B, respectively) compared to B08.

The High Resolution Vegetation Phenology and Productivity (HR-VPP) dataset from Copernicus Land Monitoring Service (CLMS) has three 10-m set products of Sentinel-2: vegetation indices, vegetation phenology and productivity parameters and seasonal trajectories (Tian et al., 2021). Both vegetation indices, Normalized Vegetation Index (NDVI) and Plant Phenology (PPI) and plant parameters, Fraction of Absorbed Photosynthetic Active Radiation (FAPAR) and Leaf Area Index (LAI) were computed for the time of Sentinel-2 overpass by the data provider.

NDVI is computed directly from B04 and B08 and PPI is computed using Difference Vegetation Index (DVI = B08 - B04) and its seasonal maximum value per pixel. FAPAR and LAI are retrieved from B03 and B04 and B08 with neural network training on PROSAIL model simulations. The dataset has a quality flag product (QFLAG2) which is a 16-bit that extends the scene classification band (SCL) of the Sentinel-2 Level-2 products. A “medium” filter was used to mask out QFLAG2 values from 2 to 1022, leaving land pixels (bit 1) within or outside cloud proximity (bits 11 and 13) or cloud shadow proximity (bits 12 and 14).

The HR-VPP daily raw vegetation indices products are described in detail in the user manual (Smets et al., 2022) and the computations details of PPI are given by Jin and Eklundh (2014). Seasonal trajectories refer to the 10-daily smoothed time-series of PPI used for vegetation phenology and productivity parameters retrieval with TIMESAT (Jönsson and Eklundh 2002, 2004).

HR-VPP data was downloaded through the WEkEO Copernicus Data and Information Access Services (DIAS) system with a Python 3.8.10 harmonized data access (HDA) API 0.2.1. Zonal statistics [’min’, ’max’, ’mean’, ’median’, ’count’, ’std’, ’majority’] were computed on non-masked pixel values within field boundaries with rasterstats Python package 0.17.00.

The Start of season date (SOSD), end of season date (EOSD) and length of seasons (LENGTH) were extracted from the annual Vegetation Phenology and Productivity Parameters (VPP) dataset as an additional source for comparison. These data are a product of the Vegetation Phenology and Productivity Parameters, see (https://land.copernicus.eu/pan-european/biophysical-parameters/high-resolution-vegetation-phenology-and-productivity/vegetation-phenology-and-productivity) for detailed information.

File Description:

4 datasets:

1_senseco_data_S2_B08_Bulgaria_France; 1_senseco_data_S2_B8A_Bulgaria_France; 1_senseco_data_HR_VPP_Bulgaria_France; 1_senseco_data_phenology_VPP_Bulgaria_France

3 metadata:

2_senseco_metadata_S2_B08_B8A_Bulgaria_France; 2_senseco_metadata_HR_VPP_Bulgaria_France; 2_senseco_metadata_phenology_VPP_Bulgaria_France

The dataset files “1_senseco_data_S2_B8_Bulgaria_France” and “1_senseco_data_S2_B8A_Bulgaria_France” concerns all vegetation indices (EVI, NDVI, kNDVI, NIRv) data values and related information, and metadata file “2_senseco_metadata_S2_B08_B8A_Bulgaria_France” describes all the existing variables. Both “1_senseco_data_S2_B8_Bulgaria_France” and “1_senseco_data_S2_B8A_Bulgaria_France” have the same column variable names and for that reason, they share the same metadata file “2_senseco_metadata_S2_B08_B8A_Bulgaria_France”.

The dataset file “1_senseco_data_HR_VPP_Bulgaria_France” concerns vegetation indices (NDVI, PPI) and plant parameters (LAI, FAPAR) data values and related information, and metadata file “2_senseco_metadata_HRVPP_Bulgaria_France” describes all the existing variables.

The dataset file “1_senseco_data_phenology_VPP_Bulgaria_France” concerns the vegetation phenology and productivity parameters (LENGTH, SOSD, EOSD) values and related information, and metadata file “2_senseco_metadata_VPP_Bulgaria_France” describes all the existing variables.

Bibliography

G. Badgley, C.B. Field, J.A. Berry, Canopy near-infrared reflectance and terrestrial photosynthesis, Sci. Adv. 3 (2017) e1602244. https://doi.org/10.1126/sciadv.1602244.

G. Camps-Valls, M. Campos-Taberner, Á. Moreno-Martínez, S. Walther, G. Duveiller, A. Cescatti, M.D. Mahecha, J. Muñoz-Marí, F.J. García-Haro, L. Guanter, M. Jung, J.A. Gamon, M. Reichstein, S.W. Running, A unified vegetation index for quantifying the terrestrial biosphere, Sci. Adv. 7 (2021) eabc7447. https://doi.org/10.1126/sciadv.abc7447.

L.V. Graf, G. Perich, H. Aasen, EOdal: An open-source Python package for large-scale agroecological research using Earth Observation and gridded environmental data, Comput. Electron. Agric. 203 (2022) 107487. https://doi.org/10.1016/j.compag.2022.107487.

H. Jin, L. Eklundh, A physically based vegetation index for improved monitoring of plant phenology, Remote Sens. Environ. 152 (2014) 512–525. https://doi.org/10.1016/j.rse.2014.07.010.

P. Jonsson, L. Eklundh, Seasonality extraction by function fitting to time-series of satellite sensor data, IEEE Trans. Geosci. Remote Sens. 40 (2002) 1824–1832. https://doi.org/10.1109/TGRS.2002.802519.

P. Jönsson, L. Eklundh, TIMESAT—a program for analyzing time-series of satellite sensor data, Comput. Geosci. 30 (2004) 833–845. https://doi.org/10.1016/j.cageo.2004.05.006.

J. Meier, W. Mauser, T. Hank, H. Bach, Assessments on the impact of high-resolution-sensor pixel sizes for common agricultural policy and smart farming services in European regions, Comput. Electron. Agric. 169 (2020) 105205. https://doi.org/10.1016/j.compag.2019.105205.

B. Smets, Z. Cai, L. Eklund, F. Tian, K. Bonte, R. Van Hoost, R. Van De Kerchove, S. Adriaensen, B. De Roo, T. Jacobs, F. Camacho, J. Sánchez-Zapero, S. Else, H. Scheifinger, K. Hufkens, P. Jönsson, HR-VPP Product User Manual Vegetation Indices, 2022.

F. Tian, Z. Cai, H. Jin, K. Hufkens, H. Scheifinger, T. Tagesson, B. Smets, R. Van Hoolst, K. Bonte, E. Ivits, X. Tong, J. Ardö, L. Eklundh, Calibrating vegetation phenology from Sentinel-2 using eddy covariance, PhenoCam, and PEP725 networks across Europe, Remote Sens. Environ. 260 (2021) 112456. https://doi.org/10.1016/j.rse.2021.112456.

Search
Clear search
Close search
Google apps
Main menu