89 datasets found
  1. f

    Descriptive statistics, mean ± SD, range, median and interquartile range...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hélène Follet; Delphine Farlay; Yohann Bala; Stéphanie Viguet-Carrin; Evelyne Gineyts; Brigitte Burt-Pichat; Julien Wegrzyn; Pierre Delmas; Georges Boivin; Roland Chapurlat (2023). Descriptive statistics, mean ± SD, range, median and interquartile range (IQR). [Dataset]. http://doi.org/10.1371/journal.pone.0055232.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Hélène Follet; Delphine Farlay; Yohann Bala; Stéphanie Viguet-Carrin; Evelyne Gineyts; Brigitte Burt-Pichat; Julien Wegrzyn; Pierre Delmas; Georges Boivin; Roland Chapurlat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics, mean ± SD, range, median and interquartile range (IQR).

  2. a

    North America Boundaries

    • home-pugonline.hub.arcgis.com
    Updated Oct 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The PUG User Group (2023). North America Boundaries [Dataset]. https://home-pugonline.hub.arcgis.com/datasets/north-america-boundaries
    Explore at:
    Dataset updated
    Oct 23, 2023
    Dataset authored and provided by
    The PUG User Group
    Area covered
    North America,
    Description

    The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a new, retrospective satellite-based precipitation dataset, constructed as a climate data record for hydrological and climate studies. The PERSIANN-CDR is available from 1983-present making the dataset the longest satellite based precipitation data record available. The precipitation maps are available at daily temporal resolution for the latitude band 60°S–60°N at 0.25 degrees. The maps shown here represent 30-year annual and seasonal median and interquartile range (IQR) of the PERSIANN-CDR dataset from 1984 – 2014. In the median precipitation maps, the mid-point value (or 50th percentile) for each pixel in is computed and plotted for the study area. The range of the data about the median is represented by the interquartile range (IQR), and shows the variability of the dataset. For these maps, winter = December – February, spring = March – May, summer = June – August, fall = September – November

  3. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  4. Precipitation Interquartile Range Spring Estimation (PERSIANN) 1984-2014

    • noaa.hub.arcgis.com
    Updated Dec 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA GeoPlatform (2024). Precipitation Interquartile Range Spring Estimation (PERSIANN) 1984-2014 [Dataset]. https://noaa.hub.arcgis.com/maps/c06721acf213414191847347fcbdff3b
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA GeoPlatform
    Area covered
    Description

    The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a satellite-based precipitation dataset for hydrological and climate studies, spanning from 1983 to present. It is the longest satellite-based precipitation record available, with daily data at 0.25° resolution for the 60°S–60°N latitude band.PERSIANN rain rate estimates are generated at 0.25° resolution and calibrated to a monthly merged in-situ and satellite product from the Global Precipitation Climatology Project (GPCP). The model uses Gridded Satellite (GridSat-B1) infrared data at 3-hourly time steps, with the raw output (PERSIANN-B1) bias-corrected and accumulated to produce the daily PERSIANN-CDR.The maps show 31 years (1984–2014) of annual and seasonal median and interquartile range (IQR) data. The median represents the 50th percentile of precipitation, and the IQR reflects the range between the 75th and 25th percentiles, showing data variability. Median and IQR are preferred over mean and standard deviation as they are less influenced by extreme values and better represent non-normally distributed data, such as precipitation, which is skewed and zero-limited.Data and Metadata: NCEIThis is a component of the Gulf Data Atlas (V1.0) for the Physical topic area.

  5. w

    Data from: GEOMACS (Geological and Oceanographic Model of Australias...

    • data.wu.ac.at
    • researchdata.edu.au
    • +1more
    zip
    Updated Jun 24, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CSIRO Oceans and Atmosphere - Information and Data Centre (2017). GEOMACS (Geological and Oceanographic Model of Australias Continental Shelf) Interquartile range [Dataset]. https://data.wu.ac.at/schema/data_gov_au/ZGRmZGQyYjktMjEwNC00OWUxLTk4OTQtNTM3OWQyY2YyNmU0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 24, 2017
    Dataset provided by
    CSIRO Oceans and Atmosphere - Information and Data Centre
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Area covered
    Australia, cd30346d96a4fac7d77a1e77d04c5511ba04b2f6
    Description

    Geoscience Australias GEOMACS model was utilised to produce hindcast hourly time series of continental shelf (~20 to 300 m depth) bed shear stress (unit of measure: Pascal, Pa) on a 0.1 degree grid covering the period March 1997 to February 2008 (inclusive). The hindcast data represents the combined contribution to the bed shear stress by waves, tides, wind and density-driven circulation. Included in the parameters that will be calculated to represent the magnitude of the bulk of the data are the quartiles of the distribution; Q25, Q50 and Q75 (i.e. the values for which 25, 50 and 75 percent of the observations fall below). The interquartile range, , of the GEOMACS output takes the observations from between Q25 and Q75 to provide an accurate representation of the spread of observations. The interquartile range was shown to provide a more robust representation of the observations than the standard deviation, which produced highly skewed observations (Hughes and Harris 2008). This dataset is a contribution to the CERF Marine Biodiversity Hub and is hosted temporarily by CMAR on behalf of Geoscience Australia.

  6. Median and interquartile range of the actual fall in eGFR from the baseline...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Takeshi Nishijima; Hirokazu Komatsu; Hiroyuki Gatanaga; Takahiro Aoki; Koji Watanabe; Ei Kinai; Haruhito Honda; Junko Tanuma; Hirohisa Yazaki; Kunihisa Tsukada; Miwako Honda; Katsuji Teruya; Yoshimi Kikuchi; Shinichi Oka (2023). Median and interquartile range of the actual fall in eGFR from the baseline to 24, 48, and 96 weeks, according to body weight. [Dataset]. http://doi.org/10.1371/journal.pone.0022661.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Takeshi Nishijima; Hirokazu Komatsu; Hiroyuki Gatanaga; Takahiro Aoki; Koji Watanabe; Ei Kinai; Haruhito Honda; Junko Tanuma; Hirohisa Yazaki; Kunihisa Tsukada; Miwako Honda; Katsuji Teruya; Yoshimi Kikuchi; Shinichi Oka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    eGFR: estimated glomerular filtration rate, IQR: interquartile range.

  7. f

    Median, interquartile range (IQR) and significance level of the difference...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthias Gilgien; Philip Crivelli; Jörg Spörri; Josef Kröll; Erich Müller (2023). Median, interquartile range (IQR) and significance level of the difference between discipline medians and distributions for all parameters, and percentage of DH for GS and SG. [Dataset]. http://doi.org/10.1371/journal.pone.0118119.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Matthias Gilgien; Philip Crivelli; Jörg Spörri; Josef Kröll; Erich Müller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DH represents 100% for the relative measure. Differences between medians and distributions were significant between all disciplines if indicated with * and were significantly different between GS and SG when marked with 1, significantly different between GS and DH if marked with 2 and significantly different between SG and DH if marked with 3. If no parameter was significantly different the column is empty. Columns marked with—indicate that the measure was not calculated.Median, interquartile range (IQR) and significance level of the difference between discipline medians and distributions for all parameters, and percentage of DH for GS and SG.

  8. Precipitation Interquartile Range Winter Estimation (PERSIANN) 1984-2014

    • noaa.hub.arcgis.com
    Updated Dec 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA GeoPlatform (2024). Precipitation Interquartile Range Winter Estimation (PERSIANN) 1984-2014 [Dataset]. https://noaa.hub.arcgis.com/maps/c38031dd1db6491d837e3b5e58c628d5
    Explore at:
    Dataset updated
    Dec 18, 2024
    Dataset provided by
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Authors
    NOAA GeoPlatform
    Area covered
    Description

    The Precipitation Estimation from Remotely Sensed Information using an Artificial Neural Network-Climate Data Record (PERSIANN-CDR) is a satellite-based precipitation dataset for hydrological and climate studies, spanning from 1983 to present. It is the longest satellite-based precipitation record available, with daily data at 0.25° resolution for the 60°S–60°N latitude band.PERSIANN rain rate estimates are generated at 0.25° resolution and calibrated to a monthly merged in-situ and satellite product from the Global Precipitation Climatology Project (GPCP). The model uses Gridded Satellite (GridSat-B1) infrared data at 3-hourly time steps, with the raw output (PERSIANN-B1) bias-corrected and accumulated to produce the daily PERSIANN-CDR.The maps show 31 years (1984–2014) of annual and seasonal median and interquartile range (IQR) data. The median represents the 50th percentile of precipitation, and the IQR reflects the range between the 75th and 25th percentiles, showing data variability. Median and IQR are preferred over mean and standard deviation as they are less influenced by extreme values and better represent non-normally distributed data, such as precipitation, which is skewed and zero-limited.Data and Metadata: NCEIThis is a component of the Gulf Data Atlas (V1.0) for the Physical topic area.

  9. P

    UDED Dataset

    • paperswithcode.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xavier Soria; Yachuan Li; Mohammad Rouhani; Angel D. Sappa, UDED Dataset [Dataset]. https://paperswithcode.com/dataset/uded
    Explore at:
    Authors
    Xavier Soria; Yachuan Li; Mohammad Rouhani; Angel D. Sappa
    Description

    This dataset is a collection of 1, 2, or 3 images from: BIPED, BSDS500, BSDS300, DIV2K, WIRE-FRAME, CID, CITYSCAPES, ADE20K, MDBD, NYUD, THANGKA, PASCAL-Context, SET14, URBAN10, and the camera-man image. The image selection process consists on computing the Inter-Quartile Range (IQR) intensity value on all the images, images larger than 720×720 pixels were not considered. In dataset whose images are in HR, they were cut. We thank all the datasets owners to make them public. This dataset is just for Edge Detection not contour nor Boundary tasks.

  10. o

    Data from: Prioritization of barriers that hinders Local Flexibility Market...

    • explore.openaire.eu
    • research.science.eus
    Updated May 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koldo Zabaleta; Diego Casado-Mansilla; Cruz E. Cruz E.Borges; Evgenia Kapassa; Guntram Preßmair; Marilena Stathopoulou; Diego López-de-Ipiña (2020). Prioritization of barriers that hinders Local Flexibility Market proliferation [Dataset]. http://doi.org/10.5281/zenodo.3855545
    Explore at:
    Dataset updated
    May 31, 2020
    Authors
    Koldo Zabaleta; Diego Casado-Mansilla; Cruz E. Cruz E.Borges; Evgenia Kapassa; Guntram Preßmair; Marilena Stathopoulou; Diego López-de-Ipiña
    Description

    This dataset contains the prioritization provided by a panel of 15 experts to a set of 28 barriers categories for 8 different roles of the future energy system. A Delphi method was followed and the scores provided in the three rounds carried out are included. The dataset also contains the scripts used to assess the results and the output of this assessment. A list of the information contained in this file is: data folder: this folders includes the scores given by the 15 experts in the 3 rounds. Every round is in an individual folder. There is a file per expert that has the scores between -5 (not relevant at all) to 5 (completely relevant) per barrier (rows) and actor (columns). There is also a file with the description of the experts in terms of their position in the company, the type of company and the country. fig folder: this folder includes the figures created to assess the information provided by the experts. For each round, the following figures are created (in each respective folder): Boxplot with the distribution of scores per barriers and roles. Heatmap with the mean scores per barriers and roles. Boxplots with the comparison of the different distributions provided by the experts of each group (depending on the keywords) per barrier and role. Heatmap with the mean score per barrier weighted depeding on the importance of the role in each use case and the final prioritization. Finally, bar plots with the mean scores differences between rounds and boxplot with comparisons of the scores distributions are also provided. stat folder: this folder includes the files with the results of the different statistical assessment carried out. For each round, the following figures are created (in each respective folder): The statistics used to assess the scores (Intraclass correlation coefficient, Inter-rater agreement, Inter-rater agreement p-value, Homogeneity of Variances, Average interquartile range, Standard Deviation of interquartile ranges, Friedman test p-value Average power post hoc) per barrier and per role. The results of the post hoc of the Friedman Test per berries and per roles. The average score per barrier and per role. The mean value of the scores provided by the experts grouped by the keywords per barrier and role. P-value of the comparison of these two values. The end prioritization of the barrier for the use case (averaging the scores or fuzzy merging of the critical sets) Finally, the differences between the mean and standard deviations of the scores between two consecutive rounds are provided.

  11. United States Climate Reference Network (USCRN) Standardized Soil Moisture...

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Sep 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NOAA National Centers for Environmental Information (Point of Contact) (2023). United States Climate Reference Network (USCRN) Standardized Soil Moisture and Soil Moisture Climatology [Dataset]. https://catalog.data.gov/dataset/united-states-climate-reference-network-uscrn-standardized-soil-moisture-and-soil-moisture-clim2
    Explore at:
    Dataset updated
    Sep 19, 2023
    Dataset provided by
    National Centers for Environmental Informationhttps://www.ncei.noaa.gov/
    National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
    Area covered
    United States
    Description

    The U.S. Climate Reference Network (USCRN) was designed to monitor the climate of the United States using research quality instrumentation located within representative pristine environments. This Standardized Soil Moisture (SSM) and Soil Moisture Climatology (SMC) product set is derived using the soil moisture observations from the USCRN. The hourly soil moisture anomaly (SMANOM) is derived by subtracting the MEDIAN from the soil moisture volumetric water content (SMVWC) and dividing the difference by the interquartile range (IQR = 75th percentile - 25th percentile) for that hour: SMANOM = (SMVWC - MEDIAN) / (IQR). The soil moisture percentile (SMPERC) is derived by taking all the values that were used to create the empirical cumulative distribution function (ECDF) that yielded the hourly MEDIAN and adding the current observation to the set, recalculating the ECDF, and determining the percentile value of the current observation. Finally, the soil temperature for the individual layers is provided for the dataset user convenience. The SMC files contain the MEAN, MEDIAN, IQR, and decimal fraction of available data that are valid for each hour of the year at 5, 10, 20, 50, and 100 cm depth soil layers as well as for a top soil layer (TOP) and column soil layer (COLUMN). The TOP layer consists of an average of the 5 and 10 cm depths, while the COLUMN layer includes all available depths at a location, either two layers or five layers depending on soil depth. The SSM files contain the mean VWC, SMANOM, SMPERC, and TEMPERATURE for each of the depth layers described above. File names are structured as CRNSSM0101-STATIONNAME.csv and CRNSMC0101-STATIONNAME.csv. SSM stands for Standardized Soil Moisture and SCM represent Soil Moisture Climatology. The first two digits of the trailing integer indicate major version and the second two digits minor version of the product.

  12. f

    The median (and interquartile range) of the individuals’ median and inter...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Karlijn Sporrel; Simone R. Caljouw; Rob Withagen (2023). The median (and interquartile range) of the individuals’ median and inter quartile range (range) of both the time on the stone and the number of steps on the stone in the standardized and nonstandardized configuration. [Dataset]. http://doi.org/10.1371/journal.pone.0176165.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Karlijn Sporrel; Simone R. Caljouw; Rob Withagen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The median (and interquartile range) of the individuals’ median and inter quartile range (range) of both the time on the stone and the number of steps on the stone in the standardized and nonstandardized configuration.

  13. Perturbed Synthetic SWOT Datasets for Testing and Development of a Kalman...

    • zenodo.org
    zip
    Updated May 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siqi Ke; Siqi Ke; Mohammad J. Tourian; Mohammad J. Tourian; Renato Prata de Moraes Frasson; Renato Prata de Moraes Frasson (2025). Perturbed Synthetic SWOT Datasets for Testing and Development of a Kalman Filter Approach to Estimate Daily Discharge [Dataset]. http://doi.org/10.5281/zenodo.15482735
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Siqi Ke; Siqi Ke; Mohammad J. Tourian; Mohammad J. Tourian; Renato Prata de Moraes Frasson; Renato Prata de Moraes Frasson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    1. Introduction

    Datasets are used to evaluate the performance of a Kalman filter approach to estimate daily discharge. This is a perturbed version of synthetic SWOT datasets consisting of 15 river sections, which are commonly agreed datasets for evaluating the performance of SWOT discharge algorithms (Frasson et al., 2020, 2021). The benchmarking manuscript entitled “A Kalman Filter Approach for Estimating Daily Discharge Using Space-based Discharge Estimates” is currently under review at Water Resources Research. Once the manuscript is accepted, its DOI will be included here.

    2. File description

    The datasets are generally divided into two categories: river information (River_Info) and time series data (Timeseries_Data). River information provides fundamental and general river characteristics, whereas time series data offers daily reach-averaged data for each reach. In time series data, the data mainly contains three components: true data, perturbed measurements, and true and perturbed flow law parameters (A0, an, and b). For each reach, there are 10000 realizations of perturbed measurements per time step and there are 100 realizations of time-invariant perturbed flow law parameters through a Monte Carlo simulation (Frasson et al., 2023). Moreover, to support our proposed Kalman filter approach to estimate daily discharge, the datasets provide the median of the perturbed discharge, river width, water surface slope, and change in the cross-sectional area, as well as the uncertainty of the perturbed discharge and change in the cross-sectional area based on the interquartile range (Fox, 2015).

    To support reproducibility and facilitate example usage, we now include a MATLAB code package (KalmanFilter_Code.zip) that demonstrates how to run the Kalman filter approach using the Missouri Downstream case as an example.

    Datasets are contained in a .mat file per river. The detailed groups and variables are in the following:

    River_Info

    Name: River name, data type: char

    QWBM: Mean annual discharge from the water balance model WBMsed (Cohen et al., 2014)

    rch_bnd: Reach boundaries measured in meters from the upstream end of the model

    gdrch: Good reaches in the study. They were used to exclude small reaches defined around low-head dams and other obstacles where Manning’s equation should not be applied.

    Timeseries_Data

    t: Time measured in days since the first day or “0-January-0000” for cases when specific dates were available. Dimension: 1, time step.

    A: Reach-averaged cross-sectional area of flow in m2. Dimension: Reach, time step.

    Q_true: True reach-averaged discharge (m3/s). Dimension: Reach, time step.

    Q_ptb: Perturbed discharge (m3/s), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.

    med_Q_ptb: Median perturbed discharge (m3/s) across the 10000 realizations. Dimension: Good reach, time step.

    sigma_Q_ptb: Uncertainty of the perturbed discharge (m3/s), calculated based on the interquartile range. Dimension: Good reach, time step.

    W_true: True reach-averaged river width (m). Dimension: Reach, time step.

    W_ptb: Perturbed river width (m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.

    med_W_ptb: Median perturbed river width (m) across the 10000 realizations. Dimension: Good reach, time step.

    H_true: True reach-averaged water surface elevation (m). Dimension: Reach, time step.

    H_ptb: Perturbed water surface elevation (m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.

    S_true: True reach-averaged water surface slope (m/m). Dimension: Reach, time step.

    S_ptb: Perturbed water surface slope (m/m), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.

    med_S_ptb: Median perturbed water surface slope (m/m) across the 10000 realizations. Dimension: Good reach, time step.

    dA_true: True reach-averaged change in the cross-sectional area (m2). Dimension: Good reach, time step.

    dA_ptb: Perturbed change in the cross-sectional area (m2), including 10000 realizations for each measurement. Dimension: Good reach, time step, 10000.

    med_dA_ptb: Median perturbed change in the cross-sectional area (m2) across the 10000 realizations. Dimension: Good reach, time step.

    sigma_dA_ptb: Uncertainty of the perturbed change in the cross-sectional area (m2), calculated based on the interquartile range. Dimension: Good reach, time step.

    A0_true: True baseline cross-sectional area (m2). Dimension: Good reach, 1.

    A0: Perturbed baseline cross-sectional area (m2), including 100 realizations for each parameter. Dimension: Good reach, 100.

    na_true: True friction coefficient. Dimension: Good reach, 1.

    na: Perturbed friction coefficient, including 100 realizations for each parameter. Dimension: Good reach, 100.

    b_true: True exponent coefficient. Dimension: Good reach, 1.

    b: Perturbed exponent coefficient, including 100 realizations for each parameter. Dimension: Good reach, 100.

  14. A Decade of Reddit Politics: Comprehensive Dataset on User Political...

    • zenodo.org
    zip
    Updated Feb 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentina Pansanella; Valentina Pansanella; Giulio Rossetti; Giulio Rossetti; Virginia Morini; Virginia Morini (2024). A Decade of Reddit Politics: Comprehensive Dataset on User Political Leanings and Interaction Networks (2011-2021) [Dataset]. http://doi.org/10.5281/zenodo.10715427
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Valentina Pansanella; Valentina Pansanella; Giulio Rossetti; Giulio Rossetti; Virginia Morini; Virginia Morini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data in brief

    Small description of the Reddit Politics 10 years dataset (starting from 2011-01).

    Political leanings

    node_month_leaning_full_headings.zip

    The zip file contains a csv where each row identifies a user content (either post or comment) with the following structure

    node_id,month_progressive_id,leaning_lable,leaning_score,post_comment

    where

    node_id: is the user uniq identifier
    month_progressive:_id is a numeric value from 0 to 100 identifying the month in which the post/comment has been published
    leaning_label: is a discrete variable identifying left/right/moderates (it is based on leaning_score and can be re-binned if needed)
    leaning_score: is the continuos score describing the political leaning (range [0,1])
    post_comment: a flag P/C to differentiate the submission type

    monthly_scores_json.zip

    This archive contains 3 json files:
    monthly_scores.json: a dictionary month->node_id->{post: [list political leanings], comments: [list political leanings]};
    monthly_scores_post_agg.json: a dictionary mont->node_id->political_leaning, where the aggregated score is the average of the interquartile range of the political leaning of the sole users' posts;
    monthly_scores_agg.json: a dictionary mont->node_id->political_leaning, where the aggregated score is the weighted(*) average among (i) the mean value of the interquartile range of the political leaning of the users' posts, (ii) the mean value of the interquartile range of the political leaning of the users' comments;

    (*) being posts' annotation more reliable than comments' ones we decided to weight the former 10 times the latter when aggregating.

    monthly_networks_full.zip

    This archive contains all the monthly undirected, unweighted, interaction network (each row identifying an edge among two node ids). The networks cover all users having having a political leanin computed (using *both* posts and comments).

    monthly_networks_posts.zip

    This archive contains all the monthly undirected, unweighted, interaction network (each row identifying an edge among two node ids). The networks cover all users having having a political leanin computed considering *only* posts.

  15. Italy: Mobility COVID-19

    • kaggle.com
    Updated Mar 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mr. Rahman (2021). Italy: Mobility COVID-19 [Dataset]. https://www.kaggle.com/motiurse/italy-mobility-covid19/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mr. Rahman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Italy
    Description

    A live version of the data record, which will be kept up-to-date with new estimates, can be downloaded from the Humanitarian Data Exchange: https://data.humdata.org/dataset/covid-19-mobility-italy.

    If you find the data helpful or you use the data for your research, please cite our work:

    Pepe, E., Bajardi, P., Gauvin, L., Privitera, F., Lake, B., Cattuto, C., & Tizzoni, M. (2020). COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Scientific Data 7, 230 (2020).

    The data record is structured into 4 comma-separated value (CSV) files, as follows:

    id_provinces_IT.csv. Table of the administrative codes of the 107 Italian provinces. The fields of the table are:

    COD_PROV is an integer field that is used to identify a province in all other data records;

    SIGLA is a two-letters code that identifies the province according to the ISO_3166-2 standard (https://en.wikipedia.org/wiki/ISO_3166-2:IT);

    DEN_PCM is the full name of the province.

    OD_Matrix_daily_flows_norm_full_2020_01_18_2020_04_17.csv. The file contains the daily fraction of users’ moving between Italian provinces. Each line corresponds to an entry of matrix (i, j). The fields of the table are:

    p1: COD_PROV of origin,

    p2: COD_PROV of destination,

    day: in the format yyyy-mm-dd.

    median_q1_q3_rog_2020_01_18_2020_04_17.csv. The file contains median and interquartile range (IQR) of users’ radius of gyration in a province by week. Each entry of the table fields of the table are:

    COD_PROV of the province;

    SIGLA of the province;

    DEN_PCM of the province;

    week: median value of the radius of gyration on week week, with week in the format dd/mm-DD/MM where dd/mm and DD/MM are the first and the last day of the week, respectively.

    week Q1 first quartile (Q1) of the distribution of the radius of gyration on week week,

    week Q3 third quartile (Q3) of the distribution of the radius of gyration on week week,

    average_network_degree_2020_01_18_2020_04_17.csv. The file contains daily time-series of the average degree 〈k〉 of the proximity network. Each entry of the table is a value of 〈k〉 on a given day. The fields of the table are:

    COD_PROV of the province;

    SIGLA of the province;

    DEN_PCM of the province;

    day in the format yyyy-mm-dd.

    ESRI shapefiles of the Italian provinces updated to the most recent definition are available from the website of the Italian National Office of Statistics (ISTAT): https://www.istat.it/it/archivio/222527.

  16. Gender, Age, and Emotion Detection from Voice

    • kaggle.com
    Updated May 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rohit Zaman (2021). Gender, Age, and Emotion Detection from Voice [Dataset]. https://www.kaggle.com/datasets/rohitzaman/gender-age-and-emotion-detection-from-voice/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 29, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rohit Zaman
    Description

    Context

    Our target was to predict gender, age and emotion from audio. We found audio labeled datasets on Mozilla and RAVDESS. So by using R programming language 20 statistical features were extracted and then after adding the labels these datasets were formed. Audio files were collected from "Mozilla Common Voice" and “Ryerson AudioVisual Database of Emotional Speech and Song (RAVDESS)”.

    Content

    Datasets contains 20 feature columns and 1 column for denoting the label. The 20 statistical features were extracted through the Frequency Spectrum Analysis using R programming Language. They are: 1) meanfreq - The mean frequency (in kHz) is a pitch measure, that assesses the center of the distribution of power across frequencies. 2) sd - The standard deviation of frequency is a statistical measure that describes a dataset’s dispersion relative to its mean and is calculated as the variance’s square root. 3) median - The median frequency (in kHz) is the middle number in the sorted, ascending, or descending list of numbers. 4) Q25 - The first quartile (in kHz), referred to as Q1, is the median of the lower half of the data set. This means that about 25 percent of the data set numbers are below Q1, and about 75 percent are above Q1. 5) Q75 - The third quartile (in kHz), referred to as Q3, is the central point between the median and the highest distributions. 6) IQR - The interquartile range (in kHz) is a measure of statistical dispersion, equal to the difference between 75th and 25th percentiles or between upper and lower quartiles. 7) skew - The skewness is the degree of distortion from the normal distribution. It measures the lack of symmetry in the data distribution. 8) kurt - The kurtosis is a statistical measure that determines how much the tails of distribution vary from the tails of a normal distribution. It is actually the measure of outliers present in the data distribution. 9) sp.ent - The spectral entropy is a measure of signal irregularity that sums up the normalized signal’s spectral power. 10) sfm - The spectral flatness or tonality coefficient, also known as Wiener entropy, is a measure used for digital signal processing to characterize an audio spectrum. Spectral flatness is usually measured in decibels, which, instead of being noise-like, offers a way to calculate how tone-like a sound is. 11) mode - The mode frequency is the most frequently observed value in a data set. 12) centroid - The spectral centroid is a metric used to describe a spectrum in digital signal processing. It means where the spectrum’s center of mass is centered. 13) meanfun - The meanfun is the average of the fundamental frequency measured across the acoustic signal. 14) minfun - The minfun is the minimum fundamental frequency measured across the acoustic signal 15) maxfun - The maxfun is the maximum fundamental frequency measured across the acoustic signal. 16) meandom - The meandom is the average of dominant frequency measured across the acoustic signal. 17) mindom - The mindom is the minimum of dominant frequency measured across the acoustic signal. 18) maxdom - The maxdom is the maximum of dominant frequency measured across the acoustic signal 19) dfrange - The dfrange is the range of dominant frequency measured across the acoustic signal. 20) modindx - the modindx is the modulation index, which calculates the degree of frequency modulation expressed numerically as the ratio of the frequency deviation to the frequency of the modulating signal for a pure tone modulation.

    Acknowledgements

    Gender and Age Audio Data Souce: Link: https://commonvoice.mozilla.org/en Emotion Audio Data Souce: Link : https://smartlaboratory.org/ravdess/

  17. o

    Data from: Prioritization of barriers that hinders Local Flexibility Market...

    • explore.openaire.eu
    Updated May 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koldo Salabarrieta; Cruz E. Cruz E.Borges; Diego Casado-Mansilla; Evgenia Kapassa; Guntram Preßmair; Diego López-de-Ipiña (2020). Prioritization of barriers that hinders Local Flexibility Market proliferation [Dataset]. http://doi.org/10.5281/zenodo.3855546
    Explore at:
    Dataset updated
    May 31, 2020
    Authors
    Koldo Salabarrieta; Cruz E. Cruz E.Borges; Diego Casado-Mansilla; Evgenia Kapassa; Guntram Preßmair; Diego López-de-Ipiña
    Description

    This dataset contains the prioritization provided by a panel of 15 experts to a set of 28 barriers categories for 8 different roles of the future energy system. A Delphi method was followed and the scores provided in the three rounds carried out are included. The dataset also contains the scripts used to assess the results and the output of this assessment. A list of the information contained in this file is: data folder: this folders includes the scores given by the 15 experts in the 3 rounds. Every round is in an individual folder. There is a file per expert that has the scores between -5 (not relevant at all) to 5 (completely relevant) per barrier (rows) and actor (columns). There is also a file with the description of the experts in terms of their position in the company, the type of company and the country. fig folder: this folder includes the figures created to assess the information provided by the experts. For each round, the following figures are created (in each respective folder): Boxplot with the distribution of scores per barriers and roles. Heatmap with the mean scores per barriers and roles. Boxplots with the comparison of the different distributions provided by the experts of each group (depending on the keywords) per barrier and role. Heatmap with the mean score per barrier and use case and with the prioritization per barrier and use case. Finally, bar plots with the mean scores differences between rounds and boxplot with comparisons of the scores distributions are also provided. stat folder: this folder includes the files with the results of the different statistical assessment carried out. For each round, the following figures are created (in each respective folder): The statistics used to assess the scores (Intraclass correlation coefficient, Inter-rater agreement, Inter-rater agreement p-value, Homogeneity of Variances, Average interquartile range, Standard Deviation of interquartile ranges, Friedman test p-value Average power post hoc) per barrier and per role. The results of the post hoc of the Friedman Test per berries and per roles. The average score per barrier and per role. The mean value of the scores provided by the experts grouped by the keywords per barrier and role. P-value of the comparison of these two values. The end prioritization of the barrier for the use case (averaging the scores or merging the critical sets) Finally, the differences between the mean and standard deviations of the scores between two consecutive rounds are provided.

  18. A complementary dataset of open-eyes EEG recordings in a photo-stimulation...

    • openneuro.org
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aimilia Ntetska; Andreas Miltiadous; Alexandros T. Tzallas; Katerina D. Tzimourta; Theodora Afrantou; Panagiotis Ioannidis; Dimitrios G. Tsalikakis; Nikolaos Grigoriadis; Pantelis Angelidis; Konstantinos Sakkas; Emmanouil D. Oikonomou; Nikolaos Giannakeas; Markos G. Tsipouras (2025). A complementary dataset of open-eyes EEG recordings in a photo-stimulation setting from: Alzheimer's disease, Frontotemporal dementia and Healthy subjects [Dataset]. http://doi.org/10.18112/openneuro.ds006036.v1.0.5
    Explore at:
    Dataset updated
    May 13, 2025
    Dataset provided by
    OpenNeurohttps://openneuro.org/
    Authors
    Aimilia Ntetska; Andreas Miltiadous; Alexandros T. Tzallas; Katerina D. Tzimourta; Theodora Afrantou; Panagiotis Ioannidis; Dimitrios G. Tsalikakis; Nikolaos Grigoriadis; Pantelis Angelidis; Konstantinos Sakkas; Emmanouil D. Oikonomou; Nikolaos Giannakeas; Markos G. Tsipouras
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset provides complementary material to the previously published dataset named “A dataset of EEG recordings from: Alzheimer's disease, Frontotemporal dementia and Healthy subjects” with doi:10.18112/openneuro.ds004504.v1.0.8. It is consisted of eyes-open EEG recordings in multiple photic stimulation settings, according to the clinical protocol of the 2nd department of Neurology, AHEPA University of Thessaloniki, Greece. The participant numbers match the respective participant numbers of the aforementioned dataset. In the clinical protocol, the 1st datasets recordings came first, followed by the recordings of this dataset. The dataset is designed to complement a previously published dataset in which the same cohort underwent EEG recordings with their eyes closed. During the recordings, participants were seated with their eyes open while being exposed to photic stimulation. The stimulation was administered at incremental frequencies, beginning at 5 Hz, progressing to 10 Hz, 15 Hz, and in some cases, extending up to 30 Hz, with increments of 5 Hz at each level. This study compared cognitive function in 36 individuals with Alzheimer's disease (AD), 23 with Frontotemporal Dementia (FTD), and 29 healthy controls (CN). Cognitive function was measured using the Mini-Mental State Examination (MMSE), where lower scores indicate greater cognitive impairment. The AD group had an average MMSE score of 17.75 (standard deviation of 4.5), the FTD group averaged 22.17 (standard deviation of 8.22), and the CN group scored 30. The average age was 66.4 (standard deviation of 7.9) for the AD group, 63.6 (standard deviation of 8.2) for the FTD group, and 67.9 (standard deviation of 5.4) for the CN group. The median disease duration was 25 months, with an interquartile range of 24 to 28.5 months. Notably, the AD group had no reported dementia-related comorbidities. Recordings: Recordings were aquired from the 2nd Department of Neurology of AHEPA General Hospital of Thessaloniki by an experienced team of neurologists. For recording, a Nihon Kohden EEG 2100 clinical device was used, with 19 scalp electrodes (Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2) according to the 10-20 international system and 2 additional ectrodes (A1 and A2) placed on the mastoids for impendance check, according to the manual of the device. Each recording was performed according to the clinical protocol with participants being in a sitting position having their eyes closed. Before the initialization of each recording, the skin impedance value was ensured to be below 5k?. The sampling rate was 500 Hz with 10uV/mm resolution. The recording montages were anterior-posterior bipolar and referential montage using Cz as the common reference. The referential montage was included in this dataset. The recordings were received under the range of the following parameters of the amplifier: Sensitivity: 10uV/mm, time constant: 0.3s, and high frequency filter at 70 Hz. Each recording lasted approximately 4.86 minutes for AD group (min=1.30 minutes , max= 8.77 minutes), 4.42 minutes for FTD group (min=1.25 minutes, max=10.05 minutes) and 6.43 minutes for CN group (min=3.17 minutes, max= 9.17 minutes). In total, 174.94 minutes of AD, 101.56 minutes of FTD and 186.50 minutes of CN recordings were collected and are included in the dataset. Preprocessing: The EEG recordings were exported in .eeg format and are transformed to BIDS accepted .set format for the inclusion in the dataset. Automatic annotations of the Nihon Kohden EEG device marking artifacts (muscle activity, blinking, swallowing) have not been included for language compatibility purposes (If this is an issue, please use the preprocessed dataset in Folder: derivatives). The unprocessed EEG recordings are included in folders named: sub-0XX. Folders named sub-0XX in the subfolder derivatives contain the preprocessed and denoised EEG recordings. The preprocessing pipeline of the EEG signals is as follows. First, a Butterworth band-pass filter 0.5-45 Hz was applied and the signals were re-referenced to A1-A2. Then, the Artifact Subspace Reconstruction routine (ASR) which is an EEG artifact correction method included in the EEGLab Matlab software was applied to the signals, removing bad data periods which exceeded the max acceptable 0.5 second window standard deviation of 15, which is considered a conservative window. Next, the Independent Component Analysis (ICA) method (RunICA algorithm) was performed, transforming the 19 EEG signals to 19 ICA components. ICA components that were classified as “eye artifacts” or “jaw artifacts” by the automatic classification routine “ICLabel” in the EEGLAB platform were automatically rejected. It should be noted that, even though the recording was performed in a resting state, eyes-closed condition, eye artifacts of eye movement were still found at some EEG recordings.

  19. ELITE Emissivity: MODIS Mid-Infrared NBE in Qinghai Provence at 0.01 degree

    • zenodo.org
    tiff, zip
    Updated Dec 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weihan Liu; Jie Cheng; Jie Cheng; Weihan Liu (2024). ELITE Emissivity: MODIS Mid-Infrared NBE in Qinghai Provence at 0.01 degree [Dataset]. http://doi.org/10.5281/zenodo.14385169
    Explore at:
    tiff, zipAvailable download formats
    Dataset updated
    Dec 22, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Weihan Liu; Jie Cheng; Jie Cheng; Weihan Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Qinghai
    Description

    The Essential thermal Infrared remoTe sEnsing (ELITE) product suite currently has four types of products, including land surface temperature (LST: clear-sky and all-sky), emissivity (NBE: narrowband emissivity; BBE: broadband emissivity; and spectral emissivity), the component of surface radiation and energy budget (SLUR: surface longwave upwelling radiation; SLDR: surface longwave downward radiation SLDR; SLNR: surface longwave net radiation), and the component of Earth’s radiation budget (OLR; outgoing longwave radiation; RSR: reflected solar radiation). The spatial-temporal resolutions of the ELITE products are mainly determined by the employed satellite data sources. For more information about ELITE products, please refer to the website (https://elite.bnu.edu.cn).

    This dataset is the ELITE MODIS Mid-Infrared (MIR) narrowband emissivity (NBE) product (bands 20, 22, and 23; 3.6–4.1 µm) at 0.01-degree spatial resolution over Qinghai Province, China, for January and July 2024. It includes instantaneous retrievals based on MODIS nighttime swath data and monthly composites derived by averaging the instantaneous retrievals after outlier removal using the interquartile range (IQR) method.

    Dataset Characteristics:

    • Spatial Coverage: Qinghai Province, China
    • Temporal Coverage: 2024-01-01 ~ 2024-01-31; 2024-07-01 ~ 2024-07-31
    • Spatial Resolution: 0.01 degree
    • Temporal Resolution: instantaneous and monthly mean
    • Data Format: Geotiff
    • Scale: 1
    • Offset: 0
    • Fill value: 0
  20. d

    Data release for solar-sensor angle analysis subset associated with the...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Data release for solar-sensor angle analysis subset associated with the journal article "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States" [Dataset]. https://catalog.data.gov/dataset/data-release-for-solar-sensor-angle-analysis-subset-associated-with-the-journal-article-so
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Western United States, United States
    Description

    This dataset provides geospatial location data and scripts used to analyze the relationship between MODIS-derived NDVI and solar and sensor angles in a pinyon-juniper ecosystem in Grand Canyon National Park. The data are provided in support of the following publication: "Solar and sensor geometry, not vegetation response, drive satellite NDVI phenology in widespread ecosystems of the western United States". The data and scripts allow users to replicate, test, or further explore results. The file GrcaScpnModisCellCenters.csv contains locations (latitude-longitude) of all the 250-m MODIS (MOD09GQ) cell centers associated with the Grand Canyon pinyon-juniper ecosystem that the Southern Colorado Plateau Network (SCPN) is monitoring through its land surface phenology and integrated upland monitoring programs. The file SolarSensorAngles.csv contains MODIS angle measurements for the pixel at the phenocam location plus a random 100 point subset of pixels within the GRCA-PJ ecosystem. The script files (folder: 'Code') consist of 1) a Google Earth Engine (GEE) script used to download MODIS data through the GEE javascript interface, and 2) a script used to calculate derived variables and to test relationships between solar and sensor angles and NDVI using the statistical software package 'R'. The file Fig_8_NdviSolarSensor.JPG shows NDVI dependence on solar and sensor geometry demonstrated for both a single pixel/year and for multiple pixels over time. (Left) MODIS NDVI versus solar-to-sensor angle for the Grand Canyon phenocam location in 2018, the year for which there is corresponding phenocam data. (Right) Modeled r-squared values by year for 100 randomly selected MODIS pixels in the SCPN-monitored Grand Canyon pinyon-juniper ecosystem. The model for forward-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle. The model for back-scatter MODIS-NDVI is log(NDVI) ~ solar-to-sensor angle + sensor zenith angle. Boxplots show interquartile ranges; whiskers extend to 10th and 90th percentiles. The horizontal line marking the average median value for forward-scatter r-squared (0.835) is nearly indistinguishable from the back-scatter line (0.833). The dataset folder also includes supplemental R-project and packrat files that allow the user to apply the workflow by opening a project that will use the same package versions used in this study (eg, .folders Rproj.user, and packrat, and files .RData, and PhenocamPR.Rproj). The empty folder GEE_DataAngles is included so that the user can save the data files from the Google Earth Engine scripts to this location, where they can then be incorporated into the r-processing scripts without needing to change folder names. To successfully use the packrat information to replicate the exact processing steps that were used, the user should refer to packrat documentation available at https://cran.r-project.org/web/packages/packrat/index.html and at https://www.rdocumentation.org/packages/packrat/versions/0.5.0. Alternatively, the user may also use the descriptive documentation phenopix package documentation, and description/references provided in the associated journal article to process the data to achieve the same results using newer packages or other software programs.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Hélène Follet; Delphine Farlay; Yohann Bala; Stéphanie Viguet-Carrin; Evelyne Gineyts; Brigitte Burt-Pichat; Julien Wegrzyn; Pierre Delmas; Georges Boivin; Roland Chapurlat (2023). Descriptive statistics, mean ± SD, range, median and interquartile range (IQR). [Dataset]. http://doi.org/10.1371/journal.pone.0055232.t001

Descriptive statistics, mean ± SD, range, median and interquartile range (IQR).

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Hélène Follet; Delphine Farlay; Yohann Bala; Stéphanie Viguet-Carrin; Evelyne Gineyts; Brigitte Burt-Pichat; Julien Wegrzyn; Pierre Delmas; Georges Boivin; Roland Chapurlat
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Descriptive statistics, mean ± SD, range, median and interquartile range (IQR).

Search
Clear search
Close search
Google apps
Main menu