21 datasets found
  1. f

    Data from: Error and anomaly detection for intra-participant time-series...

    • tandf.figshare.com
    xlsx
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David R. Mullineaux; Gareth Irwin (2023). Error and anomaly detection for intra-participant time-series data [Dataset]. http://doi.org/10.6084/m9.figshare.5189002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    David R. Mullineaux; Gareth Irwin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Identification of errors or anomalous values, collectively considered outliers, assists in exploring data or through removing outliers improves statistical analysis. In biomechanics, outlier detection methods have explored the ‘shape’ of the entire cycles, although exploring fewer points using a ‘moving-window’ may be advantageous. Hence, the aim was to develop a moving-window method for detecting trials with outliers in intra-participant time-series data. Outliers were detected through two stages for the strides (mean 38 cycles) from treadmill running. Cycles were removed in stage 1 for one-dimensional (spatial) outliers at each time point using the median absolute deviation, and in stage 2 for two-dimensional (spatial–temporal) outliers using a moving window standard deviation. Significance levels of the t-statistic were used for scaling. Fewer cycles were removed with smaller scaling and smaller window size, requiring more stringent scaling at stage 1 (mean 3.5 cycles removed for 0.0001 scaling) than at stage 2 (mean 2.6 cycles removed for 0.01 scaling with a window size of 1). Settings in the supplied Matlab code should be customised to each data set, and outliers assessed to justify whether to retain or remove those cycles. The method is effective in identifying trials with outliers in intra-participant time series data.

  2. f

    Data from: Methodology to filter out outliers in high spatial density data...

    • scielo.figshare.com
    jpeg
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leonardo Felipe Maldaner; José Paulo Molin; Mark Spekken (2023). Methodology to filter out outliers in high spatial density data to improve maps reliability [Dataset]. http://doi.org/10.6084/m9.figshare.14305658.v1
    Explore at:
    jpegAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SciELO journals
    Authors
    Leonardo Felipe Maldaner; José Paulo Molin; Mark Spekken
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT The considerable volume of data generated by sensors in the field presents systematic errors; thus, it is extremely important to exclude these errors to ensure mapping quality. The objective of this research was to develop and test a methodology to identify and exclude outliers in high-density spatial data sets, determine whether the developed filter process could help decrease the nugget effect and improve the spatial variability characterization of high sampling data. We created a filter composed of a global, anisotropic, and an anisotropic local analysis of data, which considered the respective neighborhood values. For that purpose, we used the median to classify a given spatial point into the data set as the main statistical parameter and took into account its neighbors within a radius. The filter was tested using raw data sets of corn yield, soil electrical conductivity (ECa), and the sensor vegetation index (SVI) in sugarcane. The results showed an improvement in accuracy of spatial variability within the data sets. The methodology reduced RMSE by 85 %, 97 %, and 79 % in corn yield, soil ECa, and SVI respectively, compared to interpolation errors of raw data sets. The filter excluded the local outliers, which considerably reduced the nugget effects, reducing estimation error of the interpolated data. The methodology proposed in this work had a better performance in removing outlier data when compared to two other methodologies from the literature.

  3. Predictive Validity Data Set

    • figshare.com
    txt
    Updated Dec 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Antonio Abeyta (2022). Predictive Validity Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.17030021.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 18, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Antonio Abeyta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Verbal and Quantitative Reasoning GRE scores and percentiles were collected by querying the student database for the appropriate information. Any student records that were missing data such as GRE scores or grade point average were removed from the study before the data were analyzed. The GRE Scores of entering doctoral students from 2007-2012 were collected and analyzed. A total of 528 student records were reviewed. Ninety-six records were removed from the data because of a lack of GRE scores. Thirty-nine of these records belonged to MD/PhD applicants who were not required to take the GRE to be reviewed for admission. Fifty-seven more records were removed because they did not have an admissions committee score in the database. After 2011, the GRE’s scoring system was changed from a scale of 200-800 points per section to 130-170 points per section. As a result, 12 more records were removed because their scores were representative of the new scoring system and therefore were not able to be compared to the older scores based on raw score. After removal of these 96 records from our analyses, a total of 420 student records remained which included students that were currently enrolled, left the doctoral program without a degree, or left the doctoral program with an MS degree. To maintain consistency in the participants, we removed 100 additional records so that our analyses only considered students that had graduated with a doctoral degree. In addition, thirty-nine admissions scores were identified as outliers by statistical analysis software and removed for a final data set of 286 (see Outliers below). Outliers We used the automated ROUT method included in the PRISM software to test the data for the presence of outliers which could skew our data. The false discovery rate for outlier detection (Q) was set to 1%. After removing the 96 students without a GRE score, 432 students were reviewed for the presence of outliers. ROUT detected 39 outliers that were removed before statistical analysis was performed. Sample See detailed description in the Participants section. Linear regression analysis was used to examine potential trends between GRE scores, GRE percentiles, normalized admissions scores or GPA and outcomes between selected student groups. The D’Agostino & Pearson omnibus and Shapiro-Wilk normality tests were used to test for normality regarding outcomes in the sample. The Pearson correlation coefficient was calculated to determine the relationship between GRE scores, GRE percentiles, admissions scores or GPA (undergraduate and graduate) and time to degree. Candidacy exam results were divided into students who either passed or failed the exam. A Mann-Whitney test was then used to test for statistically significant differences between mean GRE scores, percentiles, and undergraduate GPA and candidacy exam results. Other variables were also observed such as gender, race, ethnicity, and citizenship status within the samples. Predictive Metrics. The input variables used in this study were GPA and scores and percentiles of applicants on both the Quantitative and Verbal Reasoning GRE sections. GRE scores and percentiles were examined to normalize variances that could occur between tests. Performance Metrics. The output variables used in the statistical analyses of each data set were either the amount of time it took for each student to earn their doctoral degree, or the student’s candidacy examination result.

  4. Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health...

    • zenodo.org
    bin, csv, pdf
    Updated Sep 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender (2024). Extended 1.0 Dataset of "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary" [Dataset]. http://doi.org/10.5281/zenodo.13826993
    Explore at:
    bin, pdf, csvAvailable download formats
    Dataset updated
    Sep 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter Domjan; Peter Domjan; Viola Angyal; Viola Angyal; Istvan Vingender; Istvan Vingender
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Sep 23, 2024
    Area covered
    Hungary
    Description

    Introduction

    We are enclosing the database used in our research titled "Concentration and Geospatial Modelling of Health Development Offices' Accessibility for the Total and Elderly Populations in Hungary", along with our statistical calculations. For the sake of reproducibility, further information can be found in the file Short_Description_of_Data_Analysis.pdf and Statistical_formulas.pdf

    The sharing of data is part of our aim to strengthen the base of our scientific research. As of March 7, 2024, the detailed submission and analysis of our research findings to a scientific journal has not yet been completed.

    The dataset was expanded on 23rd September 2024 to include SPSS statistical analysis data, a heatmap, and buffer zone analysis around the Health Development Offices (HDOs) created in QGIS software.

    Short Description of Data Analysis and Attached Files (datasets):

    Our research utilised data from 2022, serving as the basis for statistical standardisation. The 2022 Hungarian census provided an objective basis for our analysis, with age group data available at the county level from the Hungarian Central Statistical Office (KSH) website. The 2022 demographic data provided an accurate picture compared to the data available from the 2023 microcensus. The used calculation is based on our standardisation of the 2022 data. For xlsx files, we used MS Excel 2019 (version: 1808, build: 10406.20006) with the SOLVER add-in.

    Hungarian Central Statistical Office served as the data source for population by age group, county, and regions: https://www.ksh.hu/stadat_files/nep/hu/nep0035.html, (accessed 04 Jan. 2024.) with data recorded in MS Excel in the Data_of_demography.xlsx file.

    In 2022, 108 Health Development Offices (HDOs) were operational, and it's noteworthy that no developments have occurred in this area since 2022. The availability of these offices and the demographic data from the Central Statistical Office in Hungary are considered public interest data, freely usable for research purposes without requiring permission.

    The contact details for the Health Development Offices were sourced from the following page (Hungarian National Population Centre (NNK)): https://www.nnk.gov.hu/index.php/efi (n=107). The Semmelweis University Health Development Centre was not listed by NNK, hence it was separately recorded as the 108th HDO. More information about the office can be found here: https://semmelweis.hu/egeszsegfejlesztes/en/ (n=1). (accessed 05 Dec. 2023.)

    Geocoordinates were determined using Google Maps (N=108): https://www.google.com/maps. (accessed 02 Jan. 2024.) Recording of geocoordinates (latitude and longitude according to WGS 84 standard), address data (postal code, town name, street, and house number), and the name of each HDO was carried out in the: Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file.

    The foundational software for geospatial modelling and display (QGIS 3.34), an open-source software, can be downloaded from:

    https://qgis.org/en/site/forusers/download.html. (accessed 04 Jan. 2024.)

    The HDOs_GeoCoordinates.gpkg QGIS project file contains Hungary's administrative map and the recorded addresses of the HDOs from the

    Geo_coordinates_and_names_of_Hungarian_Health_Development_Offices.csv file,

    imported via .csv file.

    The OpenStreetMap tileset is directly accessible from www.openstreetmap.org in QGIS. (accessed 04 Jan. 2024.)

    The Hungarian county administrative boundaries were downloaded from the following website: https://data2.openstreetmap.hu/hatarok/index.php?admin=6 (accessed 04 Jan. 2024.)

    HDO_Buffers.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding buffer zones with a radius of 7.5 km.

    Heatmap.gpkg is a QGIS project file that includes the administrative map of Hungary, the county boundaries, as well as the HDO offices and their corresponding heatmap (Kernel Density Estimation).

    A brief description of the statistical formulas applied is included in the Statistical_formulas.pdf.

    Recording of our base data for statistical concentration and diversification measurement was done using MS Excel 2019 (version: 1808, build: 10406.20006) in .xlsx format.

    • Aggregated number of HDOs by county: Number_of_HDOs.xlsx
    • Standardised data (Number of HDOs per 100,000 residents): Standardized_data.xlsx
    • Calculation of the Lorenz curve: Lorenz_curve.xlsx
    • Calculation of the Gini index: Gini_Index.xlsx
    • Calculation of the LQ index: LQ_Index.xlsx
    • Calculation of the Herfindahl-Hirschman Index: Herfindahl_Hirschman_Index.xlsx
    • Calculation of the Entropy index: Entropy_Index.xlsx
    • Regression and correlation analysis calculation: Regression_correlation.xlsx

    Using the SPSS 29.0.1.0 program, we performed the following statistical calculations with the databases Data_HDOs_population_without_outliers.sav and Data_HDOs_population.sav:

    • Regression curve estimation with elderly population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_elderly_without_outlier.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county, excluding outlier values such as Budapest and Pest County: Pearson_Correlation_populations_HDOs_number_without_outliers.spv.
    • Dot diagram including total population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_total_population_without_outliers.spv.
    • Dot diagram including elderly (64<) population and number of HDOs per county, excluding outlier values such as Budapest and Pest Counties: Dot_HDO_elderly_population_without_outliers.spv
    • Regression curve estimation with total population and number of HDOs, excluding outlier values (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_without_outlier.spv
    • Dot diagram including elderly (64<) population and number of HDOs per county: Dot_HDO_elderly_population.spv
    • Dot diagram including total population and number of HDOs per county: Dot_HDO_total_population.spv
    • Pearson correlation table between the total population, elderly population, and number of HDOs per county: Pearson_Correlation_populations_HDOs_number.spv
    • Regression curve estimation with total population and number of HDOs, (Types of analyzed equations: Linear, Logarithmic, Inverse, Quadratic, Cubic, Compound, Power, S, Growth, Exponential, Logistic, with summary and ANOVA analysis table): Curve_estimation_total_population.spv

    For easier readability, the files have been provided in both SPV and PDF formats.

    The translation of these supplementary files into English was completed on 23rd Sept. 2024.

    If you have any further questions regarding the dataset, please contact the corresponding author: domjan.peter@phd.semmelweis.hu

  5. a

    Rate of change of coastlines in Africa

    • africageoportal.com
    • deafrica.africageoportal.com
    • +2more
    Updated Oct 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Africa GeoPortal (2022). Rate of change of coastlines in Africa [Dataset]. https://www.africageoportal.com/maps/africageoportal::rate-of-change-of-coastlines-in-africa
    Explore at:
    Dataset updated
    Oct 26, 2022
    Dataset authored and provided by
    Africa GeoPortal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    The Digital Earth Africa Rates of change of coastlines dataset is a point dataset providing robust rates of coastal change (in metres per year) for every 30 m along Africa’s non-rocky (e.g. sandy and muddy) coastlines. These rates are calculated by linearly regressing annual shoreline positions against time, using the most recent shoreline as a baseline.Negative values (red points) indicate retreat (e.g. erosion), and positive values indicate growth (e.g. progradation) over time. By default, rates of change are shown for points with a statistically significant trend over time only.Key PropertiesGeographic Coverage: Continental Africa - approximately 37° North to 35° SouthTemporal Coverage: 2000 to PresentSpatial Resolution: 30 x 30 meterUpdate Frequency: Annual from 200 - Present; 6 months from end of previous yearParent Dataset: Landsat Collection 2 Surface ReflectanceSource Data Coordinate System: WGS 84 / NSIDC EASE-Grid 2.0 Global (EPSG:6933)Service Coordinate System: WGS 84 / NSIDC EASE-Grid 2.0 Global (EPSG:6933)Rates of change statistics attributes:

    Attribute

    rate_time

    Annual rates of change (in metres per year) calculated by linearly regressing annual shoreline distances against time (excluding outliers). Negative values indicate retreat and positive values indicate growth.

    sig_time

    Significance (p-value) of the linear relationship between annual shoreline distances and time. Small values (e.g. p-value < 0.01) may indicate a coastline is undergoing consistent coastal change through time.

    se_time

    Standard error (in metres) of the linear relationship between annual shoreline distances and time. This can be used to generate confidence intervals around the rate of change given by rate_time (e.g. 95% confidence interval = se_time * 1.96).

    outl_time

    Individual annual shoreline are noisy estimators of coastline position that can be influenced by environmental conditions (e.g. clouds, breaking waves, sea spray) or modelling issues (e.g. poor tidal modelling results or limited clear satellite observations). To obtain reliable rates of change, outlier shorelines are excluded using a robust Median Absolute Deviation outlier detection algorithm, and recorded in this column.

    sce

    Shoreline Change Envelope (SCE). A measure of the maximum change or variability across all annual shorelines, calculated by computing the maximum distance between any two annual shoreline (excluding outliers). This statistic excludes sub-annual shoreline variability.

    nsm

    Net Shoreline Movement (NSM). The distance between the oldest (2000) and most recent annual shoreline (excluding outliers). Negative values indicate the coastline retreated between the oldest and most recent shoreline; positive values indicate growth. This statistic does not reflect sub-annual shoreline variability, so will underestimate the full extent of variability at any given location.

    max_year, min_year

    The year that annual shorelines were at their maximum (i.e. located furthest towards the ocean) and their minimum (i.e. located furthest inland) respectively (excluding outliers). This statistic excludes sub-annual shoreline variability.

    outl_time

    Individual annual shoreline are noisy estimators of coastline position that can be influenced by environmental conditions (e.g. clouds, breaking waves, sea spray) or modelling issues (e.g. poor tidal modelling results or limited clear satellite observations). To obtain reliable rates of change, outlier shorelines are excluded using a robust Median Absolute Deviation outlier detection algorithm, and recorded in this column.

    More details on this dataset can be found here.

  6. n

    Prevalence of anomalies from 2003 to 2012, the annual proportional change...

    • datasetcatalog.nlm.nih.gov
    Updated Apr 5, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pierini, Anna; Greenlees, Ruth; Bergman, Jorieke E. H.; Tucker, David; Gatt, Miriam; Addor, Marie-Claude; Lynch, Catherine; Dolk, Helen; Kurinczuk, Jennifer; Arriola, Larraitz; Morris, Joan K.; Barisic, Ingeborg; Randrianaivo, Hanitra; Garne, Ester; Dias, Carlos; O'Mahony, Mary; McDonnell, Robert; Loane, Maria; Verellen-Dumoulin, Christine; Queisser-Luft, Annette; Rissmann, Anke; Draper, Elizabeth S.; Rankin, Judith; Csaky-Szunyogh, Melinda; Nelen, Vera; Wellesley, Diana; Neville, Amanda J.; Khoshnood, Babak; Springett, Anna L.; Klungsoyr, Kari (2018). Prevalence of anomalies from 2003 to 2012, the annual proportional change during this period and the adjusted annual proportional change after excluding outliers for the 17 anomaly subgroups with statistically significant trends identified in Fig 1. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000709144
    Explore at:
    Dataset updated
    Apr 5, 2018
    Authors
    Pierini, Anna; Greenlees, Ruth; Bergman, Jorieke E. H.; Tucker, David; Gatt, Miriam; Addor, Marie-Claude; Lynch, Catherine; Dolk, Helen; Kurinczuk, Jennifer; Arriola, Larraitz; Morris, Joan K.; Barisic, Ingeborg; Randrianaivo, Hanitra; Garne, Ester; Dias, Carlos; O'Mahony, Mary; McDonnell, Robert; Loane, Maria; Verellen-Dumoulin, Christine; Queisser-Luft, Annette; Rissmann, Anke; Draper, Elizabeth S.; Rankin, Judith; Csaky-Szunyogh, Melinda; Nelen, Vera; Wellesley, Diana; Neville, Amanda J.; Khoshnood, Babak; Springett, Anna L.; Klungsoyr, Kari
    Description

    Prevalence of anomalies from 2003 to 2012, the annual proportional change during this period and the adjusted annual proportional change after excluding outliers for the 17 anomaly subgroups with statistically significant trends identified in Fig 1.

  7. Data from: On Tracer Breakthrough Curve Dataset Size, Shape, and Statistical...

    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • catalog.data.gov
    • +1more
    Updated Dec 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). On Tracer Breakthrough Curve Dataset Size, Shape, and Statistical Distribution [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/on-tracer-breakthrough-curve-dataset-size-shape-and-statistical-distribution
    Explore at:
    Dataset updated
    Dec 14, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    A tracer breakthrough curve (BTC) for each sampling station is the ultimate goal of every quantitative hydrologic tracing study, and dataset size can critically affect the BTC. Groundwater-tracing data obtained using in situ automatic sampling or detection devices may result in very high-density data sets. Data-dense tracer BTCs obtained using in situ devices and stored in dataloggers can result in visually cluttered overlapping data points. The relatively large amounts of data detected by high-frequency settings available on in situ devices and stored in dataloggers ensure that important tracer BTC features, such as data peaks, are not missed. Alternatively, such dense datasets can also be difficult to interpret. Even more difficult, is the application of such dense data sets in solute-transport models that may not be able to adequately reproduce tracer BTC shapes due to the overwhelming mass of data. One solution to the difficulties associated with analyzing, interpreting, and modeling dense data sets is the selective removal of blocks of the data from the total dataset. Although it is possible to arrange to skip blocks of tracer BTC data in a periodic sense (data decimation) so as to lessen the size and density of the dataset, skipping or deleting blocks of data also may result in missing the important features that the high-frequency detection setting efforts were intended to detect. Rather than removing, reducing, or reformulating data overlap, signal filtering and smoothing may be utilized but smoothing errors (e.g., averaging errors, outliers, and potential time shifts) need to be considered. Appropriate probability distributions to tracer BTCs may be used to describe typical tracer BTC shapes, which usually include long tails. Recognizing appropriate probability distributions applicable to tracer BTCs can help in understanding some aspects of the tracer migration. This dataset is associated with the following publications: Field, M. Tracer-Test Results for the Central Chemical Superfund Site, Hagerstown, Md. May 2014 -- December 2015. U.S. Environmental Protection Agency, Washington, DC, USA, 2017. Field, M. On Tracer Breakthrough Curve Dataset Size, Shape, and Statistical Distribution. ADVANCES IN WATER RESOURCES. Elsevier Science Ltd, New York, NY, USA, 141: 1-19, (2020).

  8. a

    Digital Earth Australia Coastlines

    • digital.atlas.gov.au
    Updated Mar 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Digital Atlas of Australia (2025). Digital Earth Australia Coastlines [Dataset]. https://digital.atlas.gov.au/maps/36b0acf3d8a5439199b9a42a06011d20
    Explore at:
    Dataset updated
    Mar 13, 2025
    Dataset authored and provided by
    Digital Atlas of Australia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Abstract Digital Earth Australia Coastlines is a continental dataset that includes annual shorelines and rates of coastal change along the entire Australian coastline from 1988 to the present. The product combines satellite data from Geoscience Australia's Digital Earth Australia program with tidal modelling to map the most representative location of the shoreline at mean sea level for each year. The product enables trends of coastal retreat and growth to be examined annually at both a local and continental scale, and for patterns of coastal change to be mapped historically and updated regularly as data continues to be acquired. This allows current rates of coastal change to be compared with that observed in previous years or decades. The ability to map shoreline positions for each year provides valuable insights into whether changes to our coastline are the result of particular events or actions, or a process of more gradual change over time. This information can enable scientists, managers and policy makers to assess impacts from the range of drivers impacting our coastlines and potentially assist planning and forecasting for future scenarios. The DEA Coastlines product contains five layers:

    Annual shorelines Rates of change points Coastal change hotspots (1 km) Coastal change hotspots (5 km) Coastal change hotspots (10 km)

    Annual shorelines Annual shoreline vectors that represent the median or ‘most representative’ position of the shoreline at approximately 0 m Above Mean Sea Level for each year since 1988. Dashed shorelines have low certainty. Rates of change points A point dataset providing robust rates of coastal change for every 30 m along Australia’s non-rocky coastlines. The most recent annual shoreline is used as a baseline for measuring rates of change. Points are shown for locations with statistically significant rates of change (p-value <= 0.01; see sig_time below) and good quality data (certainty = "good"; see certainty below) only. Each point shows annual rates of change (in metres per year; see rate_time below), and an estimate of uncertainty in brackets (95% confidence interval; see se_time). For example, there is a 95% chance that a point with a label -10.0 m (±1.0 m) is retreating at a rate of between -9.0 and -11.0 metres per year. Coastal change hotspots (1 km, 5 km, 10 km) Three points layers summarising coastal change within moving 1 km, 5 km and 10km windows along the coastline. These layers are useful for visualising regional or continental-scale patterns of coastal change. Currency Date modified: August 2023 Modification frequency: Annually Data extent Spatial extent North: -9° South: -44° East: 154° West: 112° Temporal extent From 1988 to Present Source information

    Product description and metadata Digital Earth Australia Coastlines catalog entry Data download Interactive Map

    Lineage statement The DEA Coastlines product is under active development. A full and current product description is best sourced from the DEA Coastlines website. For a full summary of changes made in previous versions, refer to Github. Data dictionary Layer attribute columns Annual shorelines

    Attribute name Description

    OBJECTID Automatically generated system ID

    year The year of each annual shoreline

    certainty A column providing important data quality flags for each annual shoreline (see the Quality assurance section of the product description and metadata page for more detail about each data quality flag)

    tide_datum The tide datum of each annual shoreline (e.g. "0 m AMSL")

    id_primary The name of the annual shoreline's Primary sediment compartment from the Australian Coastal Sediment Compartments framework

    Rates of change points and Coastal change hotspots

    Attribute name Description

    OBJECTID Automatically generated system ID

    uid A unique geohash identifier for each point

    rate_time Annual rates of change (in metres per year) calculated by linearly regressing annual shoreline distances against time (excluding outliers). Negative values indicate retreat and positive values indicate growth

    sig_time Significance (p-value) of the linear relationship between annual shoreline distances and time. Small values (e.g. p-value < 0.01 or 0.05) may indicate a coastline is undergoing consistent coastal change through time

    se-time Standard error (in metres) of the linear relationship between annual shoreline distances and time. This can be used to generate confidence intervals around the rate of change given by rate_time (e.g. 95% confidence interval = se_time * 1.96).

    outl_time Individual annual shoreline are noisy estimators of coastline position that can be influenced by environmental conditions (e.g. clouds, breaking waves, sea spray) or modelling issues (e.g. poor tidal modelling results or limited clear satellite observations). To obtain reliable rates of change, outlier shorelines are excluded using a robust Median Absolute Deviation outlier detection algorithm, and recorded in this column

    dist_1990, dist_1991, etc Annual shoreline distances (in metres) relative to the most recent baseline shoreline. Negative values indicate that an annual shoreline was located inland of the baseline shoreline. By definition, the most recent baseline column will always have a distance of 0 m

    angle_mean, angle_std The mean angle and standard deviation between the baseline point to all annual shorelines. This data is used to calculate how well shorelines fall along a consistent line; high angular standard deviation indicates that derived rates of change are unlikely to be correct

    valid_obs, valid_span The total number of valid (i.e. non-outliers, non-missing) annual shoreline observations, and the maximum number of years between the first and last valid annual shoreline

    sce Shoreline Change Envelope (SCE). A measure of the maximum change or variability across all annual shorelines, calculated by computing the maximum distance between any two annual shorelines (excluding outliers). This statistic excludes sub-annual shoreline variability like tides, storms and seasonal effects

    nsm Net Shoreline Movement (NSM). The distance between the oldest (1988) and most recent annual shoreline (excluding outliers). Negative values indicate the coastline retreated between the oldest and most recent shoreline; positive values indicate growth. This statistic does not reflect sub-annual shoreline variability, so will underestimate the full extent of variability at any given location

    max_year, min_year The year that annual shorelines were at their maximum (i.e. located furthest towards the ocean) and their minimum (i.e. located furthest inland) respectively (excluding outliers). This statistic excludes sub-annual shoreline variability

    certainty A column providing important data quality flags for each annual shoreline (see the Quality assurance section of the product description and metadata page for more detail about each data quality flag)

    id_primary The name of the point's Primary sediment compartment from the Australian Coastal Sediment Compartments framework

    Contact Geoscience Australia, clientservices@ga.gov.au

  9. High-quality RNA residues: RNA2023

    • zenodo.org
    • data.niaid.nih.gov
    csv, txt, zip
    Updated Jul 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Williams; Christopher Williams; Jane Richardson; Jane Richardson (2023). High-quality RNA residues: RNA2023 [Dataset]. http://doi.org/10.5281/zenodo.8103014
    Explore at:
    zip, txt, csvAvailable download formats
    Dataset updated
    Jul 28, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christopher Williams; Christopher Williams; Jane Richardson; Jane Richardson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction
    --------------------------------------------------------------------------------
    This is the RNA2023 dataset by the Richardson Lab at Duke University

    These are high-quality residues from high-quality, low-redundancy RNA chains in the PDB.

    For a similar set of quality-filtered protein residues, see the top2018 datasets at:
    https://doi.org/10.5281/zenodo.4626149
    https://doi.org/10.5281/zenodo.5115232

    Corresponding authors
    --------------------------------------------------------------------------------
    dcrjsr at kinemage.biochem.duke.edu
    christopher.sci.williams at gmail.com


    Usage recommendations
    --------------------------------------------------------------------------------
    RNA residues that fail the filtering criteria described below have been removed from the files. As a result, these files can be considered pre-filtered and will return only results for residues of good model quality with supporting experimental data.

    Files already contain hydrogens added by Reduce in the context of the original full models.

    Two datasets are provided. The standard dataset is rna2023_pruned. We recommend this version as the default. The RNA backbone conformational space is highly diverse, and some real conformations fall below the statistical threshold for recognition as suites. Therefore we do not recommend excluding suite outliers from the dataset except in specialty cases. We also provide a rna2023_nosuiteout dataset. In this case, no residues having "!!" outlier suite identifications are permitted. This set may be useful in specialist cases where suite outliers are undesireable or where losing some real conformations is an acceptable sacrifice for maximal filtering.

    Each dataset also has a mmCIF version.

    Note: Chains are named based on author chain ids, except for 8b0x, chain a. To avoid conflicts with 8b0x chain A in file systems that do not support case-sensitive file names, 8b0x chain a has been renamed to chain AB, matching its PDB/mmCIF designation.


    Additional files
    --------------------------------------------------------------------------------
    rna2023_pdbmetadata.csv contains information on release date, resolution, title, authors, etc for each source pdb.

    rna2023_chain_list contains a list of all included chains, plus statistics on the number residues from the original chain passed the quality filters.

    rna2023_suitename_table.csv and rna2023_suitename_table_nosuiteout.csv contain suitename identifications of rotameric RNA backbone conformations (1a, 1c, 2u, 6d, etc) precomputed for convenience.


    Filtering criteria: Chain level
    --------------------------------------------------------------------------------
    The chain list was derived from http://rna.bgsu.edu/rna3dhub/nrlist, version 3.150 as of 2020/10/28, with a 1.9Å resolution cutoff.

    We added 6ugg chain A and two recent EM ribosome structures: 8a3d and 8b0x

    After residue-level filtering, chains with no complete suites were removed.


    Filtering criteria: Residue level
    --------------------------------------------------------------------------------
    Even excellent structures usually contain some poorly-resolved regions. Residue-level filtering helps avoid including these regions in otherwise high-quality data

    Residues are required to meet the following validation quality contain:
    No sugar pucker outliers
    No steric overlaps or "clashes", as per Probe >= 0.5Å
    No covalent bond or angle geometry outliers
    Optionally, no !! suite outliers

    Residues from xray structures are required for meet the following fit-to-map criteria:
    Average of worst 2 atoms' 2Fo-Fc map values >= 1.2
    Average of worst 2 atoms' RSCC scores >= 0.7
    No atoms modeled at partial occupancy

    Residues from em structures are required for meet the following fit-to-map criteria:
    RSCC >= 0.7
    Residue inclusion fraction = 1.0 or >= 0.95, depending on structure
    No atoms modeled at partial occupancy

    Filtering is documented in each pruned file. See USER DOC lines in .pdb and data_rna2023_dataset loops in .cif


    Version history
    --------------------------------------------------------------------------------
    Version 1.0 Jun 30, 2023
    Initial version

  10. m

    Stone House Value Statistics and Reports

    • m0ve.com
    csv
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M0VE (2025). Stone House Value Statistics and Reports [Dataset]. https://m0ve.com/house-prices/stone/
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 1, 2025
    Dataset authored and provided by
    M0VE
    License

    https://m0ve.com/terms-of-usehttps://m0ve.com/terms-of-use

    Time period covered
    Jan 1, 2021 - May 1, 2025
    Area covered
    Stone, Staffordshire
    Measurement technique
    Historic records are steadily inflated using postcode-led movement in price per square foot, GOV UK records are paired with Land Registry data to create richer, more truthful property definitions (particularly in terms of size, form, and structure),, Carefully adjusting prices based on how each energy score compares to similar homes nearby, bringing sharp consistency across a postcode,, Homes flagged as new builds are treated using postcode-based pricing models drawn from comparable local listings,, Irregular transactions are removed using patterns in cost-per-foot, sale clustering, or unusual data pairings,
    Description

    This Stone housing dataset, produced by M0VE.com, spans the period from 2021 to 2025 and delivers a focused view of residential pricing based on real, publicly recorded data. Land Registry sales, EPC classifications, and council-sourced records form the backbone of this dataset, but what sets it apart is how the raw inputs are refined. Each entry is rebalanced for inflation, filtered to exclude outliers, and modified to reflect nuanced variables like energy efficiency, build type, and whether the home is newly constructed or long established. Pricing is tracked annually and organised by property category, with helpful comparisons to nearby areas to frame each figure in meaningful context. The dataset is structured for clarity, not clutter, and is intended for use by people who need grounded, reliable pricing.

  11. d

    Monthly OpenET Image Collections (v2.0) Summarized by 12-Digit Hydrologic...

    • catalog.data.gov
    • data.usgs.gov
    Updated Nov 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). Monthly OpenET Image Collections (v2.0) Summarized by 12-Digit Hydrologic Unit Codes, 2008-2023 [Dataset]. https://catalog.data.gov/dataset/monthly-openet-image-collections-v2-0-summarized-by-12-digit-hydrologic-unit-codes-2008-20
    Explore at:
    Dataset updated
    Nov 23, 2024
    Dataset provided by
    U.S. Geological Survey
    Description

    This dataset provides monthly summaries of evapotranspiration (ET) data from OpenET v2.0 image collections for the period 2008-2023 for all National Watershed Boundary Dataset subwatersheds (12-digit hydrologic unit codes [HUC12s]) in the US that overlap the spatial extent of OpenET datasets. For each HUC12, this dataset contains spatial aggregation statistics (minimum, mean, median, and maximum) for each of the ET variables from each of the publicly available image collections from OpenET for the six available models (DisALEXI, eeMETRIC, geeSEBAL, PT-JPL, SIMS, SSEBop) and the Ensemble image collection, which is a pixel-wise ensemble of all 6 individual models after filtering and removal of outliers according to the median absolute deviation approach (Melton and others, 2022). Data are available in this data release in two different formats: comma-separated values (CSV) and parquet, a high-performance format that is optimized for storage and processing of columnar data. CSV files containing data for each 4-digit HUC are grouped by 2-digit HUCs for easier access of regional data, and the single parquet file provides convenient access to the entire dataset. For each of the ET models (DisALEXI, eeMETRIC, geeSEBAL, PT-JPL, SIMS, SSEBop), variables in the model-specific CSV data files include: -huc12: The 12-digit hydrologic unit code -ET: Actual evapotranspiration (in millimeters) over the HUC12 area in the month calculated as the sum of daily ET interpolated between Landsat overpasses -statistic: Max, mean, median, or min. Statistic used in the spatial aggregation within each HUC12. For example, maximum ET is the maximum monthly pixel ET value occurring within the HUC12 boundary after summing daily ET in the month -year: 4-digit year -month: 2-digit month -count: Number of Landsat overpasses included in the ET calculation in the month -et_coverage_pct: Integer percentage of the HUC12 with ET data, which can be used to determine how representative the ET statistic is of the entire HUC12 -count_coverage_pct: Integer percentage of the HUC12 with count data, which can be different than the et_coverage_pct value because the “count” band in the source image collection extends beyond the “et” band in the eastern portion of the image collection extent For the Ensemble data, these additional variables are included in the CSV files: -et_mad: Ensemble ET value, computed as the mean of the ensemble after filtering outliers using the median absolute deviation (MAD) -et_mad_count: The number of models used to compute the ensemble ET value after filtering for outliers using the MAD -et_mad_max: The maximum value in the ensemble range, after filtering for outliers using the MAD -et_mad_min: The minimum value in the ensemble range, after filtering for outliers using the MAD -et_sam: A simple arithmetic mean (across the 6 models) of actual ET average without outlier removal Below are the locations of each OpenET image collection used in this summary: DisALEXI: https://developers.google.com/earth-engine/datasets/catalog/OpenET_DISALEXI_CONUS_GRIDMET_MONTHLY_v2_0 eeMETRIC: https://developers.google.com/earth-engine/datasets/catalog/OpenET_EEMETRIC_CONUS_GRIDMET_MONTHLY_v2_0 geeSEBAL: https://developers.google.com/earth-engine/datasets/catalog/OpenET_GEESEBAL_CONUS_GRIDMET_MONTHLY_v2_0 PT-JPL: https://developers.google.com/earth-engine/datasets/catalog/OpenET_PTJPL_CONUS_GRIDMET_MONTHLY_v2_0 SIMS: https://developers.google.com/earth-engine/datasets/catalog/OpenET_SIMS_CONUS_GRIDMET_MONTHLY_v2_0 SSEBop: https://developers.google.com/earth-engine/datasets/catalog/OpenET_SSEBOP_CONUS_GRIDMET_MONTHLY_v2_0 Ensemble: https://developers.google.com/earth-engine/datasets/catalog/OpenET_ENSEMBLE_CONUS_GRIDMET_MONTHLY_v2_0

  12. f

    Quantitative Characterization of Cellular Membrane-Receptor Heterogeneity...

    • figshare.com
    tiff
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jared C. Weddell; P. I. Imoukhuede (2023). Quantitative Characterization of Cellular Membrane-Receptor Heterogeneity through Statistical and Computational Modeling [Dataset]. http://doi.org/10.1371/journal.pone.0097271
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jared C. Weddell; P. I. Imoukhuede
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cell population heterogeneity can affect cellular response and is a major factor in drug resistance. However, there are few techniques available to represent and explore how heterogeneity is linked to population response. Recent high-throughput genomic, proteomic, and cellomic approaches offer opportunities for profiling heterogeneity on several scales. We have recently examined heterogeneity in vascular endothelial growth factor receptor (VEGFR) membrane localization in endothelial cells. We and others processed the heterogeneous data through ensemble averaging and integrated the data into computational models of anti-angiogenic drug effects in breast cancer. Here we show that additional modeling insight can be gained when cellular heterogeneity is considered. We present comprehensive statistical and computational methods for analyzing cellomic data sets and integrating them into deterministic models. We present a novel method for optimizing the fit of statistical distributions to heterogeneous data sets to preserve important data and exclude outliers. We compare methods of representing heterogeneous data and show methodology can affect model predictions up to 3.9-fold. We find that VEGF levels, a target for tuning angiogenesis, are more sensitive to VEGFR1 cell surface levels than VEGFR2; updating VEGFR1 levels in the tumor model gave a 64% change in free VEGF levels in the blood compartment, whereas updating VEGFR2 levels gave a 17% change. Furthermore, we find that subpopulations of tumor cells and tumor endothelial cells (tEC) expressing high levels of VEGFR (>35,000 VEGFR/cell) negate anti-VEGF treatments. We show that lowering the VEGFR membrane insertion rate for these subpopulations recovers the anti-angiogenic effect of anti-VEGF treatment, revealing new treatment targets for specific tumor cell subpopulations. This novel method of characterizing heterogeneous distributions shows for the first time how different representations of the same data set lead to different predictions of drug efficacy.

  13. f

    Analysis of covariance.

    • datasetcatalog.nlm.nih.gov
    Updated May 24, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The citation is currently not available for this dataset.
    Explore at:
    Dataset updated
    May 24, 2017
    Authors
    Chen, Chii-Shiarng; Mayfield, Anderson B.; Dempsey, Alexandra C.
    Description

    Since MANOVA revealed that a negative relationship between biological composition (the RNA/DNA ratio and the Symbiodinium genome copy proportion [GCP]) and gene expression significantly distinguished outliers from non-outliers (Fig 4C), multivariate analysis of covariance (MANCOVA) and univariate ANCOVA were performed on the multivariate means (excluding size data) and individual gene expression means, respectively. Only genes for which statistically significant interaction effects were documented between a biological composition parameter and outlier status (analyzed as a categorical variable: outlier [yes] vs. non-outlier [no]) have been presented. For MANCOVA, Wilks’ lambda values are shown for comparisons between a continuous variable and a categorical one, while Exact F values are shown between two continuous variables. For the multivariate data, individual correlations were tested between canonical scores (first axis only) and 1) the Symbiodinium GCP and 2) the RNA/DNA ratio. t = linear regression test statistic. *p<0.05. **p<0.01. ***p<0.0001.

  14. m

    Evesham Housing Trends and Property Statistics

    • m0ve.com
    csv
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M0VE (2025). Evesham Housing Trends and Property Statistics [Dataset]. https://m0ve.com/house-prices/evesham/
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 1, 2025
    Dataset authored and provided by
    M0VE
    License

    https://m0ve.com/terms-of-usehttps://m0ve.com/terms-of-use

    Time period covered
    Jan 1, 2021 - May 1, 2025
    Area covered
    Worcestershire, Evesham
    Measurement technique
    Homes flagged as new builds are treated using postcode-based pricing models drawn from comparable local listings,, Energy scores are handled with calm moderation when they differ from the expected local performance range,, Quietly removing quirky and problematic sales (including auctions, commercial listings and multi-unit clusters) that don’t follow standard patterns,, Older transactions are brought forward using soft inflation curves tailored to each region’s historical profile, Combining records from GOV UK with Land Registry sales to give every home a far more accurate shape (covering size, subtype and construction era),
    Description

    Compiled by M0VE.com and covering the years 2021 to 2025, this Evesham residential dataset is structured to remove common distortions and present a clear, honest picture of the local market. Source data includes Land Registry entries, EPC documentation, and council-maintained housing records. Every datapoint is passed through a deliberate refinement system that accounts for inflation, excludes statistical outliers, and adjusts pricing according to building age, energy classification, and home category. With year-on-year changes mapped out and prices grouped by property type, the dataset offers practical insights backed by comparable information from surrounding areas. It’s straightforward, consistent, and designed for actual use.

  15. f

    Robustness test by sequentially removing the outliers.

    • plos.figshare.com
    xls
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhi Jiang; Longhai Tian; Wei Liu; Bo Song; Chao Xue; Tianzong Li; Jin Chen; Fang Wei (2023). Robustness test by sequentially removing the outliers. [Dataset]. http://doi.org/10.1371/journal.pone.0268757.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Zhi Jiang; Longhai Tian; Wei Liu; Bo Song; Chao Xue; Tianzong Li; Jin Chen; Fang Wei
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Robustness test by sequentially removing the outliers.

  16. e

    Fitted proper motions for the DR solution - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Aug 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Fitted proper motions for the DR solution - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/7d9147aa-767d-5218-8494-3b33322093d8
    Explore at:
    Dataset updated
    Aug 9, 2024
    Description

    We propose new estimates of the secular aberration drift, mainly due to the rotation of the Solar System about the Galactic center, based on up-to-date VLBI observations and and improved method of outlier elimination. We fit degree-2 vector spherical harmonics to extragalactic radio source proper motion field derived from geodetic VLBI observations spanning 1979-2013. We pay particular attention to the outlier elimination procedure to remove outliers from (i) radio source coordinate time series and (ii) the proper motion sample. We obtain more accurate values of the Solar system acceleration compared to those in our previous paper. The acceleration vector is oriented towards the Galactic center within ~7{deg}. The component perpendicular to the Galactic plane is statistically insignificant. We show that an insufficient cleaning of the data set can lead to strong variations in the dipole amplitude and orientation, and statistically biased results.

  17. R code

    • figshare.com
    txt
    Updated Jun 5, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Dodge (2017). R code [Dataset]. http://doi.org/10.6084/m9.figshare.5021297.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2017
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Christine Dodge
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R code used for each data set to perform negative binomial regression, calculate overdispersion statistic, generate summary statistics, remove outliers

  18. f

    Other analyzed exposure nonpositive datasets.

    • plos.figshare.com
    xlsx
    Updated May 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tongyi Li; Liangliang Geng; Yunjiao Yang; Guannan Liu; Haichen Li; Cong Long; Qiu Chen (2024). Other analyzed exposure nonpositive datasets. [Dataset]. http://doi.org/10.1371/journal.pone.0302485.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 1, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Tongyi Li; Liangliang Geng; Yunjiao Yang; Guannan Liu; Haichen Li; Cong Long; Qiu Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe etiology of diabetic kidney disease is complex, and the role of lipoproteins and their lipid components in the development of the disease cannot be ignored. However, phospholipids are an essential component, and no Mendelian randomization studies have yet been conducted to examine potential causal associations between phospholipids and diabetic kidney disease.MethodsRelevant exposure and outcome datasets were obtained through the GWAS public database. The exposure datasets included various phospholipids, including those in LDL, IDL, VLDL, and HDL. IVW methods were the primary analytical approach. The accuracy of the results was validated by conducting heterogeneity, MR pleiotropy, and F-statistic tests. MR-PRESSO analysis was utilized to identify and exclude outliers.ResultsPhospholipids in intermediate-density lipoprotein (OR: 0.8439; 95% CI: 0.7268–0.9798), phospholipids in large low- density lipoprotein (OR: 0.7913; 95% CI: 0.6703–0.9341), phospholipids in low- density lipoprotein (after removing outliers, OR: 0.788; 95% CI: 0.6698–0.9271), phospholipids in medium low- density lipoprotein (OR: 0.7682; 95% CI: 0.634–0.931), and phospholipids in small low-density lipoprotein (after removing outliers, OR: 0.8044; 95% CI: 0.6952–0.9309) were found to be protective factors.ConclusionsThis study found that a higher proportion of phospholipids in intermediate-density lipoprotein and the various subfractions of low-density lipoprotein, including large LDL, medium LDL, and small LDL, is associated with a lower risk of developing diabetic kidney disease.

  19. f

    Appendix A. Full model output for statistical analyses (including and...

    • wiley.figshare.com
    • figshare.com
    html
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elsa E. Cleland; Jenica M. Allen; Theresa M. Crimmins; Jennifer A. Dunne; Stephanie Pau; Steven E. Travers; Erika S. Zavaleta; Elizabeth M. Wolkovich (2023). Appendix A. Full model output for statistical analyses (including and excluding outliers) and figures showing the relationship between phenological sensitivity and performance for flowering and vegetative phenology and proportional increase in dbh for species in the Harvard Forest Warming Experiment. [Dataset]. http://doi.org/10.6084/m9.figshare.3553812.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    Wiley
    Authors
    Elsa E. Cleland; Jenica M. Allen; Theresa M. Crimmins; Jennifer A. Dunne; Stephanie Pau; Steven E. Travers; Erika S. Zavaleta; Elizabeth M. Wolkovich
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Full model output for statistical analyses (including and excluding outliers) and figures showing the relationship between phenological sensitivity and performance for flowering and vegetative phenology and proportional increase in dbh for species in the Harvard Forest Warming Experiment.

  20. f

    Validation of the GPR model: Comparison between the estimates obtained with...

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marius Marinescu; Alberto Olivares; Ernesto Staffetti; Junzi Sun (2023). Validation of the GPR model: Comparison between the estimates obtained with the GPR method, with and without outliers, and the ECMWF ERA5 reanalysis data. [Dataset]. http://doi.org/10.1371/journal.pone.0276185.t007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Marius Marinescu; Alberto Olivares; Ernesto Staffetti; Junzi Sun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Validation of the GPR model: Comparison between the estimates obtained with the GPR method, with and without outliers, and the ECMWF ERA5 reanalysis data.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
David R. Mullineaux; Gareth Irwin (2023). Error and anomaly detection for intra-participant time-series data [Dataset]. http://doi.org/10.6084/m9.figshare.5189002

Data from: Error and anomaly detection for intra-participant time-series data

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
David R. Mullineaux; Gareth Irwin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Identification of errors or anomalous values, collectively considered outliers, assists in exploring data or through removing outliers improves statistical analysis. In biomechanics, outlier detection methods have explored the ‘shape’ of the entire cycles, although exploring fewer points using a ‘moving-window’ may be advantageous. Hence, the aim was to develop a moving-window method for detecting trials with outliers in intra-participant time-series data. Outliers were detected through two stages for the strides (mean 38 cycles) from treadmill running. Cycles were removed in stage 1 for one-dimensional (spatial) outliers at each time point using the median absolute deviation, and in stage 2 for two-dimensional (spatial–temporal) outliers using a moving window standard deviation. Significance levels of the t-statistic were used for scaling. Fewer cycles were removed with smaller scaling and smaller window size, requiring more stringent scaling at stage 1 (mean 3.5 cycles removed for 0.0001 scaling) than at stage 2 (mean 2.6 cycles removed for 0.01 scaling with a window size of 1). Settings in the supplied Matlab code should be customised to each data set, and outliers assessed to justify whether to retain or remove those cycles. The method is effective in identifying trials with outliers in intra-participant time series data.

Search
Clear search
Close search
Google apps
Main menu