28 datasets found
  1. f

    LOF calculation time (seconds) comparison.

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jihwan Lee; Nam-Wook Cho (2023). LOF calculation time (seconds) comparison. [Dataset]. http://doi.org/10.1371/journal.pone.0165972.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jihwan Lee; Nam-Wook Cho
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LOF calculation time (seconds) comparison.

  2. f

    Anomaly Detection in High-Dimensional Data

    • tandf.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles (2023). Anomaly Detection in High-Dimensional Data [Dataset]. http://doi.org/10.6084/m9.figshare.12844508.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Priyanga Dilini Talagala; Rob J. Hyndman; Kate Smith-Miles
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The HDoutliers algorithm is a powerful unsupervised algorithm for detecting anomalies in high-dimensional data, with a strong theoretical foundation. However, it suffers from some limitations that significantly hinder its performance level, under certain circumstances. In this article, we propose an algorithm that addresses these limitations. We define an anomaly as an observation where its k-nearest neighbor distance with the maximum gap is significantly different from what we would expect if the distribution of k-nearest neighbors with the maximum gap is in the maximum domain of attraction of the Gumbel distribution. An approach based on extreme value theory is used for the anomalous threshold calculation. Using various synthetic and real datasets, we demonstrate the wide applicability and usefulness of our algorithm, which we call the stray algorithm. We also demonstrate how this algorithm can assist in detecting anomalies present in other data structures using feature engineering. We show the situations where the stray algorithm outperforms the HDoutliers algorithm both in accuracy and computational time. This framework is implemented in the open source R package stray. Supplementary materials for this article are available online.

  3. Replication dataset and calculations for PIIE PB 17-29, United States Is...

    • piie.com
    Updated Nov 2, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simeon Djankov (2017). Replication dataset and calculations for PIIE PB 17-29, United States Is Outlier in Tax Trends in Advanced and Large Emerging Economies, by Simeon Djankov. (2017). [Dataset]. https://www.piie.com/publications/policy-briefs/united-states-outlier-tax-trends-advanced-and-large-emerging-economies
    Explore at:
    Dataset updated
    Nov 2, 2017
    Dataset provided by
    Peterson Institute for International Economicshttp://www.piie.com/
    Authors
    Simeon Djankov
    Area covered
    United States
    Description

    This data package includes the underlying data and files to replicate the calculations, charts, and tables presented in United States Is Outlier in Tax Trends in Advanced and Large Emerging Economies, PIIE Policy Brief 17-29. If you use the data, please cite as: Djankov, Simeon. (2017). United States Is Outlier in Tax Trends in Advanced and Large Emerging Economies. PIIE Policy Brief 17-29. Peterson Institute for International Economics.

  4. Effect sizes calculated using MD and MC, excluding outliers

    • dro.deakin.edu.au
    • researchdata.edu.au
    txt
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Don Driscoll (2024). Effect sizes calculated using MD and MC, excluding outliers [Dataset]. http://doi.org/10.26187/deakin.26264351.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 7, 2024
    Dataset provided by
    Deakin Universityhttp://www.deakin.edu.au/
    Authors
    Don Driscoll
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Effect sizes calculated using mean difference for burnt-unburnt study designs and mean change for before-after desings. Outliers, as defined in the methods section of the paper, were excluded prior to calculating effect sizes.

  5. f

    The 12 outliers identified in the Tonga dataset.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anderson B. Mayfield; Chii-Shiarng Chen; Alexandra C. Dempsey (2023). The 12 outliers identified in the Tonga dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0185857.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Anderson B. Mayfield; Chii-Shiarng Chen; Alexandra C. Dempsey
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tonga
    Description

    Gene expression data have been presented as non-normalized (2-Ct*109) in all but the last six rows; this allows for the back-calculation of the raw threshold cycle (Ct) values so that interested individuals can readily estimate the typical range of expression of each gene. Values representing aberrant levels for a particular parameter (z-score>2.5) have been highlighted in bold. When there was a statistically significant difference (student’s t-test, p0.05). SA = surface area. GCP = genome copy proportion. Ma Dis = Mahalanobis distance. “.” = missing data.

  6. S

    Water quality test data

    • scidb.cn
    Updated Oct 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HuiyunFeng; JingangJiang (2022). Water quality test data [Dataset]. http://doi.org/10.57760/sciencedb.05375
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 26, 2022
    Dataset provided by
    Science Data Bank
    Authors
    HuiyunFeng; JingangJiang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Outliers are often present in large datasets of water quality monitoring time series data. A method of combining the sliding window technique with Dixon detection criterion for the automatic detection of outliers in time series data is limited by the empirical determination of sliding window sizes. The scientific determination of the optimal sliding window size is very meaningful research work. This paper presents a new Monte Carlo Search Method (MCSM) based on random sampling to optimize the size of the sliding window, which fully takes advantage of computers and statistics. The MCSM was applied in a case study to automatic monitoring data of water quality factors in order to test its validity and usefulness. The results of comparing the accuracy and efficiency of the MCSM show that the new method in this paper is scientific and effective. The experimental results show that, at different sample sizes, the average accuracy is between 58.70% and 75.75%, and the average computation time increase is between 17.09% and 45.53%. In the era of big data in environmental monitoring, the proposed new methods can meet the required accuracy of outlier detection and improve the efficiency of calculation.

  7. Capital Ratios For Acute Care Hospitals

    • johnsnowlabs.com
    csv
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs, Capital Ratios For Acute Care Hospitals [Dataset]. https://www.johnsnowlabs.com/marketplace/capital-ratios-for-acute-care-hospitals/
    Explore at:
    csvAvailable download formats
    Dataset authored and provided by
    John Snow Labs
    Area covered
    United States
    Description

    This dataset is used to determine whether a case qualifies for outlier payments under the hospital inpatient prospective payment system (IPPS), hospital-specific cost-to-charge ratios are applied to the total covered charges for the case. Operating and capital costs for the case are calculated separately by applying separate operating and capital cost-to-charge ratios and combining these costs to compare them with the fixed-loss outlier threshold.

  8. d

    11: Streamwater sample constituent concentration outliers from 15 watersheds...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). 11: Streamwater sample constituent concentration outliers from 15 watersheds in Gwinnett County, Georgia for water years 2003-2020 [Dataset]. https://catalog.data.gov/dataset/11-streamwater-sample-constituent-concentration-outliers-from-15-watersheds-in-gwinne-2003
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    Gwinnett County, Georgia
    Description

    This dataset contains a list of outlier sample concentrations identified for 17 water quality constituents from streamwater sample collected at 15 study watersheds in Gwinnett County, Georgia for water years 2003 to 2020. The 17 water quality constituents are: biochemical oxygen demand (BOD), chemical oxygen demand (COD), total suspended solids (TSS), suspended sediment concentration (SSC), total nitrogen (TN), total nitrate plus nitrite (NO3NO2), total ammonia plus organic nitrogen (TKN), dissolved ammonia (NH3), total phosphorus (TP), dissolved phosphorus (DP), total organic carbon (TOC), total calcium (Ca), total magnesium (Mg), total copper (TCu), total lead (TPb), total zinc (TZn), and total dissolved solids (TDS). 885 outlier concentrations were identified. Outliers were excluded from model calibration datasets used to estimate streamwater constituent loads for 12 of these constituents. Outlier concentrations were removed because they had a high influence on the model fits of the concentration relations, which could substantially affect model predictions. Identified outliers were also excluded from loads that were calculated using the Beale ratio estimator. Notes on reason(s) for considering a concentration as an outlier are included.

  9. f

    Goodness-of-fit filtering in classical metric multidimensional scaling with...

    • tandf.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jan Graffelman (2023). Goodness-of-fit filtering in classical metric multidimensional scaling with large datasets [Dataset]. http://doi.org/10.6084/m9.figshare.11389830.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Jan Graffelman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Metric multidimensional scaling (MDS) is a widely used multivariate method with applications in almost all scientific disciplines. Eigenvalues obtained in the analysis are usually reported in order to calculate the overall goodness-of-fit of the distance matrix. In this paper, we refine MDS goodness-of-fit calculations, proposing additional point and pairwise goodness-of-fit statistics that can be used to filter poorly represented observations in MDS maps. The proposed statistics are especially relevant for large data sets that contain outliers, with typically many poorly fitted observations, and are helpful for improving MDS output and emphasizing the most important features of the dataset. Several goodness-of-fit statistics are considered, and both Euclidean and non-Euclidean distance matrices are considered. Some examples with data from demographic, genetic and geographic studies are shown.

  10. R code

    • figshare.com
    txt
    Updated Jun 5, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christine Dodge (2017). R code [Dataset]. http://doi.org/10.6084/m9.figshare.5021297.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 5, 2017
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Christine Dodge
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R code used for each data set to perform negative binomial regression, calculate overdispersion statistic, generate summary statistics, remove outliers

  11. g

    DVF statistics

    • gimi9.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DVF statistics [Dataset]. https://gimi9.com/dataset/eu_64998de5926530ebcecc7b15
    Explore at:
    Description

    Data statistics DVF, available on explore.data.gouv.fr/immobilier. The files contain the number of sales, the average and the median of prices per m2. - Total DVF statistics: statistics by geographical scale, over the 10 semesters available. - Monthly DVF statistics: statistics by geographical scale and by month. ## Description of treatment The code allows statistics to be generated from the data of land value requests, aggregated at different scales, and their evolution over time (monthly). The following indicators have been calculated on a monthly basis and over the entire period available (10 semesters): * number of mutations * average prices per m2 * median of prices per m2 * Breakdown of sales prices by tranches for each type of property from: * houses * apartments * houses + apartments * commercial premises and for each scale from: * nation * Department * EPCI * municipality * Cadastral section The source data contain the following types of mutations: sale, sale in the future state of completion, sale of building land, tendering, expropriation and exchange. We have chosen to keep only sales, sales in the future state of completion and auctions for statistics*. In addition, for the sake of simplicity, we have chosen to keep only mutations that concern a single asset (excluding dependency)*. Our path is as follows: 1. for a transfer that would include assets of several types (e.g. a house + a commercial premises), it is not possible to reconstitute the share of the land value allocated to each of the assets included. 2. for a transfer that would include several assets of the same type (e.g. X apartments), the total value of the transfer is not necessarily equal to X times the value of an apartment, especially in the case where the assets are very different (area, work to be carried out, floor, etc.). We had initially kept these goods by calculating the price per m2 of the mutation by considering the goods of the mutation as a single good of an area to the sum of the surfaces of the goods, but this method, which ultimately concerned only a marginal quantity of goods, did not convince us for the final version. The price per m2 is then calculated by dividing the land value of the change by the surface area of the building of the property concerned. We finally exclude mutations for which we could not calculate the price per m2, as well as those whose price per m2 is more than € 100k (arbitrary choice)*. We have not incorporated any other outlier restrictions in order to maintain fidelity to the original data and to report potential anomalies. Displaying the median on the site reduces the impact of outliers on color scales. _*: The mentioned filters are applied for the calculation of statistics, but all mutations of the source files are well displayed on the application at the plot level.

  12. Weekly United States COVID-19 Hospitalization Metrics by County – ARCHIVED

    • data.virginia.gov
    • healthdata.gov
    • +1more
    csv, json, xsl
    Updated Feb 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Weekly United States COVID-19 Hospitalization Metrics by County – ARCHIVED [Dataset]. https://data.virginia.gov/dataset/weekly-united-states-covid-19-hospitalization-metrics-by-county-archived
    Explore at:
    csv, xsl, jsonAvailable download formats
    Dataset updated
    Feb 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    Note: After May 3, 2024, this dataset will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, hospital capacity, or occupancy data to HHS through CDC’s National Healthcare Safety Network (NHSN). The related CDC COVID Data Tracker site was revised or retired on May 10, 2023.

    Note: May 3,2024: Due to incomplete or missing hospital data received for the April 21,2024 through April 27, 2024 reporting period, the COVID-19 Hospital Admissions Level could not be calculated for CNMI and will be reported as “NA” or “Not Available” in the COVID-19 Hospital Admissions Level data released on May 3, 2024.

    This dataset represents COVID-19 hospitalization data and metrics aggregated to county or county-equivalent, for all counties or county-equivalents (including territories) in the United States. COVID-19 hospitalization data are reported to CDC’s National Healthcare Safety Network, which monitors national and local trends in healthcare system stress, capacity, and community disease levels for approximately 6,000 hospitals in the United States. Data reported by hospitals to NHSN and included in this dataset represent aggregated counts and include metrics capturing information specific to COVID-19 hospital admissions, and inpatient and ICU bed capacity occupancy.

    Reporting information:

    • As of December 15, 2022, COVID-19 hospital data are required to be reported to NHSN, which monitors national and local trends in healthcare system stress, capacity, and community disease levels for approximately 6,000 hospitals in the United States. Data reported by hospitals to NHSN represent aggregated counts and include metrics capturing information specific to hospital capacity, occupancy, hospitalizations, and admissions. Prior to December 15, 2022, hospitals reported data directly to the U.S. Department of Health and Human Services (HHS) or via a state submission for collection in the HHS Unified Hospital Data Surveillance System (UHDSS).
    • While CDC reviews these data for errors and corrects those found, some reporting errors might still exist within the data. To minimize errors and inconsistencies in data reported, CDC removes outliers before calculating the metrics. CDC and partners work with reporters to correct these errors and update the data in subsequent weeks.
    • Many hospital subtypes, including acute care and critical access hospitals, as well as Veterans Administration, Defense Health Agency, and Indian Health Service hospitals, are included in the metric calculations provided in this report. Psychiatric, rehabilitation, and religious non-medical hospital types are excluded from calculations.
    • Data are aggregated and displayed for hospitals with the same Centers for Medicare and Medicaid Services (CMS) Certification Number (CCN), which are assigned by CMS to counties based on the CMS Provider of Services files.
    • Full details on COVID-19 hospital data reporting guidance can be found here: https://www.hhs.gov/sites/default/files/covid-19-faqs-hospitals-hospital-laboratory-acute-care-facility-data-reporting.pdf
    Calculation of county-level hospital metrics:
    • County-level hospital data are derived using calculations performed at the Health Service Area (HSA) level. An HSA is defined by CDC’s National Center for Health Statistics as a geographic area containing at least one county which is self-contained with respect to the population’s provision of routine hospital care. Every county in the United States is assigned to an HSA, and each HSA must contain at least one hospital. Therefore, use of HSAs in the calculation of local hospital metrics allows for more accurate characterization of the relationship between health care utilization and health status at the local level.
    • Data presented at the county-level represent admissions, hospital inpatient and ICU bed capacity and occupancy among hosp

  13. e

    HGW: Copper, Average total content (surface)

    • data.europa.eu
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). HGW: Copper, Average total content (surface) [Dataset]. https://data.europa.eu/88u/dataset/e3ae07e5-e8b8-3d7a-6f1d-365a83b520cc
    Explore at:
    Dataset updated
    Aug 28, 2024
    Description

    The mean is the median (synonym: 50. percentile, central value). It is the value above or below which 50% of all cases of a data group are located. The calculation is carried out on outlier-adjusted data collectives. The total content is determined from the aqua regia extract (according to DIN ISO 11466 (1997)). The concentration is given in mg/kg. The salary classes take into account, among other things, the pension values of the BBodSchV (1999). These are 20 mg/kg for sand, 40 mg/kg for clay, silt and very silty sand and 60 mg/kg for clay. According to LABO (2003) a sample count of >=20 is required for the calculation of background values. However, the map also shows groups with a sample count >= 10. This information is then only informal and not representative.

  14. Probability-Density-Ranking (PDR) outliers and Most Probable Range (MPR) of...

    • springernature.figshare.com
    application/gzip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chenghua Shao; Huanwang Yang; Sijiang Wang; Zonghong Liu; Stephen K. Burley (2023). Probability-Density-Ranking (PDR) outliers and Most Probable Range (MPR) of PDB data [Dataset]. http://doi.org/10.6084/m9.figshare.7150124.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Chenghua Shao; Huanwang Yang; Sijiang Wang; Zonghong Liu; Stephen K. Burley
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Data and code to calculate Probability-Density-Ranking (PDR) outliers and Most Probable Range (MPR)

  15. n

    Data from: Drivers of contemporary and future changes in Arctic seasonal...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yijing Liu; Peiyan Wang; Bo Elberling; Andreas Westergaard-Nielsen (2023). Drivers of contemporary and future changes in Arctic seasonal transition dates for a tundra site in coastal Greenland [Dataset]. http://doi.org/10.5061/dryad.jsxksn0hp
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 30, 2023
    Dataset provided by
    University of Copenhagen
    Institute of Geographic Sciences and Natural Resources Research
    Authors
    Yijing Liu; Peiyan Wang; Bo Elberling; Andreas Westergaard-Nielsen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Arctic, Greenland
    Description

    Climate change has had a significant impact on the seasonal transition dates of Arctic tundra ecosystems, causing diverse variations between distinct land surface classes. However, the combined effect of multiple controls as well as their individual effects on these dates remains unclear at various scales and across diverse land surface classes. Here we quantified spatiotemporal variations of three seasonal transition dates (start of spring, maximum Normalized Difference Vegetation Index (NDVImax) day, end of fall) for five dominant land surface classes in the ice-free Greenland and analyzed their drivers for current and future climate scenarios, respectively. Methods To quantify the seasonal transition dates, we used NDVI derived from Sentinel-2 MultiSpectral Instrument (Level-1C) images during 2016–2020 based on Google Earth Engine (https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S2). We performed an atmospheric correction (Yin et al., 2019) on the images before calculating NDVI. The months from May to October were set as the study period each year. The quality control process includes 3 steps: (i) the cloud was masked according to the QA60 band; (ii) images were removed if the number of pixels with NDVI values outside the range of -1–1 exceeds 30% of the total pixels while extracting the median value of each date; (iii) NDVI outliers resulting from cloud mask errors (Coluzzi et al., 2018) and sporadic snow were deleted pixel by pixel. NDVI outliers mentioned here appear as a sudden drop to almost zero in the growing season and do not form a sequence in this study (Komisarenko et al., 2022). To identify outliers, we iterated through every two consecutive NDVI values in the time series and calculated the difference between the second and first values for each pixel every year. We defined anomalous NDVI differences as points outside of the percentiles threshold [10 90], and if the NDVI difference is positive, then the first NDVI value used to calculate the difference will be the outlier, otherwise, the second one will be the outlier. Finally, 215 images were used to reflect seasonal transition dates in all 5 study periods of 2016–2020 after the quality control. Each image was resampled with 32 m spatial resolution to match the resolution of the ArcticDEM data and SnowModel outputs. To detect seasonal transition dates, we used a double sigmoid model to fit the NDVI changes on time series, and points where the curvature changes most rapidly on the fitted curve, appear at the beginning, middle, and end of each season (Klosterman et al., 2014). The applicability of this phenology method in the Arctic has been demonstrated (Ma et al., 2022; Westergaard-Nielsen et al., 2013; Westergaard-Nielsen et al., 2017). We focused on 3 seasonal transition dates, i.e., SOS, NDVImax day, and EOF. The NDVI values for some pixels are still below zero in spring and summer due to topographical shadow. We, therefore, set a quality control rule before calculating seasonal transition dates for each pixel, i.e., if the number of days with positive NDVI values from June to September is less than 60% of the total number of observed days, the pixel will not be considered for subsequent calculations. As verification of fitted dates, the seasonal transition dates in dry heaths and corresponding time-lapse photos acquired from the snow fence area are shown in Fig. 2. Snow cover extent is greatly reduced and vegetation is exposed with lower NDVI values on the SOS. All visible vegetation is green on the NDVImax day. On EOF, snow cover distributes partly, and NDVI decreases to a value close to zero.

  16. C

    Lead, mean total content (topsoil)

    • ckan.mobidatalab.eu
    html, karte +2
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landesamt für Geologie und Bergbau (2023). Lead, mean total content (topsoil) [Dataset]. https://ckan.mobidatalab.eu/dataset/blei-mittlerer-gesamtgehalt-oberboden
    Explore at:
    karte, webanwendung, wms, htmlAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Landesamt für Geologie und Bergbau
    License

    Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
    License information was derived automatically

    Description

    The median (synonym: 50th percentile, central value) is used as the mean value. It is the value above or below which 50% of all cases in a data group are. The calculation is carried out on outlier-free data collectives. The total content is determined from the aqua regia extract (according to DIN ISO 11466 (1997)). The concentration is given in mg/kg. The salary classes take into account, among other things, the precautionary values ​​of the BBodSchV (1999). These are 40 mg/kg for the soil type sand, 70 mg/kg for loam, silt and very silty sand and 100 mg/kg for clay. According to LABO (2003), a sample number of >=20 is required for the calculation of background values. However, groups with a number of samples >= 10 are also shown on the map. This information is then only informal and not representative.

  17. C

    Arsenic, mean total content (topsoil)

    • ckan.mobidatalab.eu
    html, karte +2
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Landesamt für Geologie und Bergbau (2023). Arsenic, mean total content (topsoil) [Dataset]. https://ckan.mobidatalab.eu/dataset/arsen-mittlerer-gesamtgehalt-oberboden
    Explore at:
    karte, html, wms, webanwendungAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Landesamt für Geologie und Bergbau
    License

    Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
    License information was derived automatically

    Description

    The median (synonym: 50th percentile, central value) is used as the mean value. It is the value above or below which 50% of all cases in a data group are. The calculation is carried out on outlier-free data collectives. The total content is determined from the aqua regia extract (according to DIN ISO 11466 (1997)). The concentration is given in mg/kg. The BBodSchV (1999) does not set any precautionary values ​​for arsenic. According to LABO (2003), a sample number of >=20 is required for the calculation of background values. However, groups with a number of samples >= 10 are also shown on the map. This information is then only informal and not representative. Further information on definitions of terms, horizon grouping and statistical evaluation: (http://mapserver.lgb-rlp.de/php_hgw_bod/meta/Background values_Hinweise.pdf) Terms of use see: http://www.lgb-rlp.de/karten-und- products/online-maps/terms-of-use-for-online-maps.html

  18. f

    Data from: Benchmarking Basis Sets for Density Functional Theory...

    • acs.figshare.com
    xlsx
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuel J. Pitman; Alicia K. Evans; Robbie T. Ireland; Felix Lempriere; Laura K. McKemmish (2023). Benchmarking Basis Sets for Density Functional Theory Thermochemistry Calculations: Why Unpolarized Basis Sets and the Polarized 6‑311G Family Should Be Avoided [Dataset]. http://doi.org/10.1021/acs.jpca.3c05573.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 20, 2023
    Dataset provided by
    ACS Publications
    Authors
    Samuel J. Pitman; Alicia K. Evans; Robbie T. Ireland; Felix Lempriere; Laura K. McKemmish
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Basis sets are a crucial but often largely overlooked choice in setting up quantum chemistry calculations. The choice of the basis set can be critical in determining the accuracy and calculation time of your quantum chemistry calculations. Clear recommendations based on thorough benchmarking are essential but not readily available currently. This study investigates the relative quality of basis sets for general properties by benchmarking basis set performance for a diverse set of 139 reactions (from the diet-150-GMTKN55 data set). In our analysis, we find the distributions of errors are often significantly non-Gaussian, meaning that the joint consideration of median errors, mean absolute errors, and outlier statistics is helpful to provide a holistic understanding of basis set performance. Our direct comparison of performance between most modern basis sets provides quantitative evidence for basis set recommendations that broadly align with the established understanding of basis set experts and is evident in the design of modern basis sets. For example, while zeta is a good measure of quality, it is not the only determining factor for an accurate calculation with unpolarized double- and triple-ζ basis sets (like 6-31G and 6-311G) having very poor performance. Appropriate use of polarization functions (e.g., 6-31G*) is essential to obtain the accuracy offered by double- or triple-ζ basis sets. In our study, the best performances for double- and triple-ζ basis sets are 6-31++G** and pcseg-2, respectively. However, the performances of singly polarized double-ζ and doubly polarized triple-ζ basis sets are quite similar with one key exception: the polarized 6-311G basis set family has poor parametrization, which means its performance is more like a double-ζ than a triple-ζ basis set. All versions of the 6-311G basis set family should be avoided entirely for valence chemistry calculations moving forward.

  19. f

    Multivariate outlier test results.

    • figshare.com
    xls
    Updated May 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A. Yuspahruddin; Hafid Abbas; Indra Pahala; Anis Eliyana; Zaleha Yazid (2024). Multivariate outlier test results. [Dataset]. http://doi.org/10.1371/journal.pone.0298936.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 2, 2024
    Dataset provided by
    PLOS ONE
    Authors
    A. Yuspahruddin; Hafid Abbas; Indra Pahala; Anis Eliyana; Zaleha Yazid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This study underscores the significance of assessing the capabilities of rehabilitation officers in navigating challenges, devising innovative work methods, and successfully executing the rehabilitation process. This is particularly crucial amid the dual challenges of overcapacity and the repercussions of the Covid-19 pandemic, making it an essential area for research. To be specific, it aims to obtain empirical evidence about the influence of proactive personality and supportive supervision on proactive work behavior, as well as the mediating role of Role Breadth Self-efficacy and Change Orientation. This research was conducted on all rehabilitation officers at the Narcotics Penitentiary in Sumatra, totaling 272 respondents. This study employs a quantitative method via a questionnaire using a purposive sampling technique. The data was subsequently examined using the Lisrel 8.70 software and Structural Equation Modeling (SEM). It can be concluded from the results that the rehabilitation officers for narcotics addicts at the Narcotics Penitentiary can create and improve proactive work behavior properly through the influence of proactive personality, supportive supervision, role breadth self-efficacy, and change orientation. The study may suggest new ways of working and generate new ideas to increase initiative, encourage feedback, and voice employee concerns. Furthermore, this research has the potential to pinpoint deficiencies in proactive work behavior, serving as a foundation for designing interventions or training programs. These initiatives aim to enhance the innovative and creative contributions of rehabilitation officers in the rehabilitation process.

  20. f

    Pairwise FST values calculated using neutral (above diagonal) and outlier...

    • figshare.com
    odt
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Francesco Maroso; Konstantinos Gkagkavouzis; Sabina De Innocentiis; Jasmien Hillen; Fernanda do Prado; Nikoleta Karaiskou; John Bernard Taggart; Adrian Carr; Einar Nielsen; Alexandros Triantafyllidis; Luca Bargelloni (2023). Pairwise FST values calculated using neutral (above diagonal) and outlier (below diagonal) SNPs. [Dataset]. http://doi.org/10.1371/journal.pone.0236230.s008
    Explore at:
    odtAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Francesco Maroso; Konstantinos Gkagkavouzis; Sabina De Innocentiis; Jasmien Hillen; Fernanda do Prado; Nikoleta Karaiskou; John Bernard Taggart; Adrian Carr; Einar Nielsen; Alexandros Triantafyllidis; Luca Bargelloni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    • when p-value < 0.05; ** when p-value < 0.01. (ODT)
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jihwan Lee; Nam-Wook Cho (2023). LOF calculation time (seconds) comparison. [Dataset]. http://doi.org/10.1371/journal.pone.0165972.t003

LOF calculation time (seconds) comparison.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Jihwan Lee; Nam-Wook Cho
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

LOF calculation time (seconds) comparison.

Search
Clear search
Close search
Google apps
Main menu