62 datasets found

Data from: Directional Quantile Classifiers
tandf.figshare.com
txt
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alessio Farcomeni; Marco Geraci; Cinzia Viroli (2023). Directional Quantile Classifiers [Dataset]. http://doi.org/10.6084/m9.figshare.17711340.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17711340.v2
Dataset updated
Jun 4, 2023
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
Alessio Farcomeni; Marco Geraci; Cinzia Viroli
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We introduce classifiers based on directional quantiles. We derive theoretical results for selecting optimal quantile levels given a direction, and, conversely, an optimal direction given a quantile level. We also show that the probability of correct classification of the proposed classifier converges to one if population distributions differ by at most a location shift and if the number of directions is allowed to diverge at the same rate of the problem’s dimension. We illustrate the satisfactory performance of our proposed classifiers in both small- and high-dimensional settings via a simulation study and a real data example. The code implementing the proposed methods is publicly available in the R package Qtools. Supplementary materials for this article are available online.
a
Population Density in Tioga County NY
tiogatells-tiogacountyny.hub.arcgis.com
Updated Jun 14, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tioga County NY (2019). Population Density in Tioga County NY [Dataset]. https://tiogatells-tiogacountyny.hub.arcgis.com/maps/ae0a6e1e4f8144079ba29ed97cb6125c
Explore at:
Dataset updated
Jun 14, 2019
Dataset authored and provided by
Tioga County NY
Area covered

Description
The map shows population density in Tioga County NY using a quantile classification with 5 data breaks each rounded to the nearest 10 people. The population data is census block level data from the 2010 U.S. Census.
f
Quantiles of sensitivity, specificity and log posterior for training and...
figshare.com
datasetcatalog.nlm.nih.gov
+1more
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wenbiao Hu; Rebecca A. O'Leary; Kerrie Mengersen; Samantha Low Choy (2023). Quantiles of sensitivity, specificity and log posterior for training and validation datasets over all accepted trees, for Bayesian classification trees. [Dataset]. http://doi.org/10.1371/journal.pone.0023903.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0023903.t003
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Wenbiao Hu; Rebecca A. O'Leary; Kerrie Mengersen; Samantha Low Choy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Quantiles of sensitivity, specificity and log posterior for training and validation datasets over all accepted trees, for Bayesian classification trees.
a
Kansas Population 1890-2020
hub.arcgis.com
Updated Mar 15, 2013
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kansas State University (2013). Kansas Population 1890-2020 [Dataset]. https://hub.arcgis.com/maps/kstate::kansas-population-1890-2020/explore?path=
Explore at:
Dataset updated
Mar 15, 2013
Dataset authored and provided by
Kansas State University
Area covered

Description
U.S. Census population data for Kansas counties from 1890 through 2010. The choropleth map shows 2010 population based on a quantile classification. Click on any county to see additional information about historic maximums, population loss, and trend in population since 1890.
Z
Data from: Dataset from : Browsing is a strong filter for savanna tree...
data.niaid.nih.gov
Updated Oct 1, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archibald, Sally; Wayne Twine; Craddock Mthabini; Nicola Stevens (2021). Dataset from : Browsing is a strong filter for savanna tree seedlings in their first growing season [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4972083
Explore at:
Dataset updated
Oct 1, 2021
Dataset provided by
School of Animal Plant and Environmental Sciences, University of Witwatersrand, Johannesburg, South Africa
Centre for African Ecology, School of Animal Plant and Environmental Sciences, University of Witwatersrand, Johannesburg, South Africa
Centre for African Ecology, School of Animal Plant and Environmental Sciences, University of Witwatersrand, Johannesburg, South Africa AND Environmental Change Institute, School of Geography and the Environment, University of Oxford, Oxford OX1 3QY, United Kingdom
Authors
Archibald, Sally; Wayne Twine; Craddock Mthabini; Nicola Stevens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data presented here were used to produce the following paper:

Archibald, Twine, Mthabini, Stevens (2021) Browsing is a strong filter for savanna tree seedlings in their first growing season. J. Ecology.

The project under which these data were collected is: Mechanisms Controlling Species Limits in a Changing World. NRF/SASSCAL Grant number 118588

For information on the data or analysis please contact Sally Archibald: sally.archibald@wits.ac.za

Description of file(s):

File 1: cleanedData_forAnalysis.csv (required to run the R code: "finalAnalysis_PostClipResponses_Feb2021_requires_cleanData_forAnalysis_.R"

The data represent monthly survival and growth data for ~740 seedlings from 10 species under various levels of clipping.

The data consist of one .csv file with the following column names:

treatment Clipping treatment (1 - 5 months clip plus control unclipped) plot_rep One of three randomised plots per treatment matrix_no Where in the plot the individual was placed species_code First three letters of the genus name, and first three letters of the species name uniquely identifies the species species Full species name sample_period Classification of sampling period into time since clip. status Alive or Dead standing.height Vertical height above ground (in mm) height.mm Length of the longest branch (in mm) total.branch.length Total length of all the branches (in mm) stemdiam.mm Basal stem diameter (in mm) maxSpineLength.mm Length of the longest spine postclipStemNo Number of resprouting stems (only recorded AFTER clipping) date.clipped date.clipped date.measured date.measured date.germinated date.germinated Age.of.plant Date measured - Date germinated newtreat Treatment as a numeric variable, with 8 being the control plot (for plotting purposes)

File 2: Herbivory_SurvivalEndofSeason_march2017.csv (required to run the R code: "FinalAnalysisResultsSurvival_requires_Herbivory_SurvivalEndofSeason_march2017.R"

The data consist of one .csv file with the following column names:

treatment Clipping treatment (1 - 5 months clip plus control unclipped) plot_rep One of three randomised plots per treatment matrix_no Where in the plot the individual was placed species_code First three letters of the genus name, and first three letters of the species name uniquely identifies the species species Full species name sample_period Classification of sampling period into time since clip. status Alive or Dead standing.height Vertical height above ground (in mm) height.mm Length of the longest branch (in mm) total.branch.length Total length of all the branches (in mm) stemdiam.mm Basal stem diameter (in mm) maxSpineLength.mm Length of the longest spine postclipStemNo Number of resprouting stems (only recorded AFTER clipping) date.clipped date.clipped date.measured date.measured date.germinated date.germinated Age.of.plant Date measured - Date germinated newtreat Treatment as a numeric variable, with 8 being the control plot (for plotting purposes) genus Genus MAR Mean Annual Rainfall for that Species distribution (mm) rainclass High/medium/low

File 3: allModelParameters_byAge.csv (required to run the R code: "FinalModelSeedlingSurvival_June2021_.R"

Consists of a .csv file with the following column headings

Age.of.plant Age in days species_code Species pred_SD_mm Predicted stem diameter in mm pred_SD_up top 75th quantile of stem diameter in mm pred_SD_low bottom 25th quantile of stem diameter in mm treatdate date when clipped pred_surv Predicted survival probability pred_surv_low Predicted 25th quantile survival probability pred_surv_high Predicted 75th quantile survival probability species_code species code Bite.probability Daily probability of being eaten max_bite_diam_duiker_mm Maximum bite diameter of a duiker for this species duiker_sd standard deviation of bite diameter for a duiker for this species max_bite_diameter_kudu_mm Maximum bite diameer of a kudu for this species kudu_sd standard deviation of bite diameter for a kudu for this species mean_bite_diam_duiker_mm mean etc duiker_mean_sd standard devaition etc mean_bite_diameter_kudu_mm mean etc kudu_mean_sd standard deviation etc genus genus rainclass low/med/high

File 4: EatProbParameters_June2020.csv (required to run the R code: "FinalModelSeedlingSurvival_June2021_.R"

Consists of a .csv file with the following column headings

shtspec species name species_code species code genus genus rainclass low/medium/high seed mass mass of seed (g per 1000seeds)
Surv_intercept coefficient of the model predicting survival from age of clip for this species Surv_slope coefficient of the model predicting survival from age of clip for this species GR_intercept coefficient of the model predicting stem diameter from seedling age for this species GR_slope coefficient of the model predicting stem diameter from seedling age for this species species_code species code max_bite_diam_duiker_mm Maximum bite diameter of a duiker for this species duiker_sd standard deviation of bite diameter for a duiker for this species max_bite_diameter_kudu_mm Maximum bite diameer of a kudu for this species kudu_sd standard deviation of bite diameter for a kudu for this species mean_bite_diam_duiker_mm mean etc duiker_mean_sd standard devaition etc mean_bite_diameter_kudu_mm mean etc kudu_mean_sd standard deviation etc AgeAtEscape_duiker[t] age of plant when its stem diameter is larger than a mean duiker bite AgeAtEscape_duiker_min[t] age of plant when its stem diameter is larger than a min duiker bite AgeAtEscape_duiker_max[t] age of plant when its stem diameter is larger than a max duiker bite AgeAtEscape_kudu[t] age of plant when its stem diameter is larger than a mean kudu bite AgeAtEscape_kudu_min[t] age of plant when its stem diameter is larger than a min kudu bite AgeAtEscape_kudu_max[t] age of plant when its stem diameter is larger than a max kudu bite
Support vector machine with quantile hyper-spheres for pattern...
plos.figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maoxiang Chu; Xiaoping Liu; Rongfen Gong; Jie Zhao (2023). Support vector machine with quantile hyper-spheres for pattern classification [Dataset]. http://doi.org/10.1371/journal.pone.0212361
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0212361
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Maoxiang Chu; Xiaoping Liu; Rongfen Gong; Jie Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This paper formulates a support vector machine with quantile hyper-spheres (QHSVM) for pattern classification. The idea of QHSVM is to build two quantile hyper-spheres with the same center for positive or negative training samples. Every quantile hyper-sphere is constructed by using pinball loss instead of hinge loss, which makes the new classification model be insensitive to noise, especially the feature noise around the decision boundary. Moreover, the robustness and generalization of QHSVM are strengthened through maximizing the margin between two quantile hyper-spheres, maximizing the inner-class clustering of samples and optimizing the independent quadratic programming for a target class. Besides that, this paper proposes a novel local center-based density estimation method. Based on it, ρ-QHSVM with surrounding and clustering samples is given. Under the premise of high accuracy, the execution speed of ρ-QHSVM can be adjusted. The experimental results in artificial, benchmark and strip steel surface defects datasets show that the QHSVM model has distinct advantages in accuracy and the ρ-QHSVM model is fit for large-scale datasets.
f
Data from: Quantile regression of nonlinear models to describe different...
scielo.figshare.com
jpeg
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guilherme Alves Puiatti; Paulo Roberto Cecon; Moysés Nascimento; Ana Carolina Campana Nascimento; Antônio Policarpo Souza Carneiro; Fabyano Fonseca e Silva; Mário Puiatti; Ana Carolina Ribeiro de Oliveira (2023). Quantile regression of nonlinear models to describe different levels of dry matter accumulation in garlic plants [Dataset]. http://doi.org/10.6084/m9.figshare.5907898.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5907898.v1
Dataset updated
Jun 5, 2023
Dataset provided by
SciELO journals
Authors
Guilherme Alves Puiatti; Paulo Roberto Cecon; Moysés Nascimento; Ana Carolina Campana Nascimento; Antônio Policarpo Souza Carneiro; Fabyano Fonseca e Silva; Mário Puiatti; Ana Carolina Ribeiro de Oliveira
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT: Plant growth analyses are important because they generate information on the demand and necessary care for each development stage of a plant. Nonlinear regression models are appropriate for the description of curves of growth, since they include parameters with practical biological interpretation. However, these models present information in terms of the conditional mean, and they are subject to problems in the adjustment caused by possible outliers or asymmetry in the distribution of the data. Quantile regression can solve these problems, and it allows the estimation of different quantiles, generating more complete and robust results. The objective of this research was to adjust a nonlinear quantile regression model for the study of dry matter accumulation in garlic plants (Allium sativum L.) over time, estimating parameters at three different quantiles and classifying each garlic accession according to its growth rate and asymptotic weight. The nonlinear regression model fitted was a Logistic model, and 30 garlic accessions were evaluated. These 30 accessions were divided based on the model with the closest quantile estimates; 12 accessions were classified as of lesser interest for planting, 6 were classified as intermediate, and 12 were classified as of greater interest for planting.
a
Worldwide CO2 Emissions 2007
umn.hub.arcgis.com
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Minnesota (2022). Worldwide CO2 Emissions 2007 [Dataset]. https://umn.hub.arcgis.com/maps/8ca5236b4b2444718c4cc9ab824f8962
Explore at:
Dataset updated
Apr 14, 2022
Dataset authored and provided by
University of Minnesota
Area covered
Description
Quantile classification rounded to 100,000.Pop-up graphs show CO2 emissions over time since 1961Data from the World Bank.
Depth (Standard Deviation) Layer used to identify, delineate and classify...
catalog.data.gov
s.cnmilf.com
+1more
Updated Mar 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Commerce (DOC), National Oceanic and Atmospheric Administration (NOAA), National Ocean Service (NOS), Center for Coastal Monitoring and Assessment (CCMA), Biogeography Branch (Point of Contact) (2025). Depth (Standard Deviation) Layer used to identify, delineate and classify moderate-depth benthic habitats around St. John, USVI [Dataset]. https://catalog.data.gov/dataset/depth-standard-deviation-layer-used-to-identify-delineate-and-classify-moderate-depth-benthic-h4
Explore at:
Dataset updated
Mar 22, 2025
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
United States Department of Commercehttp://commerce.gov/
Area covered
Saint John, U.S. Virgin Islands
Description
Standard deviation of depth was calculated from the bathymetry surface for each cell using the ArcGIS Spatial Analyst Focal Statistics "STD" parameter. Standard deviation of depth represents the dispersion of depth values (in meters) around the mean depth within a square 3x3 cell window. The 2x2 meter resolution standard deviation of depth GeoTIFF was exported and added as a new map layer to aid in benthic habitat classification. Acoustic imagery was acquired for the VICRNM on two separate missions onboard the NOAA ship, Nancy Foster. The first mission took place from 2/18/04 to 3/5/04. The second mission took place from 2/1/05 to 2/12/05. On both missions, seafloor depths between 14 to 55 m were mapped using a RESON SeaBat 8101 ER (240 kHz) MBES sensor. This pole-mounted system measured water depths across a 150 degree swath consisting of 101 individual 1.5 degree x 1.5 degree beams. The beams to the port and starboard of nadir (i.e., directly underneath the ship) overlapped adjacent survey lines by approximately 10 m. The vessel survey speed was between 5 and 8 kn. In 2004, the ship's location was determined by a Trimble DSM 132 DGPS system, which provided a RTCM differential data stream from the U.S. Coast Guard Continually Operating Reference Station (CORS) at Port Isabel, Puerto Rico. Gyro, heave, pitch and roll correctors were acquired using an Ixsea Octans gyrocompass. In 2005, the ship's positioning and orientation were determined by the Applanix POS/MV 320 V4, which is a GPS aided Inertial Motion Unit (IMU) providing measurements of roll, pitch and heading. The POS/MV obtained its positions from two dual frequency Trimble Zephyr GPS antennae. An auxiliary Trimble DSM 132 DGPS system provided a RTCM differential data stream from the U.S. Coast Guard CORS at Port Isabel, Puerto Rico. For both years, CTD (conductivity, temperature and depth) measurements were taken approximately every 4 hours using a Seabird Electronics SBE-19 to correct for the changing sound velocities in the water column. In 2004, raw data were logged in .xtf (extended triton format) using Triton ISIS software 6.2. In 2005, raw data were logged in .gsf (generic sensor format) using SAIC ISS 2000 software. Data from 2004 were referenced to the WGS84 UTM 20 N horizontal coordinate system, and data from 2005 were referenced to the NAD83 UTM 20 N horizontal coordinate system. Data from both projects were referenced to the Mean Lower Low Water (MLLW) vertical tidal coordinate system. The 2004 and 2005 MBES bathymetric data were both corrected for sensor offsets, latency, roll, pitch, yaw, static draft, the changing speed of sound in the water column and the influence of tides in CARIS Hips & Sips 5.3 and 5.4, respectively. The 2004 data was then binned to create a 1 x 1 m raster surface, and the 2005 data was binned to a create 2 x 2 m raster surface. After these final surfaces were created, the datum for the 2004 bathymetric surfaces was transformed from WGS84 to NAD83 using the "Project Raster" function in ArcGIS 9.1. The 2004 surface was transformed so that it would have the same datum as the 2005 surface. The 2004 bathymetric surface was then down sampled from 1 x 1 to 2 x 2 m using the "Resample" function in ArcGIS 9.1. The 2004 surface was resampled so it would have the same spatial resolution as the 2005 surface. Having the same coordinate systems and spatial resolutions, the final 2004 and 2005 bathymetry rasters were then merged using the Raster Calculator function "Merge" in ArcGIS's Spatial Analyst Extension to create a seamless bathymetry surface for the entire VICRNM area south of St. John. For a complete description of the data acquisition and processing parameters, please see the data acquisition and processing reports (DAPRs) for projects: NF-04-06-VI and NF-05-05-VI (Monaco & Rooney, 2004; Battista & Lazar, 2005).
f
Results of two group classification methods.
plos.figshare.com
xls
Updated Jun 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shesh N. Rai; Sudhir Srivastava; Jianmin Pan; Xiaoyong Wu; Somesh P. Rai; Chongkham S. Mekmaysy; Lynn DeLeeuw; Jonathan B. Chaires; Nichola C. Garbett (2023). Results of two group classification methods. [Dataset]. http://doi.org/10.1371/journal.pone.0220765.t004
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220765.t004
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Shesh N. Rai; Sudhir Srivastava; Jianmin Pan; Xiaoyong Wu; Somesh P. Rai; Chongkham S. Mekmaysy; Lynn DeLeeuw; Jonathan B. Chaires; Nichola C. Garbett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Results of two group classification methods.
HUN SW Potentially Impacted Reaches by Quantile v01
researchdata.edu.au
data.gov.au
+1more
Updated Oct 9, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bioregional Assessment Program (2018). HUN SW Potentially Impacted Reaches by Quantile v01 [Dataset]. https://researchdata.edu.au/hun-sw-potentially-quantile-v01/2986501
Explore at:
Dataset updated
Oct 9, 2018
Dataset provided by
Data.govhttps://data.gov/
Authors
Bioregional Assessment Program
Description
Abstract

The dataset was derived by the Bioregional Assessment Programme from multiple source datasets. The source datasets are identified in the Lineage field in this metadata statement. The processes undertaken to produce this derived dataset are described in the History field in this metadata statement.

This dataset is a subset of the Hunter Riverine landscapes classes to be shown as an augmentation to the modelled river impacts layer.

It contains non-ephemeral landscape classes (low to mod intermittent, mod to highly intermittent and perennial) which are deemed to be potentially subject to hydrological change due to having their headwaters in areas subject to ACRD induced drawdown.

Potential impact is flagged at Q05, Q50 and Q95 levels in the attribute table.

Purpose

for use in map reports

Dataset History

Non ephemeral stream landscape classes were compared with foot prints of 0.2m groundwater ACRD drawdown at the Q05 Q50 and Q95 levels. Streams rising out of and/or intersecting the footprints at the respective quantiles were tagged acoordingly were selected out and tagged accordingly in the attribute table

Dataset Citation

Bioregional Assessment Programme (2017) HUN SW Potentially Impacted Reaches by Quantile v01. Bioregional Assessment Derived Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/55c568ce-ec90-40ca-9fd6-6c8fa58519e7.

Dataset Ancestors

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas (including WA)

Derived From HUN River Perenniality v01

Derived From HUN GW Model code v01

Derived From HUN Landscape Classification v02

Derived From Travelling Stock Route Conservation Values

Derived From HUN GW Model v01

Derived From NSW Wetlands

Derived From Climate Change Corridors Coastal North East NSW

Derived From NSW Office of Water Surface Water Licences Processed for Hunter v1 20140516

Derived From Climate Change Corridors for Nandewar and New England Tablelands

Derived From National Groundwater Dependent Ecosystems (GDE) Atlas

Derived From HUN GW Quantiles Interpolation for IMIA Database v01

Derived From BA ALL Assessment Units 1000m Reference 20160516_v01

Derived From Asset database for the Hunter subregion on 27 August 2015

Derived From Birds Australia - Important Bird Areas (IBA) 2009

Derived From Groundwater Economic Assets Hunter NSW 20150331 PersRem

Derived From Geofabric Surface Network - V2.1.1

Derived From Hunter CMA GDEs (DRAFT DPI pre-release)

Derived From Camerons Gorge Grassy White Box Endangered Ecological Community (EEC) 2008

Derived From Atlas of Living Australia NSW ALA Portal 20140613

Derived From Spatial Threatened Species and Communities (TESC) NSW 20131129

Derived From Estuarine Macrophytes of Hunter Subregion NSW DPI Hunter 2004

Derived From Asset database for the Hunter subregion on 24 February 2016

Derived From Natural Resource Management (NRM) Regions 2010

Derived From Gosford Council Endangered Ecological Communities (Umina woodlands) EEC3906

Derived From NSW Office of Water Surface Water Offtakes - Hunter v1 24102013

Derived From NSW Office of Water Surface Water Entitlements Locations v1_Oct2013

Derived From Australia - Species of National Environmental Significance Database

Derived From Asset list for Hunter - CURRENT

Derived From Species Profile and Threats Database (SPRAT) - Australia - Species of National Environmental Significance Database (BA subset - RESTRICTED - Metadata only)

Derived From Northern Rivers CMA GDEs (DRAFT DPI pre-release)

Derived From GEODATA TOPO 250K Series 3, File Geodatabase format (.gdb)

Derived From Ramsar Wetlands of Australia

Derived From Bioregional_Assessment_Programme_Catchment Scale Land Use of Australia - 2014

Derived From GEODATA TOPO 250K Series 3

Derived From NSW Catchment Management Authority Boundaries 20130917

Derived From Geological Provinces - Full Extent

Derived From Hunter subregion boundary

Derived From Commonwealth Heritage List Spatial Database (CHL)

Derived From Groundwater Economic Elements Hunter NSW 20150520 PersRem v02

Derived From Greater Hunter Native Vegetation Mapping with Classification for Mapping

Derived From Native Vegetation Management (NVM) - Manage Benefits

Derived From Bioregional Assessment areas v03

Derived From HUN Groundwater tables 20170421

Derived From HUN Assessment Units 1000m 20160725 v02

Derived From HUN Landscape Classification v03

Derived From National Heritage List Spatial Database (NHL) (v2.1)

Derived From GW Element Bores with Unknown FTYPE Hunter NSW Office of Water 20150514

Derived From Climate Change Corridors (Dry Habitat) for North East NSW

Derived From Groundwater Entitlement Hunter NSW Office of Water 20150324

Derived From Asset database for the Hunter subregion on 20 July 2015

Derived From Fauna Corridors for North East NSW

Derived From NSW Office of Water combined geodatabase of regulated rivers and water sharing plan regions

Derived From BA ALL Assessment Units 1000m 'super set' 20160516_v01

Derived From NSW Office of Water GW licence extract linked to spatial locations for NorthandSouthSydney v3 13032014

Derived From Asset database for the Hunter subregion on 16 June 2015

Derived From Australia World Heritage Areas

Derived From Asset database for the Hunter subregion on 12 February 2015

Derived From [Lower Hunter
f
The average classification accuracy (%), standard error, and standard...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sagastume, Giancarlo K.; Schofield, Jonathon S.; Whittle, Richard S.; Hong, Kihun; Young, Peyton R.; Battraw, Marcus A.; Winslow, Eden J. (2025). The average classification accuracy (%), standard error, and standard deviation for each sensing modality for test 1. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002061744
Explore at:
Dataset updated
Apr 10, 2025
Authors
Sagastume, Giancarlo K.; Schofield, Jonathon S.; Whittle, Richard S.; Hong, Kihun; Young, Peyton R.; Battraw, Marcus A.; Winslow, Eden J.
Description
The average classification accuracy (%), standard error, and standard deviation for each sensing modality for test 1.
Simple Classification Playground
kaggle.com
zip
Updated Aug 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcin Wierzbiński (2023). Simple Classification Playground [Dataset]. https://www.kaggle.com/datasets/martininf1n1ty/simple-classification-playground
Explore at:
zip(6023 bytes)Available download formats
Dataset updated
Aug 22, 2023
Authors
Marcin Wierzbiński
License
http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html
Description
This dataset represents synthetic cell data with distinct clusters. The data simulates the characteristics of cells, with each cell being described by its coordinates on a 2D plane (X-axis and Y-axis) and assigned to a specific cluster.

Dataset Characteristics:

Number of Cells: 300 Number of Genes: 2 Number of Clusters: 3 Cluster Characteristics:

Cluster 1: X Mean: 2, X Standard Deviation: 0.5 Y Mean: 3, Y Standard Deviation: 0.4 Cluster 2: X Mean: 6, X Standard Deviation: 0.7 Y Mean: 7, Y Standard Deviation: 0.8 Cluster 3: X Mean: 10, X Standard Deviation: 0.6 Y Mean: 11, Y Standard Deviation: 0.5 Cluster Proportions:

Cluster 1: 40% Cluster 2: 30% Cluster 3: 30% Visualization: The dataset is visualized on a scatter plot where each point represents a cell. The X-axis and Y-axis represent the coordinates of each cell, and different colors are used to distinguish cells belonging to different clusters. The legend indicates the corresponding cluster for each color.

This synthetic dataset is created for illustrative purposes and showcases distinct clusters with varying characteristics.
f
Mean and standard deviation (SD) of the best classification accuracy...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 10, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cappello, Angelo; Mangia, Anna Lisa; Simoncini, Laura; Pirini, Marco (2014). Mean and standard deviation (SD) of the best classification accuracy obtained for the healthy subjects and the patients for each cardinality in the Imagery Trial and in the pre-Communication Trial. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001191782
Explore at:
Dataset updated
Jun 10, 2014
Authors
Cappello, Angelo; Mangia, Anna Lisa; Simoncini, Laura; Pirini, Marco
Description
Mean and standard deviation (SD) of the best classification accuracy obtained for the healthy subjects and the patients for each cardinality in the Imagery Trial and in the pre-Communication Trial.
Multi-group diagnostic classification of high-dimensional data using...
plos.figshare.com
tiff
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shesh N. Rai; Sudhir Srivastava; Jianmin Pan; Xiaoyong Wu; Somesh P. Rai; Chongkham S. Mekmaysy; Lynn DeLeeuw; Jonathan B. Chaires; Nichola C. Garbett (2023). Multi-group diagnostic classification of high-dimensional data using differential scanning calorimetry plasma thermograms [Dataset]. http://doi.org/10.1371/journal.pone.0220765
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0220765
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Shesh N. Rai; Sudhir Srivastava; Jianmin Pan; Xiaoyong Wu; Somesh P. Rai; Chongkham S. Mekmaysy; Lynn DeLeeuw; Jonathan B. Chaires; Nichola C. Garbett
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The thermoanalytical technique differential scanning calorimetry (DSC) has been applied to characterize protein denaturation patterns (thermograms) in blood plasma samples and relate these to a subject’s health status. The analysis and classification of thermograms is challenging because of the high-dimensionality of the dataset. There are various methods for group classification using high-dimensional data sets; however, the impact of using high-dimensional data sets for cancer classification has been poorly understood. In the present article, we proposed a statistical approach for data reduction and a parametric method (PM) for modeling of high-dimensional data sets for two- and three- group classification using DSC and demographic data. We compared the PM to the non-parametric classification method K-nearest neighbors (KNN) and the semi-parametric classification method KNN with dynamic time warping (DTW). We evaluated the performance of these methods for multiple two-group classifications: (i) normal versus cervical cancer, (ii) normal versus lung cancer, (iii) normal versus cancer (cervical + lung), (iv) lung cancer versus cervical cancer as well as for three-group classification: normal versus cervical cancer versus lung cancer. In general, performance for two-group classification was high whereas three-group classification was more challenging, with all three methods predicting normal samples more accurately than cancer samples. Moreover, specificity of the PM method was mostly higher or the same as KNN and DTW-KNN with lower sensitivity. The performance of KNN and DTW-KNN decreased with the inclusion of demographic data, whereas similar performance was observed for the PM which could be explained by the fact that the PM uses fewer parameters as compared to KNN and DTW-KNN methods and is thus less susceptible to the risk of overfitting. More importantly the accuracy of the PM can be increased by using a greater number of quantile data points and by the inclusion of additional demographic and clinical data, providing a substantial advantage over KNN and DTW-KNN methods.
f
Mean and standard deviation values of operating characteristics (OC), for...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Jun 24, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antunes, Marília; Medeiros, Ana Margarida; Alves, Ana Catarina; Bourbon, Mafalda; Albuquerque, João (2022). Mean and standard deviation values of operating characteristics (OC), for different classification algorithms and techniques to cope with data imbalance, and values obtained with SB criteria. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000295569
Explore at:
Dataset updated
Jun 24, 2022
Authors
Antunes, Marília; Medeiros, Ana Margarida; Alves, Ana Catarina; Bourbon, Mafalda; Albuquerque, João
Description
Mean and standard deviation values of operating characteristics (OC), for different classification algorithms and techniques to cope with data imbalance, and values obtained with SB criteria.
o
Sport and leisure facilities
data.opendatascience.eu
Updated Jan 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Sport and leisure facilities [Dataset]. https://data.opendatascience.eu/geonetwork/srv/search?type=dataset
Explore at:
Dataset updated
Jan 2, 2021
Description
Overview: 142: Areas used for sports, leisure and recreation purposes. Traceability (lineage): This dataset was produced with a machine learning framework with several input datasets, specified in detail in Witjes et al., 2022 (in review, preprint available at https://doi.org/10.21203/rs.3.rs-561383/v3 ) Scientific methodology: The single-class probability layers were generated with a spatiotemporal ensemble machine learning framework detailed in Witjes et al., 2022 (in review, preprint available at https://doi.org/10.21203/rs.3.rs-561383/v3 ). The single-class uncertainty layers were calculated by taking the standard deviation of the three single-class probabilities predicted by the three components of the ensemble. The HCL (hard class) layers represents the class with the highest probability as predicted by the ensemble. Usability: The HCL layers have a decreasing average accuracy (weighted F1-score) at each subsequent level in the CLC hierarchy. These metrics are 0.83 at level 1 (5 classes):, 0.63 at level 2 (14 classes), and 0.49 at level 3 (43 classes). This means that the hard-class maps are more reliable when aggregating classes to a higher level in the hierarchy (e.g. 'Discontinuous Urban Fabric' and 'Continuous Urban Fabric' to 'Urban Fabric'). Some single-class probabilities may more closely represent actual patterns for some classes that were overshadowed by unequal sample point distributions. Users are encouraged to set their own thresholds when postprocessing these datasets to optimize the accuracy for their specific use case. Uncertainty quantification: Uncertainty is quantified by taking the standard deviation of the probabilities predicted by the three components of the spatiotemporal ensemble model. Data validation approaches: The LULC classification was validated through spatial 5-fold cross-validation as detailed in the accompanying publication. Completeness: The dataset has chunks of empty predictions in regions with complex coast lines (e.g. the Zeeland province in the Netherlands and the Mar da Palha bay area in Portugal). These are artifacts that will be avoided in subsequent versions of the LULC product. Consistency: The accuracy of the predictions was compared per year and per 30km*30km tile across europe to derive temporal and spatial consistency by calculating the standard deviation. The standard deviation of annual weighted F1-score was 0.135, while the standard deviation of weighted F1-score per tile was 0.150. This means the dataset is more consistent through time than through space: Predictions are notably less accurate along the Mediterrranean coast. The accompanying publication contains additional information and visualisations. Positional accuracy: The raster layers have a resolution of 30m, identical to that of the Landsat data cube used as input features for the machine learning framework that predicted it. Temporal accuracy: The dataset contains predictions and uncertainty layers for each year between 2000 and 2019. Thematic accuracy: The maps reproduce the Corine Land Cover classification system, a hierarchical legend that consists of 5 classes at the highest level, 14 classes at the second level, and 44 classes at the third level. Class 523: Oceans was omitted due to computational constraints.
f
The median of the classification accuracies from the constant position and...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Apr 10, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hong, Kihun; Schofield, Jonathon S.; Sagastume, Giancarlo K.; Young, Peyton R.; Whittle, Richard S.; Battraw, Marcus A.; Winslow, Eden J. (2025). The median of the classification accuracies from the constant position and varied grasped loads tests reported with the interquartile range and the standard deviation. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002061782
Explore at:
Dataset updated
Apr 10, 2025
Authors
Hong, Kihun; Schofield, Jonathon S.; Sagastume, Giancarlo K.; Young, Peyton R.; Whittle, Richard S.; Battraw, Marcus A.; Winslow, Eden J.
Description
The median of the classification accuracies from the constant position and varied grasped loads tests reported with the interquartile range and the standard deviation.
Educational Time Series Data
kaggle.com
zip
Updated Nov 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ayman M. (2025). Educational Time Series Data [Dataset]. https://www.kaggle.com/datasets/csmohamedayman/educational-time-series-data
Explore at:
zip(3322790 bytes)Available download formats
Dataset updated
Nov 29, 2025
Authors
Ayman M.
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset is a feature-engineered time series dataset created from the Tutorial: Tutorial-TSA-EDA-Time Series Data notebook. It includes a wide range of engineered temporal, rolling, statistical, and lag-based features suitable for time-series forecasting, anomaly detection, and exploratory data analysis.

The dataset contains:

Original target variable transformations (lags, differences, rolling statistics, exponential moving averages, etc.)

Date-based features (year, month, day, day of year, weekend flags, leap year, season, etc.)

Advanced statistical features (volatility, skewness, kurtosis, entropy, Sharpe ratio, drawdown)

Trend and detrended components

Multiple target encodings (binary_target, multiclass_target)

This dataset is ideal for practicing:

Feature selection for time series

Forecasting model training

EDA on engineered features

Multi-output regression and classification tasks

Columns Description:

Primary Core Features

date: Daily timestamp (1995-01-01 onward) Primary index

ts-1 Time series feature #1 (T-1 period) Core time series

ts-2 Time series feature #2 (T-2 period) Core time series

ts-3 Time series feature #3 (T-3 period) Core time series

ts-4 Time series feature #4 (T-4 period) Core time series

numerical_target Primary regression target Sum of all 4 time series features (ts-1 + ts-2 + ts-3 + ts-4) Calculated target

multiclass_target Multiclass classification target Quantile-based discretization of numerical_target into 4 equal groups (quartiles) Multiclass classification

binary_target Binary classification target Derived from multiclass_target: Classes 0-1 -> 0, Classes 2-3 -> 1 Binary classification

Lag & Difference Features

numerical_target_lag_1 Target value from 1 period ago Lag 1

numerical_target_lag_7 Target value from 7 periods ago Lag 7

numerical_target_diff1 Difference between current and previous target 1-period

numerical_target_diff7 Difference between current and 7-periods ago 7-period

numerical_target_pct_change_1 Percentage change from previous period 1-period

Rolling Statistics Features

numerical_target_roll_mean7 Rolling mean 7 periods

numerical_target_roll_mean30 Rolling mean 30 periods

numerical_target_roll_std7 Rolling standard deviation 7 periods

numerical_target_roll_min7 Rolling minimum 7 periods

numerical_target_roll_min30 Rolling minimum 30 periods

numerical_target_roll_max7 Rolling maximum 7 periods

numerical_target_roll_max30 Rolling maximum 30 periods

Volatility & Risk Metrics

numerical_target_volatility_7 Rolling volatility 7 periods

numerical_target_volatility_30 Rolling volatility 30 periods

numerical_target_sharpe_7 Sharpe ratio (risk-adjusted return) 7 periods

numerical_target_sharpe_30 Sharpe ratio 30 periods

numerical_target_drawdown Maximum drawdown from peak

Statistical Distribution Features

numerical_target_var_7 Variance 7 periods

numerical_target_var_30 Variance 30 periods

numerical_target_skewness Distribution skewness

numerical_target_kurtosis Distribution kurtosis (tail heaviness)

numerical_target_entropy Information entropy

Trend & Seasonality Features

numerical_target_trend_7 Linear trend component 7 periods

numerical_target_trend_30 Linear trend component 30 periods

numerical_target_detrended_7 Original minus trend (detrended) 7 periods

numerical_target_detrended_30 Original minus trend (detrended) 30 periods

numerical_target_vs_seasonal_30 Seasonal component 30 periods

Date-Based Features

date_year Year (encoded) Numerical

date_month Month (encoded) 0-11

date_day Day of month 1-31

date_dayofyear Day of year 1-365

date_weekofyear Week number 1-52

date_quarter Quarter 1-4

date_semester Semester 1-2

date_season Season 1-4

date_isweekend Weekend flag 0/1

date_isleapyear Leap year flag 0/1

Smoothing & Transformation Features

numerical_target_ewm_0.3 Exponential Moving Average Alpha=0.3

numerical_target_ewm_0.7 Exponential Moving Average Alpha=0.7

numerical_target_ratio_lag_1 Ratio to lag-1 value

numerical_target_ratio_lag_7 Ratio to lag-7 value

Dataset Structure Summary

Target Variables (3):

numerical_target - Continuous target for regression

multiclass_target - 4-class classification (quartiles of numerical_target)

binary_target - 2-class classification (first 2 vs last 2 quartiles)

Feature Categories:

Core Time Series (5): ts-1 through ts-4 + date

Engineered Features (39): Various transformations of numerical_target

Total Columns: 47
t
Grid files from two AUV missions in the DISCOL area during the SONNE cruise...
service.tib.eu
Updated Nov 30, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Grid files from two AUV missions in the DISCOL area during the SONNE cruise SO242/1 - Vdataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-892662
Explore at:
Dataset updated
Nov 30, 2024
Description
The zip file contains grid files in UTM 16S resulted from AUV mutlibeam data processing and a table with descriptions of these grid files. AUV bathymetry data resulted from interpolation of multibeam depth measurements using the IDW algorithm in SAGA GIS. The AUV bathymetric derivatives (Bathymetric Position Index, Concavity, LS factor, and Terrain Ruggedness Index were calculated in SAGA GIS. The slope derivative was calculated in ArcMap. The AUV backscatter statistics (10th quantile, 90th quantile, mean and mode) were calculated in FMGT Geocoder. The Bayesian classification map was created in SAGA GIS using data from Bayesian classification in Matlab. The ISODATA classification map was created in SAGA GIS using the the AUV backscatter statistics and the Random Forest predictive map was created using the MGET toolbox in ArcMap and the AUV bathymetry, bathymetric derivatives and backscatter statistics data.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alessio Farcomeni; Marco Geraci; Cinzia Viroli (2023). Directional Quantile Classifiers [Dataset]. http://doi.org/10.6084/m9.figshare.17711340.v2

Data from: Directional Quantile Classifiers

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.17711340.v2

Dataset updated

Jun 4, 2023

Dataset provided by

Taylor & Francishttps://taylorandfrancis.com/

Authors

Alessio Farcomeni; Marco Geraci; Cinzia Viroli

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

We introduce classifiers based on directional quantiles. We derive theoretical results for selecting optimal quantile levels given a direction, and, conversely, an optimal direction given a quantile level. We also show that the probability of correct classification of the proposed classifier converges to one if population distributions differ by at most a location shift and if the number of directions is allowed to diverge at the same rate of the problem’s dimension. We illustrate the satisfactory performance of our proposed classifiers in both small- and high-dimensional settings via a simulation study and a real data example. The code implementing the proposed methods is publicly available in the R package Qtools. Supplementary materials for this article are available online.

Clear search

Close search

Google apps

Main menu

Data from: Directional Quantile Classifiers

Population Density in Tioga County NY

Quantiles of sensitivity, specificity and log posterior for training and...

Kansas Population 1890-2020

Data from: Dataset from : Browsing is a strong filter for savanna tree...

Support vector machine with quantile hyper-spheres for pattern...

Data from: Quantile regression of nonlinear models to describe different...

Worldwide CO2 Emissions 2007

Depth (Standard Deviation) Layer used to identify, delineate and classify...

Results of two group classification methods.

HUN SW Potentially Impacted Reaches by Quantile v01

Abstract

Purpose

Dataset History

Dataset Citation

Dataset Ancestors

The average classification accuracy (%), standard error, and standard...

Simple Classification Playground

Mean and standard deviation (SD) of the best classification accuracy...

Multi-group diagnostic classification of high-dimensional data using...

Mean and standard deviation values of operating characteristics (OC), for...

Sport and leisure facilities

The median of the classification accuracies from the constant position and...

Educational Time Series Data

The dataset contains:

This dataset is ideal for practicing:

Columns Description:

Primary Core Features

Lag & Difference Features

Rolling Statistics Features

Volatility & Risk Metrics

Statistical Distribution Features

Trend & Seasonality Features

Date-Based Features

Smoothing & Transformation Features

Dataset Structure Summary

Grid files from two AUV missions in the DISCOL area during the SONNE cruise...

Data from: Directional Quantile Classifiers