Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The aim of this study is to synthesise the strategies implemented in the teaching of the Decimal Metric System in the Escuela Nueva pedagogical model, through a systematic review of the literature. This literature review will focus on scientific productions found in high impact indexed journals, endorsed by H-index. To carry out the above, a qualitative methodology was used, making use of methodological guidelines established by Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Inclusion criteria, exclusion criteria, bibliometric variables, and variables of interest on the content were taken into account. Likewise, strategies based on Boolean operators and key terms were used to search for research articles. One of the results obtained is the use of contexts that are close or real to the students so that they can internalise the mathematical object present in this research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the last decade, a plethora of algorithms have been developed for spatial ecology studies. In our case, we use some of these codes for underwater research work in applied ecology analysis of threatened endemic fishes and their natural habitat. For this, we developed codes in Rstudio® script environment to run spatial and statistical analyses for ecological response and spatial distribution models (e.g., Hijmans & Elith, 2017; Den Burg et al., 2020). The employed R packages are as follows: caret (Kuhn et al., 2020), corrplot (Wei & Simko, 2017), devtools (Wickham, 2015), dismo (Hijmans & Elith, 2017), gbm (Freund & Schapire, 1997; Friedman, 2002), ggplot2 (Wickham et al., 2019), lattice (Sarkar, 2008), lattice (Musa & Mansor, 2021), maptools (Hijmans & Elith, 2017), modelmetrics (Hvitfeldt & Silge, 2021), pander (Wickham, 2015), plyr (Wickham & Wickham, 2015), pROC (Robin et al., 2011), raster (Hijmans & Elith, 2017), RColorBrewer (Neuwirth, 2014), Rcpp (Eddelbeuttel & Balamura, 2018), rgdal (Verzani, 2011), sdm (Naimi & Araujo, 2016), sf (e.g., Zainuddin, 2023), sp (Pebesma, 2020) and usethis (Gladstone, 2022).
It is important to follow all the codes in order to obtain results from the ecological response and spatial distribution models. In particular, for the ecological scenario, we selected the Generalized Linear Model (GLM) and for the geographic scenario we selected DOMAIN, also known as Gower's metric (Carpenter et al., 1993). We selected this regression method and this distance similarity metric because of its adequacy and robustness for studies with endemic or threatened species (e.g., Naoki et al., 2006). Next, we explain the statistical parameterization for the codes immersed in the GLM and DOMAIN running:
In the first instance, we generated the background points and extracted the values of the variables (Code2_Extract_values_DWp_SC.R). Barbet-Massin et al. (2012) recommend the use of 10,000 background points when using regression methods (e.g., Generalized Linear Model) or distance-based models (e.g., DOMAIN). However, we considered important some factors such as the extent of the area and the type of study species for the correct selection of the number of points (Pers. Obs.). Then, we extracted the values of predictor variables (e.g., bioclimatic, topographic, demographic, habitat) in function of presence and background points (e.g., Hijmans and Elith, 2017).
Subsequently, we subdivide both the presence and background point groups into 75% training data and 25% test data, each group, following the method of Soberón & Nakamura (2009) and Hijmans & Elith (2017). For a training control, the 10-fold (cross-validation) method is selected, where the response variable presence is assigned as a factor. In case that some other variable would be important for the study species, it should also be assigned as a factor (Kim, 2009).
After that, we ran the code for the GBM method (Gradient Boost Machine; Code3_GBM_Relative_contribution.R and Code4_Relative_contribution.R), where we obtained the relative contribution of the variables used in the model. We parameterized the code with a Gaussian distribution and cross iteration of 5,000 repetitions (e.g., Friedman, 2002; kim, 2009; Hijmans and Elith, 2017). In addition, we considered selecting a validation interval of 4 random training points (Personal test). The obtained plots were the partial dependence blocks, in function of each predictor variable.
Subsequently, the correlation of the variables is run by Pearson's method (Code5_Pearson_Correlation.R) to evaluate multicollinearity between variables (Guisan & Hofer, 2003). It is recommended to consider a bivariate correlation ± 0.70 to discard highly correlated variables (e.g., Awan et al., 2021).
Once the above codes were run, we uploaded the same subgroups (i.e., presence and background groups with 75% training and 25% testing) (Code6_Presence&backgrounds.R) for the GLM method code (Code7_GLM_model.R). Here, we first ran the GLM models per variable to obtain the p-significance value of each variable (alpha ≤ 0.05); we selected the value one (i.e., presence) as the likelihood factor. The generated models are of polynomial degree to obtain linear and quadratic response (e.g., Fielding and Bell, 1997; Allouche et al., 2006). From these results, we ran ecological response curve models, where the resulting plots included the probability of occurrence and values for continuous variables or categories for discrete variables. The points of the presence and background training group are also included.
On the other hand, a global GLM was also run, from which the generalized model is evaluated by means of a 2 x 2 contingency matrix, including both observed and predicted records. A representation of this is shown in Table 1 (adapted from Allouche et al., 2006). In this process we select an arbitrary boundary of 0.5 to obtain better modeling performance and avoid high percentage of bias in type I (omission) or II (commission) errors (e.g., Carpenter et al., 1993; Fielding and Bell, 1997; Allouche et al., 2006; Kim, 2009; Hijmans and Elith, 2017).
Table 1. Example of 2 x 2 contingency matrix for calculating performance metrics for GLM models. A represents true presence records (true positives), B represents false presence records (false positives - error of commission), C represents true background points (true negatives) and D represents false backgrounds (false negatives - errors of omission).
|
Validation set | |
Model |
True |
False |
Presence |
A |
B |
Background |
C |
D |
We then calculated the Overall and True Skill Statistics (TSS) metrics. The first is used to assess the proportion of correctly predicted cases, while the second metric assesses the prevalence of correctly predicted cases (Olden and Jackson, 2002). This metric also gives equal importance to the prevalence of presence prediction as to the random performance correction (Fielding and Bell, 1997; Allouche et al., 2006).
The last code (i.e., Code8_DOMAIN_SuitHab_model.R) is for species distribution modelling using the DOMAIN algorithm (Carpenter et al., 1993). Here, we loaded the variable stack and the presence and background group subdivided into 75% training and 25% test, each. We only included the presence training subset and the predictor variables stack in the calculation of the DOMAIN metric, as well as in the evaluation and validation of the model.
Regarding the model evaluation and estimation, we selected the following estimators:
1) partial ROC, which evaluates the approach between the curves of positive (i.e., correctly predicted presence) and negative (i.e., correctly predicted absence) cases. As farther apart these curves are, the model has a better prediction performance for the correct spatial distribution of the species (Manzanilla-Quiñones, 2020).
2) ROC/AUC curve for model validation, where an optimal performance threshold is estimated to have an expected confidence of 75% to 99% probability (De Long et al., 1988).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: For each categorical variable, one category was chosen as a reference category (RC, e.g., RC = Social Sciences for the categorical variable discipline). For categorical variables, effect for each predictor variable (a dummy variable representing one of the categories) is a regression coefficient (Coeff) that should be interpreted in relation to its standard error (SE) and the effect of the reference category. Variance components for level 1 are derived from the data, but variance components at level 2 and level 3 indicate the amount of variance that can be explained by differences between studies (level 3) and differences between single reliability coefficients nested within studies (level 2). The loglikelihood test provided by SAS/proc mixed (−2LL) can be used to compare different models, as can also the Bayes Information Criteria (BIC). The smaller the BIC, the better the model is.*p
This study developed interval-level measurement scales for evaluating police officer performance during real or simulated deadly force situations. Through a two-day concept mapping focus group, statements were identified to describe two sets of dynamics: the difficulty (D) of a deadly force situation and the performance (P) of a police officer in that situation. These statements were then operationalized into measurable Likert-scale items that were scored by 291 use of force instructors from more than 100 agencies across the United States using an online survey instrument. The dataset resulting from this process contains a total of 685 variables, comprised of 312 difficulty statement items, 278 performance statement items, and 94 variables that measure the demographic characteristics of the scorers.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This study developed interval-level measurement scales for evaluating police officer performance during real or simulated deadly force situations. Through a two-day concept mapping focus group, statements were identified to describe two sets of dynamics: the difficulty (D) of a deadly force situation and the performance (P) of a police officer in that situation. These statements were then operationalized into measurable Likert-scale items that were scored by 291 use of force instructors from more than 100 agencies across the United States using an online survey instrument. The dataset resulting from this process contains a total of 685 variables, comprised of 312 difficulty statement items, 278 performance statement items, and 94 variables that measure the demographic characteristics of the scorers.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is the result of test case and developer metrics extraction from Honfi's experiment in https://zenodo.org/record/2596044#.Xnm4sS2B1QJThe detail of test case extraction is attached.It contained 20 metrics from the generated test case and six metrics from the profile of developers. 26 metrics act as independent variable. There are two dependent variables : ABU (Actual Binary Understandability) and TAU (Timed Actual Understandability).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper derives a method for estimating and testing the Linear Quadratic Adjustment Cost (LQAC) model when the target variable and some of the forcing variables follow I(2) processes. Based on a forward-looking error-correction formulation of the model it is shown how to obtain strongly consistent estimates of the structural parameters from both a linear and a non-linear cointegrating regression where first-differences of the I(2) variables are included as regressors (multicointegration). Further, based on the estimated parameter values, it is shown how to test and evaluate the LQAC model using a VAR approach. A simple easy interpretable metric for measuring the model fit is suggested. In an empirical application using UK money demand data, the non-linear multicointegrating regression delivers an economically plausible estimate of the adjustment cost parameter. However, the restrictions implied by the exact LQAC model under rational expectations are strongly rejected and the metric for model fit indicates a substantial noise component in the model.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This map service represents modeled streamflow metrics from the historical time period (1977-2006) in the United States. In addition to standard NHD attributes, the streamflow datasets include metrics on mean daily flow (annual and seasonal), flood levels associated with 1.5-year, 10-year, and 25-year floods; annual and decadal minimum weekly flows and date of minimum weekly flow, center of flow mass date; baseflow index, and average number of winter floods.�These files and additional information are available on the project� website,�https://www.fs.usda.gov/rm/boise/AWAE/projects/modeled_stream_flow_metrics.shtml. Streams without flow metrics (null values) were removed from this dataset to improve display speed; to see all stream lines, use an NHD flowline dataset.Hydro flow metrics data can be downloaded from�here.
This dataset includes integrated freshwater abundance and connectivity cluster output, principal component scores, and lake, wetland, and stream abundance and connectivity metrics measured at the Hydrologic Unit 8 (HU8) scale for 17 U.S. states in the Midwest and Northeast regions (appr. 1,800,000 km2). The intent of the cluster analysis is to characterize the macroscale patterns of the integrated freshwater landscape that includes lakes, wetlands, and streams and their surface connectivity attributes. We define freshwater connectivity as the permanent surface hydrologic connections that link lakes, wetlands, and streams and measure connectivity as the landscape position of systems within stream networks. Geographic data used in the analysis are in LAGOS-NE-GEO database v. 1.03 (Lake multi-scaled geospatial and temporal database), an integrated, multi-thematic geographic database (Soranno et al. 2015). The integrated freshwater clusters were created through a multi-step process as follows: 1) we quantified multiple freshwater connectivity metrics for lakes, streams, and wetlands separately, 2) we performed principal components analysis (PCA) on the connectivity metric values for each freshwater type to reduce collinearity, and 3) we performed k-means cluster analysis to group spatial units with similar freshwater connectivity characteristics. The resulting freshwater clusters are representations of the macroscale patterns of freshwater abundance and connectivity in the landscape.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The 30 year average values for the climate variables for the first day and length in days of episodes that meet certain conditions. Ordered by climate variable.
A dataset of mentions, growth rate, and total volume of the keyphrase 'Continuous Variable' over time.
This metadata record describes 99 streamflow (referred to as flow) metrics calculated using the observed flow records at 1851 streamflow gauges across the conterminous United States from 1950 to 2018. Calculation of these metrics are often used as dependent variables in statistical models to make predictions of these flow metrics at ungaged locations. Specifically, this record describes (1) the U.S. Geological Survey streamgauge identification number, (2) the 1-, 7-, and 30-day consecutive minimum flow normalized by drainage area, DA (Q1/DA, Q7/DA, and Q30/DA [cfs/sq km]), (3) the 1st, 10th, 25th, 50th, 75th, 90th, and 99th nonexceedence flows normalized by DA (P01/DA, P10/DA, P25/DA, P50/DA, P75/DA, P90/DA, P99/DA [cfs/sq km]), (4) the annual mean flows normalized by DA (Mean/DA [cfs/sq km]), (5) the coefficient of variation of the annual minimums and maximum flows (Vmin and Vmax [dimensionless]), the average annual duration of flow pulses less than P10 and greater than P90 (Dl and Dh [number of days]), (6) the average annual number of flow pulses less than P10 and greater than P90 (Fl and Fh [number of events]), (7) the average annual skew of daily flows (Skew [dimensionless]), (8) the number of days where flow greater than the previous day divided by the total number of days (daily rises [dimensionless]), (9) the low- and high-flow timing metrics for winter, spring, summer, and fall (Winter_Tl, Spring_Tl, Summer_Tl, Fall_Tl, Winter_Th, Spring_Th, Summer_Th, and Fall_Th [dimensionless]), (10) the monthly nonexceedence flows normalized by DA (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, and DEC P'X'/DA where the 'X'=10, 20, 50, 80, and 90 [cfs/sq km]), and (11) monthly mean flow normalized by DA (JAN, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, and DEC mean/DA [cfs/sq km]). For more details for flow metrics related to (2) through (8) and (11), please see Eng, K., Grantham, T.E., Carlisle, D.M., and Wolock, D.M., 2017, Predictability and selection of hydrologic metrics in riverine ecohydrology: Freshwater Science, v. 36(4), p. 915-926 [Also available at https://doi.org/10.1086/694912]. For more details on (9), please see Eng, K., Carlisle, D.M., Grantham, T.E., Wolock, D.M., and Eng, R.L., 2019, Severity and extent of alterations to natural streamflow regimes based on hydrologic metrics in the conterminous United States, 1980-2014: U.S. Geological Survey Scientific Investigations Report 2019-5001, 25 p. [Also available at https://doi.org/10.3133/sir20195001]. For (10), all daily flow values for the month of interest across all years are ranked in descending order, and the flow values associated with 10, 20, 50, 80, and 90 percent of all flow values are assigned as the monthly percent values. The data are in a tab-delimited text format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains files that show the climate change velocity metrics calculated for three climate variables across Finland. The climate velocities were used to study the magnitude of projected climatic changes in a nation-wide Natura 2000 protected area (PA) network (Heikkinen et al., 2020). Using fine-resolution climate data that describes the present-day and future topoclimates and their spatio-temporal variation, the study explored the rate of climatic changes in protected areas on an ecologically relevant, but yet poorly explored scale. The velocities for the three climate variables were developed in the following work, where in-depth description of the different steps in velocity metrics calculation and a number of visualisations of their spatial variation across Finland are provided: Risto K. Heikkinen 1, Niko Leikola 1, Juha Aalto 2,3, Kaisu Aapala 1, Saija Kuusela 1, Miska Luoto 2 & Raimo Virkkala 1 2020: Fine-grained climate velocities reveal vulnerability of protected areas to climate change. Scientific Reports 10:1678. https://doi.org/10.1038/s41598-020-58638-8 1 Finnish Environment Institute, Biodiversity Centre, Latokartanonkaari 11, FI-00790 Helsinki, Finland 2 Department of Geosciences and Geography, University of Helsinki, FI-00014, Helsinki, Finland 3 Finnish Meteorological Institute, FI-00101, Helsinki, Finland The dataset includes GIS compatible geotiff files describing the nine spatial climate velocity surfaces calculated across the whole of Finland at 50 m × 50 m spatial resolution. These nine different velocity surfaces consist of velocity metric values measured for each 50-m grid cell separately for the three different climate variables and in relation to the three different future climate scenarios (RCP2.6, RCP4.5 and RCP8.5). The baseline climate data for the study were the monthly temperature and precipitation data averaged for the period from 1981 to 2010 modelled at a resolution of 50-m, based on which estimates for the annual temperature sum above 5 °C (growing degree days, GDD, °C), the mean January temperature (TJan, °C) and the annual climatic water balance (WAB, the difference between annual precipitation and potential evapotranspiration; mm) were calculated. Corresponding future climate surfaces were produced using an ensemble of 23 global climate models for the years 2070–2099 (Taylor et al. 2012) and the three RCPs. The data for the three climate variables for 1981–2010 and under the three RCPs will be made available in separately via METIS - FMI's Research Data repository service (Aalto et al., in prep.). The climate velocity surfaces included in the present data repository were developed using climate-analog approach (Hamann et al. 2015; Batllori et al. 2017; Brito-Morales et al. 2018), whereby velocity metrics for the 50-m grid cells were measured based on the distance between climatically similar cells under the baseline and the future climates, calculated separately for the three climate variables. In Heikkinen et al. (2020), the spatial data for the Natura 2000 protected areas were used to assess their exposure to climate change. The full data on N2K areas can be downloaded from the following link: https://ckan.ymparisto.fi/dataset/%7BED80465E-135B-4391-AA8A-FE2038FB224D%7D. However, note that the N2K areas including multiple physically separate patches were treated as separate polygons in Heikkinen et al. (2020), and a minimum size requirement of 2 hectares were requested. Moreover, the digital elevation model (DEM) data for Finland (which were dissected to Natura 2000 polygons to examine their elevational variation and its relationships to topoclimatic variation) can be downloaded from the following link: https://ckan.ymparisto.fi/en/dataset/dem25_astergdem25. The coordinate system for the climate velocity data files is: ETRS-TM35FIN (EPSG: 3067) (or YKJ Finland/Finnish Uniform Coordinate System (EPSG: 2393)). Summary of the key settings and elements of the study are provided below. A detailed treatment is provided in Heikkinen et al. (2020). Code to the files (four files per each velocity layer: *.tif, *.tfw. *.ovr and .tif.aux.xml) in the dataset: (a) Velocity of GDD with respect to RCP2.6 future climate (Fig 2a in Heikkinen et al. 2020). Name of the file: GDDRCP26. (b) Velocity of GDD with respect to RCP4.5 future climate (Fig. 2b in Heikkinen et al. 2020). Name of the file: GDDRCP45.* (c) Velocity of GDD with respect to RCP8.5 future climate (Fig. 2c in Heikkinen et al. 2020). Name of the file: GDDRCP85.* (d) Velocity of mean January temperature with respect to RCP2.6 future climate (Fig. 2d in Heikkinen et al. 2020). Name of the file: TJanRCP26.* (e) Velocity of mean January temperature with respect to RCP4.5 future climate (Fig. 2e in Heikkinen et al. 2020). Name of the file: TJanRCP45.* (f) Velocity of mean January temperature with respect to RCP8.5 future climate (Fig. 2f in Heikkinen et al. 2020). Name of the file: TJanRCP85.* (g) Velocity of climatic water balance with respect to RCP2.6 future climate (Fig. 2g in Heikkinen et al. 2020). Name of the file: WABRCP26.* (h) Velocity of climatic water balance with respect to RCP4.5 future climate (Fig. 2h in Heikkinen et al. 2020). Name of the file: WABRCP45.* (i) Velocity of climatic water balance with respect to RCP8.5 future climate (Fig. 2i in Heikkinen et al. 2020). Name of the file: WABRCP85.* Note that velocity surfaces e and f include disappearing climate conditions. Summary of the study: Climate velocity is a generic metric which provides useful information for climate-wise conservation planning to identify regions and protected areas where climate conditions are changing most rapidly, exposing them to high rates of climate displacement (Batllori et al. 2017), causing potential carry-over impacts to community structure and ecosystem functions (Ackerly et al. 2010). Climate velocity has been typically used to assess the climatic risks for species and their populations, but velocity metrics can also be used to identify protected areas which face overall difficulties in retaining ecological conditions that promote present-day biodiversity. Earlier climate velocity assessments have focussed on the domains of the mesoclimate (resolutions of 1–100 km) or macroclimate (>100 km scales), and fine-grained (<100 m) local climatic conditions created by variation in topography ('topoclimate'; Ackerly et al. 2010; 2020) have largely been overlooked (Heikkinen et al. 2020). This omission may lead to biased exposure assessments especially in rugged terrain (Dobrowski et al. 2013; Franklin et al. 2013), as well as a limited ability to detect sites decoupled from the regional climate (Aalto et al. 2017; Lenoir et al. 2017). This study provided the first assessment of the climatic exposure risks across a national PA (Natura 2000) network based on very fine-grained velocities of three established drivers of high latitude biodiversity. The produce fine-grain climate velocity measures, 50-m resolution monthly temperature and precipitation data averaged for 1981–2010 were first developed, and based on it, the three bioclimatic variables (growing degree days, mean January temperature and annual climatic water balance) were calculated for the whole study domain. In the next phase, similar future climate surfaces were produced based on data from an ensemble of 23 global climate models, extracted from the CMIP5 archives for the years 2070–2099 and the three RCP scenarios (RCP2.6, RCP4.5 and RCP8.5)26. In the final step, climate velocities for each the 50 x 50 m grid cells were measured using climate-analog velocity method (Hamann et al. 2015) and based on the distance between climatically similar cells under the baseline and future climates. The results revealed notable spatial differences in the high velocity areas for the three bioclimatic variables, indicating contrasting exposure risks in protected areas situated in different areas. Moreover, comparisons of the 50-m baseline and future climate surfaces revealed a potential wholesale disappearance of current topoclimatic temperature conditions from almost all the studied PAs by the end of this century. Calculation of climate change velocity metrics for the three climate variables The overall process of calculation of climate velocities included three main steps. (1) In the first step, we developed high-resolution monthly average temperature and precipitation data averaged over the years 1981–2010 and across the study domain at a spatial resolution of 50 × 50 m. This was done by building topoclimatic models based on climate data sourced from 313 meteorological stations (European Climate Assessment and Dataset [ECA&D]) (Klok et al. 2009). Our station network and modelling domain covered the whole of Finland with an additional 100 km buffer. However, it was also extended to cover large parts of northern Sweden and Norway for areas >66.5°N, as well as selected adjacent areas in Russia (for details see Heikkinen et al. 2020). This was done to capture the present-day climate spaces in Finland which are projected to move in the future beyond the country borders but have analogous climate areas in neighbouring areas; this was done to avoid developing a large number of velocity values deemed as infinite or unknown in the data for Finland. The 50-m resolution average air temperature data were developed for the study domain using generalized additive modelling (GAM), as implemented in the R-package mgcv version 1.8–7 (R Development Core Team 2011; Wood 2011). In this modelling we utilised variables of geographical location (latitude and longitude, included as an anisotropic interaction), topography (elevation, potential incoming solar radiation, relative elevation) and water cover (sea and lake proximity), and subsequent leave-one-out cross-validation tests to assess model performance (for full process description, see Aalto et al. 2017; Heikkinen et
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The monitoring of surface-water quality followed by water-quality modeling and analysis is essential for generating effective strategies in water resource management. However, water-quality studies are limited by the lack of complete and reliable data sets on surface-water-quality variables. These deficiencies are particularly noticeable in developing countries.
This work focuses on surface-water-quality data from Santa Lucía Chico river (Uruguay), a mixed lotic and lentic river system. Data collected at six monitoring stations are publicly available at https://www.dinama.gub.uy/oan/datos-abiertos/calidad-agua/. The high temporal and spatial variability that characterizes water-quality variables and the high rate of missing values (between 50% and 70%) raises significant challenges.
To deal with missing values, we applied several statistical and machine-learning imputation methods. The competing algorithms implemented belonged to both univariate and multivariate imputation methods (inverse distance weighting (IDW), Random Forest Regressor (RFR), Ridge (R), Bayesian Ridge (BR), AdaBoost (AB), Huber Regressor (HR), Support Vector Regressor (SVR), and K-nearest neighbors Regressor (KNNR)).
IDW outperformed the others, achieving a very good performance (NSE greater than 0.8) in most cases.
In this dataset, we include the original and imputed values for the following variables:
Water temperature (Tw)
Dissolved oxygen (DO)
Electrical conductivity (EC)
pH
Turbidity (Turb)
Nitrite (NO2-)
Nitrate (NO3-)
Total Nitrogen (TN)
Each variable is identified as [STATION] VARIABLE FULL NAME (VARIABLE SHORT NAME) [UNIT METRIC].
More details about the study area, the original datasets, and the methodology adopted can be found in our paper https://www.mdpi.com/2071-1050/13/11/6318.
If you use this dataset in your work, please cite our paper:
Rodríguez, R.; Pastorini, M.; Etcheverry, L.; Chreties, C.; Fossati, M.; Castro, A.; Gorgoglione, A. Water-Quality Data Imputation with a High Percentage of Missing Values: A Machine Learning Approach. Sustainability 2021, 13, 6318. https://doi.org/10.3390/su13116318
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The 30 year average values of various climate variables per 10 days. Ordered by climate variable. Traditionally the months are divided in 2 times 10 days and the remainder of the month.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Wilson et al. SOM Tables.xlsx
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The 30 year average values of various climate variables per month, season or year. Ordered by climate variable.
This dataset is a compilation of 13 thermal response metrics for 834 freshwater fish species across the conterminous United States (CONUS). The data were extracted from six published sources, many of which are compilations of data from other sources. The data were harmonized for comparison, and additional variables were added to summarize the metrics. The dataset is presented as a spreadsheet containing 17 sheets. The first sheet (datasets) describes the data sources. Other sheets describe the source and compilation variables in detail.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The 30 year average values of various climate variables. Determined per month, per season and per year. Ordered by climate variable.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset includes intensity bin composites of column-integrated mosit static energy (MSE) spatial variance budget feedback terms for GCMs, reanalyses, and CloudSat for:
Starr, J. C., A. A. Wing, S. J. Camargo, D. Kim, T. Y. Lee, and J. Moon: Using the moist static energy variance budget to evaluate tropical cyclones in climate models against reanalyses and satellite observations. Journal of Climate, In Review.
Description of Files for GCMs and Reanalyses
For each of the GCMs and reanalyses used in this study, there are 4 netcdf files that are saved, 2 for intensity bin composites with maximum wind speed (Vmax) as the binning metric and 2 for minimum mean sea level pressure (MSLP). Considering each of the GCMs and reanalyses have the same file format, the AM4 model will be used as an example for what each file contains and how they are organized. Each reanalysis and GCM will have its own .tar containing the four netcdf files mentioned.
AM4_Binned_Composites_V2.nc is the Vmax-binned intensity bin composite means of all the variables. The first dimension of each of these variables within the file are "bin" which represents the bin mean value, for example the first bin value is 1.5 representing the 0-3 m/s bin, then increasing by 3 m/s from there. For the spatial composites, which are 2-dimensional variables, those have dimensions of "lat" and "lon" which range from -5 degrees to 5 degrees as the center of each spatial intensity bin composite of that variable would be 0 degrees, 0 degrees. The azimuthal mean variables have dimension "nr" which represents the radial increments.
AM4_Binned_STDEVS_of_BoxAvgs_V2.nc contains the Vmax-binned intensity bin composite standard deviations of the box averaged variables as well as the azimuthal mean feedback variables. This file is used in calculting the 5 to 95% confidence intervals for the azimuthal mean and box average plots.
AM4_Binned_Composites_MSLP.nc is the minimum MSLP-binned intensity bin composite means of all the variables. This file is set up the same as the Vmax-binned file, but now the first dimension "bin" represents the bin mean value using minimum MSLP as the intensity metric. For example, the first bin of this dimension is 882.5 hPa which is the mean value of the 880-885 hPa bin. These mean values then increase by 5 hPa to the weakest bin of 1020-1025 hPa.
AM4_Binned_STDEVS_of_BoxAvgs_MSLP.nc is set up identically to the Vmax-binned version of the standard deviation file, just now with minimum MSLP as the binning metric.
These files contain all the variables that are pertinent to the MSE spatial variance budget, but also some that were not utilized in this study. The variables listed below are those that were utilized in this study.
"bincounts": the number of snapshots in each intensity bin
3-D Variables (Spatial composites (bin,lat,lon)):
"hanom": anomaly of column-integrated MSE from the domain-mean column-integrated MSE
"hanom_SEFanom": the surface enthalpy flux (SEF) feedback
"hanom_LWanom": the longwave (LW) feedback
"hanom_SWanom": the shortwave (SW) feedback
2-D Variables (Azimuthal mean composites (bin,nr)):
"Azmean_hSEF": Azimuthal mean SEF feedback
"Azmean_hLW": Azimuthal mean LW feedback
"Azmean_hSW": Azimuthal mean SW feedback
1-D Variables (Box-averaged composites (bin)):
"new_boxav_hvar": the box-averaged variance of column-integrated MSE
"new_boxav_hanom_SEFanom": the box-averaged SEF feedback
"new_boxav_hanom_LWanom": the box-averaged LW feedback
"new_boxav_hanom_SWanom": the box-averaged SW feedback
"new_boxav_norm_hanom_SEFanom": the normalized box-averaged SEF feedback
"new_boxav_norm_hanom_LWanom": the normalized box-averaged LW feedback
"new_boxav_norm_hanom_SWanom": the normalized box-averaged SW feedback
To get the standard deviations of the azimuthal mean and box-averaged feedbacks of each intensity bin, the same variable names are used above in the standard deviation file.
Description of File for CloudSat
This file was provided by work done in:
Lee, T.-Y., and A. Wing, 2024: Satellite-based estimation on the role of cloud-radiative interaction in accelerating tropical cyclone development. Journal of the Atmospheric Sciences, 64 (81), 959-982, https://doi.org/https://doi.org/10.1175/JAS-D-23-0142.1.
CloudSat_Composite_IR_RRTMGclimlab_vi4_IR_Vmax999_000_R3.nc contains the Vmax-binned intensity bin composites of the MSE variance budget feedback variables. Each of the CloudSat variables are provided as radial profiles with dimensions like those in the reanalyses and GCMs of intensity bin and then radius. The variables from this file that were utilized in this study are listed below.
"RadFB_LW_ALL_500": radial composite of the LW feedback
"RadFB_SWDAY_ALL_500": radial composite of the SW feedback
"RadFB_Net_ALL_500": radial composite of the total radiaitive feedback
"RadFB_LW_CLEARSKY_500": radial composite of the clear-sky LW feedback
"RadFB_SWDAY_CLEARSKY_500": radial composite of the clear-sky SW feedback
"RadFB_Net_CLEARSKY_500": radial composite of the clear-sky total radiaitive feedback
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The aim of this study is to synthesise the strategies implemented in the teaching of the Decimal Metric System in the Escuela Nueva pedagogical model, through a systematic review of the literature. This literature review will focus on scientific productions found in high impact indexed journals, endorsed by H-index. To carry out the above, a qualitative methodology was used, making use of methodological guidelines established by Preferred Reporting Items for Systematic Reviews and Meta-Analyses. Inclusion criteria, exclusion criteria, bibliometric variables, and variables of interest on the content were taken into account. Likewise, strategies based on Boolean operators and key terms were used to search for research articles. One of the results obtained is the use of contexts that are close or real to the students so that they can internalise the mathematical object present in this research.