Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A database with R-groups frequently used in medicinal chemistry and their preferred replacements is provided. For frequently used R-groups, replacements are organized in hierarches as specified in the readme.txt file. The data deposition accompanies a forthcoming publication by the authors in which the database will be described in detail.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This publication contains the raw data as well as the evaluation scrips (written in R) of the paper "Formation of Study Groups: Exploring Students' Needs and Practical Challenges".
The evaluation data was collected using the software that is published in https://doi.org/10.5281/zenodo.10678081.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Biological processes exhibit complex temporal dependencies due to the sequential nature of allocation decisions in organisms’ life-cycles, feedback loops, and two-way causality. Consequently, longitudinal data often contain cross-lags: the predictor variable depends on the response variable of the previous time-step. Although statisticians have warned that regression models that ignore such covariate endogeneity in time series are likely to be inappropriate, this has received relatively little attention in biology. Furthermore, the resulting degree of estimation bias remains largely unexplored.
We use a graphical model and numerical simulations to understand why and how regression models that ignore cross-lags can be biased, and how this bias depends on the length and number of time series. Ecological and evolutionary examples are provided to illustrate that cross-lags may be more common than is typically appreciated and that they occur in functionally different ways.
We show that routinely used regression models that ignore cross-lags are asymptotically unbiased. However, this offers little relief, as for most realistically feasible lengths of time series conventional methods are biased. Furthermore, collecting time series on multiple subjects–such as populations, groups or individuals—does not help to overcome this bias when the analysis focusses on within-subject patterns (often the pattern of interest). Simulations (R tutorial 1 & 2), a literature search and a real-world empirical example on fairy wrens (data archived here with analyses presented in R-tutorial 3) together suggest that approaches that ignore cross-lags are likely biased in the direction opposite to the sign of the cross-lag (e.g. towards detecting density-dependence of vital rates and against detecting life history trade-offs and benefits of group living). Next, we show that multivariate (e.g. structural equation) models can dynamically account for cross-lags, and simultaneously address additional bias induced by measurement error, but only if the analysis considers multiple time series.
We provide guidance on how to identify a cross-lag and subsequently specify it in a multivariate model, which can be far from trivial. Our tutorials with data and R code of the worked examples provide step‐by‐step instructions on how to perform such analyses.
Our study offers insights into situations in which cross-lags can bias analysis of ecological and evolutionary time series and suggests that adopting dynamical models can be important, as this directly affects our understanding of population regulation, the evolution of life histories and cooperation, and possibly many other topics. Determining how strong estimation bias due to ignoring covariate endogeneity has been in the ecological literature requires further study, also because it may interact with other sources of bias.
Methods The data was part of a long-term study on red-winged fariy wrens (Malurus elegans) in South-west Australia (Pemberton) from 2008-2016. In each year data was collected on group size, offspring production and survival of all group members. See description in Box 4 in the associated paper, and references therein.
This lesson helps students know some of the options for how to graph grouped continuous data (such as those involved in doing a t-test or ANOVA) and how to choose the best option.
Fo Usa R Group Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pooled within-group correlations (r) between functions and variables.
https://hedgefollow.com/license.phphttps://hedgefollow.com/license.php
A list of the top 50 Paul R Ried Financial Group LLC holdings showing which stocks are owned by Paul R Ried Financial Group LLC's hedge fund.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the last decade, a plethora of algorithms have been developed for spatial ecology studies. In our case, we use some of these codes for underwater research work in applied ecology analysis of threatened endemic fishes and their natural habitat. For this, we developed codes in Rstudio® script environment to run spatial and statistical analyses for ecological response and spatial distribution models (e.g., Hijmans & Elith, 2017; Den Burg et al., 2020). The employed R packages are as follows: caret (Kuhn et al., 2020), corrplot (Wei & Simko, 2017), devtools (Wickham, 2015), dismo (Hijmans & Elith, 2017), gbm (Freund & Schapire, 1997; Friedman, 2002), ggplot2 (Wickham et al., 2019), lattice (Sarkar, 2008), lattice (Musa & Mansor, 2021), maptools (Hijmans & Elith, 2017), modelmetrics (Hvitfeldt & Silge, 2021), pander (Wickham, 2015), plyr (Wickham & Wickham, 2015), pROC (Robin et al., 2011), raster (Hijmans & Elith, 2017), RColorBrewer (Neuwirth, 2014), Rcpp (Eddelbeuttel & Balamura, 2018), rgdal (Verzani, 2011), sdm (Naimi & Araujo, 2016), sf (e.g., Zainuddin, 2023), sp (Pebesma, 2020) and usethis (Gladstone, 2022).
It is important to follow all the codes in order to obtain results from the ecological response and spatial distribution models. In particular, for the ecological scenario, we selected the Generalized Linear Model (GLM) and for the geographic scenario we selected DOMAIN, also known as Gower's metric (Carpenter et al., 1993). We selected this regression method and this distance similarity metric because of its adequacy and robustness for studies with endemic or threatened species (e.g., Naoki et al., 2006). Next, we explain the statistical parameterization for the codes immersed in the GLM and DOMAIN running:
In the first instance, we generated the background points and extracted the values of the variables (Code2_Extract_values_DWp_SC.R). Barbet-Massin et al. (2012) recommend the use of 10,000 background points when using regression methods (e.g., Generalized Linear Model) or distance-based models (e.g., DOMAIN). However, we considered important some factors such as the extent of the area and the type of study species for the correct selection of the number of points (Pers. Obs.). Then, we extracted the values of predictor variables (e.g., bioclimatic, topographic, demographic, habitat) in function of presence and background points (e.g., Hijmans and Elith, 2017).
Subsequently, we subdivide both the presence and background point groups into 75% training data and 25% test data, each group, following the method of Soberón & Nakamura (2009) and Hijmans & Elith (2017). For a training control, the 10-fold (cross-validation) method is selected, where the response variable presence is assigned as a factor. In case that some other variable would be important for the study species, it should also be assigned as a factor (Kim, 2009).
After that, we ran the code for the GBM method (Gradient Boost Machine; Code3_GBM_Relative_contribution.R and Code4_Relative_contribution.R), where we obtained the relative contribution of the variables used in the model. We parameterized the code with a Gaussian distribution and cross iteration of 5,000 repetitions (e.g., Friedman, 2002; kim, 2009; Hijmans and Elith, 2017). In addition, we considered selecting a validation interval of 4 random training points (Personal test). The obtained plots were the partial dependence blocks, in function of each predictor variable.
Subsequently, the correlation of the variables is run by Pearson's method (Code5_Pearson_Correlation.R) to evaluate multicollinearity between variables (Guisan & Hofer, 2003). It is recommended to consider a bivariate correlation ± 0.70 to discard highly correlated variables (e.g., Awan et al., 2021).
Once the above codes were run, we uploaded the same subgroups (i.e., presence and background groups with 75% training and 25% testing) (Code6_Presence&backgrounds.R) for the GLM method code (Code7_GLM_model.R). Here, we first ran the GLM models per variable to obtain the p-significance value of each variable (alpha ≤ 0.05); we selected the value one (i.e., presence) as the likelihood factor. The generated models are of polynomial degree to obtain linear and quadratic response (e.g., Fielding and Bell, 1997; Allouche et al., 2006). From these results, we ran ecological response curve models, where the resulting plots included the probability of occurrence and values for continuous variables or categories for discrete variables. The points of the presence and background training group are also included.
On the other hand, a global GLM was also run, from which the generalized model is evaluated by means of a 2 x 2 contingency matrix, including both observed and predicted records. A representation of this is shown in Table 1 (adapted from Allouche et al., 2006). In this process we select an arbitrary boundary of 0.5 to obtain better modeling performance and avoid high percentage of bias in type I (omission) or II (commission) errors (e.g., Carpenter et al., 1993; Fielding and Bell, 1997; Allouche et al., 2006; Kim, 2009; Hijmans and Elith, 2017).
Table 1. Example of 2 x 2 contingency matrix for calculating performance metrics for GLM models. A represents true presence records (true positives), B represents false presence records (false positives - error of commission), C represents true background points (true negatives) and D represents false backgrounds (false negatives - errors of omission).
|
Validation set | |
Model |
True |
False |
Presence |
A |
B |
Background |
C |
D |
We then calculated the Overall and True Skill Statistics (TSS) metrics. The first is used to assess the proportion of correctly predicted cases, while the second metric assesses the prevalence of correctly predicted cases (Olden and Jackson, 2002). This metric also gives equal importance to the prevalence of presence prediction as to the random performance correction (Fielding and Bell, 1997; Allouche et al., 2006).
The last code (i.e., Code8_DOMAIN_SuitHab_model.R) is for species distribution modelling using the DOMAIN algorithm (Carpenter et al., 1993). Here, we loaded the variable stack and the presence and background group subdivided into 75% training and 25% test, each. We only included the presence training subset and the predictor variables stack in the calculation of the DOMAIN metric, as well as in the evaluation and validation of the model.
Regarding the model evaluation and estimation, we selected the following estimators:
1) partial ROC, which evaluates the approach between the curves of positive (i.e., correctly predicted presence) and negative (i.e., correctly predicted absence) cases. As farther apart these curves are, the model has a better prediction performance for the correct spatial distribution of the species (Manzanilla-Quiñones, 2020).
2) ROC/AUC curve for model validation, where an optimal performance threshold is estimated to have an expected confidence of 75% to 99% probability (De Long et al., 1988).
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Data licence Germany – Attribution – Version 2.0https://www.govdata.de/dl-de/by-2-0
License information was derived automatically
Table 1.7.2: R & D staff by gender, sectors and staff groups (full-time equivalent)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Overview\r \r The NSW Ecosystem Offset Trading Group Map is a regional scale map of candidate non-threatened Ecosystem Offset Trading Groups as defined under the NSW Biodiversity Offset Scheme.\r \r The map covers all of NSW. The mapped Offset Trading Groups are generated by a direct translation of the NSW State Vegetation Type Map (SVTM vC2.0.M2.1) according to the Biodiversity Assessment Method Table 5. \r \r This map is updated in response to annual updates to the NSW State Vegetation Type Map or other relevant information.\r \r Key Fields:\r \r * 'OTG_Name' – Offset Trading Group\r * 'SUB_NAME_7' – Sub IBRA Bioregion\r * 'OTGSUBIBRA' – A concatenation of the Offset Trading Group and the Sub IBRA Bioregion\r \r New version C2.0M2.1OTG1.0 (not yet available for public release - expected March 2025)\r \r This release includes removal of areas previously mapped as "Not Offset Trading Group", defined as:\r \r * parts of NSW currently considered ineligible for Biodiversity Stewardship Agreements,\r * areas mapped as "Not classified" under the SVTM.\r \r Limitations on Use\r \r * The map may be used as a guide to the occurrence and distribution of NSW Ecosystem Offset Trading Groups.\r * The map is derived from regional scale mapping of Plant Community Types and should not be used for local or property scale decisions. Planning or investment decisions at the property scale should be supported by ground assessment using the NSW Biodiversity Assessment Method (BAM). \r * The map does not include Offset Trading Groups for threatened ecological community (TEC) Ecosystem credit types.\r * The map does not describe which Offset Trading Groups are currently in demand\r \r Data Access\r \r Map data may be downloaded as a zipped Geotiff file at the link below. The map may also be viewed on the SEED Map Viewer, or accessed via the underlying ArcGIS REST Services or WMS for integration in GIS or business applications.\r \r Map Data Type\r \r The map is supplied as a 5m GeoTiff Raster, and can be viewed and analysed in most commercial and open-source spatial software packages. No preferred symbology is provided.\r \r Feedback and Support\r \r We welcome your feedback to assist us in continuously improving our products. To help us track and process your feedback, please use the SEED Data Feedback tool available via the SEED map viewer.\r \r Useful Related Data\r \r * Understanding NSW Offset Trading Groups \r * The NSW State Vegetation Type map \r * Plant Community type to Offset Trading Group lookup tool \r * The NSW Biodiversity Offset Scheme like-for-like rules \r \r
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This map provides an estimation of Hydrologic Groups of soils in NSW according to the four class system (A-D)\r \r * A — soils having high infiltration rates, even when thoroughly wetted and consisting chiefly of deep, well to excessively-drained sands or gravels. These soils have a high rate of water transmission and have low water run-off potential.\r \r * B — soils having moderate infiltration rates when thoroughly wetted and consisting chiefly of moderately deep to deep, moderately fine to moderately coarse textures. These soils have a moderate rate of water transmission.\r \r * C — soils having slow infiltration rates when thoroughly wetted and consisting chiefly of soils with a layer that impedes downward movement of water, or soils with moderately fine to fine texture. These soils have a slow rate of water transmission.\r \r * D — soils having very slow infiltration rates when thoroughly wetted and consisting chiefly of clay soils with a high swelling potential, soils with a permanent high water table, soils with a claypan or clay layer at or near the surface, and shallow soils over nearly impervious material. These soils have a very slow rate of water transmission.\r \r The map uses the best available soils mapping coverage and was derived from a lookup table system linking a Hydrologic Group class to a particular soil type using the Great Soil Group (GSG) classification. Each dominant GSG has been assigned a Hydrologic Soil Group.\r \r The classification is based on the United State's Hydrologic Soil Group system published within the National Engineering Handbook (2007).\r \r Online Maps: This dataset can be viewed using eSPADE (NSW’s soil spatial viewer), which contains a suite of soil and landscape information including soil profile data. Many of these datasets have hot-linked soil reports. An alternative viewer is the SEED Map ; an ideal way to see what other natural resources datasets (e.g. vegetation) are available for this map area.\r \r Reference: Department of Planning, Industry and Environment, 2021, Hydrologic Soil Groups of NSW, Version 4.5, NSW Department of Planning, Industry and Environment, Parramatta.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘R & D personnel by personnel groups and sectors (full-time equivalent)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from http://data.europa.eu/88u/dataset/https-www-datenportal-bmbf-de-portal-1-7-1 on 16 January 2022.
--- Dataset description provided by original source is as follows ---
Table 1.7.1: R & D personnel by personnel groups and sectors (full-time equivalent)
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
While groups have been central to thinking about partisan identity and choices, there has been surprisingly little attention paid to the role of perceptions of the group composition of the parties. We explore this critical linking information in the context of religious groups, some of the chief pivots around which the parties have been sorting. Using three national samples, we show that perceptions of the religious group composition of the parties are often biased – evangelicals overestimate the presence of evangelicals within the Republican Party and the irreligious within the Democratic Party. The key finding is that individuals are far more likely to identify with the party in which they believe their group is well represented – a finding which clarifies the role of party image shifts in constructing partisanship, the limits of the culture war motif, and the importance of social perception in shaping beliefs about party representation.
Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries
Financial overview and grant giving statistics of Triple R Sports Group
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An Open Context "predicates" dataset item. Open Context publishes structured data as granular, URL identified Web resources. This "Variables" record is part of the "Pyla-Koutsopetria Archaeological Project I: Pedestrian Survey" data publication.
Steel Group R Llp Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A database with R-groups frequently used in medicinal chemistry and their preferred replacements is provided. For frequently used R-groups, replacements are organized in hierarches as specified in the readme.txt file. The data deposition accompanies a forthcoming publication by the authors in which the database will be described in detail.