Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ecological theories often encompass multiple levels of biological organization, such as genes, individuals, populations, and communities. Despite substantial progress toward ecological theory spanning multiple levels, ecological data rarely are connected in this way. This is unfortunate because different types of ecological data often emerge from the same underlying processes and, therefore, are naturally connected among levels. Here, we describe an approach to integrate data collected at multiple levels (e.g., individuals, populations) in a single statistical analysis. The resulting integrated models make full use of existing data and might strengthen links between statistical ecology and ecological models and theories that span multiple levels of organization. Integrated models are increasingly feasible due to recent advances in computational statistics, which allow fast calculations of multiple likelihoods that depend on complex mechanistic models. We discuss recently developed integrated models and outline a simple application using data on freshwater fishes in south-eastern Australia. Available data on freshwater fishes include population survey data, mark-recapture data, and individual growth trajectories. We use these data to estimate age-specific survival and reproduction from size-structured data, accounting for imperfect detection of individuals. Given that such parameter estimates would be infeasible without an integrated model, we argue that integrated models will strengthen ecological theory by connecting theoretical and mathematical models directly to empirical data. Although integrated models remain conceptually and computationally challenging, integrating ecological data among levels is likely to be an important step toward unifying ecology among levels.
Ecological flow (EFlow) statistics have been designated to characterize the magnitude, frequency, and duration of extreme high- and low-flows, the timing of seasonal flows, and the consistency of the historic regime. This Child Item contains a table of 178 EFlows for the time periods 1940-1969, 1970-1999, and 2000-2018, with absolute and percent change between periods, where applicable. Statistics were computed by Water Year (WY) for all 178 metrics and absolute and percent change were calculated by comparing metrics between combinations of two of the three time periods (1940-1969 and 1970-1999; 1940-1969 and 2000-2018; 1970-1999 and 2000-2018). Streamgages from the original dataset (n = 409) were excluded from one or more time periods of analysis because of extensive data gaps that would yield incomplete EFlows; therefore, stations were indexed into the earliest possible time period relative to their installation date (for example, a streamgage with an operating start year of 1958 would be included in the analysis for the time periods 1970-1999 and 2000-2018), which resulted in different sample sizes for each period: 1940-1969 (n = 90), 1970-1999 (n = 167), and 2000-2018 (n = 243). Similarly, multiple stations were wholly excluded because of frequent discontinuities in the daily mean streamflow through all three time periods. Finally, a streamgage must have fallen within at least two time periods to have a change value. As such, not all stations are represented in the change analysis (change between 1940-1969 and 1970-1999 [n = 90]; change between 1940-1969 and 2000-2018 [n = 90]; change between 1970-1999 and 2000-2018 [n = 167]).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Statistical decision theory (SDT) is a sub-field of decision theory that formally incorporates statistical investigation into a decision-theoretic framework to account for uncertainties in a decision problem. SDT provides a unifying analysis of three types of information: statistical results from a data set, knowledge of the consequences of potential choices (i.e., loss), and prior beliefs about a system. SDT links the theoretical development of a large body of statistical methods including point estimation, hypothesis testing, and confidence interval estimation. The theory and application of SDT have mainly been developed and published in the fields of mathematics, statistics, operations research, and other decision sciences, but have had limited exposure in ecology. Thus, we provide an introduction to SDT for ecologists and describe its utility for linking the conventionally separate tasks of statistical investigation and decision making in a single framework. We describe the basic framework of both Bayesian and frequentist SDT, its traditional use in statistics, and discuss its application to decision problems that occur in ecology. We demonstrate SDT with two types of decisions: Bayesian point estimation, and an applied management problem of selecting a prescribed fire rotation for managing a grassland bird species. Central to SDT, and decision theory in general, are loss functions. Thus, we also provide basic guidance and references for constructing loss functions for an SDT problem.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
All statistics were done in R Studio
In 1991, the U.S. Geological Survey (USGS) began a study of more than 50 major river basins across the Nation as part of the National Water-Quality Assessment (NAWQA) project of the National Water-Quality Program. One of the major goals of the NAWQA project is to determine how water-quality and ecological conditions change over time. To support that goal, long-term consistent and comparable ecological monitoring has been conducted on streams and rivers throughout the Nation. Fish, invertebrate, and diatom data collected as part of the NAWQA program were retrieved from the USGS Aquatic Bioassessment database for use in trend analysis. Ultimately, these data will provide insight into how natural features and human activities have contributed to changes in ecological condition over time in the Nation’s streams and rivers. This USGS data release contains all of the input and output files necessary to reproduce the results of the ecological trend analysis described in the associated U.S. Geological Survey Scientific Investigations Report. Data preparation for input to the model is also fully described in the above mentioned report.
Biomass data for Arabidopsis thaliana populations. Each row refers to a pot in the pot experiment and shows:the population codetreatment (competition, drought, their combination, or control)diversity, i.e. the number of different TE lines that a population was composed ofnumber of individuals at time of biomass harvesttotal pot biomassaverage individual biomass (pot biomass/number of individuals)weight of competitors in the pot (0 if no competitors were present)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the last decade, a plethora of algorithms have been developed for spatial ecology studies. In our case, we use some of these codes for underwater research work in applied ecology analysis of threatened endemic fishes and their natural habitat. For this, we developed codes in Rstudio® script environment to run spatial and statistical analyses for ecological response and spatial distribution models (e.g., Hijmans & Elith, 2017; Den Burg et al., 2020). The employed R packages are as follows: caret (Kuhn et al., 2020), corrplot (Wei & Simko, 2017), devtools (Wickham, 2015), dismo (Hijmans & Elith, 2017), gbm (Freund & Schapire, 1997; Friedman, 2002), ggplot2 (Wickham et al., 2019), lattice (Sarkar, 2008), lattice (Musa & Mansor, 2021), maptools (Hijmans & Elith, 2017), modelmetrics (Hvitfeldt & Silge, 2021), pander (Wickham, 2015), plyr (Wickham & Wickham, 2015), pROC (Robin et al., 2011), raster (Hijmans & Elith, 2017), RColorBrewer (Neuwirth, 2014), Rcpp (Eddelbeuttel & Balamura, 2018), rgdal (Verzani, 2011), sdm (Naimi & Araujo, 2016), sf (e.g., Zainuddin, 2023), sp (Pebesma, 2020) and usethis (Gladstone, 2022).
It is important to follow all the codes in order to obtain results from the ecological response and spatial distribution models. In particular, for the ecological scenario, we selected the Generalized Linear Model (GLM) and for the geographic scenario we selected DOMAIN, also known as Gower's metric (Carpenter et al., 1993). We selected this regression method and this distance similarity metric because of its adequacy and robustness for studies with endemic or threatened species (e.g., Naoki et al., 2006). Next, we explain the statistical parameterization for the codes immersed in the GLM and DOMAIN running:
In the first instance, we generated the background points and extracted the values of the variables (Code2_Extract_values_DWp_SC.R). Barbet-Massin et al. (2012) recommend the use of 10,000 background points when using regression methods (e.g., Generalized Linear Model) or distance-based models (e.g., DOMAIN). However, we considered important some factors such as the extent of the area and the type of study species for the correct selection of the number of points (Pers. Obs.). Then, we extracted the values of predictor variables (e.g., bioclimatic, topographic, demographic, habitat) in function of presence and background points (e.g., Hijmans and Elith, 2017).
Subsequently, we subdivide both the presence and background point groups into 75% training data and 25% test data, each group, following the method of Soberón & Nakamura (2009) and Hijmans & Elith (2017). For a training control, the 10-fold (cross-validation) method is selected, where the response variable presence is assigned as a factor. In case that some other variable would be important for the study species, it should also be assigned as a factor (Kim, 2009).
After that, we ran the code for the GBM method (Gradient Boost Machine; Code3_GBM_Relative_contribution.R and Code4_Relative_contribution.R), where we obtained the relative contribution of the variables used in the model. We parameterized the code with a Gaussian distribution and cross iteration of 5,000 repetitions (e.g., Friedman, 2002; kim, 2009; Hijmans and Elith, 2017). In addition, we considered selecting a validation interval of 4 random training points (Personal test). The obtained plots were the partial dependence blocks, in function of each predictor variable.
Subsequently, the correlation of the variables is run by Pearson's method (Code5_Pearson_Correlation.R) to evaluate multicollinearity between variables (Guisan & Hofer, 2003). It is recommended to consider a bivariate correlation ± 0.70 to discard highly correlated variables (e.g., Awan et al., 2021).
Once the above codes were run, we uploaded the same subgroups (i.e., presence and background groups with 75% training and 25% testing) (Code6_Presence&backgrounds.R) for the GLM method code (Code7_GLM_model.R). Here, we first ran the GLM models per variable to obtain the p-significance value of each variable (alpha ≤ 0.05); we selected the value one (i.e., presence) as the likelihood factor. The generated models are of polynomial degree to obtain linear and quadratic response (e.g., Fielding and Bell, 1997; Allouche et al., 2006). From these results, we ran ecological response curve models, where the resulting plots included the probability of occurrence and values for continuous variables or categories for discrete variables. The points of the presence and background training group are also included.
On the other hand, a global GLM was also run, from which the generalized model is evaluated by means of a 2 x 2 contingency matrix, including both observed and predicted records. A representation of this is shown in Table 1 (adapted from Allouche et al., 2006). In this process we select an arbitrary boundary of 0.5 to obtain better modeling performance and avoid high percentage of bias in type I (omission) or II (commission) errors (e.g., Carpenter et al., 1993; Fielding and Bell, 1997; Allouche et al., 2006; Kim, 2009; Hijmans and Elith, 2017).
Table 1. Example of 2 x 2 contingency matrix for calculating performance metrics for GLM models. A represents true presence records (true positives), B represents false presence records (false positives - error of commission), C represents true background points (true negatives) and D represents false backgrounds (false negatives - errors of omission).
Validation set
Model
True
False
Presence
A
B
Background
C
D
We then calculated the Overall and True Skill Statistics (TSS) metrics. The first is used to assess the proportion of correctly predicted cases, while the second metric assesses the prevalence of correctly predicted cases (Olden and Jackson, 2002). This metric also gives equal importance to the prevalence of presence prediction as to the random performance correction (Fielding and Bell, 1997; Allouche et al., 2006).
The last code (i.e., Code8_DOMAIN_SuitHab_model.R) is for species distribution modelling using the DOMAIN algorithm (Carpenter et al., 1993). Here, we loaded the variable stack and the presence and background group subdivided into 75% training and 25% test, each. We only included the presence training subset and the predictor variables stack in the calculation of the DOMAIN metric, as well as in the evaluation and validation of the model.
Regarding the model evaluation and estimation, we selected the following estimators:
1) partial ROC, which evaluates the approach between the curves of positive (i.e., correctly predicted presence) and negative (i.e., correctly predicted absence) cases. As farther apart these curves are, the model has a better prediction performance for the correct spatial distribution of the species (Manzanilla-Quiñones, 2020).
2) ROC/AUC curve for model validation, where an optimal performance threshold is estimated to have an expected confidence of 75% to 99% probability (De Long et al., 1988).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains adult mosquito surveys and sampling data by species and date, as well as anonymized neighborhood and environmental sensor data for research conducted in neighborhoods in Baltimore, Maryland USA (2012-2017).
This dataset includes the following files:
Metadata_Updated_for_Public_Archive.xlsx. Descriptive metadata for all sheets including redacted (confidential) data files.
Mastercontainer_Public_Archive.xlsx. These are data from container surveys conducted between 2012-2016.
Masterkapsurvey_Public_Archive_v20220928.xlsx. Anonymized surveys of residents in sample neighborhood blocks in years 2012, 2013, 2014, and 2016.
MASTERadult.xlsx. Counts and identification of (primarily) female adult mosquitoes that were actively trapped over years and block clusters.
MASTER_Bloodfed.xlsx. Includes identification of blood meal source of 2015 and 2016 subsamples of adult female mosquitoes.
Master_iButton.xlsx. Raw relative humidity and temperature data recorded by iButton during 2015-2017 as well as year-long summaries by sensor for 2016 and 2017.
The Cary Institute of Ecosystem Studies furnishes data under the following conditions: The data have received quality assurance scrutiny, and, although we are confident of the accuracy of these data, Cary Institute will not be held liable for errors in these data. Data are subject to change resulting from updates in data screening or models used. To cite these data, click on the Cite button on this page. Metadata associated with the confidential data from this study may be found here.
Questions about these data should be directed to Dr. Shannon LaDeau, ladeaus@caryinstitute.org or Cary Data Management, datamanagement@caryinstitute.org.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An analysis of the tweets from ESA2016 meeting.Used r-package 'tm'All code on Github here: https://cjlortie.github.io/esa2016.tweets/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This record represents near real time River Ecology Monitoring Results. National surveys of Irish rivers have taken place on a continuous basis since 1971, when 2,900 km of river channel was surveyed. The National Rivers Monitoring Programme was replaced by the Water Framework Monitoring Programme from 22 December 2006. As part of the Water Framework Directive (WFD) Monitoring Programme approximately one third of our major rivers and their more important tributaries are surveyed and assessed each year by EPA ecologists. A complete survey cycle is completed every three years. The sites are scored on a five point system developed by the EPA called the Biological Q rating system.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These data include the abundance, emergence, and persistence of puncturevine (Tribulus terrestris), a harmful invasive species in Western North America. We mapped the demography and distribution of this plant in Boise, ID, United States in summer 2020. These data include both the mapped plots and puncturevine points as well as .csv files with spatial covariates related to puncturevine outbreaks.
Environmental Radiation Data (ERD) is an electronic and print journal compiled and distributed quarterly by the Office of Radiation and Indoor Air's National Air and Radiation Environmental Laboratory (NAREL) in Montgomery, Alabama. It contains data from RadNet (previously known as ERAMS.)
Public Domain Mark 1.0https://creativecommons.org/publicdomain/mark/1.0/
License information was derived automatically
Bio-ORACLE is a set of GIS rasters providing geophysical, biotic and environmental data for surface and benthic marine realms. The data are available for global-scale applications at a spatial resolution of 5 arcmin (approximately 9.2 km at the equator).
Linking biodiversity occurrence data to the physical and biotic environment provides a framework to formulate hypotheses about the ecological processes governing spatial and temporal patterns in biodiversity, which can be useful for marine ecosystem management and conservation.
Bio-ORACLE offers a user-friendly solution to accomplish this task by providing 18 global geophysical, biotic and climate layers at a common spatial resolution (5 arcmin) and a uniform landmask.
The data available in Bio-ORACLE are documented in two peer reviewed articles that you should cite: Tyberghein L, Verbruggen H, Pauly K, Troupin C, Mineur F, De Clerck O (2012) Bio-ORACLE: A global environmental dataset for marine species distribution modelling. Global Ecology and Biogeography, 21, 272–281. Assis, J., Tyberghein, L., Bosh, S., Verbruggen, H., Serrão, E. A., & De Clerck, O. (2017). Bio-ORACLE v2.0: Extending marine data layers for bioclimatic modelling. Global Ecology and Biogeography.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Animals make feeding decisions to simultaneously maximise fitness traits that often require different nutrients. Recent quantitative methods have been developed to characterise these nutritional trade-offs from performance landscapes on which traits are mapped on a nutrient space defined by two nutrients. This limitation constrains the broad applications of previous methods to more complex data, and a generalised framework is needed. Here, we build upon previous methods and introduce a generalised vector-based approach – the Vector of Position approach – to study nutritional trade-offs in complex multi-dimensional spaces. The Vector of Position Approach allows the estimate of performance variations across entire landscapes (peaks and valleys), and compare these variations between animals. Using landmark published datasets on lifespan and reproduction landscapes, we illustrate how our approach gives accurate quantifications of nutritional trade-offs in two- and three-dimensional spaces, and can bring new insights into the underlying nutritional differences in trait expression between species. The Vector of Position Approach provides a generalised framework for investigating nutritional differences in life-history traits expression within and between species, an essential step for the development of comparative research on the evolution of animal nutritional strategies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data from underwater video cameras and underwater visual census to obtain real fish densities considering the habitat characteristics in the individual detectability. In addition, simulation for demonstrating (1) how to calibrate the cameras for accounting for the effects of an "external" continuous variable on detectability and (2) how to apply such a cameras calibration for estimate fish density at new sites.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Understanding how nutrients flow through food webs is central in ecosystem ecology. Tracer addition experiments are powerful tools to reconstruct nutrient flows by adding an isotopically enriched element into an ecosystem, and tracking its fate through time. Historically, the design and analysis of tracer studies have varied widely, ranging from descriptive studies to modeling approaches of varying complexity. Increasingly, isotope tracer data is being used being used to compare ecosystems and analyze experimental manipulations. Currently, a formal statistical framework for analyzing such experiments is lacking, making it impossible to calculate the estimation errors associated with the model fit, the interdependence of compartments, or the uncertainty in the diet of consumers. In this paper we develop a method based on Bayesian Hidden Markov Models, and apply it to the analysis of $^{15}$N-NH$_4^+$ tracer additions in two Trinidadian streams in which light was experimentally manipulated. Through this case study, we illustrate how to estimate N fluxes between ecosystem compartments, turnover rates of N within those compartments and the associated uncertainty. We also show how the method can be used to compare alternative models of food web structure, calculate the error arround derived parameters, and make statistical comparisons between sites or treatments.
https://eidc.ceh.ac.uk/licences/OGL/plainhttps://eidc.ceh.ac.uk/licences/OGL/plain
http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitationshttp://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations
This dataset comprises linear and areal habitat information and vegetation species recorded during an ecological survey of stratified random 1km square sites across England in 1992 and 1993. The survey was carried out by the Institute of Terrestrial Ecology (which later became part of the Centre for Ecology & Hydrology (CEH)) and was commissioned in order to carry out survey work into habitats which were perceived to be under threat or which represented areas of concern to the Department for the Environment (named as 'Key Habitats'). The habitats (or landscape types) in question were: Lowland heaths, chalk and limestone grasslands, coasts and uplands. The survey was designed to complement CEH's national ecological survey, the 'Countryside Survey'. Full details about this dataset can be found at https://doi.org/10.5285/7aefe6aa-0760-4b6d-9473-fad8b960abd4
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
In 1991, the U.S. Geological Survey (USGS) began a study of more than 50 major river basins across the Nation as part of the National Water-Quality Assessment (NAWQA) project of the National Water Quality Program. One of the major goals of the NAWQA project is to determine how water quality and ecological conditions change over time. To support that goal, long-term consistent and comparable ecological monitoring has been conducted on streams and rivers throughout the Nation. Fish, invertebrate, and algae data collected as part of the NAWQA program were retrieved from the USGS Aquatic Bioassessment database for use in trend analysis. Ultimately, these data will provide insight into how natural features and human activities have contributed to changes in ecological condition over time in the Nation’s streams and rivers. This USGS data release contains all of the input and output files necessary to reproduce the results of the ecological trend analysis described in the associated U.S ...
Financial overview and grant giving statistics of Omni Center for Peace Justice and Ecology
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
The ECOSYS (Ecological Information System) database is a provincial database that stores over 26,000 vegetation and soil plots described in the province of Alberta. This information is used in the development of management tools (plant community guides, ecosite guides, natural subregion maps, range health tools etc.) to ensure Alberta’s public lands are being managed sustainably. ECOSYS also summarizes the raw plot information into Ecosite Guides for each subregion in the province.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Ecological theories often encompass multiple levels of biological organization, such as genes, individuals, populations, and communities. Despite substantial progress toward ecological theory spanning multiple levels, ecological data rarely are connected in this way. This is unfortunate because different types of ecological data often emerge from the same underlying processes and, therefore, are naturally connected among levels. Here, we describe an approach to integrate data collected at multiple levels (e.g., individuals, populations) in a single statistical analysis. The resulting integrated models make full use of existing data and might strengthen links between statistical ecology and ecological models and theories that span multiple levels of organization. Integrated models are increasingly feasible due to recent advances in computational statistics, which allow fast calculations of multiple likelihoods that depend on complex mechanistic models. We discuss recently developed integrated models and outline a simple application using data on freshwater fishes in south-eastern Australia. Available data on freshwater fishes include population survey data, mark-recapture data, and individual growth trajectories. We use these data to estimate age-specific survival and reproduction from size-structured data, accounting for imperfect detection of individuals. Given that such parameter estimates would be infeasible without an integrated model, we argue that integrated models will strengthen ecological theory by connecting theoretical and mathematical models directly to empirical data. Although integrated models remain conceptually and computationally challenging, integrating ecological data among levels is likely to be an important step toward unifying ecology among levels.