Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Temporally ordered multi-neuron patterns likely encode information in the brain. We introduce an unsupervised method, SPOTDisClust (Spike Pattern Optimal Transport Dissimilarity Clustering), for their detection from high-dimensional neural ensembles. SPOTDisClust measures similarity between two ensemble spike patterns by determining the minimum transport cost of transforming their corresponding normalized cross-correlation matrices into each other (SPOTDis). Then, it performs density-based clustering based on the resulting inter-pattern dissimilarity matrix. SPOTDisClust does not require binning and can detect complex patterns (beyond sequential activation) even when high levels of out-of-pattern “noise” spiking are present. Our method handles efficiently the additional information from increasingly large neuronal ensembles and can detect a number of patterns that far exceeds the number of recorded neurons. In an application to neural ensemble data from macaque monkey V1 cortex, SPOTDisClust can identify different moving stimulus directions on the sole basis of temporal spiking patterns.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Fish assemblages –whether defined by taxonomy or functional traits—respond to regional and local habitat variation. We sampled rivers of Mongolia and the western United States (US) to determine the scale at which habitat could predict fish assemblage variation, classified by taxonomy or functional traits. Our hypothesis was that fish assemblages could be predicted using valley-scale hydrogeomorphology and reach-scale hydrology. We further predicted that if valley-scale variables explained high variation in fish assemblages then reach-scale variables would explain additional dimensions. Location: Mongolia, United States Methods: We evaluated reach- and valley-scale hydrogeomorphology of rivers in the US and Mongolia in each of three ecoregions, grassland, forest, and endorheic. Fishes were collected using backpack electrofisher following standard protocols. Results: Ordinations resulted in distinct assemblage patterns that corresponded with habitat variables at both valley- and reach-scales. Hydrogeomorphology differed for Mongolia and US rivers and likely contributed to different patterns that explained fish assemblage variation classified by taxonomy vs. traits. Ecoregions differed in factors contributing to fish assemblage patterns, likely a result of differences in hydrogeomorphology and historical influences, as well as effects of introduced species in the US. Main Conclusions: We found that fish assemblages were structured by hydrogeomorphic processes occurring at valley- and reach-scales, and that variables predicting fish assemblages vary with scale, ecoregion, and continent. We found a common pattern where if valley-scale variables provided high explanation of fish assemblages, then reach-scale variables frequently explained more ordination dimensions than valley-scale variables. This implies that reach-scale hydrology variables are always strong predictors of fish assemblage variation, and valley-scale geomorphology variables are sometimes strong predictors. We found evidence that introduced species or anthropogenic impacts modified our analyses predicting fish assemblage variation of Mongolia and US mountain steppe rivers. Although anthropogenic impacts were substantially higher for western US rivers than for Mongolia rivers, we were unable to detect strong differences in our ability to predict fish assemblage variation from reach- and valley-scale habitat variables. Methods 2.1 Study area and valley-scale habitat assessment We identified rivers in the US and Mongolia in three ecoregions, grassland (G), forest-steppe (F), and endorheic (E, Figures 1, 2) (Olson et al., 2001). Unique hydrogeomorphic patches were delineated into Functional Process Zones (FPZs, Thorp et al., 2006; 2008) using the GIS-based program RESonate (Williams et al., 2013) to extract valley-scale hydrogeomorphic and environmental variables from existing geospatial data. Maasri et al. (2019; 2021a) described details for data extraction using RESonate. We used the ten most influential variables for valley-scale hydrogeomorphology to delineate FPZs. These variables—which were extracted at 10 km stream intervals because of the size of these rivers--included elevation, mean annual precipitation, geology, valley width, valley floor (floodplain) width, valley width-to-valley width ratio, river channel sinuosity, right valley slope, left valley slope, and down valley slope. Data were normalized to a 0 to 1 scale for each river network, and a dissimilarity matrix was generated using a Gower dissimilarity transformation (Gower, 1971). A Gower transformation is recommended for non-biological data with range-standardization (Thoms & Parsons, 2003). The dissimilarity matrix was used in a hierarchical clustering following the Ward linkage method, as it resulted in the best partitioning of clusters (Murtagh & Legendre, 2014). We then used a Principal Components Analysis (PCA) to identify important contributive variables for group partitioning, and to describe cluster groups based on the ten variables described above. Cluster groups were later mapped to allow identification of sampling sites. We performed the clustering of FPZ groups using the cluster package (version 2.1.0) (Maechler et al., 2018) and the PCA using the FactoMineR package (version 1.42) (Lê et al., 2008) in R version 3.6.3 (R Core Team, 2020). We mapped the resulting groups using ArcGIS (version 10.5). We examined gradients in hydrogeomorphology with PCA using the ten influential valley-scale variables (above) in Minitab 18.1 (minitab.com) for all sites and by continent and ecoregion. 2.2 Reach-scale habitat assessment Each selected site was sampled following the Physical Habitat protocols from Environmental Monitoring and Assessment Program section 7 (Lazorchak et al., 1998) to provide a characterization of hydrogeomorphology at the reach-scale. Recorded field measurements were calculated into seven different metric sections (channel geometry, bank geometry, substrate, fish cover, human influence, riparian cover, flow) representing the habitat and dominant processes in the reach (Kaufmann et al., 1999). Sampling was conducted over a total reach length of 40 times the average wetted width, except where total reach length would have exceeded 5 km, where length was halved. Transects were taken at 0.1 intervals of the total reach length, while half transects were taken at 0.05 of the total reach length. Visual estimates of riparian cover were recorded as the amount and type of cover provided in a 10 m by 10 m area on the left and right banks centered on each transect. Visual estimates of the amount and type of fish cover were recorded representing an area 5 m upstream and 5 m downstream in and over the water at a transect. Human influence data were collected using a “presence metric” that also indicated closeness to the river at a given transect (P- Present > 10m away, C- Present within 10 m, B- present on the bank, 0- Not Present). Channel geometry data included five depth measurements across each transect, and wetted width at each transect and half transect. Bank geometry data were collected at each transect on both banks and included top-of-bank elevations and distances, bankfull elevations and distances, and bank angles. Substrate data were collected at the same spot as depth at transects, as well as at half transects. Additional reach-scale data were collected remotely in ArcGIS using digital elevation models and aerial photography to extract slope and sinuosity. In total, 120 characteristics and metrics were collected for each sampled site (Appendix 1). These variables were reduced by selecting only characteristics that were aggregates of multiple similar characteristics (i.e., PCT_FAST sums the percentages of falls cascades, rapids, and riffles). FPZ segment data were linked to sampled reaches through analysis in GIS, using a spatial join. The spatial join was conducted on the most downstream GPS point of a sampled site, joining one-to-one with the closest FPZ segment with a search radius to select a single FPZ line with a single sampled site. The spatial join was manually confirmed that each sampled site had an associated FPZ segment. In cases where reaches did not pair with an FPZ segment, manual connections were made by identifying the closest FPZ segment downstream. The FPZ dataset, originally representing over 4300 valley segments across FPZs (Costello et. al, in review), was reduced to 95 segments representing the FPZs that were sampled. The most downstream point of a sampled reach was used to delineate a contributing watershed area boundary. Using the watershed boundary, data were extracted from DEMs (SRTM 30-m Mongolia, 10-m US), land cover (IGBP Land Cover Classification, Mongolia; National Land Cover Dataset, US), and climate (WorldClim, 30 arc-second). Land cover characteristics were combined to provide consistent classifications between the United States and Mongolia. Land cover characteristics were divided by the contributing watershed area to allow for relative comparison across differing watershed sizes. A total of 29 characteristics were collected for each of the 96 contributing watersheds (Appendix 1). 2.3 Fish collections and traits We collected fishes from 94 sites that were identified as described above. Site distances for fish collections were 20 times the mean wetted width. We collected fishes by single-pass backpack electrofishing supplemented with angling (Ball State University IACUC #126193) following the American Fisheries Society standard collection protocols (Bonar et al., 2009). Fish abundances across sampled areas were standardized using CPUE fish-per-m. Fishes were collected during one-month expeditions in each of the river networks during summer or fall seasons from 2017-2019 (Maasri et al., 2021b). Species identifications, ecological and biological traits were from Mendsaikhan et al. (2017) for Mongolia, and from state fish guides for the US. Reproductive traits were reduced to four categories: nonguarder open substratum, nonguarder brood hiders, guarders, and viviparous based on Balon (1975). 2.4 Analyses Valley-scale geomorphology data and reach-scale hydrology data were reduced separately into fewer variables with minimal collinearity using Principal Components Analysis (PCA) in Minitab version 18. We used three PCA axes for the valley-scale data and five PCA axes for the reach-scale data. We then evaluated fish assemblage responses to valley-scale geomorphology and reach-scale hydrology variables using constrained ordinations with forward selection of environmental variables in CANOCO 5 software (canoco5.com). CANOCO evaluates length of the first ordination axis and recommends either a linear method (Redundancy Analysis, RDA) or a nonlinear method (Canonical Correspondence Analysis, CCA). RDA is a direct gradient technique for multifactorial analysis-of-variance models using ecologically relevant distance measures and
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT Precision agriculture can improve the decision-making process in agricultural production, as it gathers, processes and analyzes spatial data, allowing, for example, specific fertilizer application in each location. One of the proposals to deal with spatial heterogeneity of the soil or the distribution of chemical properties is to define application zones (homogeneous subareas). These zones allow reducing both spatial variability of the yield of the crop under study and of the environmental impacts. Considering the soil data, application zones can also represent strata or indicators to direct future soil sampling, thus seeking sample size reduction, for example. This study aimed to obtain an optimized sampling redesign using application zones generated from the assessment of five clustering methods (Fuzzy C-means, Fanny, K-means, McQuitty and Ward). Soil samples were collected in an agricultural area located in the city of Cascavel-Paraná-Brazil, and analyzed in the laboratory to determine the soil chemical properties, referring to four soybean harvest years (2013-2014, 2014-2015, 2015-2016 and 2016-2017). The application zones were obtained through a dissimilarity matrix that aggregates information about the Euclidean distance between the sample elements and the spatial dependence structure of the properties. Subsequently, an optimized sampling redesign, with reduction of the initial sample points, was obtained in these application zones. For the harvest years under study, the K-means and Ward clustering methods efficiently defined the application zones, dividing the study area into two or three application zones. Among the reduced sample configurations obtained by the optimization process, when comparing the initial sample configuration, the one optimized by 25 % (selecting 75 % of the initial configuration points, which corresponds to 76 sample points) was the most effective in terms of the accuracy indices (overall accuracy, Kappa, Tau). This fact indicates greater similarity between the thematic maps of these sample configurations. In this way, the reduced sample configurations could be used to generate the application zones and reduce the costs regarding the laboratory analyses involved in the study.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Existing phytogeographic frameworks for tropical Africa lack either spatial completeness, unit definitions smaller than the regional scale, or a quantitative approach. We investigate whether physical environmental variables can be used to interpolate floristically defined vegetation units, presenting an interpolated, hierarchical, quantitative phytogeographic framework for tropical Africa, which is compared to previously defined regions.
Location: Tropical mainland Africa 24°N to 24°S.
Taxon: 31,046 vascular plant species and infraspecific taxa.
Methods: We calculate a betasim dissimilarity matrix from a comprehensive whole-flora database of plant species distributions. We investigate environmental correlates of floristic turnover with local non-metric multidimensional scaling. We derive a hierarchical biogeographic framework by clustering the dissimilarity matrix. The framework is modelled using a classification decision tree method and 12 physical environmental variables to interpolate and downscale the framework across the study region.
Results: Floristic turnover is related strongly to water availability and temperature, with smaller contributions from land cover, topographic ruggedness and lithology. Region can be predicted with 90% accuracy by the model. We define 19 regions and 99 districts. We find a novel arrangement of the arid regions. Regional subdivision within the savanna biome is supported with minor variation to borders. Within the forests of west and central Africa, our whole-flora gridded regionalisation supports the divisions identified by a previous analysis of trees only.
Main conclusions: Physical environmental variables can be used to predict floristically defined vegetation units with very high accuracy, and the approach could be pursued for other inc ompletely sampled taxa and areas outside of tropical Africa. Geographic coherence is higher than in previous quantitative phytoregional definitions. For most tropical African vascular plant species, we provide predictions of which species will occur within each mapped district and region of tropical Africa. The framework should be useful for future studies in ecology, evolution and conservation.
Methods Plant species records from tropical Africa were summarised uniquely at degree square resolution for tropical Africa to produce 533,383 records of 31,046 tropical African species and infraspecific taxa in 1,197 degree squares of tropical mainland Africa between 24°N and 24°S. Contributing datasets are cited in the dataset ReadMe and Appendix S1 of the manuscript. Larger datasets with DOI links have been included as cited works with the Dryad submission. Data cleaning, georeferencing and synonymy of the compiled data set are described in the dataset ReadMe and Appendix S1 of the associated manuscript.
Environmental data were summarised at one degree square and half degree square. We summarised: Mean altitude from GMTED2010 at 30 arc second resolution (Danielson & Gesch, 2011). Topographic ruggedness from GMTED2010 using the GDAL Terrain Ruggedness Index tool via QGIS. Climatic variables Bio1 to Bio35 at 30-minute resolution for the years 1961-1990 from the CliMond database (Kriticos et al., 2012). Surficial lithology classification of Sayre et al. (2013). Majority land cover class from GlobCover 2009 (Arino et al., 2012). We estimated completeness of taxon sampling for each degree square by comparing the number of species recorded as present with richness estimates of Barthlott, Mutke, Rafiqpoor, Kier, & Kreft, 2005.
A betasim dissimilarity matrix was created from these summarised data and a local NMDS performed. The same betasim dissimilarity matrix was clustered using Ward’s algorithm. The 19 cluster solution was defined as the regional level, and the 99 cluster solution as the district level. Random Forest classification models were built using the summarised environmental data as predictors, using the R package randomForest (Liaw & Wiener, 2002): we trained one model on the 19 regions to predict the regional framework. We subsequently trained 19 models to predict the distribution of the 99 districts within each of the 19 regions, using the same selection of predictor variables. The interpolated regions, and districts, constitute the biogeographic framework presented here.
The biogeographic framework was characterised by the number of taxa, number of endemic taxa, percent endemism, percent sampling completeness, number of indicator species and number of significant indicator species. Continuous environmental data used in the Random Forest model were summarised by their mean and standard deviation, minimum, median, maximum, interquartile range; lower and upper confidence intervals of the median are calculated using +/-1.58 IQR/sqrt(n). Categorical data were summarised by their majority class.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In studies of cognitive neuroscience, multivariate pattern analysis (MVPA) is widely used as it offers richer information than traditional univariate analysis. Representational similarity analysis (RSA), as one method of MVPA, has become an effective decoding method based on neural data by calculating the similarity between different representations in the brain under different conditions. Moreover, RSA is suitable for researchers to compare data from different modalities and even bridge data from different species. However, previous toolboxes have been made to fit specific datasets. Here, we develop NeuroRA, a novel and easy-to-use toolbox for representational analysis. Our toolbox aims at conducting cross-modal data analysis from multi-modal neural data (e.g., EEG, MEG, fNIRS, fMRI, and other sources of neruroelectrophysiological data), behavioral data, and computer-simulated data. Compared with previous software packages, our toolbox is more comprehensive and powerful. Using NeuroRA, users can not only calculate the representational dissimilarity matrix (RDM), which reflects the representational similarity among different task conditions and conduct a representational analysis among different RDMs to achieve a cross-modal comparison. Besides, users can calculate neural pattern similarity (NPS), spatiotemporal pattern similarity (STPS), and inter-subject correlation (ISC) with this toolbox. NeuroRA also provides users with functions performing statistical analysis, storage, and visualization of results. We introduce the structure, modules, features, and algorithms of NeuroRA in this paper, as well as examples applying the toolbox in published datasets.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: Environmental and spatial factors are broadly recognized as important predictors of beta diversity patterns. However, the scale at which beta diversity patterns are evaluated will affect the outcoming results. For example, studies at larger scales will usually find spatial processes as the main predictor of beta diversity patterns. In this study we evaluate how beta diversity patterns change when analyses are conducted at different scales by reducing the scale of analysis in a hierarchical manner.
Taxon: Chiroptera.
Location: Atlantic Forest biome.
Methods: Information on the occurrence of 59 bat species were obtained from the Atlantic Bats and Species Link database. We partitioned beta diversity into its two components (nestedness and turnover), and calculated these indexes hierarchically: the biome in its entirety (all ecoregions); between larger regions (north, central and south); and between ecoregions within each region. We performed a Generalized Dissimilarity Model (GDM) to identify and predict the turnover of bat species in the Atlantic Forest based on geo-climatic predictors. We obtained 19 geo-climatic data from AMBDATA, an environmental dataset based on different data sources commonly used in species distribution modeling.
Results: We found that turnover was the main component influencing a latitudinal gradient when the biome was analysed in its entirety. However, when the scale of the analysis was reduced, we found that species loss (nestedness component) had a large effect in determining beta diversity dissimilarity. We also found that nestedness was the main pattern explaining beta diversity dissimilarity along a longitudinal gradient.
Main conclusions: Beta diversity patterns changed with the scale of analysis, which indicates that bat species composition does not follow the same pattern throughout the Atlantic Forest. This corroborates the importance of analysing beta diversity patterns at different scales in order to understand how environmental dissimilarity across geographic space can influence species distribution patterns.
Methods Occurrence and geo-climatic data
Our study was based on 2,626 bat occurrence data points for 525 sites (coordinates) within the Atlantic Forest. Our data came from two sources. We extracted 1,795 occurrence data points for 160 sites from Muylaert et al. (2017). This dataset was compiled by bat specialists who also reviewed the taxonomy of the species and the coordinates of sampling sites. We then used the species list provided in Muylaert et al. (2017) to search for other occurrence records in speciesLink (data downloaded from http://splink.cria.org.br/). We obtained 830 occurrence records of bat species for 364 sites. We reviewed the dataset obtained from speciesLink according to reliability of information regarding: i) coordinates and site correspondence (we used google maps to check if the coordinates were referring to the places indicated), ii) correct taxonomy (we excluded species with “sp”, “ssp”, “cf” and “aff”), and iii) voucher specimens (we only considered records with specimens that were deposited in a museum). We also included a single occurrence record of Natalus macrourus (Trajano, 1984) from Parque Estadual Turístico do Alto Ribeira (PETAR), which was not considered by Muylaert et al. (2017) or SpeciesLink. Occurrence records belonging to the bat families Molossidae, Vespertilionidae and Embalonuridae were not included in our study because they are seldomly captured in mist-nets (Nogueira, Pol &, Peracchi, 1999; Nogueira, Pol, Monteiro &, Peracchi, 2008), which was the predominant method used for sampling bat species represented in our data sources.
We obtained geo-climatic data from AMBDATA (available at http://www.dpi.inpe.br/Ambdata/index.php). The AMBDATA is an environmental dataset systematized from different data sources and commonly used in species distribution modelling. It consists of 19 bioclimatic variables at 30 arc-sec resolution (approx. 1 km). These are: 1) annual mean temperature (ºC); 2) mean diurnal range (ºC); 3) isothermality (mean diurnal range divided by annual temperature range, and multiplied by 100); 4) temperature seasonality (standard deviation *100); 5) maximum temperature of warmest month (ºC); 6) minimum temperature of coldest month (ºC); 7) temperature annual range (ºC); 8) mean temperature of wettest quarter (ºC); 9) mean temperature of driest quarter (ºC); 10) mean temperature of warmest quarter (ºC); 11) mean temperature of coldest quarter (ºC); 12) annual precipitation (mm); 13) precipitation of wettest month (mm); 14) precipitation of driest month (mm); 15) precipitation seasonality (coefficient of variation); 16) precipitation of wettest quarter (mm); 17) precipitation of driest quarter (mm); 18) precipitation of warmest quarter (mm); and 19) precipitation of coldest quarter (mm). We also included three non-climatic environmental variables from AMBDATA: 1) tree cover at a 500 m resolution (percentage); 2) elevation (m) at 3 arc-sec horizontal resolution (about 90 m) and a vertical resolution of 1 m and lastly, 3) declivity (degrees) generated from the elevation grid.
Beta diversity and the Generalized Dissimilarity Model (GDM)
There are various dissimilarity indices to measure changes in species composition between assemblages. We used the Sorensen index (βsØr) as implemented in ‘Betapart package’ – ‘R-project’ (Baselga & Orme, 2012). The input data table consists of the presence and absence of bat species for each study site (latitude and longitude). The package computes the total dissimilarity across all sites, and calculates turnover (Simpson’s index, βsim) and nestedness (the difference between the Sorensen and Simpson index, βsne) components. ‘Betapart’ returns cluster and dissimilarity matrices (between pairwise sites, and pairwise matrices of shared and non-shared species between sites) of turnover and nestedness.
First, we computed total beta diversity and its two components, nestedness and turnover, among the ten Atlantic Forest ecoregions proposed by Olson et al. (2001). Then we split the Atlantic Forest into three larger regions (southern, central and northern). Lastly, nestedness and turnover were calculated among the ecoregions making up each of the three regions. Each region was treated separately.
We used a species presence data frame with the coordinates of the occurrence sites to perform Generalized Dissimilarity Modeling (GDM), which analyses spatial patterns of pairwise dissimilarity in species composition between sites, using a nonlinear regression matrix. GDM quantifies dissimilarity using the Soresen Index (total beta diversity), then associates the turnover component (βsim) with biological distance (predictor variables) between sites (Fitzpatrick et al., 2013). The GDM procedure was used to predict bat species turnover across the Atlantic Forest based on environmental data. We used the ‘R packageGDM’ (Fitzpatrick & Lisk, 2016) to fit a GDM with the 22 environmental variables and the geographical distance (decimal degrees) between occurrence sites. The latter was calculated using the option ‘geo=T’ in the function ‘gdm’ of the GDM package. We used the parameter ‘weightType= richness’ to weight sites relative to the number of species to minimize sampling bias. We chose not to exclude sites with few species (i.e. less than five) because in our database over 360 occurrence sites had five or fewer species, and 90 sites had 10 species or fewer. Less than 11 sites recorded 50% or more of the total number of species (59), so that excluding sites with few species would lead to a significant loss of data. In addition, the average number of species per site was equal to five, and sites with few species are evenly distributed throughout the biome. Therefore, maintaining all sites while correcting for species richness, even with a low number of species, does not weaken our model (see Fig. S1.2). Patterns of species turnover can be visualized on a raster with RGB colour standards; areas with similar colours contain similar assemblages. The GDM matrix regression used was I-spline with three basic functions, meaning that we used three points (the minimum) to form the I-spline curve (Fitzpatrick & Lisk, 2016). I-Splines can be visualized in a graph showing the relationship of predicted biological distance versus observed biological distance, providing an indication of how species composition changes along each environmental gradient (Fitzpatrick & Lisk, 2016). The selection of the best subset of predictors for our model followed Williams, Belbin, Austin, Stein, & Ferrier (2012): the initial model included all predictors; variables that contributed less than 2% to model explanation were iteratively removed. Variable removal was done on a stepwise basis beginning with the elimination of the variable that contributed the least to model explanation. Variables were reassessed regarding their importance and significance during each step of model reduction (i.e., backward elimination). Our model started with 23 predictor variables and ended with 11.
Non-rigid 3D objects are commonly seen in our surroundings. However, previous efforts have been mainly devoted to the retrieval of rigid 3D models, and thus comparing non-rigid 3D shapes is still a challenging problem in content-based 3D object retrieval. Therefore, we organize this track to promote the development of non-rigid 3D shape retrieval. The objective of this track is to evaluate the performance of 3D shape retrieval approaches on a large-scale database of non-rigid 3D watertight meshes generated by our group. Task description: The task is to evaluate the dissimilarity between every two objects in the database and then output the dissimilarity matrix. Data set: Our large-scale database consists of 600 non-rigid 3D objects (see the figure for some examples) that are created by our group using some modeling software and our own codes. We classified these models properly to make sure that every class contains equal number of models. The models are represented as watertight triangle meshes and the file format is selected as the ASCII Object File Format (*.off). (Note that: Some of these models we recreated and modified with permission are originally from several publicly available databases: such as McGill database, TOSCA shapes, Princeton Shape Benchmark, etc.) Evaluation Methodology: We will employ the following evaluation measures: Precision-Recall curve; E-Measure; Discounted Cumulative Gain; Nearest Neighbor, First-Tier (Tier1) and Second-Tier (Tier2). Please Cite the paper : SHREC'11 Track: Shape Retrieval on Non-rigid 3D Watertight Meshes, Z. Lian, A. Godil, B. Bustos, M. Daoudi, J. Hermans, S. Kawamura, Y. Kurita, G. Lavou�, H.V. Nguyen, R. Ohbuchi, Y. Ohkita, Y. Ohishi, F. Porikli, M. Reuter, I. Sipiran, D. Smeets, P. Suetens, H. Tabia, and D. Vandermeulen , In: H. Laga and T. Schreck, A. Ferreira, A. Godil, I. Pratikakis, R. Veltkamp (eds.), Proceedings of the Eurographics/ACM SIGGRAPH Symposium on 3D Object Retrieval, 2011. http://dx.doi.org/10.2312/3DOR/3DOR11/079-088
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Aim: to analyse temporal metacommunity dynamics in river networks in relation to hydrological conditions and dispersal. Location: 15 river reaches from the Llobregat, Besòs and Foix catchments in the North-Eastern Iberian Peninsula. Taxon: aquatic macroinvertebrates belonging to 99 different families. Methods: we sampled aquatic macroinvertebrate communities during spring in 20 consecutive years. We built two environmental distances (one related with water chemistry and another one with river flow regime) and two spatial distances (network distance and topographic distance). Then we used Mantel tests (accounting for spatial autocorrelation) to relate macroinvertebrate dissimilarity with environmental and spatial distances. Additionally, we determined the dry and wet years using the Standardised Precipitation Index (SPI) and we classified macroinvertebrate families based on their ability to fly and to drift. Finally, we ran a linear regression model including the correlation value (r) of each Mantel test as response variable and distance type (environmental or spatial), SPI, dispersal mode, their pairwise interactions and a three-way interaction as predictor variables. Results: metacommunity organization varied over time and it was significantly affected by precipitation, which can be related to river network connectivity. The environmental filters, mainly the flow regime, were generally more important than the spatial filters in explaining community dissimilarity over the study period. However, this depended on the dispersal abilities of the organisms. Network fragmentation due to flow intermittence during the dry years significantly reduced the dispersal capacity of strong aerial dispersers, leading to spatially structured metacommunities. For strong drift dispersers, community dissimilarity patterns were generally best explained by environmental filters regardless of SPI. Main conclusions: a significant temporal variation in metacommunity organization can be expected in highly dynamic systems (e.g. Mediterranean rivers) and it might depend on the dispersal modes and abilities of the organisms, since they determine the response to changes in environmental and landscape filters.
Methods We studied 15 river reaches from the Llobregat, Besòs and Foix catchments in the North-Eastern Iberian Peninsula. The annual precipitation ranged from 500 to 1400 mm, and the average annual temperature from 7 to 15 °C. The mean flow ranged from 0 to 12,000 l/s, thereby including both perennial (i.e. surface flow is maintained throughout the year) and temporary (i.e. surface flow ceases during dry periods) river reaches. All river reaches were under low human pressure as they fulfilled most of the 20 criteria to meet reference conditions in Mediterranean rivers. These criteria include a wide range of human uses and disturbances on rivers and streams (e.g., diffuse sources of pollution, invasive species, land use intensity, riparian vegetation, river geomorphology, habitat conditions and hydrological alterations) and some general aspects of naturalness, and they have already been used to assess the impact of stressors in Mediterranean rivers.
At each site and sampling date, we recorded water temperature, conductivity, oxygen (concentration and percentage of saturation) and pH using a multi-parametric digital probe YSI® Pro Plus. We calculated river flow (l/s) measuring the river section (width x depth) and water velocity with a digital anemometer Schiltknecht® MiniAir2. Additionally, a water sample was collected, filtered through glass fibre filters (GF/F; Whatman, Maidstone, UK), transported to the laboratory in ice, and frozen for water chemistry analysis. At the laboratory, we analyzed major anions (chloride, sulphates, nitrites and nitrates) by high-pressure liquid chromatography and estimated soluble reactive phosphorus and ammonium concentrations using standard colorimetric methods.
We collected aquatic macroinvertebrates using a circular hand net of 250 µm mesh size. Sampling consisted of an initial 3-min kick sample from all available habitats. We examined the initial kick sample in the field and we collected successive samples until no additional macroinvertebrate families were found. Since samples were originally collected for biomonitoring purposes, family-level resolution was used. However, families should be good surrogates of species-level assemblage patterns. Samples were preserved in 10% formaldehyde or 70% ethanol solution following the ECOSTRIMED protocol. We sorted the samples and identified macroinvertebrates to family level in the laboratory. The abundance of each occurring family was quantified and ranked as follows: 1 for 1 to 3 individuals, 2 for 4 to 10 individuals, 3 for 11 to 100 individuals and 4 for more than 100 individuals. This ranking followed the application of biological indices for assessing water quality, which was the initial purpose of this database.
We collected all samples during the spring season (mostly in May) in 20 consecutive years (1997-2017).
Standardised Precipitation Index (SPI)
We determined the dry and wet years using the Standardised Precipitation Index (SPI). SPI represents the standardized deviation from a reference series. Positive and negative SPI values indicate precipitation greater or lower than the mean of the reference series, respectively. It is usually computed using a three-, nine- or 12-month span, but we chose the three-month SPI (i.e. accumulated precipitation of the last three months before each sampling date) as it reflects short- to medium-term soil moisture characteristics and it has been successfully used as a surrogate of river flow in our study region. Therefore, we assumed that lower SPI values (dry years) corresponded to a higher probability of dry river reaches, whereas higher SPI values (wet years) corresponded to a higher probability of high flows (supporting information). We calculated the three-month SPI for each year from 1997 to 2017 using available monthly rainfall data measured in six weather stations of the Meteorological Service of Catalonia from 1950 to 2017.
According to the SPI values, the study periods could be classified as extremely dry (SPI ≤ -2), severely dry (-2 < SPI ≤ -1.5), moderately dry (-1.5 < SPI ≤ -1), neutral (-1 < SPI < 1), moderately wet (1 ≤ SPI < 1.5), severely wet (1.5 ≤ SPI < 2), extremely wet (2 ≤ SPI).
The SPI values are standardised and normalised with the following formula:
SPI = (Pi−Pm ) ÷ Sp
where Pi is the accumulated precipitation of a defined time span (i), Pm is the mean of rainfall of the period analysed, and Sp is the standard deviation of the series of precipitation from the same period.
Classification of the stream flow regime
We used the TREHS (Temporary Rivers Ecological and Hydrological Status) open access software to classify each site as perennial or temporary according to their natural hydrological regime . We calculated the coefficients Mf (% of months in a year with flow), Mp (% of months in a year with isolated pools) and Md (% of months in a year with a dry riverbed), which integrate information about the flow regime of the rivers over the years. According to this information 8 river reaches were permanent and 7 were temporary.
Classifying macroinvertebrates into dispersal groups
We classified macroinvertebrate families according to their ability to fly and to drift using information from a set of biological traits and expert opinion as described in the DISPERSE database. For the ability to fly, we considered traits that indicate dispersal mode (passive/active in the aquatic and the aerial environment), but also other related traits from the larvae and adult stages (e.g., female wing length, adult life span). Female wing length included 8 categories from <5 mm to >50 mm, whereas adult life span included 4 categories from < 1 week to > 1 year. For the ability to drift, we considered propensity to drift, which included three categories from rare to frequent intentional drift. Finally, considering these traits and expert opinion, we assigned an affinity to fly and to drift between 1 and 3 (1 for low affinity to 3 for strong affinity) to each family, and created two dispersal groups: strong flyers (i.e. strong fly affinity) and strong drifters (i.e. medium or strong drift affinity). This affinity accounted for intrafamily trait variability (i.e. associated to different genera or species) following the fuzzy coding approach. We considered taxa with medium drift affinity as strong drifters because only three families had the maximum drift affinity. The strong flyers group comprised 23 families (mainly Odonata, Trichoptera, Heteroptera and some Coleoptera and Diptera) and the strong drifters group comprised 20 families (mainly Ephemeroptera, Trichoptera, Plecoptera and some Diptera, plus the Crustacea Gammaridae).
Calculation of distances
We built a Euclidean matrix (water chemistry distance matrix) based on log-transformed and standardised values (mean=0, SD=1) of river flow, oxygen, ammonium, nitrites, nitrates, soluble reactive phosphorus, sulphates and chloride to characterise local environmental conditions. We also built a hydrological dissimilarity matrix through a Gower index based on the hydrological regime (Mf, Mp, Md), river flow of each sample and river flow of the previous summer. The latter intended to include information on preceding hydrological conditions that can be relevant for aquatic macroinvertebrates (i.e. the macroinvertebrate community can be different if it comes from a wet of dry year). This matrix expressed the differences in local hydrological conditions between sites and it was termed “flow regime difference”.
The dispersal-based distances included two physical distances (geographical and network distance) and a landscape resistance distance (topographic distance), which were used to approximate
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
In this study, we compared neural song circuit morphology to singing behavior recorded in the field for 17 male and 18 female house wrens. The acoustic complexity of house wren songs was quantified using a recently published machine learning approach. This data set includes recordings of all house wren songs used in this analysis along with Raven selection tables defining the boundaries of each syllable. This includes 109 female songs. R code used to extract acoustic features and estimate element diversity and our proxy for song acoustic complexity are included. Summaries of acoustic variables for each song and each element are provided as well as files necessary to replicate the analysis. For each bird, we measured volume, cell number, cell density, and neuron soma size for three song circuits, Area X, HVC (used as a proper name), and the robust nucleus of the arcopallium (RA), and one control region, the nucleus rotundus (Rt). This data set includes these neural morphology measurements for each bird as well as R code used to (1) compare males and females for each neural measurement and (2) explore the relationship between acoustic complexity and neural morphology within each sex.
Methods
Wild house wrens were recorded in the field singing spontaneously or in response to playback recordings of male or female house wren songs. Songs were clipped from much longer song recordings with 1 second before the start and 1 second after the end of the song. No further processing occurred. All songs used in this analysis can be found in the "songs.zip" file. The start and end of each element in the song were defined manually in Raven using both the spectrogram and waveform. These boundaries can be found in the .txt file associated with each sound file (.wav file) in the "songs.zip" folder.
Signal-to-noise ratios (SNR) were used to select songs of suitable quality for the rest of the analysis. Users can use the "snr.and.automatic.frequency.detection.r" script to replicate this calculation for all sounds in the "songs.zip" file. When songs with a suitable SNR were selected, we used this same R script to automatically detect the frequency boundaries of each element. These were then viewed in Raven and corrected for any obvious deviations driven by interfering background noise. These final values are included in the .txt file for each sound.
We then used a machine-learning approach to quantify the acoustic complexity of each song. After transforming and removing any colinear variables, an unsupervised random forest was used to determine which variables best divide the data. This results in a dissimilarity matrix for each syllable which was then transformed into vectors using classical multidimensional scaling. These vectors are "acoustic space" occupied by house wren song elements. A 95% minimum convex polygon was then used to determine how much acoustic space elements within a single song occupy. Songs that occupy more space have a larger range of signal types. This final calculation is referred to as element diversity and is our measure of acoustic complexity. The "snr.and.automatic.frequency.detection.r" script provides the workflow to replicate this acoustic complexity calculation starting with the songs and .txt files in the "songs.zip" file. Users may also skip earlier steps of this analysis by using the files "acoustic.parameters.csv", "Transformed.non-colinear.acous.meas.csv" or "mds.acoustic.area.points.csv" as described in the "README.md" document.
17 male and 18 female house wrens were collected, brains were removed, frozen, sectioned, and stained, and neural morphology was measured under brightfield microscopy to quantify neural morphology in three song control regions, Area X, HVC, and RA, and one control region, Rt. Further detailed methods can be found in the associated manuscript. All neural morphology measurements can be found in "all.bird.neural.data.csv". The "statistics.and.figs.r" script provides the workflow to replicate all statistics and figures in the manuscript. Here we compare males and females for each morphology metric and investigate how song acoustic complexity relates to neural morphology for each sex separately.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Which socioeconomic factor as a basis for grouping population yields the highest intergroup, but the lowest intragroup, heterogeneity in cooking-fuel choice? In this paper, using post-earthquake data on 747,137 households from Nepal, we construct a Euclidean dissimilarity matrix that exhibits the link between the households’ cooking-fuel choice and their socioeconomic group identities. We then employ PERMANOVA, a distance-based multivariate semiparametric method, and find that ethnicity as a grouping factor leads to about 39.1% of intergroup variance in cooking fuel choice, followed by income (26.3%), education (12.6%), and location (4.1%). We also find two distinct clusters of ethnic groups exhibiting similar fuel-choice behaviors. These findings underscore the importance of ethnic-group specific policies in promoting clean cooking in post-earthquake Nepal.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The rise of jawed vertebrates (gnathostomes) and extinction of nearly all jawless vertebrates (agnathans) is one of the most important transitions in vertebrate evolution, but the causes are poorly understood. Competition between agnathans and gnathostomes during the Devonian period is the most commonly hypothesized cause; however, no formal attempts to test this hypothesis have been made. Generally, competition between species increases as morphological similarity increases; therefore, this study uses the largest-to-date morphometric comparison of Silurian and Devonian agnathan and gnathostome groups to determine which groups were most and least likely to have competed. Five agnathan groups (Anaspida, Heterostraci, Osteostraci, Thelodonti, and Furcacaudiformes) were compared with five gnathostome groups (Acanthodii, Actinopterygii, Chondrichthyes, Placodermi, Sarcopterygii) including taxa from most major orders. Morphological dissimilarity was measured by Gower’s dissimilarity coefficient, and the differences between agnathan and gnathostome body forms across early vertebrate morphospace were compared using principal coordinate analysis. Our results indicate competition between some agnathans and gnathostomes is plausible, but not all agnathan groups were similar to gnathostomes. Furcacaudiformes (fork-tailed thelodonts) are distinct from other early vertebrate groups and the least likely to have competed with other groups. Methods Supplementary table T3: 29 measurements of body form and the size, shape, and position of eyes and fins taken from fossil specimens of Silurian and Devonian vertebrates using calipers and, in rare cases for large specimens, tailor tape. Categorized by Class. Proportions of each measurement relative to standard length (tip of the rostrum to the caudal peduncle) calculated for each specimen. For taxa with multiple specimens, averages were taken. length or leading edge of structures that were absent were recorded as "0", distances (from the rostrum) and missing measurements (from incomplete specimens) were recorded as "-" or not applicable. proportions of 120 taxa used in associated article for this data were z-transformed. Supplementary file containing additional supplementary tables and figures, with results, are included; as is the r-script and necessary files for calculating Gower's dissimilarity matrix and Principal Coordinate Analysis, as well as correlation coefficients and the coefficient of determination for each variable on the first four axes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We consider clustering in group decision making where the opinions are given by pairwise comparison matrices. In particular, the k-medoids model is suggested to classify the matrices since it has a linear programming problem formulation that may contain any condition on the properties of the cluster centres. Its objective function depends on the measure of dissimilarity between the matrices but not on the weights derived from them. Our methodology provides a convenient tool for decision support, for instance, it can be used to quantify the reliability of the aggregation. The proposed theoretical framework is applied to a large-scale experimental dataset, on which it is able to automatically detect some mistakes made by the decision-makers, as well as to identify a common source of inconsistency.
Fish distribution databaseThis file contains the freshwater fish species presence/absence data used in our analysis. The first column lists each of the rivers for which we had community data, while the remaining columns indicate the presence/absence of individual species coded as 1 = present, 0 = absent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Upper numbers in cells of the lower triangle of the matrix depict the percentage dissimilarity of pair-wise comparisons according to a SIMPER analysis based on stable carbon and nitrogen isotope data. The letter and numbers in the lower part of the same cell depict the element which is most responsible for dissimilarity and the percentage of contribution of this specific element for explaining dissimilarity between pairs. Numbers in the upper triangle of the matrix depict the R- and P-value of pair-wise comparisons according to an ANOSIM. R-values range between 0 and 1 with values above 0.75 indicating separation of species pairs and values below 0.25 as barely separable species based on stable isotope ratios [34]. Pairs of species with similar stable isotope signature are highlighted in bold (see text for exemptions) and prey categories are highlighted with colours: browser in green, grazer/high δ15N in yellow and grazer/low δ15N in blue.Dissimilarity matrix of potential prey species according to an analysis of similarity (ANOSIM).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Biogeographical studies have traditionally focused on readily visible organisms, but recent technological advances are enabling analyses of the large-scale distribution of microscopic organisms, whose biogeographical patterns have long been debated. Here we assessed the global structure of plankton geography and its relation to the biological, chemical and physical context of the ocean (the 'seascape') by analyzing metagenomes of plankton communities sampled across oceans during the Tara Oceans expedition, in light of environmental data and ocean current transport. Using a consistent approach across organismal sizes that provides unprecedented resolution to measure changes in genomic composition between communities, we report a pan-ocean, size-dependent plankton biogeography overlying regional heterogeneity. We found robust evidence for a basin-scale impact of transport by ocean currents on plankton biogeography, and on a characteristic timescale of community dynamics going beyond simple seasonality or life history transitions of plankton.Supplementary Table 1. List of Tara Oceans samples sequenced with a metabarcoding (18S V9) approach and with a metagenomic approach, including identifiers for sequencing reads deposited in the DDBJ/ENA/GenBank Short Read Archives (SRA). [This Table is identical in version 2.]Supplementary Table 2. Table of environmental parameters for each sample. [This Table is identical in version 2.]Supplementary Table 3. Matrix of metagenomic dissimilarity for the 0-0.22 μm size fraction. [This Table is identical in version 2.]Supplementary Table 4. Matrix of metagenomic dissimilarity for the 0.22-1.6/3 μm size fraction. [This Table is identical in version 2.]Supplementary Table 5. Matrix of metagenomic dissimilarity for the 0.8-5 μm size fraction. [This Table is identical in version 2.]Supplementary Table 6. Matrix of metagenomic dissimilarity for the 5-20 μm size fraction. [This Table is identical in version 2.]Supplementary Table 7. Matrix of metagenomic dissimilarity for the 20-180 μm size fraction. [This Table is identical in version 2.]Supplementary Table 8. Matrix of metagenomic dissimilarity for the 180-2000 μm size fraction. [This Table is identical in version 2.]Supplementary Table 9. Matrix of OTU dissimilarity for the 0-0.22 μm size fraction. [This Table is identical in version 2.]Supplementary Table 10. Matrix of OTU dissimilarity for the 0.22-1.6/3 μm size fraction. [This Table is identical in version 2.]Supplementary Table 11. Matrix of OTU dissimilarity for the 0.8-5 μm size fraction. [This Table is identical in version 2.]Supplementary Table 12. Matrix of OTU dissimilarity for the 5-20 μm size fraction. [This Table is identical in version 2.]Supplementary Table 13. Matrix of OTU dissimilarity for the 20-180 μm size fraction. [This Table is identical in version 2.]Supplementary Table 14. Matrix of OTU dissimilarity for the 180-2000 μm size fraction. [This Table is identical in version 2.]Supplementary Table 15. Matrix of minimum travel time, in years. [This Table is identical in version 2.]Supplementary Table 16. Matrix of minimum geographic distance (without traversing land), in kilometers. [This Table is identical in version 2.]Supplementary Table 17. Matrix of imaging-based dissimilarity. [This Table is identical in version 2.]Supplementary Table 18. Matrix of metagenome-assembled genome (MAG)-based dissimilarity for the 20-180 μm size fraction. [The filename of this Table was modified from version 2. The contents of the Table are identical.]Supplementary Table 19. The cophenetic correlation coefficient for different methods of clustering metagenomic dissimilarity. [This Table is identical in version 2.]Supplementary Table 20. Baker's Gamma index comparing clustering results within size fractions. [This Table is identical in version 2.]Supplementary Table 21. Rand Index for K-means and spectral clustering, and multivariate ANOVA calculated by the adonis function. [This Table is identical in version 2.]Dataset 1. Reference database (in FASTA format) used to perform taxonomic assignment of metabarcodes. The header line of each reference V9 rDNA barcode (with a > sign) contains a unique identifier derived from GenBank accession number, followed by the taxonomic path associated to the reference barcode. [This Dataset is identical in version 2.]Dataset 2. V9 rDNA abundance at the metabarcode level. md5sum = unique identifier; totab = total abundance across all samples; cid = identifier of the OTU to which the barcode belongs (see Dataset 3); pid = best percentage identity to a barcode in Dataset 1; refs = identifier(s) of the best matching barcode(s) in Dataset 1; lineage = taxononmic lineage of the best match in Dataset 1; taxogroup = high-level taxonomic grouping of the best match in Dataset 1; sequence = V9 rDNA sequence; TV9_XXX = barcode abundance by sample (see Supplementary Table 1 for sample identifiers). [This Dataset is identical in version 2.]Dataset 3. V9 rDNA abundance at the OTU (operational taxonomic unit) level. cid = identifier of the OTU; md5sum = unique identifier of the most abundant barcode in the OTU; pid, refs, lineage, taxogroup, sequence = defined as in Dataset 2; rtotab = total abundance of the most abundant barcode in the OTU; ctotab = total abundance of all barcodes in the OTU; TV9_XXX = abundance by sample of all barcodes in the OTU (see Supplementary Table 1 for sample identifiers). [This Dataset is identical in version 2.]Dataset 4. Relative abundances of metagenome-assembled genomes (MAGs) in metagenomic samples from the 20-180 μm size fraction. [This Dataset is new in version 3.]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PERMANOVA results of square root transformed relative abundance data generated by MaxN and MLT using Bray Curtis dissimilarity matrix and one dummy variable. Significant values are highlighted bold.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Temporally ordered multi-neuron patterns likely encode information in the brain. We introduce an unsupervised method, SPOTDisClust (Spike Pattern Optimal Transport Dissimilarity Clustering), for their detection from high-dimensional neural ensembles. SPOTDisClust measures similarity between two ensemble spike patterns by determining the minimum transport cost of transforming their corresponding normalized cross-correlation matrices into each other (SPOTDis). Then, it performs density-based clustering based on the resulting inter-pattern dissimilarity matrix. SPOTDisClust does not require binning and can detect complex patterns (beyond sequential activation) even when high levels of out-of-pattern “noise” spiking are present. Our method handles efficiently the additional information from increasingly large neuronal ensembles and can detect a number of patterns that far exceeds the number of recorded neurons. In an application to neural ensemble data from macaque monkey V1 cortex, SPOTDisClust can identify different moving stimulus directions on the sole basis of temporal spiking patterns.