Facebook
TwitterDespite a growing recognition of the importance of data to the economy and to science, investment in repositories to manage and disseminate that data in easily accessible and understandable ways is scarce. Keeping repository services active and up-to-date for a long time period is difficult due to this funding situation. As a result, repositories must continually provide proof of their value, their Return on Investment (ROI) to their sponsors; yet doing so has always been difficult, problematic and not always successful. In this work, an analysis of approaches for assessing the ROI of several scientific data repositories has identified various techniques that repositories use to report on the impact and value of their data products and services. A survey of selected repositories rated the set of metrics identified and rated each by its importance as well as the ease with which the metric could be measured. The discussion is broken down into considerations for calculating costs, perceived value of repositories and suggested metrics that would allow a repository to calculate an ROI. The authors, representatives of environmental data repositories, concluded that easily obtainable data use metrics, such as data downloads, etc., have limited value while more informative analyses would require additional resources.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This resource is created for the dataset of the paper "Toward Open and Reproducible Environmental Modeling by Integrating Online Data Repositories, Computational Environments, and Model Application Programming Interfaces"
This resource includes; - 1 Model Program Resources - 7 Model Instance Resources - 2 Composite Resources
Facebook
TwitterThis study was performed to determine how different soil moistures, soil sources, and agricultural practices affected the gross CH4 fluxes (i.e., rates of methanogenesis) of soils. We extracted intact soil cores from two agricultural sites in the USA in row crop plots under conventional, no-till, and organic management. We then took them to the lab, manipulated their moisture levels, incubated them at room temperature for 22 weeks, and measured gas fluxes at weeks 6 and 21. We developed and utilized a new form of CH4 isotope pool dilution (IPD) to estimate gross CH4 production and consumption fluxes. This new method can measure IPD in a bag headspace that loses volume over time due to sampling. We fit the IPD model to the data and extracted gross CH4 production (P) and consumption (K) constants. These along with calculated fluxes and covariates measured (e.g., moisture, inorganic N) are reported in the main data table.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The current and future consequences of anthropogenic impacts such as climate change and habitat loss on ecosystems will be better understood and therefore addressed if diverse ecological data from multiple environmental contexts are more effectively shared. Re-use requires that data are readily available to the scientific scrutiny of the research community. A number of repositories to store shared data have emerged in different ecological domains and developments are underway to define common data and metadata standards. Nevertheless, the goal is far from being achieved and many challenges still need to be addressed. The definition of best practices for data sharing and re-use can benefit from the experience accumulated by pilot collaborative projects. The Euromammals bottom-up initiative has pioneered collaborative science in spatial animal ecology since 2007. It involves more than 150 institutes to address scientific, management and conservation questions regarding terrestrial mammal species in Europe using data stored in a shared database. In this manuscript we present some key lessons that we have learnt from the process of making shared data and knowledge accessible to researchers and we stress the importance of data management for data quality assurance. We suggest putting in place a pro-active data review before data are made available in shared repositories via robust technical support and users’ training in data management and standards. We recommend pursuing the definition of common data collection protocols, data and metadata standards, and shared vocabularies with direct involvement of the community to boost their implementation. We stress the importance of knowledge sharing, in addition to data sharing. We show the crucial relevance of collaborative networking with pro-active involvement of data providers in all stages of the scientific process. Our main message is that for data-sharing collaborative efforts to obtain substantial and durable scientific returns, the goals should not only consist in the creation of e-infrastructures and software tools but primarily in the establishment of a network and community trust. This requires moderate investment, but over long-term horizons.
Facebook
TwitterThese are quality-assured time series datasets from weather stations and runoff volume monitoring infrastructure, Cleveland OH. This dataset is associated with the following publication: Shuster, W., and R. Darner. Hydrologic Performance of Retrofit Rain Gardens in a Residential Neighborhood (Cleveland Ohio USA) with a Focus on Monitoring Methods. U.S. Environmental Protection Agency, Washington, DC, USA, 2018.
Facebook
TwitterAccelerometers in animal-attached tags have proven to be powerful tools in behavioural ecology, being used to determine behaviour and provide proxies for movement-based energy expenditure. Researchers are collecting and archiving data across systems, seasons and device types. However, in order to use data repositories to draw ecological inference, we need to establish the error introduced according to sensor type and position on the study animal and establish protocols for error assessment and minimization.
Using laboratory trials, we examine the absolute accuracy of tri-axial accelerometers and determine how inaccuracies impact measurements of dynamic body acceleration (DBA) in human participants, with DBA as the main acceleration-based proxy for energy expenditure. We then examine how tag type and placement affect the acceleration signal in birds, using (i) pigeons Columba livia flying in a wind tunnel, with tags mounted simultaneously in two positions, and (ii) back- and tai...
Facebook
TwitterAcknowledgement: The NEAD format includes NetCDF metadata and is proudly inspired by both SMET and NetCDF formats. NEAD is designed as a long-term data preservation and exchange format. The NEAD specifications were presented at the "WMO Data Conference 2020 - Earth System Data Exchange in the 21st Century" (Virtual Conference). ----------------------- Summary: The Non-Binary Environmental Data Archive (NEAD) format is being developed as a generic and intuitive format that combines the self-documenting features of NetCDF with human readable and writeable features of CSV. It is designed for exchange and preservation of time series data in environmental data repositories. License: The NEAD specifications are released to the public domain under a Creative Commons 4.0 CC0 "No Rights Reserved" international license. You can reuse the information contained herein in any way you want, for any purposes and without restrictions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Material 1
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
From the foods we eat and the houses we construct, to our religious practices and political organization, to who we can marry and the types of games we teach our children, the diversity of cultural practices in the world is astounding. Yet, our ability to visualize and understand this diversity is limited by the ways it has been documented and shared: on a culture-by-culture basis, in locally-told stories or difficult-to-access repositories. In this paper we introduce D-PLACE, the Database of Places, Language, Culture, and Environment. This expandable and open-access database (accessible at https://d-place.org) brings together a dispersed corpus of information on the geography, language, culture, and environment of over 1400 human societies. We aim to enable researchers to investigate the extent to which patterns in cultural diversity are shaped by different forces, including shared history, demographics, migration/diffusion, cultural innovations, and environmental and ecological conditions. We detail how D-PLACE helps to overcome four common barriers to understanding these forces: i) location of relevant cultural data, (ii) linking data from distinct sources using diverse ethnonyms, (iii) variable time and place foci for data, and (iv) spatial and historical dependencies among cultural groups that present challenges for analysis. D-PLACE facilitates the visualisation of relationships among cultural groups and between people and their environments, with results downloadable as tables, on a map, or on a linguistic tree. We also describe how D-PLACE can be used for exploratory, predictive, and evolutionary analyses of cultural diversity by a range of users, from members of the worldwide public interested in contrasting their own cultural practices with those of other societies, to researchers using large-scale computational phylogenetic analyses to study cultural evolution. In summary, we hope that D-PLACE will enable new lines of investigation into the major drivers of cultural change and global patterns of cultural diversity.
Facebook
TwitterThis dataset contains supplementary information for a manuscript describing the ESS-DIVE (Environmental Systems Science Data Infrastructure for a Virtual Ecosystem) data repository's community data and metadata reporting formats. The purpose of creating the ESS-DIVE reporting formats was to provide guidelines for formatting some of the diverse data types that can be found in the ESS-DIVE repository. The 6 teams of community partners who developed the reporting formats included scientists and engineers from across the Department of Energy National Lab network. Additionally, during the development process, 247 individuals representing 128 institutions provided input on the formats.The primary files in this dataset are 10 data and metadata crosswalk for ESS-DIVE’s reporting formats (all files ending in _crosswalk.csv). The crosswalks compare elements used in each of the reporting formats to other related standards and data resources (e.g., repositories, datasets, data systems). This dataset also contains additional files recommended by ESS-DIVE’s file-level metadata reporting format. Each data file has an associated dictionary (files ending in _dd.csv) which provide a brief description of each standard or data resource consulted in the data reporting format development process. The flmd.csv file describes each file contained within the dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data package was produced by researchers working on the Shortgrass Steppe Long Term Ecological Research (SGS-LTER) Project, administered at Colorado State University. Long-term datasets and background information (proposals, reports, photographs, etc.) on the SGS-LTER project are contained in a comprehensive project collection within the Digital Collections of Colorado (http://digitool.library.colostate.edu/R/?func=collections&collection_id=3429). The data table and associated metadata document, which is generated in Ecological Metadata Language, may be available through other repositories serving the ecological research community and represent components of the larger SGS-LTER project collection. Most investigators studying grasslands have assumed that the low standing biomass of the SGS created a system with a low probability of carrying fire, and thus a minimal historical role of fire. Nonetheless, there are years with aboveground biomass equivalent to the mixed grass prairie, and a high frequency of lightening storms. Regardless of the historical role of fire in SGS, there are new questions regarding its utility in managing for the presence of the threatened mountain plover, which only nests in areas of low plant biomass. United States Forest Service, Pawnee National Grassland recently initiated a burning program in the mid 1990s to address questions about using fire to increase plover habitat; we have collected data on some of these plots to investigate the influence of fire on SGS vegetation. Several datasets were created between 1999 and 2004 by SGS-LTER researchers, including measurements of shrub and cactus mortality rates, aboveground net primary production, amounts of litter and standing dead, and aboveground nitrogen dynamics in burned and control plots in the western section of the Pawnee National Grassland. Additional information and referenced materials can be found: http://hdl.handle.net/10217/83326. Resources in this dataset:Resource Title: Website Pointer to html file. File Name: Web Page, url: https://portal.edirepository.org/nis/mapbrowse?scope=knb-lter-sgs&identifier=127 Webpage with information and links to data files for download
Facebook
TwitterThis dataset is designed to accompany the paper submitted to Data Science Journal: O'Brien et al, "Earth Science Data Repositories: Implementing the CARE Principles". This dataset shows examples of activities that data repositories are likely to undertake as they implement the CARE principles. These examples were constructed as part of a discussion about the challenges faced by data repositories when acquiring, curating, and disseminating data and other information about Indigenous Peoples, communities, and lands. For clarity, individual repository activities were very specific. However, in practice, repository activities are not carried out singly, but are more likely to be performed in groups or in sequence. This dataset shows examples of how activities are likely to be combined in response to certain triggers. See related dataset O'Brien, M., R. Duerr, R. Taitingfong, A. Martinez, L. Vera, L. Jennings, R. Downs, E. Antognoli, T. ten Brink, N. Halmai, S.R. Carroll, D. David-Chavez, M. Hudson, and P. Buttigieg. 2024. Alignment between CARE Principles and Data Repository Activities. Environmental Data Initiative. https://doi.org/10.6073/pasta/23e699ad00f74a178031904129e78e93 (Accessed 2024-03-13), and the paper for more information about development of the activities and their categorization, raw data of relationships between specific activities and a discussion of the implementation of CARE Principles by data repositories.
Data in this table are organized into groups delineated by a triggering event in the
first column. For example, the first group consists of 9 rows; while the second group has 7
rows. The first row of each group contains the event that triggers the set of actions
described in the last 4 columns of the spreadsheet. Within each group, the associated rows
in each column are given in numerical not temporal order, since activities will likely vary
widely from repository to repository.
For example, the first group of rows is about what likely needs to happen if a
repository discovers that it holds Indigenous data (O6). Clearly, it will need to develop
processes to identify communities to engage (R6) as well as processes for contacting those
communities (R7) (if it doesn't already have them). It will also probably need to review and
possibly update its data management policies to ensure that they are justifiable (R2). Based
on these actions, it is likely that the repository's outreach group needs to prepare for
working with more communities (O3) including ensuring that the repository's governance
protocols are up-to-date and publicized (O5) and that the repository practices are
transparent (O4). If initial contacts go well, it is likely that the repository will need
ongoing engagement with the community or communities (S1). This may include adding
representation to the repository's advisory board (O2); clarifying data usage with the
communities (O9), facilitating relationships between data providers and communities (O1);
working with the community to identify educational opportunities (O10); and sharing data
with them (O8). It may also become necessary to liaise with whomever is maintaining the
vocabularies in use at the repository (O7).
Facebook
TwitterThis EDI data package contains instructional materials necessary to teach Macrosystems EDDIE Module 7: Using Data to Improve Ecological Forecasts, a ~3-hour educational module for undergraduates. Ecological forecasting is an emerging approach that provides an estimate of the future state of an ecological system with uncertainty, allowing society to prepare for changes in important ecosystem services. To be useful for management, ecological forecasts need to be both accurate enough for managers to be able to rely on them for decision-making and include a representation of forecast uncertainty, so managers can properly interpret the probability of future events. To improve forecast accuracy, forecasts can be updated with observational data once they become available, a process known as data assimilation. Recent improvements in environmental sensor technology and an increase in the number of sensors deployed in ecosystems have increased the availability of data for assimilation to develop and improve forecasts for natural resource management. In this module, students will explore how assimilating data with different amounts of observation uncertainty and at different temporal frequencies affects forecasts of lake water quality, using data from the U.S. National Ecological Observatory Network (NEON). The flexible, three-part (A-B-C) structure of this module makes it adaptable to a range of student levels and course structures. There are two versions of the module: an R Shiny application which does not require students to code, and an RMarkdown version which requires students to read and alter R code to complete module activities. The R Shiny application is published to shinyapps.io and is available at the following link: https://macrosystemseddie.shinyapps.io/module7/. GitHub repositories are available for both the R Shiny (https://github.com/MacrosystemsEDDIE/module7) and RMarkdown versions (https://github.com/MacrosystemsEDDIE/module7_R) of the module, and both code repositories have been published with DOIs to Zenodo (R Shiny version at DOI 10.5281/zenodo.10903839 and RMarkdown version at DOI 10.5281/zenodo.10909589). Readers are referred to the module landing page for additional information (https://serc.carleton.edu/eddie/teaching_materials/modules/module7.html).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides guidance materials and templates to help you prepare your research datasets for deposit in the U of G Research Data Repositories.Please refer to the U of G Research Data Repositories LibGuide for detailed information about the U of G Research Data Repositories including additional resources for preparing datasets for deposit. The library offers a self-deposit with curation service. The deposit workflow is as follows:Create your repository account.If you are a first-time depositor, complete the U of G Research Data Repositories New Depositor Intake Form.Activate your Data Repositories account by logging in with your U of G username and password.Once your account is created, contact us to set up your dataset creator access to your home department’s collection in the Data Repositories.Note: If you already have a Data Repositories account and dataset creator access, you can log in and begin a new deposit to your home department’s collection right away.Prepare your dataset.Assemble your dataset following the Dataset Deposit Guidelines. Use the README file template to capture data documentation.Create a draft dataset record.Log in to the Data Repositories and create a draft dataset record following the instructions in the Dataset Submission Guide.Submit your draft dataset for review.Dataset review.Data Repositories staff will review (also referred to as curate) your dataset for alignment with the Dataset Deposit Guidelines using a standard curation workflow.The curator will collaborate with you to enhance the dataset.Public release.Once ready, the dataset curator will make the dataset publicly available in the Data Repositories, with appropriate file access controls. Support: If you have any questions about preparing and depositing your dataset, please make a Publishing and Author Support Request.
Facebook
TwitterGAM-NICHE is a new tool developed by AZTI (Valle et al. 2023) to build Species Distribution Models (SDMs) under the ecological niche theory (Citores et al. 2020). It provides a GitHub tutorial in R language with an application to marine fish.
Species Distribution Models (SDMs) are numerical tools that combine observations of species occurrence or abundance at known locations with information on the environmental and/or spatial characteristics of those locations (Elith and Leathwick 2009). SDMs are widely used as a tool for understanding species spatial ecology and are also known as ecological niche models (ENM) or habitat suitability models.
According to ecological niche theory, species response curves are unimodal with respect to environmental gradients (Hutchinson 1957). While a variety of statistical methods have been developed for species distribution modelling, a general problem with most of these habitat modelling approaches is that the estimated response curves can display biologically implausible shapes which do not respect ecological niche theory. This is because species response curves are fit statistically with any assumption or restriction, which sometimes do not respect the ecological niche theory. To better understand species response to environmental changes, SDMs should consider theoretical background such as the ecological niche theory and pursue the unimodality of the response curves with respect to environmental gradients.
This book provides a tutorial on how to use Shape-Constrained Generalized Additive Models (SC-GAMs) to build SDMs under the ecological niche theory framework (Citores et al. 2020). SC-GAMs impose monotonicity and concavity constraints in the linear predictor of the GAMs and avoid overfitting. SC-GAM is an effective alternative to fitting nonsymmetric parametric response curves, while retaining the unimodality constraint, required by ecological niche theory, for direct variables and limiting factors.
The book is organised following the key steps in good modelling practice of SDMs (Elith and Leathwick 2009). First, presence data of a selected species are downloaded from GBIF/OBIS global public datasets and pseudo-absence data are created. Then, environmental data are downloaded from public repositories and extracted at each of the presence/pseudo-absence data points. Based on this dataset, an exploratory analysis is conducted to help deciding on the best modelling approach. The model is fitted to the dataset and the quality of the fit and the realism of the fitted response function are evaluated. After selecting a threshold to transform the continuous probability predictions into binary responses, the model is validated using a k-fold approach. Finally, the predicted maps are generated for visualization.
The code is available in AZTI’s github repository and the book is readily available. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0)
To cite the book, please use:
Valle, M., Citores, L., Ibaibarriaga, L., Chust, C. (2023) GAM-NICHE: Shape-Constrained GAMs to build Species Distribution Models under the ecological niche theory. AZTI. https://doi.org/10.57762/fzpy-6w51
References Citores, L, L Ibaibarriaga, DJ Lee, MJ Brewer, M Santos, and G Chust. 2020. “Modelling Species Presence–Absence in the Ecological Niche Theory Framework Using Shape-Constrained Generalized Additive Models.” Ecological Modelling 418: 108926. https://doi.org/10.1016/j.ecolmodel.2019.108926.
Facebook
TwitterAttribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
The eAtlas is a web delivery platform for environmental research data that focuses on data management and data visualisation. As part of the National Environmental Science Program (NESP) the eAtlas was responsible for coordinating the publication of data generated by all research projects in the NESP Tropical Water Quality (TWQ) hub. The focus of the eAtlas was to:
Actively engage projects on data management issues.
Provide in-depth review of final datasets to ensure quality data publications suitable for future reuse.
Provide permanent hosting and publication of the hub datasets and metadata.
Develop and host visualisations of spatial datasets for users to quickly assess the suitability of the data for their research, and for environmental managers to view without specialist tools.
Provide a web platform for creating project centric websites that highlight stories based around research project data
The data management under the NESP TWQ hub was more successful than previous research programs that the eAtlas has been associated with over the last 12 years. A greater percentage of data products from research projects were captured and published to a high standard. As of 7 June 2021, 94 datasets were published from the NESP TWQ hub which is significantly more than the 49 datasets from the previous National Environmental Research Program Tropical Ecosystem (NERP TE) program in 2011 – 2014, and 14 datasets from the Marine and Tropical Science Research Facility (MTSRF) in 2008 - 2010.
As part of the project final reporting a comparison was made between the size of the NESP TWQ metadata records, as measured by word count of the title, abstract and lineage, to those of similar environmental datasets in other data repositories, including the AODN, CSIRO, NESP MB hub and JCU. The aim of this analysis was to determine which aspects of the data management workflow used on NESP TWQ projects contributed to the level of detail in the metadata records. The spreadsheet associated with the word count analysis is available for download. More detail on the methods are available in the NESP-TWQ 5.15 final report (awating publication on the https://nesptropical.edu.au/ website).
Facebook
TwitterHere we present documentation of the ESS-DIVE reporting format for leaf-level gas exchange data and metadata. This reporting format provides guidance to data contributors on how to store data to maximize their discoverability, facilitate their efficient reuse, and add value to individual datasets. For data users, the reporting format will better allow data repositories to optimize data search and extraction, and more readily integrate similar data into harmonized synthesis products. The reporting format specifies data table variable naming and unit conventions, as well as metadata characterizing experimental conditions and protocols. For common data types that were the focus of this initial version of the reporting format, i.e., survey measurements, dark respiration, carbon dioxide and light response curves, and parameters derived from those measurements, we took a further step of defining required additional data that would maximize the potential reuse of those data types. To aid data contributors and the development of data ingest tools by data repositories we provided a translation table comparing the outputs of common gas exchange instruments. The reporting format presented here is intended to form a foundation for future development that will incorporate additional data types and variables as gas exchange systems and measurement approaches advance in the future. The reporting format documentation is maintained and updated on the ESS-DIVE Community Space GitHub. This data package is the first published version of this reporting format, and comprises a zip file of the complete content of https://github.com/ess-dive-community/essdive-leaf-gas-exchange v1.0. The zip contains the reporting format description, guidelines, variable tables and instructions in GitHub markdown language (.md) and 2 metadata templates as spreadsheets with drop down options (.xlsx files, also function in GoogleSheets).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset lists 289 blacklegged tick population datasets from 7 studies that record abundance. These datasets were found by inputing keywords Ixodes Scapularis and tick in data repositories including Long Term Ecological Research data portal, National Ecological Observatory Network data portal, Google Datasets, Data Dryad, and Data One. The types of tick data recorded from these studies include density (number per square meter for example), proportion of ticks, count of ticks found on people. The locations of the datasets range from New York, New Jersey, Iowa, Massachusetts, and Connecticut, and range from 9 to 24 years in length. These datasets vary in that some record different life stages, geographic scope (county/town/plot), sampling technique (dragging/surveying), and different study length. The impact of these study factors on study results is analyzed in our research.
Funding:
RMC is supported by the National Institute of General Medical Sciences of the National Institutes of the Health under Award Number R25GM122672. CAB, JP, and KSW are supported by the Office of Advanced Cyberinfrastructure in the National Science Foundation under Award Number #1838807. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health or the National Science Foundation.
Facebook
TwitterEstimated population trends can identify declining species to focus biological conservation, but monitoring may fail to illuminate causes of population change and strategies for reversing declines. Monitoring programs can relate trends with environmental attributes to test causal hypotheses, but typical analytical approaches do not explicitly support causal inference, diluting available data for informing conservation. The U.S. Bureau of Land Management (BLM) extended Integrated Monitoring in Bird Conservation Regions with a quasi-experimental sampling design over a 10-year period (2010–2019) to evaluate impacts of oil and gas development on sagebrush birds within the Atlantic Rim Natural Gas Development Project in southern Wyoming. We analyzed resulting data using a multi-scale community occupancy model to estimate trends in species occupancy and richness relevant to management triggers. Additionally, we employed path analysis to evaluate mechanisms underlying observed trends to inform..., Bird data were collected in conjunction with the Integrated Monitoring and Bird Conservation Regions program, and environmental data were retrieved primarily from online repositories. Detailed methods are described in the manuscript accompanying this repository., Scripts for analyzing data in this repository are archived at https://doi.org/10.5281/zenodo.7566617. Data were analyzed in Program R with modeling implemented using the R package nimble. Most data provided here are stored in an R workspace and thus require Program R to access them.
Facebook
TwitterThis paper gives an overview over all currently available data sets for the European Long-term Ecosystem Research (eLTER) monitoring site Gesäuse-Johnsbachtal. The site is part of the LTSER platform Eisenwurzen in the Alps of the province of Styria, Austria. It contains both protected (National Park Gesäuse) and non-protected areas (Johnsbachtal). Although the main research focus of the eLTER monitoring site Gesäuse-Johnsbachtal is on inland surface running waters, forests and other wooded land, the eLTER whole system (WAILS) approach was followed in regard to the data selection, systematically screening all available data in regard to its suitability as eLTERs Standard Observations (SOs). Thus, data from all system strata was included, incorporating Geosphere, Atmosphere, Hydrosphere, Biosphere and Sociosphere. In the WAILS approach these SOs are key data for a whole system approach towards long term ecosystem research. Altogether, 54 data sets have been collected for the eLTER monitoring site Gesäuse-Johnsbachtal and included in the Dynamical Ecological Information Management System Site and Data Registry (DEIMS-SDR), which is the eLTER data platform. The presented work provides all these data sets through dedicated data repositories for FAIR use. This paper gives an overview on all compiled data sets and their main properties. Additionally, the available data are evaluated in a concluding gap analysis with regard to the needed observation data according to WAILS, followed by an outlook on how to fill these gaps.
Facebook
TwitterDespite a growing recognition of the importance of data to the economy and to science, investment in repositories to manage and disseminate that data in easily accessible and understandable ways is scarce. Keeping repository services active and up-to-date for a long time period is difficult due to this funding situation. As a result, repositories must continually provide proof of their value, their Return on Investment (ROI) to their sponsors; yet doing so has always been difficult, problematic and not always successful. In this work, an analysis of approaches for assessing the ROI of several scientific data repositories has identified various techniques that repositories use to report on the impact and value of their data products and services. A survey of selected repositories rated the set of metrics identified and rated each by its importance as well as the ease with which the metric could be measured. The discussion is broken down into considerations for calculating costs, perceived value of repositories and suggested metrics that would allow a repository to calculate an ROI. The authors, representatives of environmental data repositories, concluded that easily obtainable data use metrics, such as data downloads, etc., have limited value while more informative analyses would require additional resources.