31 datasets found

d
Data from: Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets...
catalog.data.gov
data.usgs.gov
Updated Nov 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Geological Survey (2025). Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets and Model Results Using the Weighted Regressions on Time, Discharge, and Season (WRTDS) Model for Buck Creek Watersheds, Adirondack Park, New York, 2001 to 2021 [Dataset]. https://catalog.data.gov/dataset/streamflow-dissolved-organic-carbon-and-nitrate-input-datasets-and-model-results-using-the
Explore at:
Dataset updated
Nov 26, 2025
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Description
This data release supports an analysis of changes in dissolved organic carbon (DOC) and nitrate concentrations in Buck Creek watershed near Inlet, New York 2001 to 2021. The Buck Creek watershed is a 310-hectare forested watershed that is recovering from acidic deposition within the Adirondack region. The data release includes pre-processed model inputs and model outputs for the Weighted Regressions on Time, Discharge and Season (WRTDS) model (Hirsch and others, 2010) to estimate daily flow normalized concentrations of DOC and nitrate during a 20-year period of analysis. WRTDS uses daily discharge and concentration observations implemented through the Exploration and Graphics for River Trends R package (EGRET) to predict solute concentration using decimal time and discharge as explanatory variables (Hirsch and De Cicco, 2015; Hirsch and others, 2010). Discharge and concentration data are available from the U.S. Geological Survey National Water Information System (NWIS) database (U.S. Geological Survey, 2016). The time series data were analyzed for the entire period, water years 2001 (WY2001) to WY2021 where WY2001 is the period from October 1, 2000 to September 30, 2001. This data release contains 5 comma-separated values (CSV) files, one R script, and one XML metadata file. There are four input files (“Daily.csv”, “INFO.csv”, “Sample_doc.csv”, and “Sample_nitrate.csv”) that contain site information, daily mean discharge, and mean daily DOC or nitrate concentrations. The R script (“Buck Creek WRTDS R script.R”) uses the four input datasets and functions from the EGRET R package to generate estimations of flow normalized concentrations. The output file (“WRTDS_results.csv”) contains model output at daily time steps for each sub-watershed and for each solute. Files are automatically associated with the R script when opened in RStudio using the provided R project file ("Files.Rproj"). All input, output, and R files are in the "Files.zip" folder.
g
Data to Assess Nitrogen Export from Forested Watersheds in and near the Long...
gimi9.com
Updated Mar 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Data to Assess Nitrogen Export from Forested Watersheds in and near the Long Island Sound Basin with Weighted Regressions on Time, Discharge, and Season (WRTDS) | gimi9.com [Dataset]. https://gimi9.com/dataset/data-gov_data-to-assess-nitrogen-export-from-forested-watersheds-in-and-near-the-long-island-sound-/
Explore at:
Dataset updated
Mar 4, 2025
Area covered
Long Island, Long Island Sound
Description
The U.S. Geological Survey, in cooperation with the U.S. Environmental Protection Agency's Long Island Sound Study (https://longislandsoundstudy.net), characterized nitrogen export from forested watersheds and whether nitrogen loading has been increasing or decreasing to help inform Long Island Sound management strategies. The Weighted Regressions on Time, Discharge, and Season (WRTDS; Hirsch and others, 2010) method was used to estimate annual concentrations and fluxes of nitrogen species using long-term records (14 to 37 years in length) of stream total nitrogen, dissolved organic nitrogen, nitrate, and ammonium concentrations and daily discharge data from 17 watersheds located in the Long Island Sound basin or in nearby areas of Massachusetts, New Hampshire, or New York. This data release contains the input water-quality and discharge data, annual outputs (including concentrations, fluxes, yields, and confidence intervals about these estimates), statistical tests for trends between the periods of water years 1999-2000 and 2016-2018, and model diagnostic statistics. These datasets are organized into one zip file (WRTDSeLists.zip) and six comma-separated values (csv) data files (StationInformation.csv, AnnualResults.csv, TrendResults.csv, ModelStatistics.csv, InputWaterQuality.csv, and InputStreamflow.csv). The csv file (StationInformation.csv) contains information about the stations and input datasets. Finally, a short R script (SampleScript.R) is included to facilitate viewing the input and output data and to re-run the model. Reference: Hirsch, R.M., Moyer, D.L., and Archfield, S.A., 2010, Weighted Regressions on Time, Discharge, and Season (WRTDS), with an application to Chesapeake Bay River inputs: Journal of the American Water Resources Association, v. 46, no. 5, p. 857–880.
Petre_Slide_CategoricalScatterplotFigShare.pptx
figshare.com
pptx
Updated Sep 19, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Benj Petre; Aurore Coince; Sophien Kamoun (2016). Petre_Slide_CategoricalScatterplotFigShare.pptx [Dataset]. http://doi.org/10.6084/m9.figshare.3840102.v1
Explore at:
pptxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3840102.v1
Dataset updated
Sep 19, 2016
Dataset provided by
Figsharehttp://figshare.com/
Authors
Benj Petre; Aurore Coince; Sophien Kamoun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Categorical scatterplots with R for biologists: a step-by-step guide

Benjamin Petre1, Aurore Coince2, Sophien Kamoun1

1 The Sainsbury Laboratory, Norwich, UK; 2 Earlham Institute, Norwich, UK

Weissgerber and colleagues (2015) recently stated that ‘as scientists, we urgently need to change our practices for presenting continuous data in small sample size studies’. They called for more scatterplot and boxplot representations in scientific papers, which ‘allow readers to critically evaluate continuous data’ (Weissgerber et al., 2015). In the Kamoun Lab at The Sainsbury Laboratory, we recently implemented a protocol to generate categorical scatterplots (Petre et al., 2016; Dagdas et al., 2016). Here we describe the three steps of this protocol: 1) formatting of the data set in a .csv file, 2) execution of the R script to generate the graph, and 3) export of the graph as a .pdf file.

Protocol

• Step 1: format the data set as a .csv file. Store the data in a three-column excel file as shown in Powerpoint slide. The first column ‘Replicate’ indicates the biological replicates. In the example, the month and year during which the replicate was performed is indicated. The second column ‘Condition’ indicates the conditions of the experiment (in the example, a wild type and two mutants called A and B). The third column ‘Value’ contains continuous values. Save the Excel file as a .csv file (File -> Save as -> in ‘File Format’, select .csv). This .csv file is the input file to import in R.

• Step 2: execute the R script (see Notes 1 and 2). Copy the script shown in Powerpoint slide and paste it in the R console. Execute the script. In the dialog box, select the input .csv file from step 1. The categorical scatterplot will appear in a separate window. Dots represent the values for each sample; colors indicate replicates. Boxplots are superimposed; black dots indicate outliers.

• Step 3: save the graph as a .pdf file. Shape the window at your convenience and save the graph as a .pdf file (File -> Save as). See Powerpoint slide for an example.

Notes

• Note 1: install the ggplot2 package. The R script requires the package ‘ggplot2’ to be installed. To install it, Packages & Data -> Package Installer -> enter ‘ggplot2’ in the Package Search space and click on ‘Get List’. Select ‘ggplot2’ in the Package column and click on ‘Install Selected’. Install all dependencies as well.

• Note 2: use a log scale for the y-axis. To use a log scale for the y-axis of the graph, use the command line below in place of command line #7 in the script.

7 Display the graph in a separate window. Dot colors indicate

replicates

graph + geom_boxplot(outlier.colour='black', colour='black') + geom_jitter(aes(col=Replicate)) + scale_y_log10() + theme_bw()

References

Dagdas YF, Belhaj K, Maqbool A, Chaparro-Garcia A, Pandey P, Petre B, et al. (2016) An effector of the Irish potato famine pathogen antagonizes a host autophagy cargo receptor. eLife 5:e10856.

Petre B, Saunders DGO, Sklenar J, Lorrain C, Krasileva KV, Win J, et al. (2016) Heterologous Expression Screens in Nicotiana benthamiana Identify a Candidate Effector of the Wheat Yellow Rust Pathogen that Associates with Processing Bodies. PLoS ONE 11(2):e0149035

Weissgerber TL, Milic NM, Winham SJ, Garovic VD (2015) Beyond Bar and Line Graphs: Time for a New Data Presentation Paradigm. PLoS Biol 13(4):e1002128

https://cran.r-project.org/

http://ggplot2.org/
Dataset of the paper: "How do Hugging Face Models Document Datasets, Bias,...
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Jan 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Federica Pepe; Vittoria Nardone; Vittoria Nardone; Antonio Mastropaolo; Antonio Mastropaolo; Gerardo Canfora; Gerardo Canfora; Gabriele BAVOTA; Gabriele BAVOTA; Massimiliano Di Penta; Massimiliano Di Penta; Federica Pepe (2024). Dataset of the paper: "How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study" [Dataset]. http://doi.org/10.5281/zenodo.10058142
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10058142
Dataset updated
Jan 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Federica Pepe; Vittoria Nardone; Vittoria Nardone; Antonio Mastropaolo; Antonio Mastropaolo; Gerardo Canfora; Gerardo Canfora; Gabriele BAVOTA; Gabriele BAVOTA; Massimiliano Di Penta; Massimiliano Di Penta; Federica Pepe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This replication package contains datasets and scripts related to the paper: "*How do Hugging Face Models Document Datasets, Bias, and Licenses? An Empirical Study*"

## Root directory
- `statistics.r`: R script used to compute the correlation between usage and downloads, and the RQ1/RQ2 inter-rater agreements
- `modelsInfo.zip`: zip file containing all the downloaded model cards (in JSON format)
- `script`: directory containing all the scripts used to collect and process data. For further details, see README file inside the script directory.

## Dataset
- `Dataset/Dataset_HF-models-list.csv`: list of HF models analyzed
- `Dataset/Dataset_github-prj-list.txt`: list of GitHub projects using the *transformers* library
- `Dataset/Dataset_github-Prj_model-Used.csv`: contains usage pairs: project, model
- `Dataset/Dataset_prj-num-models-reused.csv`: number of models used by each GitHub project
- `Dataset/Dataset_model-download_num-prj_correlation.csv` contains, for each model used by GitHub projects: the name, the task, the number of reusing projects, and the number of downloads

## RQ1
- `RQ1/RQ1_dataset-list.txt`: list of HF datasets
- `RQ1/RQ1_datasetSample.csv`: sample set of models used for the manual analysis of datasets
- `RQ1/RQ1_analyzeDatasetTags.py`: Python script to analyze model tags for the presence of datasets. it requires to unzip the `modelsInfo.zip` in a directory with the same name (`modelsInfo`) at the root of the replication package folder. Produces the output to stdout. To redirect in a file fo be analyzed by the `RQ2/countDataset.py` script
- `RQ1/RQ1_countDataset.py`: given the output of `RQ2/analyzeDatasetTags.py` (passed as argument) produces, for each model, a list of Booleans indicating whether (i) the model only declares HF datasets, (ii) the model only declares external datasets, (iii) the model declares both, and (iv) the model is part of the sample for the manual analysis
- `RQ1/RQ1_datasetTags.csv`: output of `RQ2/analyzeDatasetTags.py`
- `RQ1/RQ1_dataset_usage_count.csv`: output of `RQ2/countDataset.py`

## RQ2
- `RQ2/tableBias.pdf`: table detailing the number of occurrences of different types of bias by model Task
- `RQ2/RQ2_bias_classification_sheet.csv`: results of the manual labeling
- `RQ2/RQ2_isBiased.csv`: file to compute the inter-rater agreement of whether or not a model documents Bias
- `RQ2/RQ2_biasAgrLabels.csv`: file to compute the inter-rater agreement related to bias categories
- `RQ2/RQ2_final_bias_categories_with_levels.csv`: for each model in the sample, this file lists (i) the bias leaf category, (ii) the first-level category, and (iii) the intermediate category

## RQ3
- `RQ3/RQ3_LicenseValidation.csv`: manual validation of a sample of licenses
- `RQ3/RQ3_{NETWORK-RESTRICTIVE|RESTRICTIVE|WEAK-RESTRICTIVE|PERMISSIVE}-license-list.txt`: lists of licenses with different permissiveness
- `RQ3/RQ3_prjs_license.csv`: for each project linked to models, among other fields it indicates the license tag and name
- `RQ3/RQ3_models_license.csv`: for each model, indicates among other pieces of info, whether the model has a license, and if yes what kind of license
- `RQ3/RQ3_model-prj-license_contingency_table.csv`: usage contingency table between projects' licenses (columns) and models' licenses (rows)
- `RQ3/RQ3_models_prjs_licenses_with_type.csv`: pairs project-model, with their respective licenses and permissiveness level

## scripts
Contains the scripts used to mine Hugging Face and GitHub. Details are in the enclosed README
Z
Data and Code for "A Ray-Based Input Distance Function to Model Zero-Valued...
data.niaid.nih.gov
Updated Jun 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Price, Juan José; Henningsen, Arne (2023). Data and Code for "A Ray-Based Input Distance Function to Model Zero-Valued Output Quantities: Derivation and an Empirical Application" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7882078
Explore at:
Dataset updated
Jun 17, 2023
Dataset provided by
University of Copenhagen
Universidad Adolfo Ibáñez
Authors
Price, Juan José; Henningsen, Arne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data and code archive provides all the data and code for replicating the empirical analysis that is presented in the journal article "A Ray-Based Input Distance Function to Model Zero-Valued Output Quantities: Derivation and an Empirical Application" authored by Juan José Price and Arne Henningsen and published in the Journal of Productivity Analysis (DOI: 10.1007/s11123-023-00684-1).

We conducted the empirical analysis with the "R" statistical software (version 4.3.0) using the add-on packages "combinat" (version 0.0.8), "miscTools" (version 0.6.28), "quadprog" (version 1.5.8), sfaR (version 1.0.0), stargazer (version 5.2.3), and "xtable" (version 1.8.4) that are available at CRAN. We created the R package "micEconDistRay" that provides the functions for empirical analyses with ray-based input distance functions that we developed for the above-mentioned paper. Also this R package is available at CRAN (https://cran.r-project.org/package=micEconDistRay).

This replication package contains the following files and folders:

README This file

MuseumsDk.csv The original data obtained from the Danish Ministry of Culture and from Statistics Denmark. It includes the following variables:

museum: Name of the museum.

type: Type of museum (Kulturhistorisk museum = cultural history museum; Kunstmuseer = arts museum; Naturhistorisk museum = natural history museum; Blandet museum = mixed museum).

munic: Municipality, in which the museum is located.

yr: Year of the observation.

units: Number of visit sites.

resp: Whether or not the museum has special responsibilities (0 = no special responsibilities; 1 = at least one special responsibility).

vis: Number of (physical) visitors.

aarc: Number of articles published (archeology).

ach: Number of articles published (cultural history).

aah: Number of articles published (art history).

anh: Number of articles published (natural history).

exh: Number of temporary exhibitions.

edu: Number of primary school classes on educational visits to the museum.

ev: Number of events other than exhibitions.

ftesc: Scientific labor (full-time equivalents).

ftensc: Non-scientific labor (full-time equivalents).

expProperty: Running and maintenance costs [1,000 DKK].

expCons: Conservation expenditure [1,000 DKK].

ipc: Consumer Price Index in Denmark (the value for year 2014 is set to 1).

prepare_data.R This R script imports the data set MuseumsDk.csv, prepares it for the empirical analysis (e.g., removing unsuitable observations, preparing variables), and saves the resulting data set as DataPrepared.csv.

DataPrepared.csv This data set is prepared and saved by the R script prepare_data.R. It is used for the empirical analysis.

make_table_descriptive.R This R script imports the data set DataPrepared.csv and creates the LaTeX table /tables/table_descriptive.tex, which provides summary statistics of the variables that are used in the empirical analysis.

IO_Ray.R This R script imports the data set DataPrepared.csv, estimates a ray-based Translog input distance functions with the 'optimal' ordering of outputs, imposes monotonicity on this distance function, creates the LaTeX table /tables/idfRes.tex that presents the estimated parameters of this function, and creates several figures in the folder /figures/ that illustrate the results.

IO_Ray_ordering_outputs.R This R script imports the data set DataPrepared.csv, estimates a ray-based Translog input distance functions, imposes monotonicity for each of the 720 possible orderings of the outputs, and saves all the estimation results as (a huge) R object allOrderings.rds.

allOrderings.rds (not included in the ZIP file, uploaded separately) This is a saved R object created by the R script IO_Ray_ordering_outputs.R that contains the estimated ray-based Translog input distance functions (with and without monotonicity imposed) for each of the 720 possible orderings.

IO_Ray_model_averaging.R This R script loads the R object allOrderings.rds that contains the estimated ray-based Translog input distance functions for each of the 720 possible orderings, does model averaging, and creates several figures in the folder /figures/ that illustrate the results.

/tables/ This folder contains the two LaTeX tables table_descriptive.tex and idfRes.tex (created by R scripts make_table_descriptive.R and IO_Ray.R, respectively) that provide summary statistics of the data set and the estimated parameters (without and with monotonicity imposed) for the 'optimal' ordering of outputs.

/figures/ This folder contains 48 figures (created by the R scripts IO_Ray.R and IO_Ray_model_averaging.R) that illustrate the results obtained with the 'optimal' ordering of outputs and the model-averaged results and that compare these two sets of results.
Food and Agriculture Biomass Input–Output (FABIO) database
data.europa.eu
data.niaid.nih.gov
+1more
unknown
Updated Jun 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2022). Food and Agriculture Biomass Input–Output (FABIO) database [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-2577067?locale=es
Explore at:
unknown(4578)Available download formats
Dataset updated
Jun 7, 2022
Dataset authored and provided by
Zenodohttp://zenodo.org/
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
This data repository provides the Food and Agriculture Biomass Input Output (FABIO) database, a global set of multi-regional physical supply-use and input-output tables covering global agriculture and forestry. The work is based on mostly freely available data from FAOSTAT, IEA, EIA, and UN Comtrade/BACI. FABIO currently covers 191 countries + RoW, 118 processes and 125 commodities (raw and processed agricultural and food products) for 1986-2013. All R codes and auxilliary data are available on GitHub. For more information please refer to https://fabio.fineprint.global. The database consists of the following main components, in compressed .rds format: Z: the inter-commodity input-output matrix, displaying the relationships of intermediate use of each commodity in the production of each commodity, in physical units (tons). The matrix has 24000 rows and columns (125 commodities x 192 regions), and is available in two versions, based on the method to allocate inputs to outputs in production processes: Z_mass (mass allocation) and Z_value (value allocation). Note that the row sums of the Z matrix (= total intermediate use by commodity) are identical in both versions. Y: the final demand matrix, denoting the consumption of all 24000 commodities by destination country and final use category. There are six final use categories (yielding 192 x 6 = 1152 columns): 1) food use, 2) other use (non-food), 3) losses, 4) stock addition, 5) balancing, and 6) unspecified. X: the total output vector of all 24000 commodities. Total output is equal to the sum of intermediate and final use by commodity. L: the Leontief inverse, computed as (I – A)-1, where A is the matrix of input coefficients derived from Z and x. Again, there are two versions, depending on the underlying version of Z (L_mass and L_value). E: environmental extensions for each of the 24000 commodities, including four resource categories: 1) primary biomass extraction (in tons), 2) land use (in hectares), 3) blue water use (in m3)., and 4) green water use (in m3). mr_sup_mass/mr_sup_value: For each allocation method (mass/value), the supply table gives the physical supply quantity of each commodity by producing process, with processes in the rows (118 processes x 192 regions = 22656 rows) and commodities in columns (24000 columns). mr_use: the use table capture the quantities of each commodity (rows) used as an input in each process (columns). A description of the included countries and commodities (i.e. the rows and columns of the Z matrix) can be found in the auxiliary file io_codes.csv. Separate lists of the country sample (including ISO3 codes and continental grouping) and commodities (including moisture content) are given in the files regions.csv and items.csv, respectively. For information on the individual processes, see auxiliary file su_codes.csv. RDS files can be opened in R. Information on how to read these files can be obtained here: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readRDS Except of X.rds, which contains a matrix, all variables are organized as lists, where each element contains a sparse matrix. Please note that values are always given in physical units, i.e. tonnes or head, as specified in items.csv. The suffixes value and mass only indicate the form of allocation chosen for the construction of the symmetric IO tables (for more details see Bruckner et al. 2019). Product, process and country classifications can be found in the file fabio_classifications.xlsx. Footprint results are not contained in the database but can be calculated, e.g. by using this script: https://github.com/martinbruckner/fabio_comparison/blob/master/R/fabio_footprints.R How to cite: To cite FABIO work please refer to this paper: Bruckner, M., Wood, R., Moran, D., Kuschnig, N., Wieland, H., Maus, V., Börner, J. 2019. FABIO – The Construction of the Food and Agriculture Input–Output Model. Environmental Science & Technology 53(19), 11302–11312. DOI: 10.1021/acs.est.9b03554 License: This data repository is distributed under the CC BY-NC-SA 4.0 License. You are free to share and adapt the material for non-commercial purposes using proper citation. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. In case you are interested in a collaboration, I am happy to receive enquiries at martin.bruckner@wu.ac.at. Known issues: The underlying FAO data have been manipulated to the minimum extent necessary. Data filling and supply-use balancing, yet, required some adaptations. These are documented in the code and are also reflected in the balancing item in the final demand matrices. For a proper use of the database, I recommend to distribute the balancing item over all other uses proportionally and to do analyses with and without balancing to illustrate uncertainties.
g
2007-08 V3 CEAMARC-CASO Bathymetry Plots Over Time During Events | gimi9.com...
gimi9.com
Updated Apr 20, 2008
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2008). 2007-08 V3 CEAMARC-CASO Bathymetry Plots Over Time During Events | gimi9.com [Dataset]. https://gimi9.com/dataset/au_2007-08-v3-ceamarc-caso-bathymetry-plots-over-time-during-events1/
Explore at:
Dataset updated
Apr 20, 2008
Description
A routine was developed in R ('bathy_plots.R') to plot bathymetry data over time during individual CEAMARC events. This is so we can analyse benthic data in relation to habitat, ie. did we trawl over a slope or was the sea floor relatively flat. Note that the depth range in the plots is autoscaled to the data, so a small range in depths appears as a scatetring of points. As long as you look at the depth scale though interpretation will be ok. The R files need a file of bathymetry data in '200708V3_one_minute.csv' which is a file containing a data export from the underway PostgreSQL ship database and 'events.csv' which is a stripped down version of the events export from the ship board events database export. If you wish to run the code again you may need to change the pathnames in the R script to relevant locations. If you have opened the csv files in excel at any stage and the R script gets an error you may need to format the date/time columns as yyyy-mm-dd hh;mm:ss, save and close the file as csv without opening it again and then run the R script. However, all output files are here for every CEAMARC event. Filenames contain a reference to CEAMARC event id. Files are in eps format and can be viewed using Ghostview which is available as a free download on the internet.
Supplementary data for "Characterizing Intraspecific Resource Utilization in...
zenodo.org
zip
Updated Feb 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claus-Peter Stelzer; Claus-Peter Stelzer (2025). Supplementary data for "Characterizing Intraspecific Resource Utilization in an Aquatic Consumer Using High-Throughput Phenotyping" [Dataset]. http://doi.org/10.5281/zenodo.14900039
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14900039
Dataset updated
Feb 21, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Claus-Peter Stelzer; Claus-Peter Stelzer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the raw data for the study:

Characterizing Intraspecific Resource Utilization in an Aquatic Consumer Using High-Throughput Phenotyping

Data are provided separately for the first experiment (numerical response experiment with 16 rotifer clones across six food concentrations) and the second experiment (growth rate measurements with 98 rotifer clones across two food concentrations).

Contents of first_experiment.zip

input/ This folder contains raw count data (output of the Wellcounter software):
popgrowth_

output/ output files produced by the R-script 'first_experiment_analysis.Rmd'

wellcounter/ contains the Wellcounter software (programs and configuration files) that were used for running the raw analysis of this dataset on a High Performance Computing cluster

first_experiment_analysis.Rmd R-Markdown file with data processing and statistical analysis of the first experiment
numerical_response_2par.R A function required by 'first_experiment_analysis.Rmd'

Contents of second_experiment.zip

input/ This folder contains raw count and behavioral data (output of the Wellcounter software):
popgrowth_

output/ output files produced by the R-script 'second_experiment_analysis.Rmd'

wellcounter/ contains the Wellcounter software (programs and configuration files) that were used for running the raw analysis (image and motion analysis) of this dataset on a High Performance Computing cluster

second_experiment_prep_run1.Rmd R-Markdown file for preprocessing the data from run1
second_experiment_prep_run2.Rmd R-Markdown file for preprocessing the data from run2
second_experiment_analysis.Rmd R-Markdown file with data processing and statistical analysis of the second experiment
extract_fixed_effects_table.R A function required by 'second_experiment_analysis.Rmd'
Data from: Generational differences in the low tones of Black Lahu
zenodo.org
bin, csv
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cathryn Yang; Cathryn Yang; James Stanford; Chunxia Luo; Naluo Zhang; James Stanford; Chunxia Luo; Naluo Zhang (2024). Generational differences in the low tones of Black Lahu [Dataset]. http://doi.org/10.5281/zenodo.4008213
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4008213
Dataset updated
Jul 19, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Cathryn Yang; Cathryn Yang; James Stanford; Chunxia Luo; Naluo Zhang; James Stanford; Chunxia Luo; Naluo Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We investigate apparent-time tone variation in the Black Lahu language (Loloish/Ngwi, Tibeto-Burman) of Yunnan, China. These are the supplementary materials for the paper "Generational differences in the low tones of Black Lahu," accepted for publication in Linguistics Vanguard.

Appendices:

Appendix A: Wordlist (organized by order of appearance in the story)

Appendix B: Wordless picture book

Appendix C: Cross-tabulation tables

Appendix D: F0 trajectory plots by speaker

Appendix E: LME model results for T45

Script files contained in the analysis:

F0_estimation.praat is the Praat script used for F0 estimation on the wav + Textgrid pairs

Combine_Speaker_Files.R is used to combine the Praat script output for individual speakers into a single csv file.

Data_processing.R is the main processing script, works on an Excel csv file that contains the output of the Combine_Speaker_Files.R script

Plotting_Fig1_Lahu_tones.R plots the F0 trajectories of all the tones of Lahu, averaged across all the speakers, works on an Excel csv file that contains the output of the F0 estimation.praat script.

Plotting_Fig2_age_groups.R plots the F0 trajectories of T2 and T7, in 15-age groups, works on an Excel csv file that contains the output of the Data_processing.R script.

Plotting_Fig3and6_carryover. R plots the F0 trajectories of T2, T7 and T4 when they occur after silence or Tones 1-7 in three age groups; works on an Excel csv file that contains the output of the Data_processing.R script.

Plotting_Fig4578_scatterplot.R plots the scatterplots of F0 onset versus Age, works on an Excel csv file that contains the output of the Data_processing.R script.

Data files contained in this analysis:

praat_raw_data.csv is the output of the F0_estimation.praat script. Contains the raw data from all speakers.

Speaker_info.csv is the demographic data for each speaker

data_forplotting.csv is the output of the Data_processing.R script. Contains the filtered tokens, normalized for length and converted to speaker-specific semitones. Used to plot Figure 2, Figure 3 and Figure 6

data_forRbrul.csv is the output of the Data_processing.R script, the F0 onset and F0 offset for Tones 2, 4, and 7. Used to do linear mixed effects modeling in the Rbrul interface (Johnson 2009).
I
Dataset title: Datasets, scripts and main output files for "Phylogeny,...
databank.illinois.edu
Updated Sep 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yanghui Cao; Christopher H. Dietrich; Dmitry A. Dmitriev; Joel H. Kits; Qingquan Xue; Yalin Zhang (2024). Dataset title: Datasets, scripts and main output files for "Phylogeny, Biogeography and Morphological Evolution of the Treehopper-Like Leafhoppers (Hemiptera: Cicadellidae) Megophthalminae and Ulopinae" [Dataset]. http://doi.org/10.13012/B2IDB-1475719_V3
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-1475719_V3
Dataset updated
Sep 17, 2024
Authors
Yanghui Cao; Christopher H. Dietrich; Dmitry A. Dmitriev; Joel H. Kits; Qingquan Xue; Yalin Zhang
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset funded by
National Natural Science Foundation of China
Agriculture and Agri- Food Canada
U.S. National Science Foundation (NSF)
Description
The following seven zip files are compressed folders containing the input datasets/trees, main output files and the scripts of the related analyses performed in this study. I. ancestral_microhabitat_reconstruction.zip: contains four files, including two input files (microhabitats.csv, timetree.tre) and a script (simmap_microhabitat.R) for ancestral states reconstruction of microhabitat by make.simmap implemented in the R package phytools v1.5, as well as the main output file (ancestral_microhabitats.csv). 1. ancestral_microhabitats.csv: reconstructed ancestral microhabitats for each node. 2. microhabitats.csv: microhabitats of the studies species. 3. simmap_microhabitat.R: the R script of make.simmap for ancestral microhabitat reconstruction 4. timetree.tre: dated tree used for ancestral state reconstruction for microhabitat and morphological characters II. ancestral_morphology_reconstruction.zip: contains six files, including an input file (morphology.csv) and a script (simmap_morphology.R) for ancestral states reconstruction of morphology by make.simmap implemented in the R package phytools v1.5, as well as four main output files(forewing_ancestral_state.csv, frontal_sutures_ancestral_state.csv, hind_wing_ancestral_state.csv, ocellus_ancestral_state.csv). 1. forewing_ancestral_state.csv: reconstructed ancestral states of the development of the forewing for each node. 2. frontal_sutures_ancestral_state.csv: reconstructed ancestral states of the development of frontal sutures for each node. 3. hind_wing_ancestral_state.csv: reconstructed ancestral states of the development of the hind wing for each node. 4. morphology.csv: the states of the development of ocellus, forewing, hing wing and frontal sutures for each studies species. 5. ocellus_ancestral_state.csv: reconstructed ancestral states of the development of the ocellus for each node. 6. simmap_morphology.R: the R script of make.simmap for ancestral state reconstruction of morphology III. biogeographic_reconstruction.zip: contains four files, including three input files (dispersal_probablity.txt, distributions.csv, timetree_noOutgroup.tre) used for a stratified biogeographic analysis by BioGeoBEARS in RASP v4.2 and the main output file (DIVELIKE_result.txt). 1. dispersal_probablity.txt: relative dispersal probabilities among biogeographical regions at different geological epochs. 2. distributions.csv: current distributions of the studied species. 3. DIVELIKE_result.txt: BioGeoBEARS result of ancestral areas based on the DIVELIKE model. 4. timetree_noOutgroup.tre: the dated tree with the outgroup lineage (Eurymelinae) excluded. IV. coalescent_analysis.zip: contains a folder and two files, including a folder (individual_gene_alignment) of input files used to construct gene trees, an input file (MLtree_BS70.tre) used for the multi-species coalescent analysis by ASTRAL v 4.10.5 and the main output file (coalescent_species_tree.tre). 1. coalescent_species_tree.tre: the species tree generated by the multi-species coalescent analysis with the quartet support, effective number of genes and the local posterior probability indicated. 2. individual_gene_alignment: a folder containing 427 FASTA files, each one represents the nucleotide alignment for a gene. Hyphens are used to represent gaps. These files were used to construct gene trees using IQ-TREE v1.6.12. 3. MLtree_BS70.tre: 165 gene trees with the average SH-aLRT and ultrafast bootstrap values of ≥ 70%. This file was used to estimate the species tree by ASTRAL v 4.10.5. V. divergence_time_estimation.zip: contains five files, including two input files (treefile_rooted_noBranchLength.tre, treefile_rooted.tre) and two control files (baseml.ctl, mcmctree.ctl) used for divergence time estimation by BASEML and MCMCTREE in PAML v4.9, as well as the main output file (timetree_with95%HPD.tre). 1. baseml.ctl: the control file used for the estimation of substitution rates by BASEML in PAML v4.9. 2. mcmctree.ctl: the control file used for the estimation of divergence times by MCMCTREE in PAML v4.9. 3. timetree_with95%HPD.tre: dated tree with the 95% highest posterior density confidence intervals indicated. 4. treefile_rooted_noBranchLength.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with calibrations for the crown and internal nodes. Branch length and support values were not indicated. 5. treefile_rooted.tre: the maximum likelihood tree based on the concatenated nucleotide dataset with a secondary calibration on the root age. Branch support values were not indicated. VI. maximum_likelihood_analysis_aa.zip: contains three files, including two input files (concatenated_aa_partition.nex, concatenated_aa.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_aa.tre). 1. concatenated_aa_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_aa.phy. This file partitions the 52,024 amino acid positions into 427 character sets. 2. concatenated_aa.phy: a concatenated amino acid dataset with 52,024 amino acid positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis. 3. MLtree_aa.tre: the maximum likelihood tree based on the concatenated amino acid dataset, with SH-aLRT values and ultrafast bootstrap values indicated. VII. maximum_likelihood_analysis_nt.zip: contains three files, including two input files (concatenated_nt_partition.nex, concatenated_nt.phy) used for the maximum likelihood analysis by IQ-TREE v1.6.12 and the main output file (MLtree_nt.tre). 1. concatenated_nt_partition.nex: the partitioning schemes for the maximum likelihood analysis using concatenated_nt.phy. This file partitions the 156,072 nucleotide positions into 427 character sets. 2. concatenated_nt.phy: a concatenated nucleotide dataset with 156,072 nucleotide positions. Hyphens are used to represent gaps. This dataset was used for the maximum likelihood analysis as well as divergence time estimation. 3. MLtree_nt.tre: the maximum likelihood tree based on the concatenated nucleotide dataset, with SH-aLRT values and ultrafast bootstrap values indicated. VIII. Taxon_sampling.csv: contains the sample IDs (1st column) which were used in the alignments and the taxonomic information (2nd to 6th columns).
Z
Data and code for: Impacts of changing snowfall on seasonal complementarity...
data.niaid.nih.gov
Updated Apr 27, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marshall, Adrienne (2022). Data and code for: Impacts of changing snowfall on seasonal complementarity of hydroelectric and solar power [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5806522
Explore at:
Dataset updated
Apr 27, 2022
Dataset provided by
Colorado School of Mines
Authors
Marshall, Adrienne
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data and code to reproduce analyses in manuscript entitled: Influence of changing snowfall on seasonal complementarity of hydroelectric and solar power. Submitted to Environmental Research: Infrastructure and Sustainability.

The contents include the following scripts and files, listed below. Scripts are listed in the order needed to reproduce the analysis, though intermediate data products have been saved so it is not necessary to reproduce the initial analytical steps.

R/

eia923_860.R: extracts solar and hydropower production data; requires local download of EIA data.

gridMET_swep.R: downloads and summarises gridmet data; does not require prior local download.

fdr.R: function to calculate the p-value associated with a given false discovery rate as described in the associated manuscript.

combine_data.R combines solar, hydropower, and SWE/P data

analysis.Rmd: primary script in which analyses are conducted

data/

annual_swep.csv: output from gridMET_swep.R with annual SWE/P for each watershed in the study

monthly_hydro.csv: output from eia923_860.R

monthly_solar.csv: output from eia923_860.R

combined_variables.csv: combines variables above in one CSV

watersheds_wbd_ss: shapefiles for watersheds that drain to each dam used in the study, derived as described in the manuscript.
d
QA/QC-ed Groundwater Level Time Series in PLM-1 and PLM-6 Monitoring Wells,...
dataone.org
knb.ecoinformatics.org
+1more
Updated Feb 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boris Faybishenko; Roelof Versteeg; Kenneth Williams; Rosemary Carroll; Wenming Dong; Tetsu Tokunaga; Dylan O'Ryan (2024). QA/QC-ed Groundwater Level Time Series in PLM-1 and PLM-6 Monitoring Wells, East River, Colorado (2016-2022) [Dataset]. http://doi.org/10.15485/1866836
Explore at:
Unique identifier
https://doi.org/10.15485/1866836
Dataset updated
Feb 8, 2024
Dataset provided by
ESS-DIVE
Authors
Boris Faybishenko; Roelof Versteeg; Kenneth Williams; Rosemary Carroll; Wenming Dong; Tetsu Tokunaga; Dylan O'Ryan
Time period covered
Nov 30, 2016 - Oct 13, 2022
Area covered

Description
This data set contains QA/QC-ed (Quality Assurance and Quality Control) water level data for the PLM1 and PLM6 wells. PLM1 and PLM6 are location identifiers used by the Watershed Function SFA project for two groundwater monitoring wells along an elevation gradient located along the lower montane life zone of a hillslope near the Pumphouse location at the East River Watershed, Colorado, USA. These wells are used to monitor subsurface water and carbon inventories and fluxes, and to determine the seasonally dependent flow of groundwater under the PLM hillslope. The downslope flow of groundwater in combination with data on groundwater chemistry (see related references) can be used to estimate rates of solute export from the hillslope to the floodplain and river. QA/QC analysis of measured groundwater levels in monitoring wells PLM-1 and PLM-6 included identification and flagging of duplicated values of timestamps, gap filling of missing timestamps and water levels, removal of abnormal/bad and outliers of measured water levels. The QA/QC analysis also tested the application of different QA/QC methods and the development of regular (5-minute, 1-hour, and 1-day) time series datasets, which can serve as a benchmark for testing other QA/QC techniques, and will be applicable for ecohydrological modeling. The package includes a Readme file, one R code file used to perform QA/QC, a series of 8 data csv files (six QA/QC-ed regular time series datasets of varying intervals (5-min, 1-hr, 1-day) and two files with QA/QC flagging of original data), and three files for the reporting format adoption of this dataset (InstallationMethods, file level metadata (flmd), and data dictionary (dd) files).QA/QC-ed data herein were derived from the original/raw data publication available at Williams et al., 2020 (DOI: 10.15485/1818367). For more information about running R code file (10.15485_1866836_QAQC_PLM1_PLM6.R) to reproduce QA/QC output files, see README (QAQC_PLM_readme.docx). This dataset replaces the previously published raw data time series, and is the final groundwater data product for the PLM wells in the East River. Complete metadata information on the PLM1 and PLM6 wells are available in a related dataset on ESS-DIVE: Varadharajan C, et al (2022). https://doi.org/10.15485/1660962. These data products are part of the Watershed Function Scientific Focus Area collection effort to further scientific understanding of biogeochemical dynamics from genome to watershed scales. 2022/09/09 Update: Converted data files using ESS-DIVE’s Hydrological Monitoring Reporting Format. With the adoption of this reporting format, the addition of three new files (v1_20220909_flmd.csv, V1_20220909_dd.csv, and InstallationMethods.csv) were added. The file-level metadata file (v1_20220909_flmd.csv) contains information specific to the files contained within the dataset. The data dictionary file (v1_20220909_dd.csv) contains definitions of column headers and other terms across the dataset. The installation methods file (InstallationMethods.csv) contains a description of methods associated with installation and deployment at PLM1 and PLM6 wells. Additionally, eight data files were re-formatted to follow the reporting format guidance (er_plm1_waterlevel_2016-2020.csv, er_plm1_waterlevel_1-hour_2016-2020.csv, er_plm1_waterlevel_daily_2016-2020.csv, QA_PLM1_Flagging.csv, er_plm6_waterlevel_2016-2020.csv, er_plm6_waterlevel_1-hour_2016-2020.csv, er_plm6_waterlevel_daily_2016-2020.csv, QA_PLM6_Flagging.csv). The major changes to the data files include the addition of header_rows above the data containing metadata about the particular well, units, and sensor description. 2023/01/18 Update: Dataset updated to include additional QA/QC-ed water level data up until 2022-10-12 for ER-PLM1 and 2022-10-13 for ER-PLM6. Reporting format specific files (v2_20230118_flmd.csv, v2_20230118_dd.csv, v2_20230118_InstallationMethods.csv) were updated to reflect the additional data. R code file (QAQC_PLM1_PLM6.R) was added to replace the previously uploaded HTML files to enable execution of the associated code. R code file (QAQC_PLM1_PLM6.R) and ReadMe file (QAQC_PLM_readme.docx) were revised to clarify where original data was retrieved from and to remove local file paths.
Global Landslide Catalog Export - Dataset - NASA Open Data Portal
data.nasa.gov
Updated Mar 26, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
nasa.gov (2016). Global Landslide Catalog Export - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/global-landslide-catalog-export
Explore at:
Dataset updated
Mar 26, 2016
Dataset provided by
NASAhttp://nasa.gov/
Description
The Global Landslide Catalog (GLC) was developed with the goal of identifying rainfall-triggered landslide events around the world, regardless of size, impacts or location. The GLC considers all types of mass movements triggered by rainfall, which have been reported in the media, disaster databases, scientific reports, or other sources. The GLC has been compiled since 2007 at NASA Goddard Space Flight Center. This is a unique data set with the ID tag “GLC” in the landslide editor. This dataset on data.nasa.gov was a one-time export from the Global Landslide Catalog maintained separately. It is current as of March 7, 2016. The original catalog is available here: http://www.arcgis.com/home/webmap/viewer.html?url=https%3A%2F%2Fmaps.nccs.nasa.gov%2Fserver%2Frest%2Fservices%2Fglobal_landslide_catalog%2Fglc_viewer_service%2FFeatureServer&source=sd To export GLC data, you must agree to the “Terms and Conditions”. We request that anyone using the GLC cite the two sources of this database: Kirschbaum, D. B., Adler, R., Hong, Y., Hill, S., & Lerner-Lam, A. (2010). A global landslide catalog for hazard applications: method, results, and limitations. Natural Hazards, 52(3), 561–575. doi:10.1007/s11069-009-9401-4. [1] Kirschbaum, D.B., T. Stanley, Y. Zhou (In press, 2015). Spatial and Temporal Analysis of a Global Landslide Catalog. Geomorphology. doi:10.1016/j.geomorph.2015.03.016. [2]
n
ESG rating of general stock indices
narcis.nl
data.mendeley.com
Updated Oct 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erhart, S (via Mendeley Data) (2021). ESG rating of general stock indices [Dataset]. http://doi.org/10.17632/58mwkj5pf8.1
Explore at:
Unique identifier
https://doi.org/10.17632/58mwkj5pf8.1
Dataset updated
Oct 22, 2021
Dataset provided by
Data Archiving and Networked Services (DANS)
Authors
Erhart, S (via Mendeley Data)
Description
################################################################################################## THE FILES HAVE BEEN CREATED BY SZILÁRD ERHART FOR A RESEARCH: ERHART (2021): ESG RATINGS OF GENERAL # STOCK EXCHANGE INDICES, INTERNATIONAL REVIEW OF FINANCIAL ANALYSIS# USERS OF THE FILES AGREE TO QUOTE THE ABOVE PAPER# THE PYTHON SCRIPT (PYTHONESG_ERHART.TXT) HELPS USERS TO GET TICKERS BY STOCK EXCHANGES AND EXTRACT ESG SCORES FOR THE UNDERLYING STOCKS FROM YAHOO FINANCE.# THE R SCRIPT (ESG_UA.TXT) HELPS TO REPLICATE THE MONTE CARLO EXPERIMENT DETAILED IN THE STUDY.# THE EXPORT_ALL CSV CONTAINS THE DOWNLOADED ESG DATA (SCORES, CONTROVERSIES, ETC) ORGANIZED BY STOCKS AND EXCHANGES.############################################################################################################################################################################################################### DISCLAIMER # The author takes no responsibility for the timeliness, accuracy, completeness or quality of the information provided. # The author is in no event liable for damages of any kind incurred or suffered as a result of the use or non-use of the # information presented or the use of defective or incomplete information. # The contents are subject to confirmation and not binding. # The author expressly reserves the right to alter, amend, whole and in part, # without prior notice or to discontinue publication for a period of time or even completely. ###########################################################################################################################################READ ME############################################################# BEFORE USING THE MONTE CARLO SIMULATIONS SCRIPT: # (1) COPY THE goascores.csv and goalscores_alt.csv FILES ONTO YOUR ON COMPUTER DRIVE. THE TWO FILES ARE IDENTICAL.# (2) SET THE EXACT FILE LOCATION INFORMATION IN THE 'Read in data' SECTION OF THE MONTE CARLO SCRIPT AND FOR THE OUTPUT FILES AT THE END OF THE SCRIPT# (3) LOAD MISC TOOLS AND MATRIXSTATS IN YOUR R APPLICATION# (4) RUN THE CODE.####################################READ ME
u
Data from: Data, R Code, and Output Supporting "An Historical Overview and...
agdatacommons.nal.usda.gov
bin
Updated Nov 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John R. Fieberg; L. David Mech; Shannon Barber-Meyer (2025). Data, R Code, and Output Supporting "An Historical Overview and Update of Wolf-Moose Interactions in Northeastern Minnesota" [Dataset]. http://doi.org/10.13020/D6096S
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.13020/D6096S
Dataset updated
Nov 22, 2025
Dataset provided by
University of Minnesota
Authors
John R. Fieberg; L. David Mech; Shannon Barber-Meyer
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
Minnesota
Description
These files contain data and R code (along with associated output from running the code) supporting all results reported in, "Mech, L. D., J. Fieberg, and S. Barber-Meyer (2018). An historical overview and update of wolf-moose interactions in Northeastern Minnesota. Wildlife Society Bulletin." In this paper, we explored relationships between wolf numbers, monitored in part of the Minnesota moose range, and moose calf:population and estimated log annual growth rates of moose in Northeast Minnesota. The html files contain R code and output for analyzing MooseWolfDataUpdated.csv. The R files used to generate the html documents can be found in the zip file. See readme.txt for more details. Resources in this dataset:Resource Title: Link to DRUM catalog record. File Name: Web Page, url: https://conservancy.umn.edu/handle/11299/190423
U
Input data, model output, and R scripts for a machine learning streamflow...
data.usgs.gov
datasets.ai
+1more
Updated Nov 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ryan McShane; Cheryl Miller (2021). Input data, model output, and R scripts for a machine learning streamflow model on the Wyoming Range, Wyoming, 2012–17 [Dataset]. http://doi.org/10.5066/P9XCP1AE
Explore at:
Unique identifier
https://doi.org/10.5066/P9XCP1AE
Dataset updated
Nov 19, 2021
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Ryan McShane; Cheryl Miller
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
Jan 1, 2012 - Dec 31, 2017
Area covered
Wyoming Range, Wyoming
Description
A machine learning streamflow (MLFLOW) model was developed in R (model is in the Rscripts folder) for modeling monthly streamflow from 2012 to 2017 in three watersheds on the Wyoming Range in the upper Green River basin. Geospatial information for 125 site features (vector data are in the Sites.shp file) and discrete streamflow observation data and environmental predictor data were used in fitting the MLFLOW model and predicting with the fitted model. Tabular calibration and validation data are in the Model_Fitting_Site_Data.csv file, totaling 971 discrete observations and predictions of monthly streamflow. Geospatial information for 17,518 stream grid cells (raster data are in the Streams.tif file) and environmental predictor data were used for continuous streamflow predictions with the MLFLOW model. Tabular prediction data for all the study area (17,518 stream grid cells) and study period (72 months; 2012–17) are in the Model_Prediction_Stream_Data.csv file, totaling 1,261,296 p ...
Data supporting the Master thesis "Monitoring von Open Data Praktiken -...
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Nov 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katharina Zinke; Katharina Zinke (2024). Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" [Dataset]. http://doi.org/10.5281/zenodo.14196539
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14196539
Dataset updated
Nov 21, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Katharina Zinke; Katharina Zinke
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Dresden
Description
Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" (Monitoring open data practices - challenges in finding data publications using the example of publications by researchers at TU Dresden) - Katharina Zinke, Institut für Bibliotheks- und Informationswissenschaften, Humboldt-Universität Berlin, 2023

This ZIP-File contains the data the thesis is based on, interim exports of the results and the R script with all pre-processing, data merging and analyses carried out. The documentation of the additional, explorative analysis is also available. The actual PDFs and text files of the scientific papers used are not included as they are published open access.

The folder structure is shown below with the file names and a brief description of the contents of each file. For details concerning the analyses approach, please refer to the master's thesis (publication following soon).

## Data sources

Folder 01_SourceData/

- PLOS-Dataset_v2_Mar23.csv (PLOS-OSI dataset)

- ScopusSearch_ExportResults.csv (export of Scopus search results from Scopus)

- ScopusSearch_ExportResults.ris (export of Scopus search results from Scopus)

- Zotero_Export_ScopusSearch.csv (export of the file names and DOIs of the Scopus search results from Zotero)

## Automatic classification

Folder 02_AutomaticClassification/

- (NOT INCLUDED) PDFs folder (Folder for PDFs of all publications identified by the Scopus search, named AuthorLastName_Year_PublicationTitle_Title)

- (NOT INCLUDED) PDFs_to_text folder (Folder for all texts extracted from the PDFs by ODDPub, named AuthorLastName_Year_PublicationTitle_Title)

- PLOS_ScopusSearch_matched.csv (merge of the Scopus search results with the PLOS_OSI dataset for the files contained in both)

- oddpub_results_wDOIs.csv (results file of the ODDPub classification)

- PLOS_ODDPub.csv (merge of the results file of the ODDPub classification with the PLOS-OSI dataset for the publications contained in both)

## Manual coding

Folder 03_ManualCheck/

- CodeSheet_ManualCheck.txt (Code sheet with descriptions of the variables for manual coding)

- ManualCheck_2023-06-08.csv (Manual coding results file)

- PLOS_ODDPub_Manual.csv (Merge of the results file of the ODDPub and PLOS-OSI classification with the results file of the manual coding)

## Explorative analysis for the discoverability of open data

Folder04_FurtherAnalyses

Proof_of_of_Concept_Open_Data_Monitoring.pdf (Description of the explorative analysis of the discoverability of open data publications using the example of a researcher) - in German

## R-Script

Analyses_MA_OpenDataMonitoring.R (R-Script for preparing, merging and analyzing the data and for performing the ODDPub algorithm)
d
Integrated Hourly Meteorological Database of 20 Meteorological Stations...
search.dataone.org
osti.gov
Updated Jan 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boris Faybishenko; Dylan O'Ryan (2025). Integrated Hourly Meteorological Database of 20 Meteorological Stations (1981-2022) for Watershed Function SFA Hydrological Modeling [Dataset]. http://doi.org/10.15485/2502101
Explore at:
Unique identifier
https://doi.org/10.15485/2502101
Dataset updated
Jan 17, 2025
Dataset provided by
ESS-DIVE
Authors
Boris Faybishenko; Dylan O'Ryan
Time period covered
Jan 1, 1981 - Dec 31, 2022
Area covered

Description
This dataset contains (a) a script “R_met_integrated_for_modeling.R”, and (b) associated input CSV files: 3 CSV files per location to create a 5-variable integrated meteorological dataset file (air temperature, precipitation, wind speed, relative humidity, and solar radiation) for 19 meteorological stations and 1 location within Trail Creek from the modeling team within the East River Community Observatory as part of the Watershed Function Scientific Focus Area (SFA). As meteorological forcings varied across the watershed, a high-frequency database is needed to ensure consistency in the data analysis and modeling. We evaluated several data sources, including gridded meteorological products and field data from meteorological stations. We determined that our modeling efforts required multiple data sources to meet all their needs. As output, this dataset contains (c) a single CSV data file (*_1981-2022.csv) for each location (20 CSV output files total) containing hourly time series data for 1981 to 2022 and (d) five PNG files of time series and density plots for each variable per location (100 PNG files). Detailed location metadata is contained within the Integrated_Met_Database_Locations.csv file for each point location included within this dataset, obtained from Varadharajan et al., 2023 doi:10.15485/1660962. This dataset also includes (e) a file-level metadata (flmd.csv) file that lists each file contained in the dataset with associated metadata and (f) a data dictionary (dd.csv) file that contains column/row headers used throughout the files along with a definition, units, and data type. Review the (g) ReadMe_Integrated_Met_Database.pdf file for additional details on the script, methods, and structure of the dataset. The script integrates Northwest Alliance for Computational Science and Engineering’s PRISM gridded data product, National Oceanic and Atmospheric Administration’s NCEP-NCAR Reanalysis 1 gridded data product (through the RCNEP R package, Kemp et al., doi:10.32614/CRAN.package.RNCEP), and analytical-based calculations. Further, this script downscales the input data into hourly frequency, which is necessary for the modeling efforts.
d
Minority-group incubators and majority-group reservoirs for the diffusion of...
search.dataone.org
datadryad.org
Updated Jul 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew Turner (2025). Minority-group incubators and majority-group reservoirs for the diffusion of climate change adaptations [Dataset]. http://doi.org/10.5061/dryad.2bvq83bwm
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.2bvq83bwm
Dataset updated
Jul 12, 2025
Dataset provided by
Dryad Digital Repository
Authors
Matthew Turner
Time period covered
Jan 1, 2023
Description
These data are part of a data portal that accompanies the special issue 'Climate change adaptation needs a science of culture,' published in Philosophical Transactions of the Royal Society B in 2023. To access the data portal, please visit 10.5061/dryad.bnzs7h4h4. This repository contains code and supporting documentation for the agent-based model analyzed in our paper, "Minority-group incubators and majority-group reservoirs for climate change adaptation". We performed and analyzed a suite of agent-based models that simulated the spread of an adaptation, i.e., a beneficial behavior, in a population with a minority and majority group, defined by group size and tendency to interact with others from one's own group versus another group (homophily). We ran 1000 trials per parameter setting, where parameters were systematically varied to test different homophily levels in each group, and the effect of whether the minority group, majority group, or both groups start with one member knowing t..., The data was generated through agent-based modeling of adaptation diffusion in a simulated population. The data is broken out into several CSV files from simulations that were each run on one cluster node. There are two main archives of CSV files: (1) main_parts.zip contains 30 CSV output files used in the main text analysis; and (2) supp.zip contains 270 CSV output files used in the supplemental analyses. The R analysis code (scripts/plot.R) contains utilities for combining and processing this raw output. See the Analysis subsection in the README_Dryad.md file for more information on the data combination and processing steps., All required software is free and open-source. The simulations were run using the Agents.jl library in the Julia programming language. Model output analysis was performed with the ggplot2 library in the R programming language.
S1 Supporting information -
plos.figshare.com
zip
Updated Oct 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jens Winther Johannsen; Julian Laabs; Magdalena M. E. Bunbury; Morten Fischer Mortensen (2024). S1 Supporting information - [Dataset]. http://doi.org/10.1371/journal.pone.0301938.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0301938.s001
Dataset updated
Oct 28, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Jens Winther Johannsen; Julian Laabs; Magdalena M. E. Bunbury; Morten Fischer Mortensen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
S1 File. SI_C01_SPD_KDE_models. R-script for analysing radiocarbon dates dates. The code performs the computation of over-regional and regional SPD and KDE models, as well as their export to CSV files (Rmd). S2 File. SI_C02_aoristic_dating. R-script for exporting aoristic time series derived from typochronological dated archaeological material as CSV files (Rmd). S3 File. SI_C03_vegetation_openness_score_example. R-script performing the computation of a vegetation openness score from pollen records and the export of the generated time series as CVS file (Rmd). S4 File. SI_C04_data_preparation. Jupyter Notebook performing the import and transformation of relevant data visualize plots exhibited in the paper (ipynb). S5 File. SI_C05_figures_extra. Jupyter Notebook visualizing the plots exhibited in the paper (ipynb). S1 Data. SI_D01_reg_data_no_dups. Spread sheet holding radiocarbon dates, with the information of laboratory identification, site name, geographical coordinates, site type, material, source and regional affiliation (csv). S2 Data. SI_D02_reg_axe_dagger_graves. Spread sheet holding entries of axes and daggers, with the information of context, site, parish, artefact identification, type, subtype, absolute dating, typochonological dating, references, geographical coordinates and regional affiliations (csv). S3 Data. SI_D03_pollen_example. Spread sheet holding sample entries of the pollen records from Krageholm (neotoma Site ID 3204) and Bjäresjöholmsjön (neotoma Site ID 3017) for example run of S3 File. Record can be access via the neotoma explorer (https://apps.neotomadb.org/explorer/) with their given IDs. Each entry holds the information of the records type, regional affiliation, absolute BP and BCE dating, as well as the counts of given plant taxa (csv). S4 Data. SI_D04_PAP_303600_TOC_LOI. Table holding sample entries of TOC content, LOI and SST reconstruction of sediment core PAP_303600 for correlations of population development with Baltic sea surface temperature. Available via 10.1594/PANGAEA.883292 (tab). S5 Data. SI_D05_vos_[…]. Spread sheets holding the vegetation openness score time series of lake Belau, Vinge, Northern Jutland and Zealand (csv). (ZIP)

Facebook

Twitter

Click to copy link

Link copied

Cite

U.S. Geological Survey (2025). Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets and Model Results Using the Weighted Regressions on Time, Discharge, and Season (WRTDS) Model for Buck Creek Watersheds, Adirondack Park, New York, 2001 to 2021 [Dataset]. https://catalog.data.gov/dataset/streamflow-dissolved-organic-carbon-and-nitrate-input-datasets-and-model-results-using-the

Data from: Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets and Model Results Using the Weighted Regressions on Time, Discharge, and Season (WRTDS) Model for Buck Creek Watersheds, Adirondack Park, New York, 2001 to 2021

Explore at:

Dataset updated

Nov 26, 2025

Dataset provided by

United States Geological Surveyhttp://www.usgs.gov/

Description

This data release supports an analysis of changes in dissolved organic carbon (DOC) and nitrate concentrations in Buck Creek watershed near Inlet, New York 2001 to 2021. The Buck Creek watershed is a 310-hectare forested watershed that is recovering from acidic deposition within the Adirondack region. The data release includes pre-processed model inputs and model outputs for the Weighted Regressions on Time, Discharge and Season (WRTDS) model (Hirsch and others, 2010) to estimate daily flow normalized concentrations of DOC and nitrate during a 20-year period of analysis. WRTDS uses daily discharge and concentration observations implemented through the Exploration and Graphics for River Trends R package (EGRET) to predict solute concentration using decimal time and discharge as explanatory variables (Hirsch and De Cicco, 2015; Hirsch and others, 2010). Discharge and concentration data are available from the U.S. Geological Survey National Water Information System (NWIS) database (U.S. Geological Survey, 2016). The time series data were analyzed for the entire period, water years 2001 (WY2001) to WY2021 where WY2001 is the period from October 1, 2000 to September 30, 2001. This data release contains 5 comma-separated values (CSV) files, one R script, and one XML metadata file. There are four input files (“Daily.csv”, “INFO.csv”, “Sample_doc.csv”, and “Sample_nitrate.csv”) that contain site information, daily mean discharge, and mean daily DOC or nitrate concentrations. The R script (“Buck Creek WRTDS R script.R”) uses the four input datasets and functions from the EGRET R package to generate estimations of flow normalized concentrations. The output file (“WRTDS_results.csv”) contains model output at daily time steps for each sub-watershed and for each solute. Files are automatically associated with the R script when opened in RStudio using the provided R project file ("Files.Rproj"). All input, output, and R files are in the "Files.zip" folder.

Clear search

Close search

Google apps

Main menu

Data from: Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets...

Data to Assess Nitrogen Export from Forested Watersheds in and near the Long...

Petre_Slide_CategoricalScatterplotFigShare.pptx

7 Display the graph in a separate window. Dot colors indicate

Dataset of the paper: "How do Hugging Face Models Document Datasets, Bias,...

Data and Code for "A Ray-Based Input Distance Function to Model Zero-Valued...

Food and Agriculture Biomass Input–Output (FABIO) database

2007-08 V3 CEAMARC-CASO Bathymetry Plots Over Time During Events | gimi9.com...

Supplementary data for "Characterizing Intraspecific Resource Utilization in...

Data from: Generational differences in the low tones of Black Lahu

Dataset title: Datasets, scripts and main output files for "Phylogeny,...

Data and code for: Impacts of changing snowfall on seasonal complementarity...

QA/QC-ed Groundwater Level Time Series in PLM-1 and PLM-6 Monitoring Wells,...

Global Landslide Catalog Export - Dataset - NASA Open Data Portal

ESG rating of general stock indices

Data from: Data, R Code, and Output Supporting "An Historical Overview and...

Input data, model output, and R scripts for a machine learning streamflow...

Data supporting the Master thesis "Monitoring von Open Data Praktiken -...

Integrated Hourly Meteorological Database of 20 Meteorological Stations...

Minority-group incubators and majority-group reservoirs for the diffusion of...

S1 Supporting information -

Data from: Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets and Model Results Using the Weighted Regressions on Time, Discharge, and Season (WRTDS) Model for Buck Creek Watersheds, Adirondack Park, New York, 2001 to 2021See More Versions

Data from: Streamflow, Dissolved Organic Carbon, and Nitrate Input Datasets and Model Results Using the Weighted Regressions on Time, Discharge, and Season (WRTDS) Model for Buck Creek Watersheds, Adirondack Park, New York, 2001 to 2021