59 datasets found

R code
figshare.com
txt
Updated Jun 5, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christine Dodge (2017). R code [Dataset]. http://doi.org/10.6084/m9.figshare.5021297.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5021297.v1
Dataset updated
Jun 5, 2017
Dataset provided by
Figsharehttp://figshare.com/
Authors
Christine Dodge
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
R code used for each data set to perform negative binomial regression, calculate overdispersion statistic, generate summary statistics, remove outliers
MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository
healthdata.gov
application/rdfxml +5
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository [Dataset]. https://healthdata.gov/dataset/MeSH-2023-Update-Delete-Report-4at4-q6rg-Archive-R/bjnp-cusd
Explore at:
csv, application/rdfxml, json, tsv, application/rssxml, xmlAvailable download formats
Dataset updated
Jul 16, 2025
Description
This dataset tracks the updates made on the dataset "MeSH 2023 Update - Delete Report" as a repository for previous versions of the data and metadata.
g
Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8...
gimi9.com
data.usgs.gov
+2more
Updated Feb 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8 Analysis Ready Dataset Raster Images from 2013-2023 [Dataset]. https://gimi9.com/dataset/data-gov_water-temperature-of-lakes-in-the-conterminous-u-s-using-the-landsat-8-analysis-ready-2013
Explore at:
Dataset updated
Feb 22, 2025
Area covered
Contiguous United States
Description
This data release contains lake and reservoir water surface temperature summary statistics calculated from Landsat 8 Analysis Ready Dataset (ARD) images available within the Conterminous United States (CONUS) from 2013-2023. All zip files within this data release contain nested directories using .parquet files to store the data. The file example_script_for_using_parquet.R contains example code for using the R arrow package (Richardson and others, 2024) to open and query the nested .parquet files. Limitations with this dataset include: - All biases inherent to the Landsat Surface Temperature product are retained in this dataset which can produce unrealistically high or low estimates of water temperature. This is observed to happen, for example, in cases with partial cloud coverage over a waterbody. - Some waterbodies are split between multiple Landsat Analysis Ready Data tiles or orbit footprints. In these cases, multiple waterbody-wide statistics may be reported - one for each data tile. The deepest point values will be extracted and reported for tile covering the deepest point. A total of 947 waterbodies are split between multiple tiles (see the multiple_tiles = “yes” column of site_id_tile_hv_crosswalk.csv). - Temperature data were not extracted from satellite images with more than 90% cloud cover. - Temperature data represents skin temperature at the water surface and may differ from temperature observations from below the water surface. Potential methods for addressing limitations with this dataset: - Identifying and removing unrealistic temperature estimates: - Calculate total percentage of cloud pixels over a given waterbody as: percent_cloud_pixels = wb_dswe9_pixels/(wb_dswe9_pixels + wb_dswe1_pixels), and filter percent_cloud_pixels by a desired percentage of cloud coverage. - Remove lakes with a limited number of water pixel values available (wb_dswe1_pixels < 10) - Filter waterbodies where the deepest point is identified as water (dp_dswe = 1) - Handling waterbodies split between multiple tiles: - These waterbodies can be identified using the "site_id_tile_hv_crosswalk.csv" file (column multiple_tiles = “yes”). A user could combine sections of the same waterbody by spatially weighting the values using the number of water pixels available within each section (wb_dswe1_pixels). This should be done with caution, as some sections of the waterbody may have data available on different dates. All zip files within this data release contain nested directories using .parquet files to store the data. The example_script_for_using_parquet.R contains example code for using the R arrow package to open and query the nested .parquet files. - "year_byscene=XXXX.zip" – includes temperature summary statistics for individual waterbodies and the deepest points (the furthest point from land within a waterbody) within each waterbody by the scene_date (when the satellite passed over). Individual waterbodies are identified by the National Hydrography Dataset (NHD) permanent_identifier included within the site_id column. Some of the .parquet files with the _byscene datasets may only include one dummy row of data (identified by tile_hv="000-000"). This happens when no tabular data is extracted from the raster images because of clouds obscuring the image, a tile that covers mostly ocean with a very small amount of land, or other possible. An example file path for this dataset follows: year_byscene=2023/tile_hv=002-001/part-0.parquet -"year=XXXX.zip" – includes the summary statistics for individual waterbodies and the deepest points within each waterbody by the year (dataset=annual), month (year=0, dataset=monthly), and year-month (dataset=yrmon). The year_byscene=XXXX is used as input for generating these summary tables that aggregates temperature data by year, month, and year-month. Aggregated data is not available for the following tiles: 001-004, 001-010, 002-012, 028-013, and 029-012, because these tiles primarily cover ocean with limited land, and no output data were generated. An example file path for this dataset follows: year=2023/dataset=lakes_annual/tile_hv=002-001/part-0.parquet - "example_script_for_using_parquet.R" – This script includes code to download zip files directly from ScienceBase, identify HUC04 basins within desired landsat ARD grid tile, download NHDplus High Resolution data for visualizing, using the R arrow package to compile .parquet files in nested directories, and create example static and interactive maps. - "nhd_HUC04s_ingrid.csv" – This cross-walk file identifies the HUC04 watersheds within each Landsat ARD Tile grid. -"site_id_tile_hv_crosswalk.csv" - This cross-walk file identifies the site_id (nhdhr_{permanent_identifier}) within each Landsat ARD Tile grid. This file also includes a column (multiple_tiles) to identify site_id's that fall within multiple Landsat ARD Tile grids. - "lst_grid.png" – a map of the Landsat grid tiles labelled by the horizontal – vertical ID.
f
Data from: Error and anomaly detection for intra-participant time-series...
tandf.figshare.com
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David R. Mullineaux; Gareth Irwin (2023). Error and anomaly detection for intra-participant time-series data [Dataset]. http://doi.org/10.6084/m9.figshare.5189002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5189002
Dataset updated
Jun 1, 2023
Dataset provided by
Taylor & Francis
Authors
David R. Mullineaux; Gareth Irwin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Identification of errors or anomalous values, collectively considered outliers, assists in exploring data or through removing outliers improves statistical analysis. In biomechanics, outlier detection methods have explored the ‘shape’ of the entire cycles, although exploring fewer points using a ‘moving-window’ may be advantageous. Hence, the aim was to develop a moving-window method for detecting trials with outliers in intra-participant time-series data. Outliers were detected through two stages for the strides (mean 38 cycles) from treadmill running. Cycles were removed in stage 1 for one-dimensional (spatial) outliers at each time point using the median absolute deviation, and in stage 2 for two-dimensional (spatial–temporal) outliers using a moving window standard deviation. Significance levels of the t-statistic were used for scaling. Fewer cycles were removed with smaller scaling and smaller window size, requiring more stringent scaling at stage 1 (mean 3.5 cycles removed for 0.0001 scaling) than at stage 2 (mean 2.6 cycles removed for 0.01 scaling with a window size of 1). Settings in the supplied Matlab code should be customised to each data set, and outliers assessed to justify whether to retain or remove those cycles. The method is effective in identifying trials with outliers in intra-participant time series data.
Data from: CellFuse enables multi-modal integration of single-cell and...
zenodo.org
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhishek Koladiya; Abhishek Koladiya (2025). CellFuse enables multi-modal integration of single-cell and spatial proteomics data [Dataset]. http://doi.org/10.5281/zenodo.15858358
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15858358
Dataset updated
Jul 17, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Abhishek Koladiya; Abhishek Koladiya
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jul 19, 2025
Description
Fig 2

Bone marrow (Fig 2B, D, E, F, H, Supplementary Fig 1A, 2,3)

1. Fig 2/BM/Reference/ Fig2_BM_prepare_data.R: Prepare bone marrow for CellFuse

2. Fig 2/BM/ BM_CellFuse_Integration.R: Run CellFuse

3. Fig 2/BM/BM_Running_Benchmark_Methods.R: Run benchmarking methods (Harmony, Seurat, FastMNN)

4. Fig 2/BM/BM_scIB_Benchmarking.ipynb: evaluate performance of CellFuse and other benchmarking methods using scIB framework proposed by Luecken et al.

5. Fig 2/BM/ BM_scIB_prepare_figures.R: Visualize results of scIB framework

6. Fig 2/BM/Sequential_Feature_drop/Prepare_data.R: Prepare data for evaluating sequential feature drop

7. Fig 2/BM/Sequential_Feature_drop/Run_methods.R: Run CellFuse, Harmony, Seurat and FastMNN for sequential feature drop

8. Fig 2/BM/Sequential_Feature_drop/Evaluate_results.R: Evaluate results features drop and visualize data.

PBMC (Fig 2G,I, Supplementary Fig 1B and 4)

1. Fig 2/PBMC/Reference/ Fig2_PBMC_prepare_data.R: Prepare PBMC data for CellFuse

2. Fig 2/ PBMC / PBMC_CellFuse_Integration.R: Run CellFuse

3. Fig 2/ PBMC /PBMC_Running_Benchmark_Methods.R: Run benchmarking methods (Harmony, Seurat, FastMNN)

4. Fig 2/ PBMC /PBMC_scIB_Benchmarking.ipynb: evaluate performace of CellFuse and other benchmarking methods using scIB framework proposed by Luecken et al., 2021

5. Fig 2/ PBMC /PBMC_scIB_prepare_figures.R: Visualize results of scIB framework

6. Fig 2/ PBMC/ RunTime_benchmark/Run_Benchmark.R: Prepare data, run benchmarking method and evaluate results.

Fig 3 and Supplementary Fig 5

1. Fig 3/Reference/ Fig3_CyTOF_prepare_data.R: Prepare CyTOF and CITE-Seq data for CellFuse

2. Fig 3/CellFuse_Integration_CyTOF.R: Run CellFuse to remove batch effect and integrate CyTOF data from day 7 post-infusion

3. Fig 3/CellFuse_Integration_CITESeq.R: Run CellFuse to integrate CyTOF and CITE-Seq data

4. Fig 3/CART_Data_visualisation.R: Visualize data

Fig 4

HuBMAP CODEX data (Fig. 4A, B, C, D and Supplementary Fig 6)

1. Fig 4/CODEX_colorectal/Reference/ CODEX_HuBMAP_prepare_data.R: Prepare CODEX data from annotated and unannotated donor

2. Fig 4/ CODEX_colorectal/ CODEX_HuBMAP_CellFuse_Predict.R: Run CellFuse on cells from from annotated and unannotated donor

3. Fig 4/ CODEX_colorectal/CODEX_HuBMAP_Data_visualisation.R: Visualize data and prepare figures.

4. Fig 4/ CODEX_colorectal/ CODEX_HuBMAP_Benchmark.R: Benchmarking CellFuse against CELESTA, SVM and Seurat using cells from annotated donors and prepare figures.

a. Astir is python package so run following python notebook: Fig 4/ CODEX_colorectal/ Benchmarking/Astir/Astrir.ipynb

5. Fig 4/ CODEX_colorectal/CODEX_HuBMAP_Suppl_figure_heatmap.R: F1score calculation per celltype per Benchmarking methods and heatmap comparing celltypes from annotated and unannotated donors (Supplementary Fig 6)

IMC Breast cancer data (Fig. 4E,F, G and Supplementary Fig 7)

1. Fig 4/ IMC_Breast_Cancer/ IMC_prepare_data.R: Prepare CODEX data from annotated and unannotated donor

2. Fig 4/ IMC_Breast_Cancer/ IMC_CellFuse_Predict.R: Run CellFuse to predict cell types

3. Fig 4/ IMC_Breast_Cancer/ IMC_dat_visualization.R: Visualize data and prepare figures.

Fig 5

1. Fig5/ Reference/ Fig5_CyTOF_Data_prep.R: Prepare CyTOF data from healthy PBMC and healthy colon single cells

2. Fig5/ MIBI_CellFuse_Predict.R: Run CellFuse to predicte cells from colon cancer patients

3. Fig5/ MIBI_PostPrediction.R: Visualize data and prepare figures

4. Fig5/ Predicted_Data/ mask_generation.ipynb: Post CellFuse prediction annotated cell types in segmented images. This will generate Fig5C and D
Market Basket Analysis
kaggle.com
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 9, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Data and R code for constructing bird and insect seasonal trend recorded by...
figshare.com
txt
Updated Jul 2, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xu Shi; Baptiste Schmid; philippe tschanz; gernot segelbacher; Felix Liechti (2020). Data and R code for constructing bird and insect seasonal trend recorded by radar [Dataset]. http://doi.org/10.6084/m9.figshare.12600146.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12600146.v1
Dataset updated
Jul 2, 2020
Dataset provided by
Figsharehttp://figshare.com/
Authors
Xu Shi; Baptiste Schmid; philippe tschanz; gernot segelbacher; Felix Liechti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Featall.Rdata：dataset used to generate the results, including the MTR calculated for both birds and insects.01_MTR_tidy.Rmd: R script to combine bird and insect data & remove precipitation and technical contamination02_M_L_separation.Rmd: R script to calculate proportion o migration for each day/night and estimate total n. of birds and insects per year03_trend_construction.Rmd: R script to construct and plot trend of animal movement04_phenology_figs.Rmd: R script to plot flight direction (Fig. 2) and proportion of migration (Fig. S1)
n
Data and R code from: Spatiotemporal risk factors predict landscape-scale...
data.niaid.nih.gov
datadryad.org
zip
Updated Aug 31, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Eacker; Andrew Jakes; Paul Jones (2022). Data and R code from: Spatiotemporal risk factors predict landscape-scale survivorship for a northern ungulate [Dataset]. http://doi.org/10.5061/dryad.pvmcvdnnt
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pvmcvdnnt
Dataset updated
Aug 31, 2022
Dataset provided by
Alberta Wildlife Association
Smithsonian's National Zoo and Conservation Biology Institute
Taurus Wildlife Consulting
Authors
Daniel Eacker; Andrew Jakes; Paul Jones
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
These data and computer code (written in R, https://www.r-project.org) were created to statistically evaluate a suite of spatiotemporal covariates that could potentially explain pronghorn (Antilocapra americana) mortality risk in the Northern Sagebrush Steppe (NSS) ecosystem (50.0757o N, −108.7526o W). Known-fate data were collected from 170 adult female pronghorn monitored with GPS collars from 2003-2011, which were used to construct a time-to-event (TTE) dataset with a daily timescale and an annual recurrent origin of 11 November. Seasonal risk periods (winter, spring, summer, autumn) were defined by median migration dates of collared pronghorn. We linked this TTE dataset with spatiotemporal covariates that were extracted and collated from pronghorn seasonal activity areas (estimated using 95% minimum convex polygons) to form a final dataset. Specifically, average fence and road densities (km/km2), average snow water equivalent (SWE; kg/m2), and maximum decadal normalized difference vegetation index (NDVI) were considered as predictors. We tested for these main effects of spatiotemporal risk covariates as well as the hypotheses that pronghorn mortality risk from roads or fences could be intensified during severe winter weather (i.e., interactions: SWE*road density and SWE*fence density). We also compare an analogous frequentist implementation to estimate model-averaged risk coefficients. Ultimately, the study aimed to develop the first broad-scale, spatially explicit map of predicted annual pronghorn survivorship based on anthropogenic features and environmental gradients to identify areas for conservation and habitat restoration efforts.

Methods We combined relocations from GPS-collared adult female pronghorn (n = 170) with raster data that described potentially important spatiotemporal risk covariates. We first collated relocation and time-to-event data to remove individual pronghorn from the analysis that had no spatial data available. We then constructed seasonal risk periods based on the median migration dates determined from a previous analysis; thus, we defined 4 seasonal periods as winter (11 November–21 March), spring (22 March–10 April), summer (11 April–30 October), and autumn (31 October–10 November). We used the package 'amt' in Program R to rarify relocation data to a common 4-hr interval using a 30-min tolerance. We used the package 'adehabitatHR' in Program R to estimate seasonal activity areas using 95% minimum convex polygon. We constructed annual- and seasonal-specific risk covariates by averaging values within individual activity areas. We specifically extracted values for linear features (road and fence densities), a proxy for snow depth (SWE), and a measure of forage productivity (NDVI). We resampled all raster data to a common resolution of 1 km2. Given that fence density models characterized regional-scale variation in fence density (i.e., 1.5 km2), this resolution seemed appropriate for our risk analysis. We fit Bayesian proportional hazards (PH) models using a time-to-event approach to model the effects of spatiotemporal covariates on pronghorn mortality risk. We aimed to develop a model to understand the relative effects of risk covariates for pronghorn in the NSS. The effect of fence or road densities may depend on SWE such that the variables interact in affecting mortality risk. Thus, our full candidate model included four main effects and two interaction terms. We used reversible-jump Markov Chain Monte Carlo (RJMCMC) to determine relative support for a nested set of Bayesian PH models. This allowed us to conduct Bayesian model selection and averaging in one step by using two custom samplers provided for the R package 'nimble'. For brevity, we provide the final time-to-event dataset and analysis code rather than include all of the code, GIS, etc. used to estimate seasonal activity areas and extract and collate spatial risk covariates for each individual. Rather we provide the data and all code to reproduce the risk regression results presented in the manuscript.
f
Open data and analysis script for 'Attentional bias modification in virtual...
su.figshare.com
zip
Updated Mar 13, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lichen Ma; Anne-Wil Kruijt; Sofia Nöjd; Elin Zetterlund; Gerhard Andersson; Per Carlbring (2019). Open data and analysis script for 'Attentional bias modification in virtual reality' [Dataset]. http://doi.org/10.17045/sthlmuni.7813439.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.17045/sthlmuni.7813439.v1
Dataset updated
Mar 13, 2019
Dataset provided by
Stockholm University
Authors
Lichen Ma; Anne-Wil Kruijt; Sofia Nöjd; Elin Zetterlund; Gerhard Andersson; Per Carlbring
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The zip file contains the data files and R analysis script used in the manuscript titled 'Attentional bias modification in virtual reality - a VR-based dot-probe task with 2D and 3D stimuli'Analysis_script.R is a script file that can be opened by the statistical software R (https://www.r-project.org/) and Rstudio (https://www.rstudio.com/). All analysis steps and codes are found within this file.All files under the Data_files folder are directly called by Analysis_script from R, therefore please ensure that the folder structure and file names remain the same.Folder dot_probe_raw_data_files and its subfolders contain *.xml files with attentional bias (reaction time) data from the participants, generated by the VR program.outcome_measures_and_demographic_data.xlsx contains participant demographic data and questionnaire measures, generated by the iTerapi platform. This data file has been cleaned to remove information irrelevant to the analysis (e.g. number of reminder emails sent etc.).lsas_pre_individual_items.xlsx contains participant responses to individual items of the LSAS-SR questionnaire, generated by the iTerapi platform.
Air-conditioner Location Running Hours Data for GovHack 2015
researchdata.edu.au
Updated Jul 9, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Industry, Science and Resources (DISR) (2014). Air-conditioner Location Running Hours Data for GovHack 2015 [Dataset]. https://researchdata.edu.au/air-conditioner-location-govhack-2015/2979421
Explore at:
Dataset updated
Jul 9, 2014
Dataset provided by
Data.govhttps://data.gov/
Authors
Department of Industry, Science and Resources (DISR)
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
This resource is for historic purposes only and was provided for the GovHack competition (3-5 July 2015). After the event it was discovered that the latitude and longitude columns had been inadvertently inverted. For any project using this data please use the updated version of the resource (link) located here.\r \r We have elected not to remove this resource at this time so as to ensure that any GovHack entries using this data are not disadvantaged during the judging process. We intend to remove this version of the data after the GovHack judging has been completed.\r ==\r
t
Wenau, Stefan, Spieß, Volkhard, Zabel, Matthias (2021). Dataset: Multibeam...
service.tib.eu
Updated Nov 29, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Wenau, Stefan, Spieß, Volkhard, Zabel, Matthias (2021). Dataset: Multibeam bathymetry processed data (EM 120 echosounder dataset compilation) of RV METEOR & RV MARIA S. MERIAN during cruise M76/1 & MSM19/1c, Namibian continental slope. https://doi.org/10.1594/PANGAEA.932434 [Dataset]. https://service.tib.eu/ldmservice/dataset/png-doi-10-1594-pangaea-932434
Explore at:
Dataset updated
Nov 29, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data contain bathymetric data from the Namibia continental slope. The data were acquired on R/V Meteor research expeditions M76/1 in 2008, and R/V Maria S. Merian expedition MSM19/1c in 2011. The purpose of the data was the exploration of the Namibian continental slope and espressially the investigation of large seafloor depressions. The bathymetric data were acquired with the 191-beam 12 kHz Kongsberg EM120 system. The data were processed using the public software package MBSystems. The loaded data were cleaned semi-automatically and manually, removing outliers and other erroneous data. Initial velocity fields were adjusted to remove artifacts from the data. Gridding was done in 10x10 m grid cells for the MSM19-1c dataset and 50x50 m for the M76 dataset using the Gaussian Weighted Mean algorithm.
Genomic variant data and codes used for analysis in the manuscript - Whole...
figshare.com
bin
Updated Jul 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sam Heraghty (2022). Genomic variant data and codes used for analysis in the manuscript - Whole genome sequencing reveals the structure of environment associated divergence in a broadly distributed montane bumble bee, Bombus vancouverensis [Dataset]. http://doi.org/10.6084/m9.figshare.20310522.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20310522.v2
Dataset updated
Jul 14, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sam Heraghty
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
See below for details of the files included below.

delly_vanc.vcf.gz # Raw output of Delly

b.vanc.fully.filtered.100k.plus.recode.vcf.gz # output of freebayes which was filtered using VCFtools v0.1.13 (Danecek et al. 2011) with the following flags: --remove-indels --min-alleles 2 --max-alleles 2 --minQ 20 --minDP 4 --max-missing 0.75

Above file was also filtered to remove sites with unusually high coverage (>2x mean coverage) or excess heterozygosity. Finally SNPs that fell on scaffolds less than 100kb in length were removed

b.vanc.fully.filtered.100k.plus.recode.maf05.recode.ANN.vcf.gz #Fully filtered variant file (see manuscript for details) with annotation information

b.vanc.fully.filtered.100k.plus.recode.maf05.recode.impute.vcf.gz #Fully filtered variant file (see manuscript for details) after imputation with beagle

Description of each script contained in this directory

Trim_N_QC.sh #Trim raw sequencing data and run fastQC to evaluate trimmed data

BWA_PICARD_vanc1.sh #Example of script used to align sequence data to the reference genome using BWA. Also, uses Picard tools to sort, deduplicate and index bam files

P_call_test-2-vanc.sh #First part of pipeline for calling SNPS with freebayes (calls freebayes-parallel-part1_vanc.sh)

freebayes-parallel-part1_vanc.sh #see above

Filter_vanc.sh #Create list of SV's to filter from DELLY output

filter_delly.sh #filter based on generated list of SV's

delly_vanc.sh #call SV's using DELLY

bcf2vcf.sh # convert bcf from DELLY to vcf format

freebayes-parallel-part2.sh #Second part of freebayes pipeline

merge_vanc_vars.sh #Second part of freebayes pipeline (calls freebayes-parallel-part2.sh)

site_depth_vanc.sh #Gets site depth per SNP

remove_highdepth_vanc.sh #removes SNPs above depth threshold

hardy_vanc.sh #calculates HWE per SNP

remove_hwe_vanc.sh #removes SNPs based on HWE threshold

filter_vcf_size.sh #Removes SNPs on scaffolds less than 100Kb in size

filter_vcf_maf05.sh #filters SNPs based on 5% MAF filter

beagle.sh #imputes using beagle

LEA_con.R #converts vcf file into LFMM and geno format

Snpeff_ANN.sh # annotate vcf file using SNPeff

plink_for_sambaR.sh # convert vcf file into format ready for use in sambaR

LD_test.sh #example of script used to calculate LD per scaffold

vcf_stats.sh #Gets various stats from final filtered vcf

get_pi_diversity.sh #gets per population nucleotide diversity

sambaR.R #Runs SambaR

lfmm2_analysis.R #Code for running analysis on output of LFMM2 and generating graphs

Max_ent_map.R #Generates maxent map

RDA_script.R #Code for RDA analysis of structural variants

snprelate_script.R #runs SNPrelate as well as makes graphs of Fst and pi along scaffolds of interest

repeat_correctedfst.R #Analysis for correlation between repeat density and Fst

LD_script.R #analysis of linkage
Buoy, IOEB Doppler Current Profiler (ADCP) Data [Krishfield, R.]
data.ucar.edu
arcticdata.io
+2more
archive
Updated Dec 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Albert Plueddemann; Richard Krishfield; Takatoshi Takizawa (2024). Buoy, IOEB Doppler Current Profiler (ADCP) Data [Krishfield, R.] [Dataset]. http://doi.org/10.5065/D6XK8CZ9
Explore at:
archiveAvailable download formats
Unique identifier
https://doi.org/10.5065/D6XK8CZ9
Dataset updated
Dec 26, 2024
Dataset provided by
University Corporation for Atmospheric Research
Authors
Albert Plueddemann; Richard Krishfield; Takatoshi Takizawa
Time period covered
Oct 1, 1997 - Sep 28, 1998
Area covered

Description
A 150-kHz narrowband RD Instruments Acoustic Doppler Current Profiler (ADCP) internally recorded 34,805 current ensembles in 362 days from an Ice-Ocean Buoy (IOEB) deployed during the SHEBA project. The IOEB was initially deployed about 50 km from the main camp and drifted from 75.1 N, 141 W to 80.6 N, 160 W between October 1, 1997 and September 30, 1998. The ADCP was located at a depth of 14m below the ice surface and was configured to record data at 15 minute intervals from 40 8m wide bins extending downward 320m below the instrument. The retrieved 24 Mb raw data are processed to remove noise, correct for platform drift and geomagnetic declination, remove bottom hits, and output 2-hr average Earth-referenced current profiles along with ancillary data.
e
Compiled occurrence records for prey items of listed species found in...
knb.ecoinformatics.org
search.dataone.org
Updated Aug 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachel King; Jenna Braun; Michael Westphal; CJ Lortie (2023). Compiled occurrence records for prey items of listed species found in California drylands with associated environmental data [Dataset]. http://doi.org/10.5063/F1VM49RH
Explore at:
Unique identifier
https://doi.org/10.5063/F1VM49RH
Dataset updated
Aug 7, 2023
Dataset provided by
Knowledge Network for Biocomplexity
Authors
Rachel King; Jenna Braun; Michael Westphal; CJ Lortie
Time period covered
Jan 1, 1945 - Jan 1, 2022
Area covered

Description
This dataset contains cleaned GBIF (www.gbif.org) occurrence records and associated climate and environmental data for all arthropod prey of listed species in California drylands as identified in Lortie et al. (2023): https://besjournals.onlinelibrary.wiley.com/doi/full/10.1002/2688-8319.12251. All arthropod records were downloaded from GBIF (https://doi.org/10.15468/dl.ngym3r) on 14 November 2022. Records were imported into R using the rgbif package and cleaned with the coordinateCleaner package to remove occurrence data with likely errors. Environmental data include bioclimatic variables from WorldClim (www.worldclim.org), landcover and NDVI data from MODIS and the LPDAAC (https://lpdaac.usgs.gov/), elevation data from the USGS (https://www.sciencebase.gov/catalog/item/542aebf9e4b057766eed286a), and distance to the nearest road from the census bureau's TIGER/Line road shapefile (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html). All environmental data were combined into a stacked raster and we extracted the environmental variables for each occurrence record from this raster to make the final dataset.
r
Data from: SNP datasets used in the paper “Floristic classifications and...
researchdata.edu.au
Updated Aug 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mr Richard Dimon; Mr Richard Dimon; Honorary Professor Maurizio Rossetto; Dr Patrick Fahey; Dr Patrick Fahey (2023). SNP datasets used in the paper “Floristic classifications and bioregionalizations are not predictors of intra-specific evolutionary patterns” [Dataset]. http://doi.org/10.48610/5C76DE2
Explore at:
Unique identifier
https://doi.org/10.48610/5C76DE2
Dataset updated
Aug 28, 2023
Dataset provided by
The University of Queensland
Authors
Mr Richard Dimon; Mr Richard Dimon; Honorary Professor Maurizio Rossetto; Dr Patrick Fahey; Dr Patrick Fahey
License
https://guides.library.uq.edu.au/deposit-your-data/license-reuse-data-agreementhttps://guides.library.uq.edu.au/deposit-your-data/license-reuse-data-agreement
Description
The SNP dataset for each species investigated in this study is present. These datasets are saved as R data objects in list formats with meta-data for samples and post filtering DArTseq SNPs. Sample filtering included removing samples which we suspected to be mis-identified taxa, hybrids and those with >50% missing data, after which any samples from populations with less than 5 suitable samples remaining were also removed. SNP filtering included removing loci with reproducibility values below 0.96, missingness of >20%, followed by subsampling to one SNP per locus to remove any linkage effects. Datasets can be read into R, where they are formatted as list objects.
d
Underway Data from R/V Melville, R/V Roger Revelle cruises MV1101, RR1202 in...
search.dataone.org
bco-dmo.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William M. Balch (2021). Underway Data from R/V Melville, R/V Roger Revelle cruises MV1101, RR1202 in the Southern Ocean (30-60S); 2011-2012 (Great Calcite Belt project) [Dataset]. https://search.dataone.org/view/sha256:95953222af66522e3d54e5be1182d79457916fbffc7713155ab16f80ddf5c711
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
William M. Balch
Area covered
Southern Ocean
Description
Along track temperature, Salinity, backscatter, Chlorophyll Fluoresence, and normalized water leaving radiance (nLw).

On the bow of the vessel was a Satlantic SeaWiFS Aircraft Simulator (MicroSAS) system, used to estimate water-leaving radiance from the ship, analogous to to the nLw derived by the SeaWiFS and MODIS satellite sensors, but free from atmospheric error (hence, it can provide data below clouds).

The system consisted of a down-looking radiance sensor and a sky-viewing radiance sensor, both mounted on a steerable holder on the bow. A downwelling irradiance sensor was mounted at the top of the ship's meterological mast, on the bow, far from any potentially shading structures. These data were used to estimate normalized water-leaving radiance as a function of wavelength. The radiance detector was set to view the water at 40deg from nadir as recommended by Mueller et al. [2003b]. The water radiance sensor was able to view over an azimuth range of ~180deg across the ship's heading with no viewing of the ship's wake. The direction of the sensor was adjusted to view the water 90-120deg from the sun's azimuth, to minimize sun glint. This was continually adjusted as the time and ship's gyro heading were used to calculate the sun's position using an astronomical solar position subroutine interfaced with a stepping motor which was attached to the radiometer mount (designed and fabricated at Bigelow Laboratory for Ocean Sciences). Protocols for operation and calibration were performed according to Mueller [Mueller et al., 2003a; Mueller et al., 2003b; Mueller et al., 2003c]. Before 1000h and after 1400h, data quality was poorer as the solar zenith angle was too low. Post-cruise, the 10Hz data were filtered to remove as much residual white cap and glint as possible (we accept the lowest 5% of the data). Reflectance plaque measurements were made several times at local apparent noon on sunny days to verify the radiometer calibrations.

Within an hour of local apparent noon each day, a Satlantic OCP sensor was deployed off the stern of the vessel after the ship oriented so that the sun was off the stern. The ship would secure the starboard Z-drive, and use port Z-drive and bow thruster to move the ship ahead at about 25cm s-1. The OCP was then trailed aft and brought to the surface ~100m aft of the ship, then allowed to sink to 100m as downwelling spectral irradiance and upwelling spectral radiance were recorded continuously along with temperature and salinity. This procedure ensured there were no ship shadow effects in the radiometry.

Instruments include a WETLabs wetstar fluorometer, a WETLabs ECOTriplet and a SeaBird microTSG.
Radiometry was done using a Satlantic 7 channel microSAS system with Es, Lt and Li sensors.

Chl data is based on inter calibrating surface discrete Chlorophyll measure with the temporally closest fluorescence measurement and applying the regression results to all fluorescence data.

Data have been corrected for instrument biofouling and drift based on weekly purewater calibrations of the system. Radiometric data has been processed using standard Satlantic processing software and has been checked with periodic plaque measurements using a 2% spectralon standard.

Lw is calculated from Lt and Lsky and is \"what Lt would be if the
sensor were looking straight down\". Since our sensors are mounted at
40o, based on various NASA protocols, we need to do that conversion.

Lwn adds Es to the mix. Es is used to normalize Lw. Nlw is related to Rrs, Remote Sensing Reflectance

Techniques used are as described in:
Balch WM, Drapeau DT, Bowler BC, Booth ES, Windecker LA, Ashe A (2008) Space–time variability of carbon standing stocks and fixation rates in the Gulf of Maine, along the GNATS transect between Portland, ME, USA, and Yarmouth, Nova Scotia, Canada.
J Plankton Res 30:119–139
Occurrence Record Dataset from "Depth Matters for Marine Biodiversity"
zenodo.org
zip
Updated Aug 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hannah Owens; Hannah Owens (2024). Occurrence Record Dataset from "Depth Matters for Marine Biodiversity" [Dataset]. http://doi.org/10.5281/zenodo.13318673
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13318673
Dataset updated
Aug 14, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Hannah Owens; Hannah Owens
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This is the final occurrence record dataset produced for the manuscript "Depth Matters for Marine Biodiversity". Detailed methods for the creation of the dataset, below, have been excerpted from Appendix I: Extended Methods. Detailed citations for the occurrence datasets from which these data were derived can also be foud in Appedix I of the manuscript.

We assembled a list of all recognized species of fishes from the orders Scombiformes (sensu Betancur-R et al., 2017), Gadiformes, and Beloniformes by accessing FishBase (Boettiger et al., 2012; Froese & Pauly, 2017) and the Ocean Biodiversity Information System (OBIS; OBIS, 2022; Provoost & Bosch, 2019) through queries in R (R Core Team, 2021). Species were considered Atlantic if their FishBase distribution or occurrence records on OBIS included any area within the Atlantic or Mediterranean major fishing regions as defined by the Food and Agriculture Organization of the United Nations (FAO Regions 21, 27, 31, 34, 37, 41, 47, and 48; FAO, 2020) The database query script can be found on the project code repository (https://github.com/hannahlowens/3DFishRichness/blob/main/1_OccurrenceSearch.R). We then curated the list of names to resolve discrepancies in taxonomy and known distributions through comparison with the Eschmeyer Catalog of Fishes (Eschmeyer & Fricke, 2015), accessed in September of 2020, as our ultimate taxonomic authority. The resulting list of species was then mapped onto the Global Biodiversity Information Facility’s backbone taxonomy (Chamberlain et al., 2021; GBIF, 2020a) to ensure taxonomic concurrence across databases (Appendix I Table 1). The final taxonomic list was used to download occurrence records from OBIS (OBIS, 2022) and GBIF (GBIF, 2020b) in R through robis and occCite (Chamberlain et al., 2020; Provoost & Bosch, 2019; Owens et al., 2021).

Once the resulting data were mapped and curated to remove records with putatively spurious coordinates, under-sampled regions and species were augmented with data from publicly available digital museum collection databases not served through OBIS or GBIF, as well as a literature search. For each species, duplicate points were removed from two- and three-dimensional species occurrence datasets separately, and inaccurate depth records were removed from 3D datasets. Inaccuracy was determined based on extreme statistical outliers (values greater than 2 or less than -2 when occurrence depths were centered and scaled), depth ranges that exceeded bathymetry at occurrence coordinates, and occurrence far outside known depth ranges compared to information from FishBase, Eschmeyer’s Catalog of Fishes, and congeneric depth ranges in the dataset. Finally, for datasets with more than 20 points remaining after cleaning, occurrence data were downsampled to the resolution of the environmental data; that is, to 1 point per 1 degree grid cell in the 2D dataset, and to one point per depth slice per 1 degree grid cell in the 3D dataset. Counts of raw and cleaned records for each species can be found in Appendix 1 Table 1.

References:

Betancur-R, R., Wiley, E. O., Arratia, G., Acero, A., Bailly, N., Miya, M., Lecointre, G., & Ortí, G. (2017). Phylogenetic classification of bony fishes. BMC Evolutionary Biology, 17(1), 162. https://doi.org/10.1186/s12862-017-0958-3

Boettiger, C., Lang, D. T., & Wainwright, P. C. (2012). rfishbase: exploring, manipulating and visualizing FishBase data from R. Journal of Fish Biology, 81(6), 2030–2039. https://doi.org/10.1111/j.1095-8649.2012.03464.x

Chamberlain, S., Barve, V., McGlinn, D., Oldoni, D., Desmet, P., Geffert, L., & Ram, K. (2021). rgbif: Interface to the Global Biodiversity Information Facility API. https://CRAN.R-project.org/package=rgbif

Eschmeyer, & Fricke, W. N. &. (2015). Taxonomic checklist of fish species listed in the CITES Appendices and EC Regulation 338/97 (Elasmobranchii, Actinopteri, Coelacanthi, and Dipneusti, except the genus Hippocampus). Catalog of Fishes, Electronic Version. Accessed September, 2020. https://www.calacademy.org/scientists/projects/eschmeyers-catalog-of-fishes

FAO. (2020). FAO Major Fishing Areas. United Nations Fisheries and Aquaculture Division. https://www.fao.org/fishery/en/collection/area

Froese, R., & Pauly, D. (2017). FishBase. Accessed September, 2022. www.fishbase.org

GBIF.org. (2020a). GBIF Backbone Taxonomy. Accessed September, 2020. GBIF.org

GBIF.org. (2020b). GBIF Occurrence Download. Accessed November, 2020. https://doi.org/10.15468

OBIS. (2020). Ocean Biodiversity Information System. Intergovernmental Oceanographic Commission of UNESCO. Accessed November, 2020. www.obis.org

Owens, H. L., Merow, C., Maitner, B. S., Kass, J. M., Barve, V., & Guralnick, R. P. (2021). occCite: Tools for querying and managing large biodiversity occurrence datasets. Ecography, 44(8), 1228–1235. https://doi.org/10.1111/ecog.05618

Provoost, P., & Bosch, S. (2019). robis: R Client to access data from the OBIS API. https://cran.r-project.org/package=robis

R Core Team. (2021). R: A Language and Environment for Statistical Computing. https://www.R-project.org/
Z
Marine geophysical data exchange files for R/V Kilo Moana: 2002 to 2018
data.niaid.nih.gov
zenodo.org
Updated Apr 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamilton, Michael (2021). Marine geophysical data exchange files for R/V Kilo Moana: 2002 to 2018 [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4699568
Explore at:
Dataset updated
Apr 27, 2021
Dataset authored and provided by
Hamilton, Michael
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary:

Marine geophysical exchange files for R/V Kilo Moana: 2002 to 2018 includes 328 geophysical archive files spanning km0201, the vessel's very first expedition, through km1812, the last survey included in this data synthesis.

Data formats (you will likely require only one of these):

MGD77T (M77T): ASCII - the current standard format for marine geophysical data exchange, tab delimited, low human readability

MGD77: ASCII - legacy format for marine geophysical data exchange (no longer recommended due to truncated data precision and low human readability)

GMT DAT: ASCII - the Generic Mapping Tools format in which these archive files were built, best human readability but largest file size

MGD77+: highly flexible and disk space saving binary NetCDF-based format, enables adding additional columns and application of errata-based data correction methods (i.e., Chandler et al, 2012), not human readable

The process by which formats were converted is explained below.

Data Reduction and Explanation:

R/V Kilo Moana routinely acquired bathymetry data using two concurrently operated sonar systems hence, for this analysis, a best effort was made to extract center beam depth values from the appropriate sonar system. No resampling or decimation of center beam depth data has been performed with the exception that all depth measurements were required to be temporally separated by at least 1 second. The initial sonar systems were the Kongsberg EM120 for deep and EM1002 for shallow water mapping. The vessel's deep sonar system was upgraded to Kongsberg EM122 in January of 2010 and the shallow system to EM710 in March 2012.

The vessel deployed a Lacoste and Romberg spring-type gravity meter (S-33) from 2002 until March 2012 when it was replaced with a Bell Labs BGM-3 forced feedback-type gravity meter. Of considerable importance is that gravity tie-in logs were by and large inadequate for the rigorous removal of gravity drift and tares. Hence a best effort has been made to remove gravity meter drift via robust regression to satellite-derived gravity data. Regression slope and intercept are analogous to instrument drift and DC shift hence their removal markedly improves the agreement between shipboard and satellite gravity anomalies for most surveys. These drift corrections were applied to both observed gravity and free air anomaly fields. If the corrections are undesired by users, the correction coefficients have been supplied within the metadata headers for all gravity surveys, thereby allowing users to undo these drift corrections.

The L&R gravity meter had a 180 second hardware filter so for this analysis the data were Gaussian filtered another 180 seconds and resampled at 10 seconds. BGM-3 data are not hardware filtered hence a 360 second Gaussian filter was applied for this analysis. BGM-3 gravity anomalies were resampled at 15 second intervals. For both meter types, data gaps exceeding the filter length were not through-interpolated. Eotvos corrections were computed via the standard formula (e.g., Dehlinger, 1978) and were subjected to identical filtering of the respective gravity meter.

The vessel also deployed a Geometrics G-882 cesium vapor magnetometer on several expeditions. A Gaussian filter length of 135 seconds has been applied and resampling was performed at 15 second intervals with the same exception that no interpolation was performed through data gaps exceeding the filter length.

Archive file production:

At all depth, gravity and magnetic measurement times, vessel GPS navigation was resampled using linear interpolation as most geophysical measurement times did not exactly coincide with GPS position times. The geophysical fields were then merged with resampled vessel navigation and listed sequentially in the GMT DAT format to produce data records.

Archive file header fields were populated with relevant information such as port names, PI names, instrument and data processing details, and others whereas survey geographic and temporal boundary fields were automatically computed from the data records.

Archive file conversion:

Once completed, each marine geophysical data exchange file was converted to the other formats using the Generic Mapping Tools program known as mgd77convert. For example, conversions to the other formats were carried out as follows:

mgd77convert km0201.dat -Ft -Tm # gives mgd77t (m77t file extension)

mgd77convert km0201.dat -Ft -Ta # gives mgd77

mgd77convert km0201.dat -Ft -Tc # gives mgd77+ (nc file extension)

Disclaimers:

These data have not been edited in detail using a visual data editor and data outliers are known to exist. Several hardware malfunctions are known to have occurred during the 2002 to 2018 time frame and these malfunctions are apparent in some of the data sets. No guarantee is made that the data are accurate and they are not meant to be used for vessel navigation. Close scrutiny and further removal of outliers and other artifacts is recommended before making scientific determinations from these data.

The archive file production method employed for this analysis is explained in detail by Hamilton et al (2019).
e
Multibeam bathymetry processed data (EM 120 echosounder dataset compilation)...
b2find.eudat.eu
Updated Oct 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Multibeam bathymetry processed data (EM 120 echosounder dataset compilation) of RV METEOR & RV MARIA S. MERIAN during cruise M76/1 & MSM19/1c, Namibian continental slope - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/ac5a07f1-b764-5c4e-bcc8-7eb7854f2c37
Explore at:
Dataset updated
Oct 26, 2023
Description
The data contain bathymetric data from the Namibia continental slope. The data were acquired on R/V Meteor research expeditions M76/1 in 2008, and R/V Maria S. Merian expedition MSM19/1c in 2011. The purpose of the data was the exploration of the Namibian continental slope and espressially the investigation of large seafloor depressions. The bathymetric data were acquired with the 191-beam 12 kHz Kongsberg EM120 system. The data were processed using the public software package MBSystems. The loaded data were cleaned semi-automatically and manually, removing outliers and other erroneous data. Initial velocity fields were adjusted to remove artifacts from the data. Gridding was done in 10x10 m grid cells for the MSM19-1c dataset and 50x50 m for the M76 dataset using the Gaussian Weighted Mean algorithm.
Data from: Disentangling the origins of confidence in speeded perceptual...
openneuro.org
Updated Apr 25, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Pereira; Nathan Faivre; Inaki Iturrate; Marco Wirthlin; Luana Serafini; Stephanie Martin; Arnaud Desvachez; Olaf Blanke; Dimitri Van de Ville; Jose del R. Millan (2020). Disentangling the origins of confidence in speeded perceptual judgments through multimodal imaging [Dataset]. http://doi.org/10.18112/openneuro.ds002158.v1.0.2
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds002158.v1.0.2
Dataset updated
Apr 25, 2020
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Michael Pereira; Nathan Faivre; Inaki Iturrate; Marco Wirthlin; Luana Serafini; Stephanie Martin; Arnaud Desvachez; Olaf Blanke; Dimitri Van de Ville; Jose del R. Millan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains the data in

Pereira, M., Faivre, N., Iturrate, I., Wirthlin, M., Serafini, L., Martin, S., Desvachez, A., Blanke, O., Van De Ville, D., Millan, JdR. (2020). Disentangling the origins of confidence in speeded perceptual judgments through multimodal imaging. Proceedings of the National Academy of Science, 117 (15) pp. 8382-8390 https://doi.org/10.1073/pnas.1918335117

Preprint: https://www.biorxiv.org/content/10.1101/496877v1

ABSTRACT The human capacity to compute the likelihood that a decision is correct—known as metacognition—has proven difficult to study in isolation as it usually cooccurs with decision making. Here, we isolated postdecisional from decisional contributions to metacognition by analyzing neural correlates of confidence with multimodal imaging. Healthy volunteers reported their confidence in the accuracy of decisions they made or decisions they observed. We found better metacognitive performance for committed vs. observed decisions, indicating that committing to a decision may improve confidence. Relying on concurrent electroencephalography and hemodynamic recordings, we found a common correlate of confidence following committed and observed decisions in the inferior frontal gyrus and a dissociation in the anterior prefrontal cortex and anterior insula. We discuss these results in light of decisional and postdecisional accounts of confidence and propose a computational model of confidence in which metacognitive performance naturally improves when evidence accumulation is constrained upon committing a decision.

preregistration: https://osf.io/a5qmv/

The dataset contains raw fMRI scans, raw EEG in BrainVision format as well as anatomical scans (T1) and field mapping. We also included preprocessed EEG and fMRI data in derivatives/eegprep and derivatives/fmriprep.

EEG PREPROCESSING MR-gradient artifacts were removed using sliding window average template subtraction. TP10 electrode on the right mastoid was used to detect heartbeats for ballistocardiogram artifact (BCG) removal using a semi-automatic procedure in BrainVision Analyzer 2. Data were then filtered using a Butterworth, 4th order zero-phase (two-pass) bandpass filter between 1 and 10 Hz, epoched [-0.2, 0.6 s] around the response onset (i.e. the button press in the active condition or the appearance of the virtual hand for in observation condition), re-referenced to a common average, and input to independent component analysis (ICA) to remove residual BCG and ocular artifacts. In order to ensure numerical stability when estimating the independent components, we retained 99% of the variance from the electrode space, leading to an average of 19 (SD = 6) components estimated for each participant and condition. Independent components (ICs) were then fitted with a dipolar source localization method (66). ICs whose dipole lied outside the brain, or resembled muscular or ocular artifacts were eliminated. A total of 8 (SD = 3) components were finally kept. All preprocessing steps were performed using EEGLAB and in-house scripts under Matlab (The MathWorks, Inc., Natick, Massachusetts, United States).

FMRI PREPROCESSING We modeled the BOLD signal using a general linear model (GLM) with two separate regressors (stick functions at stimulus onset) for the active and observation condition as well as their spatial and temporal derivatives. We then parametrically modulated the regressors with three behavioral variables: the confidence ratings, the response times, and the numerosity difference between the two arrays of dots (i.e., perceptual evidence). Empirical cross-correlation between regressors confirmed limited collinearity for the active (resp. observation) condition (max(abs(R)) = 0.26 ± 0.02 resp., max(abs(R)) = 0.25 ± 0.02). Bad trials as defined in the behavioral analysis section were modeled by two separate regressors (one for active and one for observation) and their spatial and temporal derivatives. We added six realignments parameters as regressors of no interest. All second-level (group-level) results are reported at a significance-level of p < 0.05 using cluster-extent family-wise error (FWE) correction with a voxel-height threshold of p < 0.001. We used the anatomical automatic labelling (AAL) atlas for brain parcellation (Tzourio-Mazoyer et al., 2002).

Facebook

Twitter

Click to copy link

Link copied

Cite

Christine Dodge (2017). R code [Dataset]. http://doi.org/10.6084/m9.figshare.5021297.v1

R code

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.5021297.v1

Dataset updated

Jun 5, 2017

Dataset provided by

Figsharehttp://figshare.com/

Authors

Christine Dodge

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

R code used for each data set to perform negative binomial regression, calculate overdispersion statistic, generate summary statistics, remove outliers

Clear search

Close search

Google apps

Main menu

R code

MeSH 2023 Update - Delete Report - 4at4-q6rg - Archive Repository

Water Temperature of Lakes in the Conterminous U.S. Using the Landsat 8...

Data from: Error and anomaly detection for intra-participant time-series...

Data from: CellFuse enables multi-modal integration of single-cell and...

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Data and R code for constructing bird and insect seasonal trend recorded by...

Data and R code from: Spatiotemporal risk factors predict landscape-scale...

Open data and analysis script for 'Attentional bias modification in virtual...

Air-conditioner Location Running Hours Data for GovHack 2015

Wenau, Stefan, Spieß, Volkhard, Zabel, Matthias (2021). Dataset: Multibeam...

Genomic variant data and codes used for analysis in the manuscript - Whole...

Above file was also filtered to remove sites with unusually high coverage (>2x mean coverage) or excess heterozygosity. Finally SNPs that fell on scaffolds less than 100kb in length were removed

Description of each script contained in this directory

Buoy, IOEB Doppler Current Profiler (ADCP) Data [Krishfield, R.]

Compiled occurrence records for prey items of listed species found in...

Data from: SNP datasets used in the paper “Floristic classifications and...

Underway Data from R/V Melville, R/V Roger Revelle cruises MV1101, RR1202 in...

Occurrence Record Dataset from "Depth Matters for Marine Biodiversity"

Marine geophysical data exchange files for R/V Kilo Moana: 2002 to 2018

Multibeam bathymetry processed data (EM 120 echosounder dataset compilation)...

Data from: Disentangling the origins of confidence in speeded perceptual...

R code