42 datasets found
  1. f

    Merge script

    • figshare.com
    txt
    Updated May 30, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jose Aguasvivas; Jon Andoni Duñabeitia (2018). Merge script [Dataset]. http://doi.org/10.6084/m9.figshare.5924797.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2018
    Dataset provided by
    figshare
    Authors
    Jose Aguasvivas; Jon Andoni Duñabeitia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Script in R language to load the databases and merge them into an unique file. This script may be modified and extended as needed.

  2. u

    NSF/NCAR GV HIAPER 1 Minute Data Merge

    • data.ucar.edu
    ascii
    Updated Aug 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson; Michael Shook (2025). NSF/NCAR GV HIAPER 1 Minute Data Merge [Dataset]. http://doi.org/10.26023/R1RA-JHKZ-W913
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Gao Chen; Jennifer R. Olson; Michael Shook
    Time period covered
    May 18, 2012 - Jun 30, 2012
    Area covered
    Description

    This data set contains NSF/NCAR GV HIAPER 1 Minute Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 30 June 2012. These are updated merges from the NASA DC3 archive that were made available 13 June 2014. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrg60-gV_merge_YYYYMMdd_R5_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments.

  3. e

    Merger of BNV-D data (2008 to 2019) and enrichment

    • data.europa.eu
    zip
    Updated Jan 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick VINCOURT (2025). Merger of BNV-D data (2008 to 2019) and enrichment [Dataset]. https://data.europa.eu/data/datasets/5f1c3eca9d149439e50c740f
    Explore at:
    zip(18530465)Available download formats
    Dataset updated
    Jan 16, 2025
    Dataset authored and provided by
    Patrick VINCOURT
    Description

    Merging (in Table R) data published on https://www.data.gouv.fr/fr/datasets/ventes-de-pesticides-par-departement/, and joining two other sources of information associated with MAs: — uses: https://www.data.gouv.fr/fr/datasets/usages-des-produits-phytosanitaires/ — information on the “Biocontrol” status of the product, from document DGAL/SDQSPV/2020-784 published on 18/12/2020 at https://agriculture.gouv.fr/quest-ce-que-le-biocontrole

    All the initial files (.csv transformed into.txt), the R code used to merge data and different output files are collected in a zip. enter image description here NB: 1) “YASCUB” for {year,AMM,Substance_active,Classification,Usage,Statut_“BioConttrol”}, substances not on the DGAL/SDQSPV list being coded NA. 2) The file of biocontrol products shall be cleaned from the duplicates generated by the marketing authorisations leading to several trade names.
    3) The BNVD_BioC_DY3 table and the output file BNVD_BioC_DY3.txt contain the fields {Code_Region,Region,Dept,Code_Dept,Anne,Usage,Classification,Type_BioC,Quantite_substance)}

  4. u

    NASA DC-8 SAGAAERO Data Merge

    • data.ucar.edu
    ascii
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson; Michael Shook (2025). NASA DC-8 SAGAAERO Data Merge [Dataset]. http://doi.org/10.26023/ANQE-HZRR-P30K
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Gao Chen; Jennifer R. Olson; Michael Shook
    Time period covered
    May 18, 2012 - Jun 22, 2012
    Area covered
    Description

    This data set contains NASA DC-8 SAGAAERO Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 22 June 2012. These merge files were updated by NASA. The data have been merged to SAGAAero file timeline. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrgSAGAAero-dc8_merge_YYYYMMdd_R*_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments.

  5. Data from: KORUS-AQ Aircraft Merge Data Files

    • catalog.data.gov
    • cmr.earthdata.nasa.gov
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/LARC/SD/ASDC (2025). KORUS-AQ Aircraft Merge Data Files [Dataset]. https://catalog.data.gov/dataset/korus-aq-aircraft-merge-data-files
    Explore at:
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    KORUSAQ_Merge_Data are pre-generated merge data files combining various products collected during the KORUS-AQ field campaign. This collection features pre-generated merge files for the DC-8 aircraft. Data collection for this product is complete.The KORUS-AQ field study was conducted in South Korea during May-June, 2016. The study was jointly sponsored by NASA and Korea’s National Institute of Environmental Research (NIER). The primary objectives were to investigate the factors controlling air quality in Korea (e.g., local emissions, chemical processes, and transboundary transport) and to assess future air quality observing strategies incorporating geostationary satellite observations. To achieve these science objectives, KORUS-AQ adopted a highly coordinated sampling strategy involved surface and airborne measurements including both in-situ and remote sensing instruments.Surface observations provided details on ground-level air quality conditions while airborne sampling provided an assessment of conditions aloft relevant to satellite observations and necessary to understand the role of emissions, chemistry, and dynamics in determining air quality outcomes. The sampling region covers the South Korean peninsula and surrounding waters with a primary focus on the Seoul Metropolitan Area. Airborne sampling was primarily conducted from near surface to about 8 km with extensive profiling to characterize the vertical distribution of pollutants and their precursors. The airborne observational data were collected from three aircraft platforms: the NASA DC-8, NASA B-200, and Hanseo King Air. Surface measurements were conducted from 16 ground sites and 2 ships: R/V Onnuri and R/V Jang Mok.The major data products collected from both the ground and air include in-situ measurements of trace gases (e.g., ozone, reactive nitrogen species, carbon monoxide and dioxide, methane, non-methane and oxygenated hydrocarbon species), aerosols (e.g., microphysical and optical properties and chemical composition), active remote sensing of ozone and aerosols, and passive remote sensing of NO2, CH2O, and O3 column densities. These data products support research focused on examining the impact of photochemistry and transport on ozone and aerosols, evaluating emissions inventories, and assessing the potential use of satellite observations in air quality studies.

  6. u

    NASA DC-8 10 Second Data Merge

    • data.ucar.edu
    ascii
    Updated Aug 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson; Michael Shook (2025). NASA DC-8 10 Second Data Merge [Dataset]. http://doi.org/10.26023/CHJ0-RYQ4-GR10
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Gao Chen; Jennifer R. Olson; Michael Shook
    Time period covered
    May 18, 2012 - Jun 22, 2012
    Area covered
    Description

    This data set contains NASA DC-8 10 Second Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 22 June 2012. These merges are an updated version provided by NASA. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrg10-dc8_merge_YYYYMMdd_R*_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments. For the latest information on the updates to this dataset, please see the readme file.

  7. n

    Multilevel modeling of time-series cross-sectional data reveals the dynamic...

    • data.niaid.nih.gov
    • dataone.org
    • +1more
    zip
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kodai Kusano (2020). Multilevel modeling of time-series cross-sectional data reveals the dynamic interaction between ecological threats and democratic development [Dataset]. http://doi.org/10.5061/dryad.547d7wm3x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    University of Nevada, Reno
    Authors
    Kodai Kusano
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    What is the relationship between environment and democracy? The framework of cultural evolution suggests that societal development is an adaptation to ecological threats. Pertinent theories assume that democracy emerges as societies adapt to ecological factors such as higher economic wealth, lower pathogen threats, less demanding climates, and fewer natural disasters. However, previous research confused within-country processes with between-country processes and erroneously interpreted between-country findings as if they generalize to within-country mechanisms. In this article, we analyze a time-series cross-sectional dataset to study the dynamic relationship between environment and democracy (1949-2016), accounting for previous misconceptions in levels of analysis. By separating within-country processes from between-country processes, we find that the relationship between environment and democracy not only differs by countries but also depends on the level of analysis. Economic wealth predicts increasing levels of democracy in between-country comparisons, but within-country comparisons show that democracy declines as countries become wealthier over time. This relationship is only prevalent among historically wealthy countries but not among historically poor countries, whose wealth also increased over time. By contrast, pathogen prevalence predicts lower levels of democracy in both between-country and within-country comparisons. Our longitudinal analyses identifying temporal precedence reveal that not only reductions in pathogen prevalence drive future democracy, but also democracy reduces future pathogen prevalence and increases future wealth. These nuanced results contrast with previous analyses using narrow, cross-sectional data. As a whole, our findings illuminate the dynamic process by which environment and democracy shape each other.

    Methods Our Time-Series Cross-Sectional data combine various online databases. Country names were first identified and matched using R-package “countrycode” (Arel-Bundock, Enevoldsen, & Yetman, 2018) before all datasets were merged. Occasionally, we modified unidentified country names to be consistent across datasets. We then transformed “wide” data into “long” data and merged them using R’s Tidyverse framework (Wickham, 2014). Our analysis begins with the year 1949, which was occasioned by the fact that one of the key time-variant level-1 variables, pathogen prevalence was only available from 1949 on. See our Supplemental Material for all data, Stata syntax, R-markdown for visualization, supplemental analyses and detailed results (available at https://osf.io/drt8j/).

  8. Data from: A dataset to model Levantine landcover and land-use change...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael Kempf; Michael Kempf (2023). A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19 [Dataset]. http://doi.org/10.5281/zenodo.10396148
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michael Kempf; Michael Kempf
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 16, 2023
    Area covered
    Levant
    Description

    Overview

    This dataset is the repository for the following paper submitted to Data in Brief:

    Kempf, M. A dataset to model Levantine landcover and land-use change connected to climate change, the Arab Spring and COVID-19. Data in Brief (submitted: December 2023).

    The Data in Brief article contains the supplement information and is the related data paper to:

    Kempf, M. Climate change, the Arab Spring, and COVID-19 - Impacts on landcover transformations in the Levant. Journal of Arid Environments (revision submitted: December 2023).

    Description/abstract

    The Levant region is highly vulnerable to climate change, experiencing prolonged heat waves that have led to societal crises and population displacement. Since 2010, the area has been marked by socio-political turmoil, including the Syrian civil war and currently the escalation of the so-called Israeli-Palestinian Conflict, which strained neighbouring countries like Jordan due to the influx of Syrian refugees and increases population vulnerability to governmental decision-making. Jordan, in particular, has seen rapid population growth and significant changes in land-use and infrastructure, leading to over-exploitation of the landscape through irrigation and construction. This dataset uses climate data, satellite imagery, and land cover information to illustrate the substantial increase in construction activity and highlights the intricate relationship between climate change predictions and current socio-political developments in the Levant.

    Folder structure

    The main folder after download contains all data, in which the following subfolders are stored are stored as zipped files:

    “code” stores the above described 9 code chunks to read, extract, process, analyse, and visualize the data.

    “MODIS_merged” contains the 16-days, 250 m resolution NDVI imagery merged from three tiles (h20v05, h21v05, h21v06) and cropped to the study area, n=510, covering January 2001 to December 2022 and including January and February 2023.

    “mask” contains a single shapefile, which is the merged product of administrative boundaries, including Jordan, Lebanon, Israel, Syria, and Palestine (“MERGED_LEVANT.shp”).

    “yield_productivity” contains .csv files of yield information for all countries listed above.

    “population” contains two files with the same name but different format. The .csv file is for processing and plotting in R. The .ods file is for enhanced visualization of population dynamics in the Levant (Socio_cultural_political_development_database_FAO2023.ods).

    “GLDAS” stores the raw data of the NASA Global Land Data Assimilation System datasets that can be read, extracted (variable name), and processed using code “8_GLDAS_read_extract_trend” from the respective folder. One folder contains data from 1975-2022 and a second the additional January and February 2023 data.

    “built_up” contains the landcover and built-up change data from 1975 to 2022. This folder is subdivided into two subfolder which contain the raw data and the already processed data. “raw_data” contains the unprocessed datasets and “derived_data” stores the cropped built_up datasets at 5 year intervals, e.g., “Levant_built_up_1975.tif”.

    Code structure

    1_MODIS_NDVI_hdf_file_extraction.R


    This is the first code chunk that refers to the extraction of MODIS data from .hdf file format. The following packages must be installed and the raw data must be downloaded using a simple mass downloader, e.g., from google chrome. Packages: terra. Download MODIS data from after registration from: https://lpdaac.usgs.gov/products/mod13q1v061/ or https://search.earthdata.nasa.gov/search (MODIS/Terra Vegetation Indices 16-Day L3 Global 250m SIN Grid V061, last accessed, 09th of October 2023). The code reads a list of files, extracts the NDVI, and saves each file to a single .tif-file with the indication “NDVI”. Because the study area is quite large, we have to load three different (spatially) time series and merge them later. Note that the time series are temporally consistent.


    2_MERGE_MODIS_tiles.R


    In this code, we load and merge the three different stacks to produce large and consistent time series of NDVI imagery across the study area. We further use the package gtools to load the files in (1, 2, 3, 4, 5, 6, etc.). Here, we have three stacks from which we merge the first two (stack 1, stack 2) and store them. We then merge this stack with stack 3. We produce single files named NDVI_final_*consecutivenumber*.tif. Before saving the final output of single merged files, create a folder called “merged” and set the working directory to this folder, e.g., setwd("your directory_MODIS/merged").


    3_CROP_MODIS_merged_tiles.R


    Now we want to crop the derived MODIS tiles to our study area. We are using a mask, which is provided as .shp file in the repository, named "MERGED_LEVANT.shp". We load the merged .tif files and crop the stack with the vector. Saving to individual files, we name them “NDVI_merged_clip_*consecutivenumber*.tif. We now produced single cropped NDVI time series data from MODIS.
    The repository provides the already clipped and merged NDVI datasets.


    4_TREND_analysis_NDVI.R


    Now, we want to perform trend analysis from the derived data. The data we load is tricky as it contains 16-days return period across a year for the period of 22 years. Growing season sums contain MAM (March-May), JJA (June-August), and SON (September-November). December is represented as a single file, which means that the period DJF (December-February) is represented by 5 images instead of 6. For the last DJF period (December 2022), the data from January and February 2023 can be added. The code selects the respective images from the stack, depending on which period is under consideration. From these stacks, individual annually resolved growing season sums are generated and the slope is calculated. We can then extract the p-values of the trend and characterize all values with high confidence level (0.05). Using the ggplot2 package and the melt function from reshape2 package, we can create a plot of the reclassified NDVI trends together with a local smoother (LOESS) of value 0.3.
    To increase comparability and understand the amplitude of the trends, z-scores were calculated and plotted, which show the deviation of the values from the mean. This has been done for the NDVI values as well as the GLDAS climate variables as a normalization technique.


    5_BUILT_UP_change_raster.R


    Let us look at the landcover changes now. We are working with the terra package and get raster data from here: https://ghsl.jrc.ec.europa.eu/download.php?ds=bu (last accessed 03. March 2023, 100 m resolution, global coverage). Here, one can download the temporal coverage that is aimed for and reclassify it using the code after cropping to the individual study area. Here, I summed up different raster to characterize the built-up change in continuous values between 1975 and 2022.


    6_POPULATION_numbers_plot.R


    For this plot, one needs to load the .csv-file “Socio_cultural_political_development_database_FAO2023.csv” from the repository. The ggplot script provided produces the desired plot with all countries under consideration.


    7_YIELD_plot.R


    In this section, we are using the country productivity from the supplement in the repository “yield_productivity” (e.g., "Jordan_yield.csv". Each of the single country yield datasets is plotted in a ggplot and combined using the patchwork package in R.


    8_GLDAS_read_extract_trend


    The last code provides the basis for the trend analysis of the climate variables used in the paper. The raw data can be accessed https://disc.gsfc.nasa.gov/datasets?keywords=GLDAS%20Noah%20Land%20Surface%20Model%20L4%20monthly&page=1 (last accessed 9th of October 2023). The raw data comes in .nc file format and various variables can be extracted using the [“^a variable name”] command from the spatraster collection. Each time you run the code, this variable name must be adjusted to meet the requirements for the variables (see this link for abbreviations: https://disc.gsfc.nasa.gov/datasets/GLDAS_CLSM025_D_2.0/summary, last accessed 09th of October 2023; or the respective code chunk when reading a .nc file with the ncdf4 package in R) or run print(nc) from the code or use names(the spatraster collection).
    Choosing one variable, the code uses the MERGED_LEVANT.shp mask from the repository to crop and mask the data to the outline of the study area.
    From the processed data, trend analysis are conducted and z-scores were calculated following the code described above. However, annual trends require the frequency of the time series analysis to be set to value = 12. Regarding, e.g., rainfall, which is measured as annual sums and not means, the chunk r.sum=r.sum/12 has to be removed or set to r.sum=r.sum/1 to avoid calculating annual mean values (see other variables). Seasonal subset can be calculated as described in the code. Here, 3-month subsets were chosen for growing seasons, e.g. March-May (MAM), June-July (JJA), September-November (SON), and DJF (December-February, including Jan/Feb of the consecutive year).
    From the data, mean values of 48 consecutive years are calculated and trend analysis are performed as describe above. In the same way, p-values are extracted and 95 % confidence level values are marked with dots on the raster plot. This analysis can be performed with a much longer time series, other variables, ad different spatial extent across the globe due to the availability of the GLDAS variables.

  9. f

    Scripts for Analysis

    • figshare.com
    txt
    Updated Jul 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 18, 2018
    Dataset provided by
    figshare
    Authors
    Sneddon Lab UCSF
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.

  10. R Script - Sequence Merging Stats, Reorganizing and Averaging Replicate...

    • figshare.com
    txt
    Updated May 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lindsay Putman (2021). R Script - Sequence Merging Stats, Reorganizing and Averaging Replicate Samples [Dataset]. http://doi.org/10.6084/m9.figshare.14605722.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 18, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Lindsay Putman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R Script used to assess if the merging of sequence data from different sequencing centers was successful, and to reorganize count table data columns and average technical replicates.

  11. u

    NASA DC-8 1 Minute Data Merge

    • data.ucar.edu
    ascii
    Updated Aug 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson; Michael Shook (2025). NASA DC-8 1 Minute Data Merge [Dataset]. http://doi.org/10.26023/VM9C-1C16-H003
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Aug 1, 2025
    Authors
    Gao Chen; Jennifer R. Olson; Michael Shook
    Time period covered
    May 1, 2012 - Jun 30, 2012
    Area covered
    Description

    This dataset contains NASA DC-8 1 Minute Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 22 June 2012. This dataset contains updated data provided by NASA. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrg60-dc8_merge_YYYYMMdd_R5_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This dataset is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and dataset comments. For more information on updates to this dataset, please see the readme file.

  12. DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy...

    • data.niaid.nih.gov
    Updated Feb 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aleksandra Ciprijanovic; Gregory Snyder; Kathryn Downey; Diana Kafkes (2021). DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains (Data) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4507940
    Explore at:
    Dataset updated
    Feb 26, 2021
    Dataset provided by
    Space Telescope Science Institutehttps://stsci.edu/
    University of Chicago
    Fermi National Accelerator Laboratory
    Authors
    Aleksandra Ciprijanovic; Gregory Snyder; Kathryn Downey; Diana Kafkes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We present the data used in "DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains". In this paper, we test domain adaptation techniques, such as Maximum Mean Discrepancy (MMD) and adversarial training with Domain Adversarial Neural Networks (DANNs) for cross-domain studies of merging galaxies. Domain adaptation is performed between two simulated datasets of various levels of observational realism (simulation-to-simulation experiments), and between simulated data and observed telescope images (simulation-to-real experiments). For more details about the datasets please see the paper mentioned above.

    Simulation-to-Simulation Experiments

    Data used to study distant merging galaxies using simulated images from the Illustris-1 cosmological simulation at redshift z=2. The images are 75x75 pixels with three filters applied that mimic Hubble Space Telescope (HST) observations (ACS F814W,NC F356W, WFC3 F160W) with added point-spread function (PSF) and with or without observational noise.

    Source Domain

     Images: SimSim_SOURCE_X_Illustris2_pristine.npy
    
    
     Labels: SimSim_SOURCE_y_Illustris2_pristine.npy
    

    Target Domain

    Images: SimSim_TARGET_X_Illustris2_noisy.npy
    
    
    Labels: SimSim_TARGET_y_Illustris2_noisy.npy
    

    Simulation-to-Real Experiments

    Data used to study nearby merging galaxies using simulated Illustris-1 images at redshift z=0 and observed Sloan Digital Sky Survey (SDSS) images from the Galaxy Zoo project. All images have three filters. SDSS images have (g,r,i) filters, while simulated Illustris images also mimic the same three SDSS filters with added effects of dust, PSF and observational noise.

    Source Domain

    Images: SimReal_SOURCE_X_Illustris0.npy
    
    
    Labels: SimReal_SOURCE_y_Illustris0.npy
    

    Target Domain

    Images: SimReal_TARGET_X_postmergers_SDSS.npy
    
    
    Labels: SimReal_TARGET_y_postmergers_SDSS.npy
    
  13. d

    Replication Data for: Wake merging and turbulence transition downstream of...

    • search.dataone.org
    • dataverse.no
    Updated Jun 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hearst, R. Jason; Berstad, Fanny Olivia Johannessen; Neunaber, Ingrid (2025). Replication Data for: Wake merging and turbulence transition downstream of side-by-side porous discs [Dataset]. http://doi.org/10.18710/XAEWC5
    Explore at:
    Dataset updated
    Jun 3, 2025
    Dataset provided by
    DataverseNO
    Authors
    Hearst, R. Jason; Berstad, Fanny Olivia Johannessen; Neunaber, Ingrid
    Description

    These are the streamwise velocity time series measured in the wakes of two sets of porous discs in side-by-side setting as used in the manuscript ``Wake merging and turbulence transition downstream of side-by-side porous discs´´ which is accepted by Journal of Fluid Mechanics. Data was obtained by means of hot-wire anemometry in the Large Scale Wind Tunnel at the Norwegian University of Science and Technology in near-laminar inflow (background turbulence intensity of approximately 0.3%) at an inflow velocity of 10m/s (diameter-based Reynolds number 125000). Two types of porous discs with diameters D = 0.2m, one with uniform blockage and one with radially changing blockage, were used. Three spacings, namely 1.5D, 2D and 3D, were investigated. Span-wise profiles were measured at 8D and 30D downstream for each case, and a streamwise profile along the centerline between the discs was additionally obtained. In addition, measurements downstream of both disc types (singe disc setting) are provided as comparison. The scope of these experiments was to study the merging mechanisms of the turbulence when the two wakes are meeting.

  14. d

    Data from: Merged acoustic-backscactter imagery collected in 2005, 2007, and...

    • datasets.ai
    • s.cnmilf.com
    • +1more
    55
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of the Interior, Merged acoustic-backscactter imagery collected in 2005, 2007, and 2010, Skagit Bay, Washington [Dataset]. https://datasets.ai/datasets/merged-acoustic-backscactter-imagery-collected-in-2005-2007-and-2010-skagit-bay-washington
    Explore at:
    55Available download formats
    Dataset authored and provided by
    Department of the Interior
    Description

    These metadata describe the U.S. Geological Survey (USGS), Pacific Coastal and Marine Science Center (PCMSC) merged acoustic-backscatter imagery that was collected in 2005, 2007, and 2010 in Skagit Bay Washington that is provided as a 5-m resolution TIFF image. In 2004, 2005, 2007, and 2010 the U.S. Geological Survey (USGS), Pacific Coastal and Marine Science Center (PCMSC) collected bathymetry and acoustic backscatter data in Skagit Bay, Washington using an interferometric bathymetric sidescan sonar system mounded to the USGS R/V Parke Snavely and the USGS R/V Karluk. The research was conducted in coordination with the Swinomish Indian Tribal Community, Skagit River System Cooperative, Skagit Watershed Council, Puget Sound Nearshore Ecosystem Restoration Project, and U.S. Army Corps of Engineers to characterize estuarine habitats and processes, including the sediment budget of the Skagit River and the influence of river-delta channelization on sediment transport. Information quantifying the distribution of habitats and extent that sediment transport influences habitats and the morphology of the delta is useful for planning for salmon recovery, agricultural resilience, flood risk protection, and coastal change associated with sea-level rise.

  15. 4

    Supplementary materials for the paper: "Designing Multi-Modal Communication...

    • data.4tu.nl
    zip
    Updated Oct 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ruolin Gao; Haoyu Liu; Pavlo Bazilinskyy; Marieke Martens (2025). Supplementary materials for the paper: "Designing Multi-Modal Communication for Merge Negotiation with Automated Vehicles: Insights from a Design Exploration with Prototypes" [Dataset]. http://doi.org/10.4121/be60bbb2-5f7d-4ac9-b755-2fae8ffe061c.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 15, 2025
    Dataset provided by
    4TU.ResearchData
    Authors
    Ruolin Gao; Haoyu Liu; Pavlo Bazilinskyy; Marieke Martens
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary materials for the paper: Gao, R., Liu, H., Bazilinskyy, P., & Martens, M. (2025, September). Designing Multi-Modal Communication for Merge Negotiation with Automated Vehicles: Insights from a Design Exploration with Prototypes. In Adjunct Proceedings of the 17th International Conference on Automotive User Interfaces and Interactive Vehicular Applications (pp. 40-46).

    Supplementary materials include Arduino code for simple physical interfaces and video stimuli that depict merge scenarios from a driver’s perspective with the corresponding visual HMI (e.g., HUD) where applicable.

  16. f

    Merging Resource Availability with Isotope Mixing Models: The Role of...

    • plos.figshare.com
    ai
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Justin D. Yeakel; Mark Novak; Paulo R. Guimarães Jr.; Nathaniel J. Dominy; Paul L. Koch; Eric J. Ward; Jonathan W. Moore; Brice X. Semmens (2023). Merging Resource Availability with Isotope Mixing Models: The Role of Neutral Interaction Assumptions [Dataset]. http://doi.org/10.1371/journal.pone.0022015
    Explore at:
    aiAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Justin D. Yeakel; Mark Novak; Paulo R. Guimarães Jr.; Nathaniel J. Dominy; Paul L. Koch; Eric J. Ward; Jonathan W. Moore; Brice X. Semmens
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundBayesian mixing models have allowed for the inclusion of uncertainty and prior information in the analysis of trophic interactions using stable isotopes. Formulating prior distributions is relatively straightforward when incorporating dietary data. However, the use of data that are related, but not directly proportional, to diet (such as prey availability data) is often problematic because such information is not necessarily predictive of diet, and the information required to build a reliable prior distribution for all prey species is often unavailable. Omitting prey availability data impacts the estimation of a predator's diet and introduces the strong assumption of consumer ultrageneralism (where all prey are consumed in equal proportions), particularly when multiple prey have similar isotope values. MethodologyWe develop a procedure to incorporate prey availability data into Bayesian mixing models conditional on the similarity of isotope values between two prey. If a pair of prey have similar isotope values (resulting in highly uncertain mixing model results), our model increases the weight of availability data in estimating the contribution of prey to a predator's diet. We test the utility of this method in an intertidal community against independently measured feeding rates. ConclusionsOur results indicate that our weighting procedure increases the accuracy by which consumer diets can be inferred in situations where multiple prey have similar isotope values. This suggests that the exchange of formalism for predictive power is merited, particularly when the relationship between prey availability and a predator's diet cannot be assumed for all species in a system.

  17. Code for 'Food-insecure women eat a less diverse diet in a more temporally...

    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Nettle; Melissa Bateson; Daniel Nettle; Melissa Bateson (2020). Code for 'Food-insecure women eat a less diverse diet in a more temporally variable way: Evidence from the US National Health and Nutrition Examination Survey, 2013-4' [Dataset]. http://doi.org/10.5281/zenodo.2649032
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Daniel Nettle; Melissa Bateson; Daniel Nettle; Melissa Bateson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Code to reproduce the analyses in the study 'Food-insecure women eat a less diverse diet in a more temporally variable way: Evidence from the US National Health and Nutrition Examination Survey, 2013-4'

    The analysis requires two R scripts available here, plus original 2013-4 NHANES data files, downloadable from the NHANES website (https://wwwn.cdc.gov/nchs/nhanes/continuousnhanes/default.aspx?BeginYear=2013).

    The first R script, 'merging script.r' takes the original NHANES files, extracts the variables required for the study, merges them into a single data frame, and saves this in .csv format. The NHANES files it requires are:

    # Demographics, food insecurity and BMI
    DEMO_H.XPT
    FSQ_H.XPT
    BMX_H.XPT

    # Summary files of food recalls
    DR1TOT_H.XPT
    DR2TOT_H.XPT

    # Individual foods files from food recalls
    DR1FF_H.XPT
    DR2FF_H.XPT

    The second R script takes the .csv file output by the merging script, and reproduces the analyses described in the paper.

    Uploaded by Daniel Nettle, April 23rd 2019.

  18. d

    SBI Cruise NBP03-04a merged bottle dataset

    • dataone.org
    • arcticdata.io
    • +1more
    Updated Oct 22, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nick R. Bates; Dennis Hansell; Steven Roberts; Service Group, Scripps Institution of Oceanography, University of California - San Diego (2016). SBI Cruise NBP03-04a merged bottle dataset [Dataset]. http://doi.org/10.5065/D6DR2SK4
    Explore at:
    Dataset updated
    Oct 22, 2016
    Dataset provided by
    Arctic Data Center
    Authors
    Nick R. Bates; Dennis Hansell; Steven Roberts; Service Group, Scripps Institution of Oceanography, University of California - San Diego
    Time period covered
    Jul 5, 2003 - Aug 20, 2003
    Area covered
    Description

    This data set contains merged bottle data from the SBI cruise on the United States Coast Guard Cutter (USCGC) Nathaniel B. Palmer (NBP03-04a). During this cruise rosette casts were conducted and a bottle data file was generated by the Scripps Service group from these water samples. Additional groups were funded to measure supplementary parameters from these same water samples. This data set is the first version of the merging of the Scripps Service group bottle data file with these data gathered by these additional groups.

  19. R scripts used to analyze rodent call statistics generated by 'DeepSqueak'

    • figshare.com
    zip
    Updated May 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathijs Blom (2021). R scripts used to analyze rodent call statistics generated by 'DeepSqueak' [Dataset]. http://doi.org/10.6084/m9.figshare.14696304.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Mathijs Blom
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The scripts in this folder weer used to combine all call statistic files per day into one file, resulting in nine files containing all call statistics per data. The script ‘merging_dataset.R’ was used to combine all days worth of call statistics and create subsets of two frequency ranges (18-32 and 32-96). The script ‘camera_data’ was used to combine all camera and observation data.

  20. Data from: Transfer learning reveals sequence determinants of the...

    • zenodo.org
    application/gzip, zip
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahin Naqvi; Sahin Naqvi (2024). Transfer learning reveals sequence determinants of the quantitative response to transcription factor dosage [Dataset]. http://doi.org/10.5281/zenodo.11224809
    Explore at:
    application/gzip, zipAvailable download formats
    Dataset updated
    May 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sahin Naqvi; Sahin Naqvi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Processed data and code for "Transfer learning reveals sequence determinants of the quantitative response to transcription factor dosage," Naqvi et al 2024.

    Directory is organized into 4 subfolders, each tar'ed and gzipped:

    data_analysis.tar.gz - Processed data for modulation of TWIST1 levels and calculation of RE responsiveness to TWIST1 dosage

    • atac_design.txt - design matrix for ATAC-seq TWIST1 titration samples
    • all.sub.150bpclust.greater2.500bp.merge.TWIST1.titr.ATAC.counts.txt - ATAC-seq counts from all samples over all reproducible ATAC-seq peak regions, as defined in Naqvi et al 2023
    • atac_deseq_fitmodels_moded50.R - R code for calculating new version of ED50 and response to full depletion from TWIST1 titration data (note, uses drm.R function from 10.5281/zenodo.7689948, install drc() with this version to avoid errors)

    baseline_models.tar.gz - Code and data for training baseline models to predict RE responsiveness to SOX9/TWIST1 dosage

    • {sox9|twist1}.{0v100|ed50}.{train|valid|test}.txt - Training/testing/validation data (ED50 or full TF depletion effect for SOX9 or TWIST1), split into train/test/validation folds
    • HOCOMOCOv11_core_HUMAN_mono_jaspar_format.all.sub.150bpclust.greater2.500bp.merge.minus300bp.p01.maxscore.mat.cpg.gc.basemean.txt.gz - matrix of predictors for all REs. Quantitative encoding of PWM match for all HOCOMOCO motifs + CpG + GC content, plus unperturbed ATAC-seq signal
    • train_baseline.R - R code to train baseline (LASSO regression or random forest) models using predictor matrix and the provided training data.
      • Note: training the random forest to predict full TF depletion is computationally intensive because it is across all REs, if doing this run on CPU for ~6 hrs.

    chrombpnet_models.tar.gz - Remainder of code, data, and models for fine-tuning and interpreting ChromBPNet mdoels to predict RE responsiveness to SOX9/TWIST1 dosage

    • Fine-tuning code, data, models
      • {all|sox9.direct|twist1.bound.down}.{train|valid|test}.{ed50|0v100.log2fc}.txt - Training/testing/validation data (ED50 or full TF depletion effect for SOX9 or TWIST1), split into train/test/validation folds
      • pretrained.unperturbed.chrombpnet.h5 - Pretrained model of unperturbed ATAC-seq signal in CNCCs, obtained by running ChromBPNet (https://github.com/kundajelab/chrombpnet) on DMSO-treated SOX9/TWIST1-tagged ATAC-seq data
      • finetune_chrombpnet.py - code for fine-tuning the pretrained model for any of the relevant prediction tasks (ED50/ effect of full TF depletion for SOX9/TWIST1)
      • best.model.chrombpnet.{0v100|ed50}.{sox9|twist1}.h5 - output of finetune_chrombpnet.py, best model after 10 training epochs for the indicated task
      • chrombpnet.{0v100|ed50}.{sox9|twist1}.contrib.{h5|bw} - contribution scores for the indicated predictive model, obtained by running chrombpnet contribs_bw on the corresponding model h5 file.
      • chrombpnet.{0v100|ed50}.{sox9|twist1}.contrib.modisco.{h5|bw} - TF-MoDIsCo output from the corresponding contribution score file
    • Interpretation code, data, models
      • contrib_h5_to_projshap_npy.py - code to convert contrib .h5 files into .npy files containing projected SHAP scores (required because the CWM matching code takes this format of contribution scores)
      • sox9.direct.10col.bed, twist1.bound.down.10col.uniq.bed - regions over which CWMs will be matched (likely direct targets of each TF)
      • match_cwms.py - Python code to match individual CWM instances. Takes as input: modisco .h5 file, SHAP .npy file, bed file of regions to be matched. Output is a bed file of all CWM matches (not pruned, contains many redundant matches).
      • chrombpnet.ed50.{sox9|twist1}.contrib.perc05.matchperc10.allmatch.bed - output of match_cwms.py
      • take_max_overlap.py - code to merge output of match_cwms.py into clusters, and then take the maximum (length-normalized) match score in each cluster as the representative CWM match of that cluster. Requires upstream bedtools commands to be piped in, see example usage in file.
      • chrombpnet.ed50.{sox9|twist1}.contrib.perc05.matchperc10.allmatch.maxoverlap.bed - output of take_max_overlap.py. These CWM instances are the ones used throughout the paper.

    modisco_reports.zip - TF-MoDIsCo reports from running on the fine-tuned ChromBPNet models

    • modisco_report_{sox9|twist1}_{0v100|ed50}: folders containing images of discovered CWMs and HTMLs/PDFs of summarized reports from running TF-MoDisCo on the indicated fine-tuned ChromBPNet model

    mirny_model.tar.gz - Code and data for analyzing and fitting Mirny model of TF-nucleosome competition to observed RE dosage response curves

    • twist1.strong.multi.only.ed50.cutoff.true.hill.txt - ED50 and signed hill coefficients for all TWIST1-dependent REs with only buffering Coordinators (mostly one or two) and no other TFs' binding sites. "ed50_new" is the ED50 calculation used in this paper.
    • twist1.strong.weak{1|2|3}.ed50.cutoff.true.hill.txt - ED50 and signed hill coefficients for all TWIST1-dependent REs with only buffering Coordinators (mostly one or two) and the indicated number of sensitizing (weak) Coordinators and no other TFs' binding sites. "ed50_new" is the ED50 calculation used in this paper.
    • MirnyModelAnalysis.py - Python code for analysis of Mirny model of TF-nucleosome competition. Contains implementations of analytic solutions, as well as code to fit model to observed ED50 and hill coefficients in the provided data files.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jose Aguasvivas; Jon Andoni Duñabeitia (2018). Merge script [Dataset]. http://doi.org/10.6084/m9.figshare.5924797.v2

Merge script

Explore at:
274 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
May 30, 2018
Dataset provided by
figshare
Authors
Jose Aguasvivas; Jon Andoni Duñabeitia
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Script in R language to load the databases and merge them into an unique file. This script may be modified and extended as needed.

Search
Clear search
Close search
Google apps
Main menu