69 datasets found
  1. u

    NSF/NCAR GV HIAPER 1 Minute Data Merge

    • data.ucar.edu
    ascii
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson; Michael Shook (2025). NSF/NCAR GV HIAPER 1 Minute Data Merge [Dataset]. http://doi.org/10.26023/R1RA-JHKZ-W913
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Gao Chen; Jennifer R. Olson; Michael Shook
    Time period covered
    May 18, 2012 - Jun 30, 2012
    Area covered
    Description

    This data set contains NSF/NCAR GV HIAPER 1 Minute Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 30 June 2012. These are updated merges from the NASA DC3 archive that were made available 13 June 2014. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrg60-gV_merge_YYYYMMdd_R5_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments.

  2. Data from: KORUS-AQ Aircraft Merge Data Files

    • catalog.data.gov
    • access.earthdata.nasa.gov
    • +1more
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/LARC/SD/ASDC (2025). KORUS-AQ Aircraft Merge Data Files [Dataset]. https://catalog.data.gov/dataset/korus-aq-aircraft-merge-data-files
    Explore at:
    Dataset updated
    Aug 22, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    KORUSAQ_Merge_Data are pre-generated merge data files combining various products collected during the KORUS-AQ field campaign. This collection features pre-generated merge files for the DC-8 aircraft. Data collection for this product is complete.The KORUS-AQ field study was conducted in South Korea during May-June, 2016. The study was jointly sponsored by NASA and Korea’s National Institute of Environmental Research (NIER). The primary objectives were to investigate the factors controlling air quality in Korea (e.g., local emissions, chemical processes, and transboundary transport) and to assess future air quality observing strategies incorporating geostationary satellite observations. To achieve these science objectives, KORUS-AQ adopted a highly coordinated sampling strategy involved surface and airborne measurements including both in-situ and remote sensing instruments.Surface observations provided details on ground-level air quality conditions while airborne sampling provided an assessment of conditions aloft relevant to satellite observations and necessary to understand the role of emissions, chemistry, and dynamics in determining air quality outcomes. The sampling region covers the South Korean peninsula and surrounding waters with a primary focus on the Seoul Metropolitan Area. Airborne sampling was primarily conducted from near surface to about 8 km with extensive profiling to characterize the vertical distribution of pollutants and their precursors. The airborne observational data were collected from three aircraft platforms: the NASA DC-8, NASA B-200, and Hanseo King Air. Surface measurements were conducted from 16 ground sites and 2 ships: R/V Onnuri and R/V Jang Mok.The major data products collected from both the ground and air include in-situ measurements of trace gases (e.g., ozone, reactive nitrogen species, carbon monoxide and dioxide, methane, non-methane and oxygenated hydrocarbon species), aerosols (e.g., microphysical and optical properties and chemical composition), active remote sensing of ozone and aerosols, and passive remote sensing of NO2, CH2O, and O3 column densities. These data products support research focused on examining the impact of photochemistry and transport on ozone and aerosols, evaluating emissions inventories, and assessing the potential use of satellite observations in air quality studies.

  3. h

    dataset-pinkball-first-merge

    • huggingface.co
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas R (2025). dataset-pinkball-first-merge [Dataset]. https://huggingface.co/datasets/treitz/dataset-pinkball-first-merge
    Explore at:
    Dataset updated
    Dec 1, 2025
    Authors
    Thomas R
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset was created using LeRobot.

      Dataset Structure
    

    meta/info.json: { "codebase_version": "v3.0", "robot_type": "so101_follower", "total_episodes": 40, "total_frames": 10385, "total_tasks": 1, "chunks_size": 1000, "data_files_size_in_mb": 100, "video_files_size_in_mb": 200, "fps": 30, "splits": { "train": "0:40" }, "data_path": "data/chunk-{chunk_index:03d}/file-{file_index:03d}.parquet", "video_path":… See the full description on the dataset page: https://huggingface.co/datasets/treitz/dataset-pinkball-first-merge.

  4. Reddit's /r/Gamestop

    • kaggle.com
    zip
    Updated Nov 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Reddit's /r/Gamestop [Dataset]. https://www.kaggle.com/datasets/thedevastator/gamestop-inc-stock-prices-and-social-media-senti
    Explore at:
    zip(186464492 bytes)Available download formats
    Dataset updated
    Nov 28, 2022
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Reddit's /r/Gamestop

    Merge this dataset with gamestop price data to study how the chat impacted

    By SocialGrep [source]

    About this dataset

    The stonks movement spawned by this is a very interesting one. It's rare to see an Internet meme have such an effect on real-world economy - yet here we are.

    This dataset contains a collection of posts and comments mentioning GME in their title and body text respectively. The data is procured using SocialGrep. The posts and the comments are labelled with their score.

    It'll be interesting to see how this effects the stock market prices in the aftermath with this new dataset

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    The file contains posts from Reddit mentioning GME and their score. This can be used to analyze how the sentiment on GME affected its stock prices in the aftermath

    Research Ideas

    • To study how social media affects stock prices
    • To study how Reddit affects stock prices
    • To study how the sentiment of a subreddit affects stock prices

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.

    Columns

    File: six-months-of-gme-on-reddit-comments.csv | Column name | Description | |:-------------------|:------------------------------------------------------| | type | The type of post or comment. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether the subreddit is NSFW. (Boolean) | | created_utc | The time the post or comment was created. (Timestamp) | | permalink | The permalink of the post or comment. (String) | | body | The body of the post or comment. (String) | | sentiment | The sentiment of the post or comment. (String) | | score | The score of the post or comment. (Integer) |

    File: six-months-of-gme-on-reddit-posts.csv | Column name | Description | |:-------------------|:------------------------------------------------------| | type | The type of post or comment. (String) | | subreddit.name | The name of the subreddit. (String) | | subreddit.nsfw | Whether the subreddit is NSFW. (Boolean) | | created_utc | The time the post or comment was created. (Timestamp) | | permalink | The permalink of the post or comment. (String) | | score | The score of the post or comment. (Integer) | | domain | The domain of the post or comment. (String) | | url | The URL of the post or comment. (String) | | selftext | The selftext of the post or comment. (String) | | title | The title of the post or comment. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit SocialGrep.

  5. ESA Soil Moisture Climate Change Initiative (Soil_Moisture_cci): COMBINED...

    • catalogue.ceda.ac.uk
    Updated Oct 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wouter Dorigo; Wolfgang Preimesberger; S. Hahn; R. Van der Schalie; R. De Jeu; R. Kidd; N. Rodriguez-Fernandez; M. Hirschi; P. Stradiotti; T. Frederikse; A. Gruber; D. Duchemin (2024). ESA Soil Moisture Climate Change Initiative (Soil_Moisture_cci): COMBINED product, Version 09.1 [Dataset]. https://catalogue.ceda.ac.uk/uuid/0e346e1e1e164ac99c60098848537a29
    Explore at:
    Dataset updated
    Oct 16, 2024
    Dataset provided by
    Centre for Environmental Data Analysishttp://www.ceda.ac.uk/
    Authors
    Wouter Dorigo; Wolfgang Preimesberger; S. Hahn; R. Van der Schalie; R. De Jeu; R. Kidd; N. Rodriguez-Fernandez; M. Hirschi; P. Stradiotti; T. Frederikse; A. Gruber; D. Duchemin
    License

    https://artefacts.ceda.ac.uk/licences/specific_licences/esacci_soilmoisture_terms_and_conditions_v2.pdfhttps://artefacts.ceda.ac.uk/licences/specific_licences/esacci_soilmoisture_terms_and_conditions_v2.pdf

    Time period covered
    Nov 1, 1978 - Dec 31, 2023
    Area covered
    Earth
    Variables measured
    time, latitude, longitude
    Description

    The Soil Moisture CCI COMBINED dataset is one of three datasets created as part of the European Space Agency's (ESA) Soil Moisture Essential Climate Variable (ECV) Climate Change Initiative (CCI) project. The COMBINED product has been created by directly merging Level 2 scatterometer ('active' remote sensing) and radiometer ('passive' remote sensing) soil moisture products derived from the AMI-WS, ASCAT, SMMR, SSM/I, TMI, AMSR-E, WindSat, FY-3B, FY-3C, FY3D, AMSR2, SMOS, GPM and SMAP satellite instruments. PASSIVE and ACTIVE products have also been created.

    The v09.1 COMBINED product, provided as global daily images in NetCDF-4 classic file format, presents a global coverage of surface soil moisture at a spatial resolution of 0.25 degrees. It is provided in volumetric units [m3 m-3] and covers the period (yyyy-mm-dd) 1978-11-01 to 2023-12-31. For information regarding the theoretical and algorithmic base of the product, please see the Algorithm Theoretical Baseline Document. Additional reference documents and information relating to the dataset can also be found on the CCI Soil Moisture project website.

    The data set should be cited using the following references:

    1. Gruber, A., Scanlon, T., van der Schalie, R., Wagner, W., and Dorigo, W. (2019). Evolution of the ESA CCI Soil Moisture climate data records and their underlying merging methodology, Earth Syst. Sci. Data, 11, 717–739, https://doi.org/10.5194/essd-11-717-2019

    2. Dorigo, W.A., Wagner, W., Albergel, C., Albrecht, F., Balsamo, G., Brocca, L., Chung, D., Ertl, M., Forkel, M., Gruber, A., Haas, E., Hamer, D. P. Hirschi, M., Ikonen, J., De Jeu, R. Kidd, R. Lahoz, W., Liu, Y.Y., Miralles, D., Lecomte, P. (2017). ESA CCI Soil Moisture for improved Earth system understanding: State-of-the art and future directions. In Remote Sensing of Environment, 2017, ISSN 0034-4257, https://doi.org/10.1016/j.rse.2017.07.001

    3. Preimesberger, W., Scanlon, T., Su, C. -H., Gruber, A. and Dorigo, W., "Homogenization of Structural Breaks in the Global ESA CCI Soil Moisture Multisatellite Climate Data Record," in IEEE Transactions on Geoscience and Remote Sensing, vol. 59, no. 4, pp. 2845-2862, April 2021, doi: 10.1109/TGRS.2020.3012896.

  6. R scripts used to analyze rodent call statistics generated by 'DeepSqueak'

    • figshare.com
    zip
    Updated May 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mathijs Blom (2021). R scripts used to analyze rodent call statistics generated by 'DeepSqueak' [Dataset]. http://doi.org/10.6084/m9.figshare.14696304.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Mathijs Blom
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The scripts in this folder weer used to combine all call statistic files per day into one file, resulting in nine files containing all call statistics per data. The script ‘merging_dataset.R’ was used to combine all days worth of call statistics and create subsets of two frequency ranges (18-32 and 32-96). The script ‘camera_data’ was used to combine all camera and observation data.

  7. r

    R codes and dataset for Visualisation of Diachronic Constructional Change...

    • researchdata.edu.au
    • bridges.monash.edu
    Updated Apr 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg (2019). R codes and dataset for Visualisation of Diachronic Constructional Change using Motion Chart [Dataset]. http://doi.org/10.26180/5c844c7a81768
    Explore at:
    Dataset updated
    Apr 1, 2019
    Dataset provided by
    Monash University
    Authors
    Gede Primahadi Wijaya Rajeg; Gede Primahadi Wijaya Rajeg
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Publication


    Primahadi Wijaya R., Gede. 2014. Visualisation of diachronic constructional change using Motion Chart. In Zane Goebel, J. Herudjati Purwoko, Suharno, M. Suryadi & Yusuf Al Aried (eds.). Proceedings: International Seminar on Language Maintenance and Shift IV (LAMAS IV), 267-270. Semarang: Universitas Diponegoro. doi: https://doi.org/10.4225/03/58f5c23dd8387

    Description of R codes and data files in the repository

    This repository is imported from its GitHub repo. Versioning of this figshare repository is associated with the GitHub repo's Release. So, check the Releases page for updates (the next version is to include the unified version of the codes in the first release with the tidyverse).

    The raw input data consists of two files (i.e. will_INF.txt and go_INF.txt). They represent the co-occurrence frequency of top-200 infinitival collocates for will and be going to respectively across the twenty decades of Corpus of Historical American English (from the 1810s to the 2000s).

    These two input files are used in the R code file 1-script-create-input-data-raw.r. The codes preprocess and combine the two files into a long format data frame consisting of the following columns: (i) decade, (ii) coll (for "collocate"), (iii) BE going to (for frequency of the collocates with be going to) and (iv) will (for frequency of the collocates with will); it is available in the input_data_raw.txt.

    Then, the script 2-script-create-motion-chart-input-data.R processes the input_data_raw.txt for normalising the co-occurrence frequency of the collocates per million words (the COHA size and normalising base frequency are available in coha_size.txt). The output from the second script is input_data_futurate.txt.

    Next, input_data_futurate.txt contains the relevant input data for generating (i) the static motion chart as an image plot in the publication (using the script 3-script-create-motion-chart-plot.R), and (ii) the dynamic motion chart (using the script 4-script-motion-chart-dynamic.R).

    The repository adopts the project-oriented workflow in RStudio; double-click on the Future Constructions.Rproj file to open an RStudio session whose working directory is associated with the contents of this repository.

  8. c

    Code to Create a Combined British Household Panel Survey and Understanding...

    • datacatalogue.cessda.eu
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Coulter, R (2025). Code to Create a Combined British Household Panel Survey and Understanding Society Data File, 1991-2020 [Dataset]. http://doi.org/10.5255/UKDA-SN-856507
    Explore at:
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    University College London
    Authors
    Coulter, R
    Area covered
    United Kingdom
    Variables measured
    Text unit
    Measurement technique
    This R script was written to extract, combine and clean household panel survey data from the BHPS and UKHLS. The code was written in RStudio 2022.7.2.0 for R 4.2.2.
    Description

    The Modelling Housing Career Trajectories in Great Britain project used a range of datasets to examine how people’s pathways through the British housing system have changed since the 1990s. The project made extensive use of secondary household panel survey data from the British Household Panel Survey (BHPS, 1991-2008) and its successor, Understanding Society (UKHLS, 2009-present).

    This collection consists of project R code written to (1) extract and then (2) combine data from across the various files of BHPS and UKHLS into one ‘master’ longitudinal dataset containing annual records for all observed individuals from 1991 to 2020. A number of key individual and household-level variables (including age, sex, country of birth, partnership status, highest educational qualification, employment status, incomes and region) are then (3) cleaned and harmonised.

    Users can download and adapt the deposited code as needed for their own social science applications.

    The Modelling Housing Career Trajectories in Great Britain project aimed to develop our understanding of how people's pathways through the housing market are changing in 21st Century Britain. To do this, one strand of empirical research used UK household panel survey data (the British Household Panel Survey and its successor, Understanding Society) to examine housing career pathways and homeownership transitions since the 1990s.

    This collection contains R code written to (1) extract BHPS and Understanding Society data, (2) combine various data files from the two collections and (3) produce a basic set of harmonised BHSP-Understanding Society variables to support longitudinal analysis covering the 1991-present period. Users can download and adapt the code for their own research projects.

  9. u

    NASA DC-8 1 Second Data Merge

    • data.ucar.edu
    ascii
    Updated Oct 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Chen; Jennifer R. Olson (2025). NASA DC-8 1 Second Data Merge [Dataset]. http://doi.org/10.5065/D6SF2TXB
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Gao Chen; Jennifer R. Olson
    Time period covered
    May 18, 2012 - Jun 22, 2012
    Area covered
    Description

    This data set contains NASA DC-8 1 Second Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 22 June 2012. These merges are an updated version that were provided by NASA. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. No "grand merge" has been provided for the 1-second data on the DC8 aircraft due to its prohibitive size (~1.5GB). In most cases, downloading the individual merge files for each day and simply concatenating them should suffice. This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments. For more information on the updates to this dataset, please see the readme file.

  10. n

    Multilevel modeling of time-series cross-sectional data reveals the dynamic...

    • data.niaid.nih.gov
    • dataone.org
    • +1more
    zip
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kodai Kusano (2020). Multilevel modeling of time-series cross-sectional data reveals the dynamic interaction between ecological threats and democratic development [Dataset]. http://doi.org/10.5061/dryad.547d7wm3x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    University of Nevada, Reno
    Authors
    Kodai Kusano
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    What is the relationship between environment and democracy? The framework of cultural evolution suggests that societal development is an adaptation to ecological threats. Pertinent theories assume that democracy emerges as societies adapt to ecological factors such as higher economic wealth, lower pathogen threats, less demanding climates, and fewer natural disasters. However, previous research confused within-country processes with between-country processes and erroneously interpreted between-country findings as if they generalize to within-country mechanisms. In this article, we analyze a time-series cross-sectional dataset to study the dynamic relationship between environment and democracy (1949-2016), accounting for previous misconceptions in levels of analysis. By separating within-country processes from between-country processes, we find that the relationship between environment and democracy not only differs by countries but also depends on the level of analysis. Economic wealth predicts increasing levels of democracy in between-country comparisons, but within-country comparisons show that democracy declines as countries become wealthier over time. This relationship is only prevalent among historically wealthy countries but not among historically poor countries, whose wealth also increased over time. By contrast, pathogen prevalence predicts lower levels of democracy in both between-country and within-country comparisons. Our longitudinal analyses identifying temporal precedence reveal that not only reductions in pathogen prevalence drive future democracy, but also democracy reduces future pathogen prevalence and increases future wealth. These nuanced results contrast with previous analyses using narrow, cross-sectional data. As a whole, our findings illuminate the dynamic process by which environment and democracy shape each other.

    Methods Our Time-Series Cross-Sectional data combine various online databases. Country names were first identified and matched using R-package “countrycode” (Arel-Bundock, Enevoldsen, & Yetman, 2018) before all datasets were merged. Occasionally, we modified unidentified country names to be consistent across datasets. We then transformed “wide” data into “long” data and merged them using R’s Tidyverse framework (Wickham, 2014). Our analysis begins with the year 1949, which was occasioned by the fact that one of the key time-variant level-1 variables, pathogen prevalence was only available from 1949 on. See our Supplemental Material for all data, Stata syntax, R-markdown for visualization, supplemental analyses and detailed results (available at https://osf.io/drt8j/).

  11. Texas GIS Data By County

    • kaggle.com
    zip
    Updated Sep 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ItsMundo (2022). Texas GIS Data By County [Dataset]. https://www.kaggle.com/datasets/itsmundo/texas-gis-data-by-county
    Explore at:
    zip(11720 bytes)Available download formats
    Dataset updated
    Sep 9, 2022
    Authors
    ItsMundo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Texas
    Description

    This dataset was created to be used in my Capstone Project for the Google Data Analytics Professional Certificate. Data was web scraped from the state websites to combine the GIS information like FIPS, latitude, longitude, and County Codes by both number and Mailing Number.

    RStudio was used for this web scrape and join. For details on how it was done you can go to the following link for my Github repository.

    Feel free to follow my Github or LinkedIn profile to see what I end up doing with this Dataset.

  12. Cleaned NHANES 1988-2018

    • figshare.com
    txt
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.

  13. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  14. cyclistsbp

    • kaggle.com
    zip
    Updated Feb 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    abayomi kayode (2022). cyclistsbp [Dataset]. https://www.kaggle.com/akaymsa/cyclistsbp
    Explore at:
    zip(35324933 bytes)Available download formats
    Dataset updated
    Feb 2, 2022
    Authors
    abayomi kayode
    Description

    The dataset is the Combined dataset using Power Query and used in R for Data Analyst Capstone project.

    Grateful for Google and the tutors that guided me through the program.

  15. R code, data, and analysis documentation for Colour biases in learned...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    zip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wyatt Toure; Simon M. Reader (2023). R code, data, and analysis documentation for Colour biases in learned foraging preferences in Trinidadian guppies [Dataset]. http://doi.org/10.6084/m9.figshare.14404868.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Wyatt Toure; Simon M. Reader
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary---------------This is the repository containing the R code and data to produce the analyses and figures in the manuscript ‘Colour biases in learned foraging preferences in Trinidadian guppies’. R version 3.6.2 was used for this project. Here, we explain how to reproduce the results, provides the location of the metadata for the data sheets, and gives descriptions of the root directory contents and folder contents. This material is adapted from the README file of the project, README.md which is located in the root directory.How to reproduce the results-------------------------------------------This project uses the renv package from RStudio to manage package dependencies and ensure reproducibility through time. To ensure results are reproduced based on the versions of the packages used at the time this project was created, you will need to install renv using install.packages("renv") in R.If you want to reproduce the results it is best to download the entire repository onto your system. This can be done by clicking the Download button on the FigShare repository (DOI: 10.6084/m9.figshare.14404868). This will download a zip file of the entire repository. Unzip the zip file to get access to the project files.Once the repository is downloaded onto your system, navigate to the root directory and open guppy-colour-learning-project.Rproj. It is important to open the project using the .Rproj file to ensure the working directory is set correctly. Then install the package dependencies onto your system using renv::restore(). Running renv::restore() will install the correct versions of all the packages needed to reproduce our results. Packages are installed in a stand-alone library for this project and will not affect your installed R packages anywhere else.If you want to reproduce specific results from the analyses you can open either analysis-experiment-1.Rmd for results from experiment 1 or analysis-experiment-2.Rmd for results from experiment 2. Both are located in the root directory. You can select the Run All option under the Code option in the navbar of RStudio to execute all the code chunks. You can also run all chunks independently as well though we advise that you do so sequentially since variables necessary for the analysis are created as the script progresses.Metadata--------------Data are available in the data/ directory. - colour-learning-experiment-1-data.csv are the data for experiment 1- colour-learning-experiment-2-full-data.csv are the data for experiment 2We provide the variable descriptions for the data sets in the file metadata.md located in the data/ directory. The packages required to conduct the analyses and construct the website as well as their versions and citations are provided in the file required-r-packages.md.Directory structure---------------------------- - data/ contains the raw data used to conduct the analyses - docs/ contains the reader-friendly html write-up of the analyses, the GitHub pages site is built from this folder - R/ contains custom R functions used in the analysis - references/ contains reference information and formatting for citations used in the project - renv/ contains an activation script and configuration files for the renv package manager - figs/ contains the individual files for the figures and residual diagnostic plots produced by the analysis scripts. This directory is created and populated by running analysis-experiment-1.Rmd, analysis-experiment-2.Rmd and combined-figures.RmdRoot directory contents------------------------------------The root directory contains Rmd scripts used to conduct the analyses, create figures, and render the website pages. Below we describe the contents of these files as well as the additional files contained in the root directory. - analysis-experiment-1.Rmd is the R code and documentation for the experiment 1 data preparation and analysis. This script generates the Analysis 1 page of the website. - analysis-experiment-2.Rmd is the R code and documentation for the experiment 2 data preparation and analysis. This script generates the Analysis 2 page of the website. - protocols.Rmd contains the protocols used to conduct the experiments and generate the data. This script generates the Protocols page of the website. - index.Rmd creates the Homepage of the project site. - combined-figures.Rmd is the R code used to create figures that combine data from experiments 1 and 2. Not used in the project site. - treatment-object-side-assignment.Rmd is the R code used to assign treatments and object sides during trials for experiment 2. Not used in the project site. - renv.lock is a JSON formatted plain text file which contains package information for the project. renv will install the packages listed in this file upon executing renv::restore() - required-r-packages.md is a plain text file containing the versions and sources of the packages required for the project. - styles.css contains the CSS formatting for the rendered html pages - LICENSE.md contains the license indicating the conditions upon which the code can be reused - guppy-colour-learning-project.Rproj is the R project file which sets the working directory of the R instance to the root directory of this repository. If trying to run the code in this repository to reproduce results it is important to open R by clicking on this .Rproj file.

  16. d

    Data from: Cultivar resistance to common scab disease of potato is dependent...

    • catalog.data.gov
    • agdatacommons.nal.usda.gov
    Updated Apr 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Agricultural Research Service (2025). Data from: Cultivar resistance to common scab disease of potato is dependent on the pathogen species [Dataset]. https://catalog.data.gov/dataset/data-from-cultivar-resistance-to-common-scab-disease-of-potato-is-dependent-on-the-pathoge-53c3e
    Explore at:
    Dataset updated
    Apr 21, 2025
    Dataset provided by
    Agricultural Research Service
    Description

    All data from the paper "Cultivar resistance to common scab disease of potato is dependent on the pathogen species." Three separate datasets are included: A csv file with the disease severity of three common scab pathogens across 55 different potato cultivars in a greenhouse pot assay (Figures 2-5 in the associated paper). The included R script was used with this data to perform the ANOVA for the data from the greenhouse pot assay (Table 2 in the associated paper). This script can be used in R for any similar dataset to calculate the significance and percent of total variation for any number of user-defined fixed effects. A zipped file with all of the qPCR data for the expression of the txtAB genes (Figure 6 in the associated paper). An Excel file with the HPLC data for making the thaxtomin detection standard curve and quantifying the amount of thaxtomin in the test sample. Resources in this dataset:Resource Title: Streptomyces pot assay data. File Name: 18.4.2updatedfileAllDataPotAssay.csvResource Description: Combined data from all Streptomyces - potato pot assays from the paper "Cultivar resistance to common scab disease of potato is dependent on the pathogen species." This csv file can be used with the example R script "DiseaseseverityEstimateScript."Resource Title: Combined qPCR data.. File Name: CombinedtxtABqPCRresults.zipResource Description: Zipped file that contains all qPCR data of txtAB gene expression in all experimental conditions. Combined qPCR data from Figure 6 of the paper "Cultivar resistance to common scab disease of potato is dependent on the pathogen species."Resource Title: R script for estimating disease severity. File Name: DiseaseSeverityEstimateScript.txtResource Description: R script used in combination with the "18.4.2updatedfileAllDataPotAssay.csv" file for generating the disease severity estimates (Figures 2-4) in the paper "Combined qPCR data from Figure 6 of the paper "Cultivar resistance to common scab disease of potato is dependent on the pathogen species."Resource Title: Thaxtomin standard curve and quantification - All data. File Name: Thaxtomin_CalCurve_log_log-Scale_12072018 (003).xlsxResource Description: Excel file with two sheets. The first sheet is all of the HPLC data used for calculating the standard curve of thaxtomin using known standards. The second sheet is the quantification data for the abundance of thaxtomin across the experimental groups. Data presented as Figure 6 in the paper "Combined qPCR data from Figure 6 of the paper "Cultivar resistance to common scab disease of potato is dependent on the pathogen species."

  17. r

    Respiration_chambers/raw_log_files and combined datasets of biomass and...

    • researchdata.edu.au
    • data.aad.gov.au
    Updated Dec 3, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BLACK, JAMES GEOFFREY; Black, J.G.; BLACK, JAMES GEOFFREY; BLACK, JAMES GEOFFREY (2018). Respiration_chambers/raw_log_files and combined datasets of biomass and chamber data, and physical parameters [Dataset]. https://researchdata.edu.au/respirationchambersrawlogfiles-combined-datasets-physical-parameters/1360456
    Explore at:
    Dataset updated
    Dec 3, 2018
    Dataset provided by
    Australian Antarctic Data Centre
    Australian Antarctic Division
    Authors
    BLACK, JAMES GEOFFREY; Black, J.G.; BLACK, JAMES GEOFFREY; BLACK, JAMES GEOFFREY
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 27, 2015 - Feb 23, 2015
    Area covered
    Description

    General overview
    The following datasets are described by this metadata record, and are available for download from the provided URL.

    - Raw log files, physical parameters raw log files
    - Raw excel files, respiration/PAM chamber raw excel spreadsheets
    - Processed and cleaned excel files, respiration chamber biomass data
    - Raw rapid light curve excel files (this is duplicated from Raw log files), combined dataset pH, temperature, oxygen, salinity, velocity for experiment
    - Associated R script file for pump cycles of respirations chambers

    ####

    Physical parameters raw log files

    Raw log files
    1) DATE=
    2) Time= UTC+11
    3) PROG=Automated program to control sensors and collect data
    4) BAT=Amount of battery remaining
    5) STEP=check aquation manual
    6) SPIES=check aquation manual
    7) PAR=Photoactive radiation
    8) Levels=check aquation manual
    9) Pumps= program for pumps
    10) WQM=check aquation manual

    ####

    Respiration/PAM chamber raw excel spreadsheets

    Abbreviations in headers of datasets
    Note: Two data sets are provided in different formats. Raw and cleaned (adj). These are the same data with the PAR column moved over to PAR.all for analysis. All headers are the same. The cleaned (adj) dataframe will work with the R syntax below, alternative add code to do cleaning in R.

    Date: ISO 1986 - Check
    Time:UTC+11 unless otherwise stated
    DATETIME: UTC+11 unless otherwise stated
    ID (of instrument in respiration chambers)
    ID43=Pulse amplitude fluoresence measurement of control
    ID44=Pulse amplitude fluoresence measurement of acidified chamber
    ID=1 Dissolved oxygen
    ID=2 Dissolved oxygen
    ID3= PAR
    ID4= PAR
    PAR=Photo active radiation umols
    F0=minimal florescence from PAM
    Fm=Maximum fluorescence from PAM
    Yield=(F0 – Fm)/Fm
    rChl=an estimate of chlorophyll (Note this is uncalibrated and is an estimate only)
    Temp=Temperature degrees C
    PAR=Photo active radiation
    PAR2= Photo active radiation2
    DO=Dissolved oxygen
    %Sat= Saturation of dissolved oxygen
    Notes=This is the program of the underwater submersible logger with the following abreviations:
    Notes-1) PAM=
    Notes-2) PAM=Gain level set (see aquation manual for more detail)
    Notes-3) Acclimatisation= Program of slowly introducing treatment water into chamber
    Notes-4) Shutter start up 2 sensors+sample…= Shutter PAMs automatic set up procedure (see aquation manual)
    Notes-5) Yield step 2=PAM yield measurement and calculation of control
    Notes-6) Yield step 5= PAM yield measurement and calculation of acidified
    Notes-7) Abatus respiration DO and PAR step 1= Program to measure dissolved oxygen and PAR (see aquation manual). Steps 1-4 are different stages of this program including pump cycles, DO and PAR measurements.

    8) Rapid light curve data
    Pre LC: A yield measurement prior to the following measurement
    After 10.0 sec at 0.5% to 8%: Level of each of the 8 steps of the rapid light curve
    Odessey PAR (only in some deployments): An extra measure of PAR (umols) using an Odessey data logger
    Dataflow PAR: An extra measure of PAR (umols) using a Dataflow sensor.
    PAM PAR: This is copied from the PAR or PAR2 column
    PAR all: This is the complete PAR file and should be used
    Deployment: Identifying which deployment the data came from

    ####

    Respiration chamber biomass data

    The data is chlorophyll a biomass from cores from the respiration chambers. The headers are: Depth (mm) Treat (Acidified or control) Chl a (pigment and indicator of biomass) Core (5 cores were collected from each chamber, three were analysed for chl a), these are psudoreplicates/subsamples from the chambers and should not be treated as replicates.

    ####

    Associated R script file for pump cycles of respirations chambers

    Associated respiration chamber data to determine the times when respiration chamber pumps delivered treatment water to chambers. Determined from Aquation log files (see associated files). Use the chamber cut times to determine net production rates. Note: Users need to avoid the times when the respiration chambers are delivering water as this will give incorrect results. The headers that get used in the attached/associated R file are start regression and end regression. The remaining headers are not used unless called for in the associated R script. The last columns of these datasets (intercept, ElapsedTimeMincoef) are determined from the linear regressions described below.

    To determine the rate of change of net production, coefficients of the regression of oxygen consumption in discrete 180 minute data blocks were determined. R squared values for fitted regressions of these coefficients were consistently high (greater than 0.9). We make two assumptions with calculation of net production rates: the first is that heterotrophic community members do not change their metabolism under OA; and the second is that the heterotrophic communities are similar between treatments.

    ####

    Combined dataset pH, temperature, oxygen, salinity, velocity for experiment

    This data is rapid light curve data generated from a Shutter PAM fluorimeter. There are eight steps in each rapid light curve. Note: The software component of the Shutter PAM fluorimeter for sensor 44 appeared to be damaged and would not cycle through the PAR cycles. Therefore the rapid light curves and recovery curves should only be used for the control chambers (sensor ID43).

    The headers are
    PAR: Photoactive radiation
    relETR: F0/Fm x PAR
    Notes: Stage/step of light curve
    Treatment: Acidified or control


    The associated light treatments in each stage. Each actinic light intensity is held for 10 seconds, then a saturating pulse is taken (see PAM methods).

    After 10.0 sec at 0.5% = 1 umols PAR
    After 10.0 sec at 0.7% = 1 umols PAR
    After 10.0 sec at 1.1% = 0.96 umols PAR
    After 10.0 sec at 1.6% = 4.32 umols PAR
    After 10.0 sec at 2.4% = 4.32 umols PAR
    After 10.0 sec at 3.6% = 8.31 umols PAR
    After 10.0 sec at 5.3% =15.78 umols PAR
    After 10.0 sec at 8.0% = 25.75 umols PAR

    This dataset appears to be missing data, note D5 rows potentially not useable information

    See the word document in the download file for more information.

  18. Data and R-Code used for Zooplankton Publication

    • figshare.com
    csv
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Austin Happel (2025). Data and R-Code used for Zooplankton Publication [Dataset]. http://doi.org/10.6084/m9.figshare.28212716.v2
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 16, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Austin Happel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Combined sewer systems are common throughout much of the world, allowing the conveyance and of both storm- and waste- water within one system. Many of these systems were built with overflows (termed “CSO”) leading to local waterways so that large rain events do not overwhelm sewer treatment plants nor cause urban flooding. Chicago, IL USA has the addition of pumping stations which, during extreme wet weather events, actively pump combined storm- and waste- water into the Chicago River. In 2023 12.3 billion liters of untreated storm- and waste-water was pumped into the Chicago River, 9.1 B l of which was discharged into Bubbly Creek. Increased conductivity was noted In Bubbly Creek following the CSO event and dissolved oxygen levels remained ≤ 5.0 mg l for several weeks following the event. We show that populations of Chydoridae were lost and Moinidae populations reached high (>100 individuals l-1) abundances within Bubbly Creek following the CSO event whereas populations of zooplankton taxa at locations elsewhere in the system remained relatively unchanged. We posit that nutrients provided by the sewage fueled a phytoplankton bloom, while low oxygen levels remove predators allowing densities of zooplankton (primarily Moinidae) to exceed 100 l-1 in Bubbly Creek. We offer some of the first in-situ evidence that releases of untreated combined storm- and waste-water (i.e., CSOs) alter zooplankton communities of the receiving waterbody.Between June 25 and July 15 2023 27.7 cm rain fell, including ~10 cm within 24 hrs on July 2 2023 (as recorded at Midway Airport), overwhelming much of the Chicago’s storm and sewer system. To alleviate pressure in the sewage system and avoid flooding communities and homes, 3.4 billion liters of untreated waste and storm water was pumped into Bubbly Creek (aka, South Fork of the South Branch of the Chicago River) over the course of 67 hours between July 2nd and 5th and 10 CSO outfalls activated. At the same time, 2.2 bn l were pumped into the North Branch over 28.4 hrs and 21 CSO outfalls activated. A second event occurred July 12-15 2023 where 2.5 bn l was pumped into Bubbly Creek via the Racine Avenue pumping station, no CSO outfalls registered as being active during this time. MWRD maintains 7 continuous dissolved oxygen monitoring stations (Eureka Manta2™ or Manta+™ Probes of Austin, Texas) within the Chicago River and North Shore Channel which are situated 1 m below the water surface and probes rotated monthly (Minarik et al. 2024). North Branch main channel locations (Station names: Church Street, Addison Avenue, Division Avenue) exhibited low oxygen (< 5.0 mg l-1) for a period of ≤ 12 hours following the early July 2023 CSO event. The station within the South Branch of the Chicago River (Station name: Loomis Avenue), exhibited low oxygen for 28 hrs July 3-4 and 21 hrs July 6-7. Downstream of our locations included in this study (Station name: Cicero Avenue), low oxygen occurred for 6 consecutive days starting July 3rd 2pm and for nearly 88% of the hourly recordings between July 13- 23rd remained < 5.0 mg l-1. Over the course of 2023, ≤ 6 % of recordings of at this downstream location exhibited oxygen levels < 5.0 mg l-1, whereas ≤ 3 % of recordings during 2023 at others mentioned so far were < 5.0 mg l-1. Within Bubbly Creek, anoxic and hypoxic conditions existed from July 2nd until July 21st, and hypoxia occurred for 76.5% of hourly recordings between July 22nd and Aug 2nd, and then 56.4% of recordings through Aug 12th (Station names: I-55 and 36th Street). Over the course of 2023, 20 and 24 % of recordings of the two Bubbly Creek monitoring stations exhibited oxygen levels < 5.0 mg l-1. For comparison, in 2022 no CSO events occurred within Bubbly Creek and ≤ 5 % of recordings across the year exhibited oxygen levels < 5.0 mg l-1 (Minarik et al. 2023).Seven locations were chosen to represent the North and South Branches of the Chicago River (Fig 1.). These included three locations on the North Shore Channel and North Branch (referred to as “North”), two locations in the South Branch main channel, and two locations to represent the main treatment location, Bubbly Creek. All sites were accessible via a pier or wading, allowing sampling of water from a depth 0.6 m.All locations were sampled mid-day, between 09:30 and 15:00, typically in order from North to South, with some variation in the order of the Southern locations which is not thought to influence results. A 5 l horizontal Niskin Bottle (General Oceanics) and a 19 l bucket were used to collect 20 l of sub-surface (top 0.6 m) water which was filtered through a 53 micron mesh sieve. Collected organisms were then washed into 20 ml scintillation vials and preserved onsite with 90 % Ethanol for later enumeration. A handheld YSI meter was used to collect data on temperature, conductivity, and dissolved oxygen during each sampling event. Samples were taken as close to the center of the waterway as possible. Sites were visited approximately weekly between June 8th and September 12th 2023, for a total of 15 sample dates. Due to visible clouds of zooplankton on July 27th in Bubbly Creek, 2 20 l samples were taken via bucket. We sampled using both the Niskin bottle and bucket concomitantly on August 1st, 8th, and 15th to facilitate analysis of any differences in the communities or abundances they captured.Zooplankton were enumerated under a dissection microscope using a gridded square petri dish. Samples with large numbers of zooplankton were diluted to a known volume (i.e., 300 ml) and at minimum 4 subsamples were taken using a 10 ml Hensen-Stempel sampler following agitation. Subsample counts were adjusted to reflect the full 20 l sample volume, and averaged to obtain 1 value for that date, location, and sampling method combination. During lab work, the Bubbly Creek sample from August 22nd was lost after one 10 ml subsample was counted (beaker broke), we maintained this sample in the analysis despite lower confidence in community composition (78 individuals counted compared to mean of 124, median of 54, and standard deviation of 176 across 146 samples). Zooplankton were classified into the following groups: Bosminidae, Calanoida, Chironomid larvae, Cyclopoida, Chydoridae, other Daphniidae (including Scapholeberis spp., Daphnia lumholtzi, among others), Diaphanosoma (Sididae), Moinidae, Naididae, Ostracoda, Polyphemus pediculus (Polyphemidae), and the presence of nauplii and rotifers were noted but not counted.

  19. Data supporting the Master thesis "Monitoring von Open Data Praktiken -...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Zinke; Katharina Zinke (2024). Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" [Dataset]. http://doi.org/10.5281/zenodo.14196539
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katharina Zinke; Katharina Zinke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Dresden
    Description

    Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" (Monitoring open data practices - challenges in finding data publications using the example of publications by researchers at TU Dresden) - Katharina Zinke, Institut für Bibliotheks- und Informationswissenschaften, Humboldt-Universität Berlin, 2023

    This ZIP-File contains the data the thesis is based on, interim exports of the results and the R script with all pre-processing, data merging and analyses carried out. The documentation of the additional, explorative analysis is also available. The actual PDFs and text files of the scientific papers used are not included as they are published open access.

    The folder structure is shown below with the file names and a brief description of the contents of each file. For details concerning the analyses approach, please refer to the master's thesis (publication following soon).

    ## Data sources

    Folder 01_SourceData/

    - PLOS-Dataset_v2_Mar23.csv (PLOS-OSI dataset)

    - ScopusSearch_ExportResults.csv (export of Scopus search results from Scopus)

    - ScopusSearch_ExportResults.ris (export of Scopus search results from Scopus)

    - Zotero_Export_ScopusSearch.csv (export of the file names and DOIs of the Scopus search results from Zotero)

    ## Automatic classification

    Folder 02_AutomaticClassification/

    - (NOT INCLUDED) PDFs folder (Folder for PDFs of all publications identified by the Scopus search, named AuthorLastName_Year_PublicationTitle_Title)

    - (NOT INCLUDED) PDFs_to_text folder (Folder for all texts extracted from the PDFs by ODDPub, named AuthorLastName_Year_PublicationTitle_Title)

    - PLOS_ScopusSearch_matched.csv (merge of the Scopus search results with the PLOS_OSI dataset for the files contained in both)

    - oddpub_results_wDOIs.csv (results file of the ODDPub classification)

    - PLOS_ODDPub.csv (merge of the results file of the ODDPub classification with the PLOS-OSI dataset for the publications contained in both)

    ## Manual coding

    Folder 03_ManualCheck/

    - CodeSheet_ManualCheck.txt (Code sheet with descriptions of the variables for manual coding)

    - ManualCheck_2023-06-08.csv (Manual coding results file)

    - PLOS_ODDPub_Manual.csv (Merge of the results file of the ODDPub and PLOS-OSI classification with the results file of the manual coding)

    ## Explorative analysis for the discoverability of open data

    Folder04_FurtherAnalyses

    Proof_of_of_Concept_Open_Data_Monitoring.pdf (Description of the explorative analysis of the discoverability of open data publications using the example of a researcher) - in German

    ## R-Script

    Analyses_MA_OpenDataMonitoring.R (R-Script for preparing, merging and analyzing the data and for performing the ODDPub algorithm)

  20. d

    Data from: Estimating historic N- and S-deposition with publicly available...

    • datadryad.org
    • data.niaid.nih.gov
    zip
    Updated Nov 15, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Schellenberger Costa; Johanna Otto; Ines Chmara; Markus Bernhardt-Römermann (2021). Estimating historic N- and S-deposition with publicly available data – An example from Central Germany [Dataset]. http://doi.org/10.5061/dryad.n5tb2rbwz
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 15, 2021
    Dataset provided by
    Dryad
    Authors
    David Schellenberger Costa; Johanna Otto; Ines Chmara; Markus Bernhardt-Römermann
    Time period covered
    Oct 20, 2021
    Area covered
    Germany
    Description

    Data on European emission and deposition trends was collected from several studies mentioned in the original publication using the digitize R-package which allows for the conversion of graphical data to numeric values.

    The FFK data were received and uploaded with the permission of the Forstliches Forschungs- und Kompetenzzentrum (FFK Gotha), a department of the Thuringia forestry department.

    The UBA data are publicly available upon request from the Umweltbundesamt, the federal Environmental Agency from Germany.

    Rainfall data were retrieved from Climatology Lab (http://www.climatologylab.org/terraclimate.html) and CRU.

    Elevation data was retrieved from ASTER GDEM, a product of METI and NASA.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gao Chen; Jennifer R. Olson; Michael Shook (2025). NSF/NCAR GV HIAPER 1 Minute Data Merge [Dataset]. http://doi.org/10.26023/R1RA-JHKZ-W913

NSF/NCAR GV HIAPER 1 Minute Data Merge

Explore at:
asciiAvailable download formats
Dataset updated
Oct 7, 2025
Authors
Gao Chen; Jennifer R. Olson; Michael Shook
Time period covered
May 18, 2012 - Jun 30, 2012
Area covered
Description

This data set contains NSF/NCAR GV HIAPER 1 Minute Data Merge data collected during the Deep Convective Clouds and Chemistry Experiment (DC3) from 18 May 2012 through 30 June 2012. These are updated merges from the NASA DC3 archive that were made available 13 June 2014. In most cases, variable names have been kept identical to those submitted in the raw data files. However, in some cases, names have been changed (e.g., to eliminate duplication). Units have been standardized throughout the merge. In addition, a "grand merge" has been provided. This includes data from all the individual merged flights throughout the mission. This grand merge will follow the following naming convention: "dc3-mrg60-gV_merge_YYYYMMdd_R5_thruYYYYMMdd.ict" (with the comment "_thruYYYYMMdd" indicating the last flight date included). This data set is in ICARTT format. Please see the header portion of the data files for details on instruments, parameters, quality assurance, quality control, contact information, and data set comments.

Search
Clear search
Close search
Google apps
Main menu