54 datasets found
  1. e

    Merger of BNV-D data (2008 to 2019) and enrichment

    • data.europa.eu
    zip
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick VINCOURT, Merger of BNV-D data (2008 to 2019) and enrichment [Dataset]. https://data.europa.eu/data/datasets/5f1c3eca9d149439e50c740f
    Explore at:
    zip(18530465)Available download formats
    Dataset authored and provided by
    Patrick VINCOURT
    Description

    Merging (in Table R) data published on https://www.data.gouv.fr/fr/datasets/ventes-de-pesticides-par-departement/, and joining two other sources of information associated with MAs: — uses: https://www.data.gouv.fr/fr/datasets/usages-des-produits-phytosanitaires/ — information on the “Biocontrol” status of the product, from document DGAL/SDQSPV/2020-784 published on 18/12/2020 at https://agriculture.gouv.fr/quest-ce-que-le-biocontrole

    All the initial files (.csv transformed into.txt), the R code used to merge data and different output files are collected in a zip. enter image description here NB: 1) “YASCUB” for {year,AMM,Substance_active,Classification,Usage,Statut_“BioConttrol”}, substances not on the DGAL/SDQSPV list being coded NA. 2) The file of biocontrol products shall be cleaned from the duplicates generated by the marketing authorisations leading to several trade names.
    3) The BNVD_BioC_DY3 table and the output file BNVD_BioC_DY3.txt contain the fields {Code_Region,Region,Dept,Code_Dept,Anne,Usage,Classification,Type_BioC,Quantite_substance)}

  2. Data from: KORUS-AQ Aircraft Merge Data Files

    • catalog.data.gov
    • s.cnmilf.com
    • +1more
    Updated Jul 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NASA/LARC/SD/ASDC (2025). KORUS-AQ Aircraft Merge Data Files [Dataset]. https://catalog.data.gov/dataset/korus-aq-aircraft-merge-data-files
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    KORUSAQ_Merge_Data are pre-generated merge data files combining various products collected during the KORUS-AQ field campaign. This collection features pre-generated merge files for the DC-8 aircraft. Data collection for this product is complete.The KORUS-AQ field study was conducted in South Korea during May-June, 2016. The study was jointly sponsored by NASA and Korea’s National Institute of Environmental Research (NIER). The primary objectives were to investigate the factors controlling air quality in Korea (e.g., local emissions, chemical processes, and transboundary transport) and to assess future air quality observing strategies incorporating geostationary satellite observations. To achieve these science objectives, KORUS-AQ adopted a highly coordinated sampling strategy involved surface and airborne measurements including both in-situ and remote sensing instruments.Surface observations provided details on ground-level air quality conditions while airborne sampling provided an assessment of conditions aloft relevant to satellite observations and necessary to understand the role of emissions, chemistry, and dynamics in determining air quality outcomes. The sampling region covers the South Korean peninsula and surrounding waters with a primary focus on the Seoul Metropolitan Area. Airborne sampling was primarily conducted from near surface to about 8 km with extensive profiling to characterize the vertical distribution of pollutants and their precursors. The airborne observational data were collected from three aircraft platforms: the NASA DC-8, NASA B-200, and Hanseo King Air. Surface measurements were conducted from 16 ground sites and 2 ships: R/V Onnuri and R/V Jang Mok.The major data products collected from both the ground and air include in-situ measurements of trace gases (e.g., ozone, reactive nitrogen species, carbon monoxide and dioxide, methane, non-methane and oxygenated hydrocarbon species), aerosols (e.g., microphysical and optical properties and chemical composition), active remote sensing of ozone and aerosols, and passive remote sensing of NO2, CH2O, and O3 column densities. These data products support research focused on examining the impact of photochemistry and transport on ozone and aerosols, evaluating emissions inventories, and assessing the potential use of satellite observations in air quality studies.

  3. n

    Multilevel modeling of time-series cross-sectional data reveals the dynamic...

    • data.niaid.nih.gov
    • dataone.org
    • +2more
    zip
    Updated Mar 6, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kodai Kusano (2020). Multilevel modeling of time-series cross-sectional data reveals the dynamic interaction between ecological threats and democratic development [Dataset]. http://doi.org/10.5061/dryad.547d7wm3x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 6, 2020
    Dataset provided by
    University of Nevada, Reno
    Authors
    Kodai Kusano
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    What is the relationship between environment and democracy? The framework of cultural evolution suggests that societal development is an adaptation to ecological threats. Pertinent theories assume that democracy emerges as societies adapt to ecological factors such as higher economic wealth, lower pathogen threats, less demanding climates, and fewer natural disasters. However, previous research confused within-country processes with between-country processes and erroneously interpreted between-country findings as if they generalize to within-country mechanisms. In this article, we analyze a time-series cross-sectional dataset to study the dynamic relationship between environment and democracy (1949-2016), accounting for previous misconceptions in levels of analysis. By separating within-country processes from between-country processes, we find that the relationship between environment and democracy not only differs by countries but also depends on the level of analysis. Economic wealth predicts increasing levels of democracy in between-country comparisons, but within-country comparisons show that democracy declines as countries become wealthier over time. This relationship is only prevalent among historically wealthy countries but not among historically poor countries, whose wealth also increased over time. By contrast, pathogen prevalence predicts lower levels of democracy in both between-country and within-country comparisons. Our longitudinal analyses identifying temporal precedence reveal that not only reductions in pathogen prevalence drive future democracy, but also democracy reduces future pathogen prevalence and increases future wealth. These nuanced results contrast with previous analyses using narrow, cross-sectional data. As a whole, our findings illuminate the dynamic process by which environment and democracy shape each other.

    Methods Our Time-Series Cross-Sectional data combine various online databases. Country names were first identified and matched using R-package “countrycode” (Arel-Bundock, Enevoldsen, & Yetman, 2018) before all datasets were merged. Occasionally, we modified unidentified country names to be consistent across datasets. We then transformed “wide” data into “long” data and merged them using R’s Tidyverse framework (Wickham, 2014). Our analysis begins with the year 1949, which was occasioned by the fact that one of the key time-variant level-1 variables, pathogen prevalence was only available from 1949 on. See our Supplemental Material for all data, Stata syntax, R-markdown for visualization, supplemental analyses and detailed results (available at https://osf.io/drt8j/).

  4. f

    Data from: HOW TO PERFORM A META-ANALYSIS: A PRACTICAL STEP-BY-STEP GUIDE...

    • scielo.figshare.com
    tiff
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diego Ariel de Lima; Camilo Partezani Helito; Lana Lacerda de Lima; Renata Clazzer; Romeu Krause Gonçalves; Olavo Pires de Camargo (2023). HOW TO PERFORM A META-ANALYSIS: A PRACTICAL STEP-BY-STEP GUIDE USING R SOFTWARE AND RSTUDIO [Dataset]. http://doi.org/10.6084/m9.figshare.19899537.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    SciELO journals
    Authors
    Diego Ariel de Lima; Camilo Partezani Helito; Lana Lacerda de Lima; Renata Clazzer; Romeu Krause Gonçalves; Olavo Pires de Camargo
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    ABSTRACT Meta-analysis is an adequate statistical technique to combine results from different studies, and its use has been growing in the medical field. Thus, not only knowing how to interpret meta-analysis, but also knowing how to perform one, is fundamental today. Therefore, the objective of this article is to present the basic concepts and serve as a guide for conducting a meta-analysis using R and RStudio software. For this, the reader has access to the basic commands in the R and RStudio software, necessary for conducting a meta-analysis. The advantage of R is that it is a free software. For a better understanding of the commands, two examples were presented in a practical way, in addition to revising some basic concepts of this statistical technique. It is assumed that the data necessary for the meta-analysis has already been collected, that is, the description of methodologies for systematic review is not a discussed subject. Finally, it is worth remembering that there are many other techniques used in meta-analyses that were not addressed in this work. However, with the two examples used, the article already enables the reader to proceed with good and robust meta-analyses. Level of Evidence V, Expert Opinion.

  5. Data supporting the Master thesis "Monitoring von Open Data Praktiken -...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Zinke; Katharina Zinke (2024). Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" [Dataset]. http://doi.org/10.5281/zenodo.14196539
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 21, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Katharina Zinke; Katharina Zinke
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data supporting the Master thesis "Monitoring von Open Data Praktiken - Herausforderungen beim Auffinden von Datenpublikationen am Beispiel der Publikationen von Forschenden der TU Dresden" (Monitoring open data practices - challenges in finding data publications using the example of publications by researchers at TU Dresden) - Katharina Zinke, Institut für Bibliotheks- und Informationswissenschaften, Humboldt-Universität Berlin, 2023

    This ZIP-File contains the data the thesis is based on, interim exports of the results and the R script with all pre-processing, data merging and analyses carried out. The documentation of the additional, explorative analysis is also available. The actual PDFs and text files of the scientific papers used are not included as they are published open access.

    The folder structure is shown below with the file names and a brief description of the contents of each file. For details concerning the analyses approach, please refer to the master's thesis (publication following soon).

    ## Data sources

    Folder 01_SourceData/

    - PLOS-Dataset_v2_Mar23.csv (PLOS-OSI dataset)

    - ScopusSearch_ExportResults.csv (export of Scopus search results from Scopus)

    - ScopusSearch_ExportResults.ris (export of Scopus search results from Scopus)

    - Zotero_Export_ScopusSearch.csv (export of the file names and DOIs of the Scopus search results from Zotero)

    ## Automatic classification

    Folder 02_AutomaticClassification/

    - (NOT INCLUDED) PDFs folder (Folder for PDFs of all publications identified by the Scopus search, named AuthorLastName_Year_PublicationTitle_Title)

    - (NOT INCLUDED) PDFs_to_text folder (Folder for all texts extracted from the PDFs by ODDPub, named AuthorLastName_Year_PublicationTitle_Title)

    - PLOS_ScopusSearch_matched.csv (merge of the Scopus search results with the PLOS_OSI dataset for the files contained in both)

    - oddpub_results_wDOIs.csv (results file of the ODDPub classification)

    - PLOS_ODDPub.csv (merge of the results file of the ODDPub classification with the PLOS-OSI dataset for the publications contained in both)

    ## Manual coding

    Folder 03_ManualCheck/

    - CodeSheet_ManualCheck.txt (Code sheet with descriptions of the variables for manual coding)

    - ManualCheck_2023-06-08.csv (Manual coding results file)

    - PLOS_ODDPub_Manual.csv (Merge of the results file of the ODDPub and PLOS-OSI classification with the results file of the manual coding)

    ## Explorative analysis for the discoverability of open data

    Folder04_FurtherAnalyses

    Proof_of_of_Concept_Open_Data_Monitoring.pdf (Description of the explorative analysis of the discoverability of open data publications using the example of a researcher) - in German

    ## R-Script

    Analyses_MA_OpenDataMonitoring.R (R-Script for preparing, merging and analyzing the data and for performing the ODDPub algorithm)

  6. SBI Cruise HLY0402 merged bottle dataset

    • data.ucar.edu
    • search.dataone.org
    • +1more
    ascii
    Updated Feb 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Kirchman; Dennis Hansell; Margaret H. P. Best; Nick R. Bates; Ronald Benner; Service Group, Scripps Institution of Oceanography, University of California - San Diego; Steven Roberts; Victoria J. Hill (2024). SBI Cruise HLY0402 merged bottle dataset [Dataset]. https://data.ucar.edu/dataset/sbi-cruise-hly0402-merged-bottle-dataset
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Feb 7, 2024
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    David Kirchman; Dennis Hansell; Margaret H. P. Best; Nick R. Bates; Ronald Benner; Service Group, Scripps Institution of Oceanography, University of California - San Diego; Steven Roberts; Victoria J. Hill
    Time period covered
    May 5, 2004 - Jun 23, 2004
    Area covered
    Description

    This data set contains merged bottle data from the SBI cruise on the United States Coast Guard Cutter (USCGC) Healy (HLY0402). During this cruise rosette casts were conducted and a bottle data file was generated by the Scripps Service group from these water samples. Additional groups were funded to measure supplementary parameters from these same water samples. This data set is the first version of the merging of the Scripps Service group bottle data file with these data gathered by these additional groups.

  7. D

    Replication Data for: Wake merging and turbulence transition downstream of...

    • dataverse.no
    • search.dataone.org
    Updated Jun 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R. Jason Hearst; R. Jason Hearst; Fanny Olivia Johannessen Berstad; Ingrid Neunaber; Ingrid Neunaber; Fanny Olivia Johannessen Berstad (2025). Replication Data for: Wake merging and turbulence transition downstream of side-by-side porous discs [Dataset]. http://doi.org/10.18710/XAEWC5
    Explore at:
    application/x-rlang-transport(1417054263), application/x-rlang-transport(1363277492), application/x-rlang-transport(1400436794), txt(7883), application/x-rlang-transport(1355448278), application/x-rlang-transport(1069205992), application/x-rlang-transport(1389202797), application/x-rlang-transport(1434576877), application/x-rlang-transport(959411386), application/x-rlang-transport(1373148467), application/x-rlang-transport(1398098974), application/x-rlang-transport(1195049341), application/x-rlang-transport(1605897578), application/x-rlang-transport(1341687981), application/x-rlang-transport(1276474862), application/x-rlang-transport(1097556108), application/x-rlang-transport(1412349302), application/x-rlang-transport(1471679338), application/x-rlang-transport(1292190917), application/x-rlang-transport(1033022936), application/x-rlang-transport(1287168311), application/x-rlang-transport(1425403151), application/x-rlang-transport(1417989437), application/x-rlang-transport(1361195525), application/x-rlang-transport(1313472566)Available download formats
    Dataset updated
    Jun 2, 2025
    Dataset provided by
    DataverseNO
    Authors
    R. Jason Hearst; R. Jason Hearst; Fanny Olivia Johannessen Berstad; Ingrid Neunaber; Ingrid Neunaber; Fanny Olivia Johannessen Berstad
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    These are the streamwise velocity time series measured in the wakes of two sets of porous discs in side-by-side setting as used in the manuscript ``Wake merging and turbulence transition downstream of side-by-side porous discs´´ which is accepted by Journal of Fluid Mechanics. Data was obtained by means of hot-wire anemometry in the Large Scale Wind Tunnel at the Norwegian University of Science and Technology in near-laminar inflow (background turbulence intensity of approximately 0.3%) at an inflow velocity of 10m/s (diameter-based Reynolds number 125000). Two types of porous discs with diameters D = 0.2m, one with uniform blockage and one with radially changing blockage, were used. Three spacings, namely 1.5D, 2D and 3D, were investigated. Span-wise profiles were measured at 8D and 30D downstream for each case, and a streamwise profile along the centerline between the discs was additionally obtained. In addition, measurements downstream of both disc types (singe disc setting) are provided as comparison. The scope of these experiments was to study the merging mechanisms of the turbulence when the two wakes are meeting.

  8. f

    Cleaned NHANES 1988-2018

    • figshare.com
    txt
    Updated Feb 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet (2025). Cleaned NHANES 1988-2018 [Dataset]. http://doi.org/10.6084/m9.figshare.21743372.v9
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 18, 2025
    Dataset provided by
    figshare
    Authors
    Vy Nguyen; Lauren Y. M. Middleton; Neil Zhao; Lei Huang; Eliseu Verly; Jacob Kvasnicka; Luke Sagers; Chirag Patel; Justin Colacino; Olivier Jolliet
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The National Health and Nutrition Examination Survey (NHANES) provides data and have considerable potential to study the health and environmental exposure of the non-institutionalized US population. However, as NHANES data are plagued with multiple inconsistencies, processing these data is required before deriving new insights through large-scale analyses. Thus, we developed a set of curated and unified datasets by merging 614 separate files and harmonizing unrestricted data across NHANES III (1988-1994) and Continuous (1999-2018), totaling 135,310 participants and 5,078 variables. The variables conveydemographics (281 variables),dietary consumption (324 variables),physiological functions (1,040 variables),occupation (61 variables),questionnaires (1444 variables, e.g., physical activity, medical conditions, diabetes, reproductive health, blood pressure and cholesterol, early childhood),medications (29 variables),mortality information linked from the National Death Index (15 variables),survey weights (857 variables),environmental exposure biomarker measurements (598 variables), andchemical comments indicating which measurements are below or above the lower limit of detection (505 variables).csv Data Record: The curated NHANES datasets and the data dictionaries includes 23 .csv files and 1 excel file.The curated NHANES datasets involves 20 .csv formatted files, two for each module with one as the uncleaned version and the other as the cleaned version. The modules are labeled as the following: 1) mortality, 2) dietary, 3) demographics, 4) response, 5) medications, 6) questionnaire, 7) chemicals, 8) occupation, 9) weights, and 10) comments."dictionary_nhanes.csv" is a dictionary that lists the variable name, description, module, category, units, CAS Number, comment use, chemical family, chemical family shortened, number of measurements, and cycles available for all 5,078 variables in NHANES."dictionary_harmonized_categories.csv" contains the harmonized categories for the categorical variables.“dictionary_drug_codes.csv” contains the dictionary for descriptors on the drugs codes.“nhanes_inconsistencies_documentation.xlsx” is an excel file that contains the cleaning documentation, which records all the inconsistencies for all affected variables to help curate each of the NHANES modules.R Data Record: For researchers who want to conduct their analysis in the R programming language, only cleaned NHANES modules and the data dictionaries can be downloaded as a .zip file which include an .RData file and an .R file.“w - nhanes_1988_2018.RData” contains all the aforementioned datasets as R data objects. We make available all R scripts on customized functions that were written to curate the data.“m - nhanes_1988_2018.R” shows how we used the customized functions (i.e. our pipeline) to curate the original NHANES data.Example starter codes: The set of starter code to help users conduct exposome analysis consists of four R markdown files (.Rmd). We recommend going through the tutorials in order.“example_0 - merge_datasets_together.Rmd” demonstrates how to merge the curated NHANES datasets together.“example_1 - account_for_nhanes_design.Rmd” demonstrates how to conduct a linear regression model, a survey-weighted regression model, a Cox proportional hazard model, and a survey-weighted Cox proportional hazard model.“example_2 - calculate_summary_statistics.Rmd” demonstrates how to calculate summary statistics for one variable and multiple variables with and without accounting for the NHANES sampling design.“example_3 - run_multiple_regressions.Rmd” demonstrates how run multiple regression models with and without adjusting for the sampling design.

  9. h

    Medical-R1-Distill-Data

    • huggingface.co
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcus Cedric R. Idia (2025). Medical-R1-Distill-Data [Dataset]. https://huggingface.co/datasets/marcuscedricridia/Medical-R1-Distill-Data
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    Marcus Cedric R. Idia
    Description

    Merged UI Dataset: Medical-R1-Distill-Data

    This dataset was automatically generated by merging and processing the following sources: FreedomIntelligence/Medical-R1-Distill-Data Generation Timestamp: 2025-04-03 20:06:44 Processing Time: 2.56 seconds Output Format: sharegpt

      Processing Summary
    

    Total Datasets Attempted: 1 Datasets Successfully Processed: 1 Datasets Failed/Skipped: 0 Total Input Rows Scanned: 22,000 Total Formatted Entries Generated: 22,000 Entries with… See the full description on the dataset page: https://huggingface.co/datasets/marcuscedricridia/Medical-R1-Distill-Data.

  10. H

    National Health and Nutrition Examination Survey (NHANES)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). National Health and Nutrition Examination Survey (NHANES) [Dataset]. http://doi.org/10.7910/DVN/IMWQPJ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the national health and nutrition examination survey (nhanes) with r nhanes is this fascinating survey where doctors and dentists accompany survey interviewers in a little mobile medical center that drives around the country. while the survey folks are interviewing people, the medical professionals administer laboratory tests and conduct a real doctor's examination. the b lood work and medical exam allow researchers like you and me to answer tough questions like, "how many people have diabetes but don't know they have diabetes?" conducting the lab tests and the physical isn't cheap, so a new nhanes data set becomes available once every two years and only includes about twelve thousand respondents. since the number of respondents is so small, analysts often pool multiple years of data together. the replication scripts below give a few different examples of how multiple years of data can be pooled with r. the survey gets conducted by the centers for disease control and prevention (cdc), and generalizes to the united states non-institutional, non-active duty military population. most of the data tables produced by the cdc include only a small number of variables, so importation with the foreign package's read.xport function is pretty straightforward. but that makes merging the appropriate data sets trickier, since it might not be clear what to pull for which variables. for every analysis, start with the table with 'demo' in the name -- this file includes basic demographics, weighting, and complex sample survey design variables. since it's quick to download the files directly from the cdc's ftp site, there's no massive ftp download automation script. this new github repository co ntains five scripts: 2009-2010 interview only - download and analyze.R download, import, save the demographics and health insurance files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the interview weights run a series of pretty generic analyses on the health insurance ques tions 2009-2010 interview plus laboratory - download and analyze.R download, import, save the demographics and cholesterol files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the mobile examination component (mec) weights perform a direct-method age-adjustment and matc h figure 1 of this cdc cholesterol brief replicate 2005-2008 pooled cdc oral examination figure.R download, import, save, pool, recode, create a survey object, run some basic analyses replicate figure 3 from this cdc oral health databrief - the whole barplot replicate cdc publications.R download, import, save, pool, merge, and recode the demographics file plus cholesterol laboratory, blood pressure questionnaire, and blood pressure laboratory files match the cdc's example sas and sudaan syntax file's output for descriptive means match the cdc's example sas and sudaan synta x file's output for descriptive proportions match the cdc's example sas and sudaan syntax file's output for descriptive percentiles replicate human exposure to chemicals report.R (user-contributed) download, import, save, pool, merge, and recode the demographics file plus urinary bisphenol a (bpa) laboratory files log-transform some of the columns to calculate the geometric means and quantiles match the 2007-2008 statistics shown on pdf page 21 of the cdc's fourth edition of the report click here to view these five scripts for more detail about the national health and nutrition examination survey (nhanes), visit: the cdc's nhanes homepage the national cancer institute's page of nhanes web tutorials notes: nhanes includes interview-only weights and interview + mobile examination component (mec) weights. if you o nly use questions from the basic interview in your analysis, use the interview-only weights (the sample size is a bit larger). i haven't really figured out a use for the interview-only weights -- nhanes draws most of its power from the combination of the interview and the mobile examination component variables. if you're only using variables from the interview, see if you can use a data set with a larger sample size like the current population (cps), national health interview survey (nhis), or medical expenditure panel survey (meps) instead. confidential to sas, spss, stata, sudaan users: why are you still riding around on a donkey after we've invented the internal combustion engine? time to transition to r. :D

  11. HIPPO Merged 10-Second Meteorology, Atmospheric Chemistry, and Aerosol Data

    • data.ucar.edu
    ascii
    Updated Dec 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Watt; Anne E. Perring; Benjamin R. Miller; Bin Xiang; Bradley Hall; Britton B. Stephens; Bruce Daube; Christopher A. Pickett-Heaps; Colm Sweeney; Dale Hurst; Daniel J. Jacob; David C. Rogers; David Nance; David W. Fahey; Elliot Atlas; Eric J. Hintsa; Eric Kort; Eric Ray; Fred Moore; Geoff S. Dutton; Greg Santoni; Huiqun Wang; J. Ryan Spackman; James W. Elkins; Jasna V. Pittman; Jenny A. Fisher; Jonathan Bent; Joshua P. Schwarz; Julie Haggerty; Karen H. Rosenlof; Kevin J. Wecht; Laurel A. Watts; Mark Zondlo; Michael J. Mahoney (deceased); Minghui Diao; Pavel Romashkin; Qiaoqiao Wang; Ralph F. Keeling; Richard Lueb; Rodrigo Jimenez-Pizarro; Roger Hendershot; Roisin Commane; Ru-Shan Gao; Samuel J. Oltmans; Stephen A. Montzka; Stephen R. Shertz; Steven C. Wofsy; Stuart Beaton; Sunyoung Park; Teresa Campos; William A. Cooper (2024). HIPPO Merged 10-Second Meteorology, Atmospheric Chemistry, and Aerosol Data [Dataset]. http://doi.org/10.3334/CDIAC/HIPPO_010
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Dec 26, 2024
    Dataset provided by
    University Corporation for Atmospheric Research
    Authors
    Andrew Watt; Anne E. Perring; Benjamin R. Miller; Bin Xiang; Bradley Hall; Britton B. Stephens; Bruce Daube; Christopher A. Pickett-Heaps; Colm Sweeney; Dale Hurst; Daniel J. Jacob; David C. Rogers; David Nance; David W. Fahey; Elliot Atlas; Eric J. Hintsa; Eric Kort; Eric Ray; Fred Moore; Geoff S. Dutton; Greg Santoni; Huiqun Wang; J. Ryan Spackman; James W. Elkins; Jasna V. Pittman; Jenny A. Fisher; Jonathan Bent; Joshua P. Schwarz; Julie Haggerty; Karen H. Rosenlof; Kevin J. Wecht; Laurel A. Watts; Mark Zondlo; Michael J. Mahoney (deceased); Minghui Diao; Pavel Romashkin; Qiaoqiao Wang; Ralph F. Keeling; Richard Lueb; Rodrigo Jimenez-Pizarro; Roger Hendershot; Roisin Commane; Ru-Shan Gao; Samuel J. Oltmans; Stephen A. Montzka; Stephen R. Shertz; Steven C. Wofsy; Stuart Beaton; Sunyoung Park; Teresa Campos; William A. Cooper
    Time period covered
    Jan 7, 2009 - Sep 15, 2011
    Area covered
    Description

    This data set provides the merged 10-second data product of meteorological, atmospheric chemistry, and aerosol measurements from all Missions, 1 through 5, of the HIAPER Pole-to-Pole Observations (HIPPO) study of carbon cycle and greenhouse gases. The Missions took place from January of 2009 to September 2011. All of the data are provide in one space-delimited format ASCII file. The 10-second merged data product was derived by combining the NSF/NCAR GV aircraft navigation and atmospheric structure parameters for position, time, temperature, pressure, wind speed, etc., reported at 1-second frequency, with meteorological, atmospheric chemistry and aerosol measurements made by several teams of investigators on a common time and position basis. Investigators reported most continuously measured parameters at a 1-second interval. The 1 second measurements were aggregated with a median filter to 10 seconds. The fast-sample GC and whole air sample measurements reported at the greater than 10 second intervals (15-120 seconds including processing time) were aggregated to the most representative 10 second sample interval. A supplementary file is provided with this product that summarizes the completeness of the reported data values (HIPPO_10s_meta_summary.tbl). The completeness entries are the number of non-missing observations for each species in the main data file for each mission and in total. The data are provided in one space-delimited format ASCII file. Note that EOL Version 1.0 corresponds to R. 20121129 previously served by ORNL.

  12. BRAINTEASER ALS and MS Datasets

    • zenodo.org
    • data.europa.eu
    Updated Jul 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guglielmo Faggioli; Alessandro Guazzo; Stefano Marchesin; Laura Menotti; Isotta Trescato; Helena Aidos; Roberto Bergamaschi; Giovanni Birolo; Paola Cavalla; Adriano Chiò; Arianna Dagliati; Mamede de Carvalho; Giorgio Maria Di Nunzio; Piero Fariselli; Jose Manuel García Dominguez; Marta Gromicho; Enrico Longato; Sara C. Madeira; Umberto Manera; Gianmaria Silvello; Eleonora Tavazzi; Erica Tavazzi; Marta Vettoretti; Barbara Di Camillo; Nicola Ferro; Nicola Ferro; Guglielmo Faggioli; Alessandro Guazzo; Stefano Marchesin; Laura Menotti; Isotta Trescato; Helena Aidos; Roberto Bergamaschi; Giovanni Birolo; Paola Cavalla; Adriano Chiò; Arianna Dagliati; Mamede de Carvalho; Giorgio Maria Di Nunzio; Piero Fariselli; Jose Manuel García Dominguez; Marta Gromicho; Enrico Longato; Sara C. Madeira; Umberto Manera; Gianmaria Silvello; Eleonora Tavazzi; Erica Tavazzi; Marta Vettoretti; Barbara Di Camillo (2024). BRAINTEASER ALS and MS Datasets [Dataset]. http://doi.org/10.5281/zenodo.8083181
    Explore at:
    Dataset updated
    Jul 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Guglielmo Faggioli; Alessandro Guazzo; Stefano Marchesin; Laura Menotti; Isotta Trescato; Helena Aidos; Roberto Bergamaschi; Giovanni Birolo; Paola Cavalla; Adriano Chiò; Arianna Dagliati; Mamede de Carvalho; Giorgio Maria Di Nunzio; Piero Fariselli; Jose Manuel García Dominguez; Marta Gromicho; Enrico Longato; Sara C. Madeira; Umberto Manera; Gianmaria Silvello; Eleonora Tavazzi; Erica Tavazzi; Marta Vettoretti; Barbara Di Camillo; Nicola Ferro; Nicola Ferro; Guglielmo Faggioli; Alessandro Guazzo; Stefano Marchesin; Laura Menotti; Isotta Trescato; Helena Aidos; Roberto Bergamaschi; Giovanni Birolo; Paola Cavalla; Adriano Chiò; Arianna Dagliati; Mamede de Carvalho; Giorgio Maria Di Nunzio; Piero Fariselli; Jose Manuel García Dominguez; Marta Gromicho; Enrico Longato; Sara C. Madeira; Umberto Manera; Gianmaria Silvello; Eleonora Tavazzi; Erica Tavazzi; Marta Vettoretti; Barbara Di Camillo
    Description

    BRAINTEASER (Bringing Artificial Intelligence home for a better care of amyotrophic lateral sclerosis and multiple sclerosis) is a data science project that seeks to exploit the value of big data, including those related to health, lifestyle habits, and environment, to support patients with Amyotrophic Lateral Sclerosis (ALS) and Multiple Sclerosis (MS) and their clinicians. Taking advantage of cost-efficient sensors and apps, BRAINTEASER will integrate large, clinical datasets that host both patient-generated and environmental data.

    As part of its activities, BRAINTEASER organized two open evaluation challenges on Intelligent Disease Progression Prediction (iDPP), iDPP@CLEF 2022 and iDPP@CLEF 2023, co-located with the Conference and Labs of the Evaluation Forum (CLEF).

    The goal of iDPP@CLEF is to design and develop an evaluation infrastructure for AI algorithms able to:

    • better describe disease mechanisms;
    • stratify patients according to their phenotype assessed all over the disease evolution;
    • predict disease progression in a probabilistic, time dependent fashion.

    The iDPP@CLEF challenges relied on retrospective ALS and MS patient data made available by the clinical partners of the BRAINTEASER consortium. The datasets contain data about 2,204 ALS patients (static variables, ALSFRS-R questionnaires, spirometry tests, environmental/pollution data) and 1,792 MS patients (static variables, EDSS scores, evoked potentials, relapses, MRIs).

    More in detail, the BRAINTEASER project retrospective datasets derived from the merging of already existing datasets obtained by the clinical centers involved in the BRAINTEASER Project.

    • The ALS dataset was obtained by the merge and homogenisation of the Piemonte and Valle d’Aosta Registry for Amyotrophic Lateral Sclerosis (PARALS, Chiò et al., 2017) and the Lisbon ALS clinic (CENTRO ACADÉMICO DE MEDICINA DE LISBOA, Centro Hospitalar Universitário de Lisboa-Norte, Hospital de Santa Maria, Lisbon, Portugal,) dataset. Both datasets was initiated in 1995 and are currently maintained by researchers of the ALS Regional Expert Centre (CRESLA), University of Turin and of the CENTRO ACADÉMICO DE MEDICINA DE LISBOA-Instituto de Medicina Molecular, Faculdade de Medicina, Universidade de Lisboa. They include demographic and clinical data, comprehending both static and dynamic variables.
    • The MS dataset was obtained from the Pavia MS clinical dataset, that was started in 1990 and contains demographic and clinical information that are continuously updated by the researchers of the Institute and the Turin MS clinic dataset (Department of Neurosciences and Mental Health, Neurology Unit 1, Città della Salute e della Scienza di Torino.
    • Retrospective environmental data are accessible at various scales at the individual subject level. Thus, environmental data have been retrieved at different scales:
      • To gather macroscale air pollution data we’ve leveraged data coming from public monitoring stations that cover the whole extension of the involved countries, namely the European Air Quality Portal;
      • data from a network of air quality sensors (PurpleAir - Outdoor Air Quality Monitor / PurpleAir PA-II) installed in different points of the city of Pavia (Italy) were extracted as well. In both cases, environmental data were previously publicly available. In order to merge environmental data with individual subject location we leverage on postcodes (postcodes of the station for the pollutant detection and postcodes of subject address). Data were merged following an anonymization procedure based on hash keys. Environmental exposure trajectories have been pre-processed and aggregated in order to avoid fine temporal and spatial granularities. Thus, individual exposure information could not disclose personal addresses.

    The datasets are shared in two formats:

    • RDF (serialized in Turtle) modeled according to the BRAINTEASER Ontology (BTO);
    • CSV, as shared during the iDPP@CLEF 2022 and 2023 challenges, split into training and test.

    Each format corresponds to a specific folder in the datasets, where a dedicated README file provides further details on the datasets. Note that the ALS dataset is split into multiple ZIP files due to the size of the environmental data.

    The BRAINTEASER Data Sharing Policy section below reports the details for requesting access to the datasets.

  13. 4

    Data underlying the publication: Modelling perceived risk and trust in...

    • data.4tu.nl
    zip
    Updated Oct 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaolin He; J.C.J. (Jork) Stapel; Meng Wang; R. (Riender) Happee (2023). Data underlying the publication: Modelling perceived risk and trust in driving automation reacting to merging and braking vehicles [Dataset]. http://doi.org/10.4121/95a4bb4e-3ca4-4fcc-ba34-4be76a9ab578.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 20, 2023
    Dataset provided by
    4TU.ResearchData
    Authors
    Xiaolin He; J.C.J. (Jork) Stapel; Meng Wang; R. (Riender) Happee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    European Union’s Horizon 2020
    Description

    This dataset is derived from a driving simulator study that explored the dynamics of perceived risk and trust in the context of driving automation. The study involved 25 participants who were tasked with monitoring SAE Level 2 driving automation features (Adaptive Cruise Control and Lane Centering) while encountering various driving scenarios on a motorway. These scenarios included merging and hard-braking events with different levels of criticality.

    This dataset contains kinetic data from the driving simulator, capturing variables such as vehicle position, velocity, and acceleration among others. Subjective ratings of perceived risk and trust, collected post-event for regression analysis are also included.

  14. h

    Mixmix-LLaMAX

    • huggingface.co
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcus Cedric R. Idia (2025). Mixmix-LLaMAX [Dataset]. https://huggingface.co/datasets/marcuscedricridia/Mixmix-LLaMAX
    Explore at:
    Dataset updated
    Apr 3, 2025
    Authors
    Marcus Cedric R. Idia
    Description

    Merged UI Dataset: Mixmix-LLaMAX

    This dataset was automatically generated by merging and processing the following sources: marcuscedricridia/s1K-claude-3-7-sonnet, marcuscedricridia/Creative_Writing-ShareGPT-deepclean-sharegpt, marcuscedricridia/Medical-R1-Distill-Data-deepclean-sharegpt, marcuscedricridia/Open-Critic-GPT-deepclean-sharegpt, marcuscedricridia/kalo-opus-instruct-22k-no-refusal-deepclean-sharegpt, marcuscedricridia/unAIthical-ShareGPT-deepclean-sharegpt… See the full description on the dataset page: https://huggingface.co/datasets/marcuscedricridia/Mixmix-LLaMAX.

  15. Scripts for Analysis

    • figshare.com
    txt
    Updated Jul 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 18, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sneddon Lab UCSF
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.

  16. [Superseded] Intellectual Property Government Open Data 2019

    • researchdata.edu.au
    • data.gov.au
    Updated Jun 6, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IP Australia (2019). [Superseded] Intellectual Property Government Open Data 2019 [Dataset]. https://researchdata.edu.au/superseded-intellectual-property-data-2019/2994670
    Explore at:
    Dataset updated
    Jun 6, 2019
    Dataset provided by
    Data.govhttps://data.gov/
    Authors
    IP Australia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    What is IPGOD?\r

    The Intellectual Property Government Open Data (IPGOD) includes over 100 years of registry data on all intellectual property (IP) rights administered by IP Australia. It also has derived information about the applicants who filed these IP rights, to allow for research and analysis at the regional, business and individual level. This is the 2019 release of IPGOD.\r \r \r

    How do I use IPGOD?\r

    IPGOD is large, with millions of data points across up to 40 tables, making them too large to open with Microsoft Excel. Furthermore, analysis often requires information from separate tables which would need specialised software for merging. We recommend that advanced users interact with the IPGOD data using the right tools with enough memory and compute power. This includes a wide range of programming and statistical software such as Tableau, Power BI, Stata, SAS, R, Python, and Scalar.\r \r \r

    IP Data Platform\r

    IP Australia is also providing free trials to a cloud-based analytics platform with the capabilities to enable working with large intellectual property datasets, such as the IPGOD, through the web browser, without any installation of software. IP Data Platform\r \r

    References\r

    \r The following pages can help you gain the understanding of the intellectual property administration and processes in Australia to help your analysis on the dataset.\r \r * Patents\r * Trade Marks\r * Designs\r * Plant Breeder’s Rights\r \r \r

    Updates\r

    \r

    Tables and columns\r

    \r Due to the changes in our systems, some tables have been affected.\r \r * We have added IPGOD 225 and IPGOD 325 to the dataset!\r * The IPGOD 206 table is not available this year.\r * Many tables have been re-built, and as a result may have different columns or different possible values. Please check the data dictionary for each table before use.\r \r

    Data quality improvements\r

    \r Data quality has been improved across all tables.\r \r * Null values are simply empty rather than '31/12/9999'.\r * All date columns are now in ISO format 'yyyy-mm-dd'.\r * All indicator columns have been converted to Boolean data type (True/False) rather than Yes/No, Y/N, or 1/0.\r * All tables are encoded in UTF-8.\r * All tables use the backslash \ as the escape character.\r * The applicant name cleaning and matching algorithms have been updated. We believe that this year's method improves the accuracy of the matches. Please note that the "ipa_id" generated in IPGOD 2019 will not match with those in previous releases of IPGOD.

  17. H

    National Health Interview Survey (NHIS)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). National Health Interview Survey (NHIS) [Dataset]. http://doi.org/10.7910/DVN/BYPZ8N
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the national health interview survey (nhis) with r the national health interview survey (nhis) is a household survey about health status and utilization. each annual data set can be used to examine the disease burden and access to care that individuals and families are currently experiencing across the country. check out the wikipedia article (ohh hayy i wrote that) for more detail about its current and potential uses. if you're cooking up a health-related analysis that doesn't need medical expenditures or monthly health insurance coverage, look at nhis before the medical expenditure panel survey (it's sample is twice as big). the centers for disease control and prevention (cdc) has been keeping nhis real since 1957, and the scripts below automate the download, importation, and analysis of every file back to 1963. what happened in 1997, you ask? scientists cloned dolly the sheep, clinton started his second term, and the national health interview survey underwent its most recent major questionnaire re-design. here's how all the moving parts work: a person-level file (personsx) that merges onto other files using unique household (hhx), family (fmx), and person (fpx) identifiers. [note to data historians: prior to 2004, person number was (px) and unique within each household.] this file includes the complex sample survey variables needed to construct a taylor-series linearization design, and should be used if your analysis doesn't require variables from the sample adult or sample c hild files. this survey setup generalizes to the noninstitutional, non-active duty military population. a family-level file that merges onto other files using unique household (hhx) and family (fmx) identifiers. a household-level file that merges onto other files using the unique household (hhx) identifier. a sample adult file that includes questions asked of only one adult within each household (selected at random) - a subset of the main person-level file. hhx, fmx, and fpx identifiers will merge with each of the files above, but since not every adult gets asked thes e questions, this file contains its own set of weights: wtfa_sa instead of wtfa. you can merge on whatever other variables you need from the three files above, but if your analysis requires any variables from the sample adult questionnaire, you can't use records in the person-level file that aren't also in the sample adult file (a big sample size cut). this survey setup generalizes to the noninstitutional, non-active duty military adult population. a sample child file that includes questions asked of only one child within each household (if available, and also selected at random) - another subset of the main person-level file. same deal as the sample adult description, except use wtfa_sc instead of wtfa oh yeah and this one generalizes to the child population. five imputed income files. if you want income and/or poverty variables incorporated into any part of your analysis, you'll need these puppies. the replication example below uses these, but if that's impenetrable, post in the comments describing where you get stuck. some injury stuff and other miscellanea that varies by year. if anyone uses this, please share your experience. if you use anything more than the personsx file alone, you'll need to merge some tables together. make sure you understand the difference between setting the parameter all = TRUE versus all = FALSE -- not everyone in the personsx file has a record in the samadult and sam child files. this new github repository contains four scripts: 1963-2011 - download all microdata.R loop through every year and download every file hosted on the cdc's nhis ftp site import each file into r with SAScii save each file as an r d ata file (.rda) download all the documentation into the year-specific directory 2011 personsx - analyze.R load the r data file (.rda) created by the download script (above) set up a taylor-series linearization survey design outlined on page 6 of this survey document perform a smattering of analysis examples 2011 personsx plus samadult with multiple imputation - analyze.R load the personsx and samadult r data files (.rda) created by the download script (above) merge the personsx and samadult files, highlighting how to conduct analyses that need both create tandem survey designs for both personsx-only and merg ed personsx-samadult files perform just a touch of analysis examples load and loop through the five imputed income files, tack them onto the personsx-samadult file conduct a poverty recode or two analyze the multiply-imputed survey design object, just like mom used to analyze replicate cdc tecdoc - 2000 multiple imputation.R download and import the nhis 2000 personsx and imputed income files, using SAScii and this imputed income sas importation script (no longer hosted on the cdc's nhis ftp site). loop through each of the five imputed income files, merging each to the personsx file and performing the same set of...

  18. h

    Data from: A High Statistics Measurement of the Proton Structure Functions...

    • hepdata.net
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    A High Statistics Measurement of the Proton Structure Functions F(2) (x, Q**2) and R from Deep Inelastic Muon Scattering at High Q**2 [Dataset]. http://doi.org/10.17182/hepdata.12557.v1
    Explore at:
    Description

    CERN-SPS. NA4/BCDMS collaboration. Plab 100 - 280 GEV/C. These are data from the BCDMS collaboration on F2 and R=SIG(L)/SIG(T) with a hydrogen target. The statistics are very large (1.8 million events). The ranges of X,Q**2 are 0.06& lt;X& lt;0.8 and 7& lt;Q**2& lt;260 GeV**2. The F2 data show a distinct difference from the data on F2 proton taken by the EMC.. The publication lists values of F2 corresponding to R=0 and R=R(QCD) at each of the four energies, 100, 120, 200 and 280 GeV. As well as the statistical errors also given are 5 factors representing the effects of estimated systematic errors on F2 associated with (1) beam momentum calibration, (2) magnetic field calibration, (3) spectrometer resolution, (4) detector and trigger inefficiencies, and (5) relative normalisation uncertainty of data taken from external and internal targets. This record contains our attempt to merge these data at different energies using the statistical errors as weight factors. The final one-sigma systematic errors given here have been calculated using a prescription from the authors involving calculation of new merged F2 values for each of the systematic errors applied individually, and the combining in quadrature the differences in the new merged F2 values and the original F2. The individual F2 values at each energy are given in separate database records (& lt;a href=http://durpdg.dur.ac.uk/scripts/reacsearch.csh/TESTREAC/red+3021& gt; RED = 3021 & lt;/a& gt;). PLAB=100 GeV/c. These are the data from the BCDMS Collaboration on F2 and R=SIG(L)/SIG(T) with a hydrogen target. The statistics are very large (1.8 million events). The ranges of X, Q**2 are 0.06& lt;X& lt;0.8 and 7& lt;Q**2& lt;260 GeV**2. The F2 data show a distinct difference from the data on F2 proton taken by the EMC. In the preprint are listed values of F2 corresponding to R=0 and R=R(QCD) at each of the four energies, 100, 120, 200 and 280 GeV. Also listed are 5 systematic errors associated with beam momentum calibration, magnetic field calibration, spectrometer resolution, detector and trigger inefficiencies and relative normalisationuncertainty.. The sytematic error shown in the tables is a result of combining together the 5 individual errors according to a prescription provided by the authors. Themethod involves taking the quadratic sum of the errors from each source.. The record (& lt;a href=http://durpdg.dur.ac.uk/scripts/reacsearch.csh/TESTREAC/red+3019& gt; RED = 3019 & lt;/a& gt;) contains our attempt to merge these data at different energies using the statistical errors as weight factors. PLAB=120 GeV/c. These are the data from the BCDMS Collaboration on F2 and R=SIG(L)/SIG(T) with a hydrogen target. The statistics are very large (1.8 million events). The ranges of X, Q**2 are 0.06& lt;X& lt;0.8 and 7& lt;Q**2& lt;260 GeV**2. The F2 data show a distinct difference from the data on F2 proton taken by the EMC. In the preprint are listed values of F2 corresponding to R=0 and R=R(QCD) at each of the four energies, 100, 120, 200 and 280 GeV. Also listed are 5 systematic errors associated with beam momentum calibration, magnetic field calibration, spectrometer resolution, detector and trigger inefficiencies and relative normalisationuncertainty. The sytematic error shown in the tables is a result of combining together the 5 individual errors according to a prescription provided by the authors. Themethod involves taking the quadratic sum of the errors from each source. The record (& lt;a href=http://durpdg.dur.ac.uk/scripts/reacsearch.csh/TESTREAC/red+3019& gt; RED = 3019 & lt;/a& gt;) contains our attempt to merge these data at different energies using the statistical errors as weight factors. PLAB=200 GeV/c. These are the data from the BCDMS Collaboration on F2 and R=SIG(L)/SIG(T) with a hydrogen target. The statistics are very large (1.8 million events). The ranges of X, Q**2 are 0.06& lt;X& lt;0.8 and 7& lt;Q**2& lt;260 GeV**2. The F2 data show a distinct difference from the data on F2 proton taken by the EMC. In the preprint are listed values of F2 corresponding to R=0 and R=R(QCD) at each of the four energies, 100, 120, 200 and 280 GeV. Also listed are 5 systematic errors associated with beam momentum calibration, magnetic field calibration, spectrometer resolution, detector and trigger inefficiencies and relative normalisationuncertainty. The sytematic error shown in the tables is a result of combining together the 5 individual errors according to a prescription provided by the authors. Themethod involves taking the quadratic sum of the errors from each source. The record (& lt;a href=http://durpdg.dur.ac.uk/scripts/reacsearch.csh/TESTREAC/red+3019& gt; RED = 3019 & lt;/a& gt;) contains our attempt to merge these data at different energies using the statistical errors as weight factors. PLAB=280 GeV/c. These are the data...

  19. d

    MERRA-2 subset for evaluation of renewables with merra2ools R-package:...

    • datadryad.org
    zip
    Updated Mar 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oleg Lugovoy; Shuo Gao (2021). MERRA-2 subset for evaluation of renewables with merra2ools R-package: 1980-2020 hourly, 0.5° lat x 0.625° lon global grid [Dataset]. http://doi.org/10.5061/dryad.v41ns1rtt
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 29, 2021
    Dataset provided by
    Dryad
    Authors
    Oleg Lugovoy; Shuo Gao
    Time period covered
    Mar 19, 2021
    Description

    The merra2ools dataset has been assembled through the following steps:

    The MERRA-2 collections tavg1_2d_flx_Nx (Surface Flux Diagnostics), tavg1_2d_rad_Nx (Radiation Diagnostics), and tavg1_2d_slv_Nx (Single-level atmospheric state variables) downloaded from NASA Goddard Earth Sciences (GES) Data and Information Services Center (DISC) (https://disc.gsfc.nasa.gov/datasets?project=MERRA-2) using GNU Wget network utility (https://disc.gsfc.nasa.gov/data-access). Every of the three collections consist of daily netCDF-4 files with 3-dimensional variables (lon x lat x hour). 
    The following variables obtained from the netCDF-4 files and merged into long-term time-series:
    
    
    
    Northward (V) and Eastward (U) wind at 10 and 50 meters (V10M, V50M, U10M, U50M, respectively), and 10-meter air temperature (T10M) from the tavg1_2d_slv_Nx collection;
    Incident shortwave land (SWGDN) and Surface albedo (ALBEDO) fro...
    
  20. K

    Replication code and data for: Delineating Neighborhoods: An approach...

    • rdr.kuleuven.be
    bin, html, png +6
    Updated Apr 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anirudh Govind; Anirudh Govind; Ate Poorthuis; Ate Poorthuis; Ben Derudder; Ben Derudder (2024). Replication code and data for: Delineating Neighborhoods: An approach combining urban morphology with point and flow datasets [Dataset]. http://doi.org/10.48804/NBDJE3
    Explore at:
    bin(53097), bin(50966), bin(147612), bin(147578), bin(52480), bin(49368), bin(139501), bin(213372), bin(146148), bin(52067), bin(52787), bin(204172), bin(142661), bin(53685), bin(51000), bin(43353), bin(24797), bin(48525), bin(55462), bin(53414), bin(138211), bin(40237), bin(49287), bin(38305), bin(206647), bin(53709), bin(141717), bin(145572), bin(141359), bin(50849), bin(51735), bin(143396), bin(145720), bin(142573), bin(55221), bin(2041), bin(49573), bin(49488), bin(214030), bin(141368), bin(208984), bin(56007), bin(52961), bin(49288), bin(146282), bin(53219), bin(50493), bin(38350), bin(40783), bin(56015), bin(51175), bin(50738), bin(221169), bin(52136), bin(213413), bin(51068), bin(51952), bin(142899), bin(137284), bin(52355), bin(212813), bin(55979), bin(54004), bin(51524), bin(56049), bin(137335), bin(147651), bin(51726), txt(22712040), bin(55448), bin(55655), bin(53385), bin(53598), bin(52251), bin(209474), bin(143633), bin(54861), bin(138176), text/markdown(2163), bin(143027), bin(55891), bin(51135), bin(56010), bin(53124), bin(142144), bin(41853), bin(142249), bin(51420), bin(53784), bin(53493), bin(143810), bin(206759), bin(52307), bin(52700), bin(1972), bin(138766), bin(49406), bin(51400), bin(49286), bin(50744), bin(52946), bin(138189), bin(139798), bin(217747), bin(52050), bin(140803), bin(142079), bin(52253), bin(38310), bin(50904), bin(207544), bin(53879), bin(17224), bin(53260), bin(147440), bin(55999), bin(208342), bin(55245), bin(56013), bin(53085), bin(38302), bin(210432), bin(137686), bin(50806), bin(139282), bin(41535), bin(50854), bin(55535), bin(147653), bin(52263), txt(14907132), bin(49575), bin(51665), bin(259103), bin(56114), bin(52275), bin(53262), bin(51181), bin(53650), bin(141117), bin(51829), bin(54385), bin(142241), bin(55907), bin(56116), png(281559), bin(53352), bin(138350), bin(54090), bin(53569), bin(145279), bin(141720), bin(213937), txt(8033151), bin(51685), bin(53596), bin(53665), bin(51095), bin(219618), png(11184), bin(50968), bin(136793), bin(53301), bin(142761), text/markdown(8426), bin(140005), bin(213601), bin(41754), bin(51456), bin(2609152), bin(221253), bin(144761), bin(51327), bin(51111), bin(52954), bin(138425), bin(56113), bin(141899), html(629525), bin(138727), bin(51304), bin(145596), bin(136839), xml(12517), bin(142458), bin(139337), png(1824580), bin(54561), bin(211160), bin(52614), bin(142035), text/markdown(4324), bin(141834), bin(140983), bin(1990382), bin(201216), bin(50664), bin(53193), bin(892685), bin(52529), bin(51292), bin(137136), bin(51364), bin(138553), bin(879044), bin(147601), bin(52569), bin(220854), bin(142328), bin(147348), bin(54974), bin(38536), txt(185821), bin(53970), png(22653), bin(139071), bin(201774), bin(52860), bin(55995), bin(53795), bin(40592), bin(137756), bin(55432), bin(143929), bin(52730), txt(19402), bin(221023), bin(49446), bin(42199), bin(51100), bin(54899), bin(53046), bin(56012), bin(43397), bin(50901), bin(142461), bin(51566), bin(52410), bin(147442), bin(52113), bin(49574), bin(56112), bin(49458), bin(51214), bin(55489), bin(127600), bin(49652), bin(56115), bin(52539), bin(50820), bin(212328), bin(41979), bin(55005), bin(51798), bin(55966), bin(55360), bin(53368), bin(139415), bin(50954), bin(138864), bin(56117), bin(38301), bin(51105), bin(213149), bin(51967), bin(146694), bin(138390), bin(51935), bin(38318), bin(140037), bin(41415), bin(50956), bin(56098), bin(143171), bin(51810), bin(205510), bin(146000), bin(210712), bin(52738), bin(42380), bin(146066), bin(53198), bin(143080), bin(213791), bin(55423), bin(228152), bin(147649), bin(53457), bin(138367), bin(142352), bin(53786), bin(138951), bin(202842), bin(212117), bin(51543), bin(55862), bin(139076), bin(43268), bin(56055), bin(56014), bin(50884), bin(1785), bin(55874), bin(51871), bin(38303), bin(51079), bin(51624), bin(55800), bin(53519), bin(203135), bin(45351), bin(147057), bin(142817), bin(54418), bin(53197), bin(49559), bin(252577), bin(49435), bin(138171), bin(144034), bin(221259), bin(53120), bin(202188), bin(144398), bin(208682), bin(50972), bin(2828), bin(51348), bin(54197), bin(142779), bin(221262), bin(49568), bin(52459), bin(55902), bin(51441), bin(54572), bin(38304), bin(56043), bin(142984), bin(146568), bin(51341), bin(140660), bin(142610), bin(55600), bin(52927), bin(52747), bin(51690), bin(51559), bin(144296), bin(52990), bin(52154), bin(147727), bin(50797), bin(220197), bin(51412), text/markdown(26450), bin(53065), bin(55980), txt(195933), bin(49570), bin(55867), bin(142327), bin(51379), bin(139790), bin(53152), bin(140189), bin(140557), bin(56056), bin(209949), bin(49453), bin(51724), bin(53416), bin(293751), bin(142003), bin(140312), bin(139711), bin(53393), bin(20992), bin(51506), bin(147650), bin(141560), bin(142878), bin(53321), bin(53380), bin(55110), bin(56044), png(22286), bin(51331), bin(38322), bin(52850), bin(136625), bin(51049), bin(295386), bin(43160), bin(52313), bin(144802), bin(51936), bin(52187), bin(50965), bin(53137), bin(143492), bin(52262), bin(205919), bin(208494), bin(53466), txt(3940140), bin(211759), bin(51737), bin(53765), bin(141796), bin(19402), bin(51707), bin(146826), bin(143383), bin(19016), bin(141785), bin(140975), bin(49282), bin(203748), bin(214285), text/comma-separated-values(144), png(67020), bin(52116), bin(146790), bin(209312), bin(53719), bin(53694), png(67574), bin(210897), bin(49450), bin(52097), bin(51578), bin(212929), bin(55772), bin(53560), bin(51294), bin(51460), png(10386), type/x-r-syntax(2745), bin(54501), bin(38306), bin(51295), bin(147249), bin(322754), png(1565337), bin(56040), bin(137824), bin(206944), bin(28966912), bin(38327), bin(54581), bin(141877), bin(1844), bin(42621), bin(140629), text/markdown(1245), bin(51050), bin(40846), bin(52986), bin(141608), bin(211945), bin(147693), bin(212470), bin(53491), bin(207896), bin(41656), bin(140301), bin(41215), bin(51344), bin(147168), bin(55469), bin(221260), bin(3227), bin(41012), bin(51473), bin(55716), zip(71461515), bin(147726), bin(141483), bin(38298), bin(144128), bin(147509), bin(138527), bin(38562), bin(49550), bin(147720), bin(55811), bin(147588), bin(141189), bin(51919), bin(52581), bin(52574), bin(138931), bin(136488), bin(52442), bin(145456), bin(55946), bin(38426), bin(221261), bin(53305), bin(54212), bin(53933), bin(137981), bin(215864), bin(38458), bin(15669), bin(147072), bin(52029), bin(52373), png(22050), bin(171981), bin(51092), bin(38338), bin(143048), bin(147589), bin(137229), bin(201515), bin(145844), bin(51239), bin(207075), bin(141516), bin(56100), bin(53479), bin(142206), bin(202428), bin(146503), text/markdown(2537), bin(52348), bin(210816), bin(138985), bin(141883), bin(215142), bin(145346), bin(212602), bin(53040), bin(210271), html(11382799), bin(54251), text/markdown(2927), bin(220628), bin(51572), bin(146160), bin(53290), bin(53341), bin(51182), bin(147124), bin(138076), bin(147646), bin(51443), bin(205), bin(141333), bin(50795), bin(145150), bin(52694), bin(144314), bin(56054), bin(140497), bin(54762), text/markdown(18667), bin(143217), bin(42805), bin(207324), bin(147725), bin(146383), bin(147372), bin(52869), bin(10753), bin(51409), bin(69349), bin(147652), bin(146964), bin(56011), bin(49926), text/markdown(1946), text/markdown(2094), txt(22443850), bin(139563), bin(38308), bin(51689), bin(55714), bin(51099), bin(1123), bin(142747), bin(49783), bin(50996), bin(147590), bin(137521), bin(53435), bin(55823), bin(55329), bin(52053), bin(55931), bin(54915), bin(51267), bin(136986), bin(53233), bin(38321), bin(214403), bin(55335), text/markdown(1639), bin(137289), bin(50861), bin(55993), bin(53177), bin(54335), bin(142023), bin(10506), bin(43412), bin(141453), bin(49361), bin(218967), bin(56045), bin(54839), bin(54588), bin(4303), bin(51536), bin(137530), bin(204934), bin(53679), bin(50397), bin(138237), bin(51752), bin(142127), bin(145031), bin(147556), bin(138813), bin(55989), bin(51168), bin(54978), bin(147638), bin(142467), bin(51302), bin(53767), bin(52008), bin(54703), bin(52854), bin(138637), bin(143450), bin(7668), bin(53728), bin(49377), bin(51202), bin(50060), bin(55187), bin(56027), bin(55267), bin(53147), bin(56009), bin(49558), bin(51838), text/markdown(17298), text/comma-separated-values(849), bin(208065), bin(42983), bin(54152), bin(53032), bin(55670), bin(207805), bin(207382), bin(139319), png(40776), bin(139593), bin(52905), bin(211534), bin(52752), bin(53985), bin(221257), bin(38333), bin(53618), bin(141010), bin(216948), bin(53261), bin(214535), bin(52879), bin(52046), bin(38377), type/x-r-syntax(12363), bin(52921), bin(138081), bin(53590), bin(53561), bin(52458), bin(52685), bin(52509), bin(144626), bin(50882), bin(49285), bin(52662), text/markdown(11222), bin(51639), bin(51929), bin(49551), bin(55047), bin(38362), bin(55378), bin(221225), txt(6846843), bin(146471), bin(140097), bin(49561)Available download formats
    Dataset updated
    Apr 10, 2024
    Dataset provided by
    KU Leuven RDR
    Authors
    Anirudh Govind; Anirudh Govind; Ate Poorthuis; Ate Poorthuis; Ben Derudder; Ben Derudder
    License

    https://rdr.kuleuven.be/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.48804/NBDJE3https://rdr.kuleuven.be/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.48804/NBDJE3

    Description

    This repository contains the R code and aggregated data needed to replicate the analysis in our paper "Delineating Neighborhoods: An approach combining urban morphology with point and flow datasets". The enclosed renv.lock file provides details of the R packages used. Aside from these packages, an installation of the Infomap algorithm (freely available through standalone installations and Docker images) is also necessary but is not included in this repository. All code is organized in computational notebooks, arranged sequentially. Data required to execute these notebooks is stored in the data/ folder. For further details, please refer to the enclosed 'README' file and the original publication.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Patrick VINCOURT, Merger of BNV-D data (2008 to 2019) and enrichment [Dataset]. https://data.europa.eu/data/datasets/5f1c3eca9d149439e50c740f

Merger of BNV-D data (2008 to 2019) and enrichment

Explore at:
zip(18530465)Available download formats
Dataset authored and provided by
Patrick VINCOURT
Description

Merging (in Table R) data published on https://www.data.gouv.fr/fr/datasets/ventes-de-pesticides-par-departement/, and joining two other sources of information associated with MAs: — uses: https://www.data.gouv.fr/fr/datasets/usages-des-produits-phytosanitaires/ — information on the “Biocontrol” status of the product, from document DGAL/SDQSPV/2020-784 published on 18/12/2020 at https://agriculture.gouv.fr/quest-ce-que-le-biocontrole

All the initial files (.csv transformed into.txt), the R code used to merge data and different output files are collected in a zip. enter image description here NB: 1) “YASCUB” for {year,AMM,Substance_active,Classification,Usage,Statut_“BioConttrol”}, substances not on the DGAL/SDQSPV list being coded NA. 2) The file of biocontrol products shall be cleaned from the duplicates generated by the marketing authorisations leading to several trade names.
3) The BNVD_BioC_DY3 table and the output file BNVD_BioC_DY3.txt contain the fields {Code_Region,Region,Dept,Code_Dept,Anne,Usage,Classification,Type_BioC,Quantite_substance)}

Search
Clear search
Close search
Google apps
Main menu