100+ datasets found
  1. f

    Collection of example datasets used for the book - R Programming -...

    • figshare.com
    txt
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kingsley Okoye; Samira Hosseini (2023). Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research [Dataset]. http://doi.org/10.6084/m9.figshare.24728073.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 4, 2023
    Dataset provided by
    figshare
    Authors
    Kingsley Okoye; Samira Hosseini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.

  2. d

    Randomized Hourly Load Data for use with Taxonomy Distribution Feeders.

    • datadiscoverystudio.org
    • data.wu.ac.at
    Updated Aug 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Randomized Hourly Load Data for use with Taxonomy Distribution Feeders. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/bc873dbf6a1f44c190153d3345fbbafd/html
    Explore at:
    Dataset updated
    Aug 29, 2017
    Description

    description: This dataset was developed by NREL's distributed energy systems integration group as part of a study on high penetrations of distributed solar PV [1]. It consists of hourly load data in CSV format for use with the PNNL taxonomy of distribution feeders [2]. These feeders were developed in the open source GridLAB-D modelling language [3]. In this dataset each of the load points in the taxonomy feeders is populated with hourly averaged load data from a utility in the feeder s geographical region, scaled and randomized to emulate real load profiles. For more information on the scaling and randomization process, see [1]. The taxonomy feeders are statistically representative of the various types of distribution feeders found in five geographical regions of the U.S. Efforts are underway (possibly complete) to translate these feeders into the OpenDSS modelling language. This data set consists of one large CSV file for each feeder. Within each CSV, each column represents one load bus on the feeder. The header row lists the name of the load bus. The subsequent 8760 rows represent the loads for each hour of the year. The loads were scaled and randomized using a Python script, so each load series represents only one of many possible randomizations. In the header row, "rl" = residential load and "cl" = commercial load. Commercial loads are followed by a phase letter (A, B, or C). For regions 1-3, the data is from 2009. For regions 4-5, the data is from 2000. For use in GridLAB-D, each column will need to be separated into its own CSV file without a header. The load value goes in the second column, and corresponding datetime values go in the first column, as shown in the sample file, sample_individual_load_file.csv. Only the first value in the time column needs to written as an absolute time; subsequent times may be written in relative format (i.e. "+1h", as in the sample). The load should be written in P+Qj format, as seen in the sample CSV, in units of Watts (W) and Volt-amps reactive (VAr). This dataset was derived from metered load data and hence includes only real power; reactive power can be generated by assuming an appropriate power factor. These loads were used with GridLAB-D version 2.2. Browse files in this dataset, accessible as individual files and as a single ZIP file. This dataset is approximately 242MB compressed or 475MB uncompressed. For questions about this dataset, contact andy.hoke@nrel.gov. If you find this dataset useful, please mention NREL and cite [1] in your work. References: [1] A. Hoke, R. Butler, J. Hambrick, and B. Kroposki, Steady-State Analysis of Maximum Photovoltaic Penetration Levels on Typical Distribution Feeders, IEEE Transactions on Sustainable Energy, April 2013, available at http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6357275 . [2] K. Schneider, D. P. Chassin, R. Pratt, D. Engel, and S. Thompson, Modern Grid Initiative Distribution Taxonomy Final Report, PNNL, Nov. 2008. Accessed April 27, 2012: http://www.gridlabd.org/models/feeders/taxonomy of prototypical feeders.pdf [3] K. Schneider, D. Chassin, Y. Pratt, and J. C. Fuller, Distribution power flow for smart grid technologies, IEEE/PES Power Systems Conference and Exposition, Seattle, WA, Mar. 2009, pp. 1-7, 15-18.; abstract: This dataset was developed by NREL's distributed energy systems integration group as part of a study on high penetrations of distributed solar PV [1]. It consists of hourly load data in CSV format for use with the PNNL taxonomy of distribution feeders [2]. These feeders were developed in the open source GridLAB-D modelling language [3]. In this dataset each of the load points in the taxonomy feeders is populated with hourly averaged load data from a utility in the feeder s geographical region, scaled and randomized to emulate real load profiles. For more information on the scaling and randomization process, see [1]. The taxonomy feeders are statistically representative of the various types of distribution feeders found in five geographical regions of the U.S. Efforts are underway (possibly complete) to translate these feeders into the OpenDSS modelling language. This data set consists of one large CSV file for each feeder. Within each CSV, each column represents one load bus on the feeder. The header row lists the name of the load bus. The subsequent 8760 rows represent the loads for each hour of the year. The loads were scaled and randomized using a Python script, so each load series represents only one of many possible randomizations. In the header row, "rl" = residential load and "cl" = commercial load. Commercial loads are followed by a phase letter (A, B, or C). For regions 1-3, the data is from 2009. For regions 4-5, the data is from 2000. For use in GridLAB-D, each column will need to be separated into its own CSV file without a header. The load value goes in the second column, and corresponding datetime values go in the first column, as shown in the sample file, sample_individual_load_file.csv. Only the first value in the time column needs to written as an absolute time; subsequent times may be written in relative format (i.e. "+1h", as in the sample). The load should be written in P+Qj format, as seen in the sample CSV, in units of Watts (W) and Volt-amps reactive (VAr). This dataset was derived from metered load data and hence includes only real power; reactive power can be generated by assuming an appropriate power factor. These loads were used with GridLAB-D version 2.2. Browse files in this dataset, accessible as individual files and as a single ZIP file. This dataset is approximately 242MB compressed or 475MB uncompressed. For questions about this dataset, contact andy.hoke@nrel.gov. If you find this dataset useful, please mention NREL and cite [1] in your work. References: [1] A. Hoke, R. Butler, J. Hambrick, and B. Kroposki, Steady-State Analysis of Maximum Photovoltaic Penetration Levels on Typical Distribution Feeders, IEEE Transactions on Sustainable Energy, April 2013, available at http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6357275 . [2] K. Schneider, D. P. Chassin, R. Pratt, D. Engel, and S. Thompson, Modern Grid Initiative Distribution Taxonomy Final Report, PNNL, Nov. 2008. Accessed April 27, 2012: http://www.gridlabd.org/models/feeders/taxonomy of prototypical feeders.pdf [3] K. Schneider, D. Chassin, Y. Pratt, and J. C. Fuller, Distribution power flow for smart grid technologies, IEEE/PES Power Systems Conference and Exposition, Seattle, WA, Mar. 2009, pp. 1-7, 15-18.

  3. Z

    Storage and Transit Time Data and Code

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jun 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Felton (2024). Storage and Transit Time Data and Code [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8136816
    Explore at:
    Dataset updated
    Jun 12, 2024
    Dataset authored and provided by
    Andrew Felton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Author: Andrew J. FeltonDate: 5/5/2024

    This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis and figure production for the study entitled:

    "Global estimates of the storage and transit time of water through vegetation"

    Please note that 'turnover' and 'transit' are used interchangeably in this project.

    Data information:

    The data folder contains key data sets used for analysis. In particular:

    "data/turnover_from_python/updated/annual/multi_year_average/average_annual_turnover.nc" contains a global array summarizing five year (2016-2020) averages of annual transit, storage, canopy transpiration, and number of months of data. This is the core dataset for the analysis; however, each folder has much more data, including a dataset for each year of the analysis. Data are also available is separate .csv files for each land cover type. Oterh data can be found for the minimum, monthly, and seasonal transit time found in their respective folders. These data were produced using the python code found in the "supporting_code" folder given the ease of working with .nc and EASE grid in the xarray python module. R was used primarily for data visualization purposes. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here.

    Code information

    Python scripts can be found in the "supporting_code" folder.

    Each R script in this project has a particular function:

    01_start.R: This script loads the R packages used in the analysis, sets thedirectory, and imports custom functions for the project. You can also load in the main transit time (turnover) datasets here using the source() function.

    02_functions.R: This script contains the custom function for this analysis, primarily to work with importing the seasonal transit data. Load this using the source() function in the 01_start.R script.

    03_generate_data.R: This script is not necessary to run and is primarilyfor documentation. The main role of this code was to import and wranglethe data needed to calculate ground-based estimates of aboveground water storage.

    04_annual_turnover_storage_import.R: This script imports the annual turnover andstorage data for each landcover type. You load in these data from the 01_start.R scriptusing the source() function.

    05_minimum_turnover_storage_import.R: This script imports the minimum turnover andstorage data for each landcover type. Minimum is defined as the lowest monthlyestimate.You load in these data from the 01_start.R scriptusing the source() function.

    06_figures_tables.R: This is the main workhouse for figure/table production and supporting analyses. This script generates the key figures and summary statistics used in the study that then get saved in the manuscript_figures folder. Note that allmaps were produced using Python code found in the "supporting_code"" folder.

  4. R And R Export Import Specialities Importer/Buyer Data in USA, R And R...

    • seair.co.in
    Updated Apr 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). R And R Export Import Specialities Importer/Buyer Data in USA, R And R Export Import Specialities Imports Data [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Apr 19, 2025
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    United States
    Description

    Find details of R And R Export Import Specialities Buyer/importer data in US (United States) with product description, price, shipment date, quantity, imported products list, major us ports name, overseas suppliers/exporters name etc. at sear.co.in.

  5. d

    Replication Data for: Revisiting 'The Rise and Decline' in a Population of...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill (2023). Replication Data for: Revisiting 'The Rise and Decline' in a Population of Peer Production Projects [Dataset]. http://doi.org/10.7910/DVN/SG3LP1
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    TeBlunthuis, Nathan; Aaron Shaw; Benjamin Mako Hill
    Description

    This archive contains code and data for reproducing the analysis for “Replication Data for Revisiting ‘The Rise and Decline’ in a Population of Peer Production Projects”. Depending on what you hope to do with the data you probabbly do not want to download all of the files. Depending on your computation resources you may not be able to run all stages of the analysis. The code for all stages of the analysis, including typesetting the manuscript and running the analysis, is in code.tar. If you only want to run the final analysis or to play with datasets used in the analysis of the paper, you want intermediate_data.7z or the uncompressed tab and csv files. The data files are created in a four-stage process. The first stage uses the program “wikiq” to parse mediawiki xml dumps and create tsv files that have edit data for each wiki. The second stage generates all.edits.RDS file which combines these tsvs into a dataset of edits from all the wikis. This file is expensive to generate and at 1.5GB is pretty big. The third stage builds smaller intermediate files that contain the analytical variables from these tsv files. The fourth stage uses the intermediate files to generate smaller RDS files that contain the results. Finally, knitr and latex typeset the manuscript. A stage will only run if the outputs from the previous stages do not exist. So if the intermediate files exist they will not be regenerated. Only the final analysis will run. The exception is that stage 4, fitting models and generating plots, always runs. If you only want to replicate from the second stage onward, you want wikiq_tsvs.7z. If you want to replicate everything, you want wikia_mediawiki_xml_dumps.7z.001 wikia_mediawiki_xml_dumps.7z.002, and wikia_mediawiki_xml_dumps.7z.003. These instructions work backwards from building the manuscript using knitr, loading the datasets, running the analysis, to building the intermediate datasets. Building the manuscript using knitr This requires working latex, latexmk, and knitr installations. Depending on your operating system you might install these packages in different ways. On Debian Linux you can run apt install r-cran-knitr latexmk texlive-latex-extra. Alternatively, you can upload the necessary files to a project on Overleaf.com. Download code.tar. This has everything you need to typeset the manuscript. Unpack the tar archive. On a unix system this can be done by running tar xf code.tar. Navigate to code/paper_source. Install R dependencies. In R. run install.packages(c("data.table","scales","ggplot2","lubridate","texreg")) On a unix system you should be able to run make to build the manuscript generalizable_wiki.pdf. Otherwise you should try uploading all of the files (including the tables, figure, and knitr folders) to a new project on Overleaf.com. Loading intermediate datasets The intermediate datasets are found in the intermediate_data.7z archive. They can be extracted on a unix system using the command 7z x intermediate_data.7z. The files are 95MB uncompressed. These are RDS (R data set) files and can be loaded in R using the readRDS. For example newcomer.ds <- readRDS("newcomers.RDS"). If you wish to work with these datasets using a tool other than R, you might prefer to work with the .tab files. Running the analysis Fitting the models may not work on machines with less than 32GB of RAM. If you have trouble, you may find the functions in lib-01-sample-datasets.R useful to create stratified samples of data for fitting models. See line 89 of 02_model_newcomer_survival.R for an example. Download code.tar and intermediate_data.7z to your working folder and extract both archives. On a unix system this can be done with the command tar xf code.tar && 7z x intermediate_data.7z. Install R dependencies. install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). On a unix system you can simply run regen.all.sh to fit the models, build the plots and create the RDS files. Generating datasets Building the intermediate files The intermediate files are generated from all.edits.RDS. This process requires about 20GB of memory. Download all.edits.RDS, userroles_data.7z,selected.wikis.csv, and code.tar. Unpack code.tar and userroles_data.7z. On a unix system this can be done using tar xf code.tar && 7z x userroles_data.7z. Install R dependencies. In R run install.packages(c("data.table","ggplot2","urltools","texreg","optimx","lme4","bootstrap","scales","effects","lubridate","devtools","roxygen2")). Run 01_build_datasets.R. Building all.edits.RDS The intermediate RDS files used in the analysis are created from all.edits.RDS. To replicate building all.edits.RDS, you only need to run 01_build_datasets.R when the int... Visit https://dataone.org/datasets/sha256%3Acfa4980c107154267d8eb6dc0753ed0fde655a73a062c0c2f5af33f237da3437 for complete metadata about this dataset.

  6. d

    R programming code for analyzing output from the Stochastic Empirical...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). R programming code for analyzing output from the Stochastic Empirical Loading Dilution Model created for U.S. Geological Survey Scientific Investigations Report 2019-5053, 116 p., https://doi.org/10.3133/sir20195053 [Dataset]. https://catalog.data.gov/dataset/r-programming-code-for-analyzing-output-from-the-stochastic-empirical-loading-dilution-mod
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Description

    This R script can be used to analyze SELDM results. The script is specifically tailored for the SELDM simulations used in the publication: Stonewall, A.J., and Granato, G.E., 2018, Assessing potential effects of highway and urban runoff on receiving streams in total maximum daily load watersheds in Oregon using the Stochastic Empirical Loading and Dilution Model: U.S. Geological Survey Scientific Investigations Report 2019-5053, 116 p., https://doi.org/10.3133/sir20195053

  7. f

    How does cognitive load affect social interactions? Dataset and Analysis

    • figshare.com
    txt
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kathryn Mills (2016). How does cognitive load affect social interactions? Dataset and Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.757787.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    figshare
    Authors
    Kathryn Mills
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Project abstract: Many situations involve processing social and non-social information simultaneously. However, is not known how performance is affected in such situations. Here, we examined how our ability to process social information is affected by the need to keep track of non-social information. Participants were instructed to carry out two tasks within each trial. The social task involved referential communication – requiring participants to use social cues to guide their decisions. At the same time, cognitive load was manipulated by requiring participants to remember non-social information in the form of either one or three two-digit numbers visually presented before each social task stimulus. Results indicate that the cognitive demands of simultaneously processing social and non-social information impair social information processing. Specifically, keeping in mind three numbers slowed participants' ability to use another person's perspective to guide decisions. These results suggest that social information processing requires domain-general resources that are depleted under cognitive load. Data: These files include our dataset, as well as the scripts used to analyze the data and create graphs of the results. You will need to download R (http://www.r-project.org/) to use these files. Data are from 29 adult participants. Participants completed an adapted version of the “Director Task” (Dumontheil, Hillebrandt, Apperly, & Blakemore, 2012) with an embedded working memory (WM) Task component. Afterwards, participants completed a verbal reverse digit-span task as a measure of WM capacity and the Interpersonal Reactivity Index questionnaire to assess individual differences in trait perspective taking (Davis, 1980). Data Analysis: We used the lme4 package in R (Bates, Maechler, & Bolker, 2013) to perform a linear mixed effects analysis on the relationship between our factors of interest and accuracy and RT for both tasks. RT data from correct trials only were analyzed. To create approximately normally distributed residuals, we used a log or reciprocal function to transform RT data. We performed a two-step procedure: first, we created a global model including main and interactive effects of cognitive load (low vs. high), condition (Director Present vs. Director Absent), trial type (1-object vs. 3-object), and perspective (same vs. different) as fixed effects, and each model included a random intercept for each participant. We then compared all possible combinations[1] of the variables within our global model using an automated model selection procedure (MuMIn1.9.0; Barton, 2013). Models were ranked using Second-order Akaike Information Criterion (AICc; Burnham & Anderson, 2002). Second, after determining the best fitting model for each outcome of interest, we tested whether WM capacity or trait perspective taking explained any additional variance through likelihood ratio tests. All p-values were obtained by likelihood ratio tests comparing the best fitting model against a baseline model.[1] Interactions were always accompanied by their respective main effects and all lower order terms

    Update (August 8, 2013): There was a minor error in the original SocialDualTaskData.R file, which has now been corrected.

  8. R Loc Import Data India – Buyers & Importers List

    • seair.co.in
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim, R Loc Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    India
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  9. d

    R-LOADEST files to produce results in the Heart River Basin, North Dakota,...

    • catalog.data.gov
    • data.usgs.gov
    Updated Jul 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2024). R-LOADEST files to produce results in the Heart River Basin, North Dakota, 1970-2020 [Dataset]. https://catalog.data.gov/dataset/r-loadest-files-to-produce-results-in-the-heart-river-basin-north-dakota-1970-2020
    Explore at:
    Dataset updated
    Jul 6, 2024
    Dataset provided by
    United States Geological Surveyhttp://www.usgs.gov/
    Area covered
    North Dakota, Heart River
    Description

    This child page contains a zipped folder which contains all of the items necessary to run load estimation using R-LOADEST to produce results that are published in U.S. Geological Survey Investigations Report 2021-XXXX [Tatge, W.S., Nustad, R.A., and Galloway, J.M., 2021, Evaluation of Salinity and Nutrient Conditions in the Heart River Basin, North Dakota, 1970-2020: U.S. Geological Survey Scientific Investigations Report 2021-XXXX, XX p]. The folder contains an allsiteinfo.table.csv file, a "datain" folder, and a "scripts" folder. The allsiteinfo.table.csv file can be used to cross reference the sites with the main report (Tatge and others, 2021). The "datain" folder contains all the input data necessary to reproduce the load estimation results. The naming convention in the "datain" folder is site_MI_rloadest or site_NUT_rloadest for either the major ion loads or the nutrient loads. The .Rdata files are used in the scripts to run the estimations and the .csv files can be used to look at the data. The "scripts" folder contains the written R scripts to produce the results of the load estimation from the main report. R-LOADEST is a software package for analyzing loads in streams and an accompanying report (Runkel and others, 2004) serves as the formal documentation for R-LOADEST. The package is a collection of functions written in R (R Development Core Team, 2019), an open source language and a general environment for statistical computing and graphics. The following system requirements are necessary for producing results: Windows 10 operating system R (version 3.4 or later; 64-bit recommended) RStudio (version 1.1.456 or later) R-LOADEST program (available at https://github.com/USGS-R/rloadest). Runkel, R.L., Crawford, C.G., and Cohn, T.A., 2004, Load Estimator (LOADEST): A FORTRAN Program for Estimating Constituent Loads in Streams and Rivers: U.S. Geological Survey Techniques and Methods Book 4, Chapter A5, 69 p., [Also available at https://pubs.usgs.gov/tm/2005/tm4A5/pdf/508final.pdf.] R Development Core Team, 2019, R—A language and environment for statistical computing: Vienna, Austria, R Foundation for Statistical Computing, accessed December 7, 2020, at https://www.r-project.org.

  10. Global import data of Motorcycle R

    • volza.com
    csv
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of Motorcycle R [Dataset]. https://www.volza.com/imports-united-states/united-states-import-data-of-motorcycle+r
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset provided by
    Volza
    Authors
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    1310 Global import shipment records of Motorcycle R with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  11. Simulation Data Set

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Simulation Data Set [Dataset]. https://catalog.data.gov/dataset/simulation-data-set
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: File format: R workspace file; “Simulated_Dataset.RData”. Metadata (including data dictionary) • y: Vector of binary responses (1: adverse outcome, 0: control) • x: Matrix of covariates; one row for each simulated individual • z: Matrix of standardized pollution exposures • n: Number of simulated individuals • m: Number of exposure time periods (e.g., weeks of pregnancy) • p: Number of columns in the covariate design matrix • alpha_true: Vector of “true” critical window locations/magnitudes (i.e., the ground truth that we want to estimate) Code Abstract We provide R statistical software code (“CWVS_LMC.txt”) to fit the linear model of coregionalization (LMC) version of the Critical Window Variable Selection (CWVS) method developed in the manuscript. We also provide R code (“Results_Summary.txt”) to summarize/plot the estimated critical windows and posterior marginal inclusion probabilities. Description “CWVS_LMC.txt”: This code is delivered to the user in the form of a .txt file that contains R statistical software code. Once the “Simulated_Dataset.RData” workspace has been loaded into R, the text in the file can be used to identify/estimate critical windows of susceptibility and posterior marginal inclusion probabilities. “Results_Summary.txt”: This code is also delivered to the user in the form of a .txt file that contains R statistical software code. Once the “CWVS_LMC.txt” code is applied to the simulated dataset and the program has completed, this code can be used to summarize and plot the identified/estimated critical windows and posterior marginal inclusion probabilities (similar to the plots shown in the manuscript). Optional Information (complete as necessary) Required R packages: • For running “CWVS_LMC.txt”: • msm: Sampling from the truncated normal distribution • mnormt: Sampling from the multivariate normal distribution • BayesLogit: Sampling from the Polya-Gamma distribution • For running “Results_Summary.txt”: • plotrix: Plotting the posterior means and credible intervals Instructions for Use Reproducibility (Mandatory) What can be reproduced: The data and code can be used to identify/estimate critical windows from one of the actual simulated datasets generated under setting E4 from the presented simulation study. How to use the information: • Load the “Simulated_Dataset.RData” workspace • Run the code contained in “CWVS_LMC.txt” • Once the “CWVS_LMC.txt” code is complete, run “Results_Summary.txt”. Format: Below is the replication procedure for the attached data set for the portion of the analyses using a simulated data set: Data The data used in the application section of the manuscript consist of geocoded birth records from the North Carolina State Center for Health Statistics, 2005-2008. In the simulation study section of the manuscript, we simulate synthetic data that closely match some of the key features of the birth certificate data while maintaining confidentiality of any actual pregnant women. Availability Due to the highly sensitive and identifying information contained in the birth certificate data (including latitude/longitude and address of residence at delivery), we are unable to make the data from the application section publically available. However, we will make one of the simulated datasets available for any reader interested in applying the method to realistic simulated birth records data. This will also allow the user to become familiar with the required inputs of the model, how the data should be structured, and what type of output is obtained. While we cannot provide the application data here, access to the North Carolina birth records can be requested through the North Carolina State Center for Health Statistics, and requires an appropriate data use agreement. Description Permissions: These are simulated data without any identifying information or informative birth-level covariates. We also standardize the pollution exposures on each week by subtracting off the median exposure amount on a given week and dividing by the interquartile range (IQR) (as in the actual application to the true NC birth records data). The dataset that we provide includes weekly average pregnancy exposures that have already been standardized in this way while the medians and IQRs are not given. This further protects identifiability of the spatial locations used in the analysis. This dataset is associated with the following publication: Warren, J., W. Kong, T. Luben, and H. Chang. Critical Window Variable Selection: Estimating the Impact of Air Pollution on Very Preterm Birth. Biostatistics. Oxford University Press, OXFORD, UK, 1-30, (2019).

  12. Global import data of R Tyre

    • volza.com
    csv
    Updated May 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of R Tyre [Dataset]. https://www.volza.com/imports-malaysia/malaysia-import-data-of-r+tyre
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 31, 2025
    Dataset provided by
    Volza
    Authors
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    1166 Global import shipment records of R Tyre with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  13. d

    Replication Data for: Reining in the Rascals: Challenger Parties' Path to...

    • search.dataone.org
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hjorth, Frederik; Jacob Nyrup; Martin Vinæs Larsen (2024). Replication Data for: Reining in the Rascals: Challenger Parties' Path to Power [Dataset]. http://doi.org/10.7910/DVN/FLGPW8
    Explore at:
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Hjorth, Frederik; Jacob Nyrup; Martin Vinæs Larsen
    Description
    ### Information for replicating the analysis for "Reining in the Rascals: Challenger Parties' Path to Power" ### The Journal of Politics ### ### Frederik Hjorth, Jacob Nyrup & Martin Vinæs Larsen ###### All code to replicate the analysis is written in R. 14 files in total are used to replicate the analysis in the article: 5 r-scripts and 9 datafiles. The scripts use the R package "pacman" to install and load relevant packages, which is handled by the function pacman::p_load(). To make sure the function runs, the replicator should have "pacman" installed. The scripts use the R package "here" to automatically set the working directory to the replication folder. If "here" fails to locate the appropriate folder, simply set the working directory to the folder containing scripts and data using setwd(). When running the analysis it is important that 00-helperfunctions.R is loaded into R. This file contains a list of extra functions used throughout the analysis. ### List of r-scripts 00-helperfunctions.R 01-comparativeanalysis.R 02-mainanalysis.R 03-mechanismanalysis.R 04-appendix.R ### List of datasets df_comparative.xlsx df_main.rds df_mainretroactive.rds dkvaa13txtdf.rds dkvaa17txtdf.rds dkvaa2013.xlsx dkvaa2017.xlsx irtposbyparty.rds municodelist.txt
  14. Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic...

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, zip
    Updated Dec 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa (2022). Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials [Dataset]. http://doi.org/10.5281/zenodo.6965147
    Explore at:
    bin, zip, csvAvailable download formats
    Dataset updated
    Dec 24, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Alexander R. Hartloper; Alexander R. Hartloper; Selimcan Ozden; Albano de Castro e Sousa; Dimitrios G. Lignos; Dimitrios G. Lignos; Selimcan Ozden; Albano de Castro e Sousa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Database of Uniaxial Cyclic and Tensile Coupon Tests for Structural Metallic Materials

    Background

    This dataset contains data from monotonic and cyclic loading experiments on structural metallic materials. The materials are primarily structural steels and one iron-based shape memory alloy is also included. Summary files are included that provide an overview of the database and data from the individual experiments is also included.

    The files included in the database are outlined below and the format of the files is briefly described. Additional information regarding the formatting can be found through the post-processing library (https://github.com/ahartloper/rlmtp/tree/master/protocols).

    Usage

    • The data is licensed through the Creative Commons Attribution 4.0 International.
    • If you have used our data and are publishing your work, we ask that you please reference both:
      1. this database through its DOI, and
      2. any publication that is associated with the experiments. See the Overall_Summary and Database_References files for the associated publication references.

    Included Files

    • Overall_Summary_2022-08-25_v1-0-0.csv: summarises the specimen information for all experiments in the database.
    • Summarized_Mechanical_Props_Campaign_2022-08-25_v1-0-0.csv: summarises the average initial yield stress and average initial elastic modulus per campaign.
    • Unreduced_Data-#_v1-0-0.zip: contain the original (not downsampled) data
      • Where # is one of: 1, 2, 3, 4, 5, 6. The unreduced data is broken into separate archives because of upload limitations to Zenodo. Together they provide all the experimental data.
      • We recommend you un-zip all the folders and place them in one "Unreduced_Data" directory similar to the "Clean_Data"
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the unreduced data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Clean_Data_v1-0-0.zip: contains all the downsampled data
      • The experimental data is provided through .csv files for each test that contain the processed data. The experiments are organised by experimental campaign and named by load protocol and specimen. A .pdf file accompanies each test showing the stress-strain graph.
      • There is a "db_tag_clean_data_map.csv" file that is used to map the database summary with the clean data.
      • The computed yield stresses and elastic moduli are stored in the "yield_stress" directory.
    • Database_References_v1-0-0.bib
      • Contains a bibtex reference for many of the experiments in the database. Corresponds to the "citekey" entry in the summary files.

    File Format: Downsampled Data

    These are the "LP_

    • The header of the first column is empty: the first column corresponds to the index of the sample point in the original (unreduced) data
    • Time[s]: time in seconds since the start of the test
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: the surface temperature in degC

    These data files can be easily loaded using the pandas library in Python through:

    import pandas
    data = pandas.read_csv(data_file, index_col=0)

    The data is formatted so it can be used directly in RESSPyLab (https://github.com/AlbanoCastroSousa/RESSPyLab). Note that the column names "e_true" and "Sigma_true" were kept for backwards compatibility reasons with RESSPyLab.

    File Format: Unreduced Data

    These are the "LP_

    • The first column is the index of each data point
    • S/No: sample number recorded by the DAQ
    • System Date: Date and time of sample
    • Time[s]: time in seconds since the start of the test
    • C_1_Force[kN]: load cell force
    • C_1_Déform1[mm]: extensometer displacement
    • C_1_Déplacement[mm]: cross-head displacement
    • Eng_Stress[MPa]: engineering stress
    • Eng_Strain[]: engineering strain
    • e_true: true strain
    • Sigma_true: true stress in MPa
    • (optional) Temperature[C]: specimen surface temperature in degC

    The data can be loaded and used similarly to the downsampled data.

    File Format: Overall_Summary

    The overall summary file provides data on all the test specimens in the database. The columns include:

    • hidden_index: internal reference ID
    • grade: material grade
    • spec: specifications for the material
    • source: base material for the test specimen
    • id: internal name for the specimen
    • lp: load protocol
    • size: type of specimen (M8, M12, M20)
    • gage_length_mm_: unreduced section length in mm
    • avg_reduced_dia_mm_: average measured diameter for the reduced section in mm
    • avg_fractured_dia_top_mm_: average measured diameter of the top fracture surface in mm
    • avg_fractured_dia_bot_mm_: average measured diameter of the bottom fracture surface in mm
    • fy_n_mpa_: nominal yield stress
    • fu_n_mpa_: nominal ultimate stress
    • t_a_deg_c_: ambient temperature in degC
    • date: date of test
    • investigator: person(s) who conducted the test
    • location: laboratory where test was conducted
    • machine: setup used to conduct test
    • pid_force_k_p, pid_force_t_i, pid_force_t_d: PID parameters for force control
    • pid_disp_k_p, pid_disp_t_i, pid_disp_t_d: PID parameters for displacement control
    • pid_extenso_k_p, pid_extenso_t_i, pid_extenso_t_d: PID parameters for extensometer control
    • citekey: reference corresponding to the Database_References.bib file
    • yield_stress_mpa_: computed yield stress in MPa
    • elastic_modulus_mpa_: computed elastic modulus in MPa
    • fracture_strain: computed average true strain across the fracture surface
    • c,si,mn,p,s,n,cu,mo,ni,cr,v,nb,ti,al,b,zr,sn,ca,h,fe: chemical compositions in units of %mass
    • file: file name of corresponding clean (downsampled) stress-strain data

    File Format: Summarized_Mechanical_Props_Campaign

    Meant to be loaded in Python as a pandas DataFrame with multi-indexing, e.g.,

    tab1 = pd.read_csv('Summarized_Mechanical_Props_Campaign_' + date + version + '.csv',
              index_col=[0, 1, 2, 3], skipinitialspace=True, header=[0, 1],
              keep_default_na=False, na_values='')
    • citekey: reference in "Campaign_References.bib".
    • Grade: material grade.
    • Spec.: specifications (e.g., J2+N).
    • Yield Stress [MPa]: initial yield stress in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign
    • Elastic Modulus [MPa]: initial elastic modulus in MPa
      • size, count, mean, coefvar: number of experiments in campaign, number of experiments in mean, mean value for campaign, coefficient of variation for campaign

    Caveats

    • The files in the following directories were tested before the protocol was established. Therefore, only the true stress-strain is available for each:
      • A500
      • A992_Gr50
      • BCP325
      • BCR295
      • HYP400
      • S460NL
      • S690QL/25mm
      • S355J2_Plates/S355J2_N_25mm and S355J2_N_50mm
  15. R Propylene Carbonate Import Data India – Buyers & Importers List

    • seair.co.in
    Updated Dec 28, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2015). R Propylene Carbonate Import Data India – Buyers & Importers List [Dataset]. https://www.seair.co.in
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Dec 28, 2015
    Dataset provided by
    Seair Exim Solutions
    Authors
    Seair Exim
    Area covered
    India
    Description

    Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.

  16. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  17. e

    Eximpedia Export Import Trade

    • eximpedia.app
    Updated Feb 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Eximpedia Export Import Trade [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    Seychelles, Switzerland, Macao, Bouvet Island, Cyprus, Burundi, Saint Barthélemy, Kenya, Chile, Bahamas
    Description

    R Proc Inc Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.

  18. Global import data of Motorcycle R

    • volza.com
    csv
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of Motorcycle R [Dataset]. https://www.volza.com/exports-chile/chile-import-data-of-motorcycle+r
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    Volza
    Authors
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    10707 Global import shipment records of Motorcycle R with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  19. Global import data of Cd Dvd R

    • volza.com
    csv
    Updated Mar 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of Cd Dvd R [Dataset]. https://www.volza.com/imports-global/global-import-data-of-cd+dvd+r
    Explore at:
    csvAvailable download formats
    Dataset updated
    Mar 7, 2025
    Dataset provided by
    Volza
    Authors
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    2608 Global import shipment records of Cd Dvd R with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  20. Global import data of Aerosil R 812

    • volza.com
    csv
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global import data of Aerosil R 812 [Dataset]. https://www.volza.com/imports-global/global-import-data-of-aerosil+r+812
    Explore at:
    csvAvailable download formats
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    Volza
    Authors
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
    Description

    529 Global import shipment records of Aerosil R 812 with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Kingsley Okoye; Samira Hosseini (2023). Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research [Dataset]. http://doi.org/10.6084/m9.figshare.24728073.v1

Collection of example datasets used for the book - R Programming - Statistical Data Analysis in Research

Explore at:
txtAvailable download formats
Dataset updated
Dec 4, 2023
Dataset provided by
figshare
Authors
Kingsley Okoye; Samira Hosseini
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This book is written for statisticians, data analysts, programmers, researchers, teachers, students, professionals, and general consumers on how to perform different types of statistical data analysis for research purposes using the R programming language. R is an open-source software and object-oriented programming language with a development environment (IDE) called RStudio for computing statistics and graphical displays through data manipulation, modelling, and calculation. R packages and supported libraries provides a wide range of functions for programming and analyzing of data. Unlike many of the existing statistical softwares, R has the added benefit of allowing the users to write more efficient codes by using command-line scripting and vectors. It has several built-in functions and libraries that are extensible and allows the users to define their own (customized) functions on how they expect the program to behave while handling the data, which can also be stored in the simple object system.For all intents and purposes, this book serves as both textbook and manual for R statistics particularly in academic research, data analytics, and computer programming targeted to help inform and guide the work of the R users or statisticians. It provides information about different types of statistical data analysis and methods, and the best scenarios for use of each case in R. It gives a hands-on step-by-step practical guide on how to identify and conduct the different parametric and non-parametric procedures. This includes a description of the different conditions or assumptions that are necessary for performing the various statistical methods or tests, and how to understand the results of the methods. The book also covers the different data formats and sources, and how to test for reliability and validity of the available datasets. Different research experiments, case scenarios and examples are explained in this book. It is the first book to provide a comprehensive description and step-by-step practical hands-on guide to carrying out the different types of statistical analysis in R particularly for research purposes with examples. Ranging from how to import and store datasets in R as Objects, how to code and call the methods or functions for manipulating the datasets or objects, factorization, and vectorization, to better reasoning, interpretation, and storage of the results for future use, and graphical visualizations and representations. Thus, congruence of Statistics and Computer programming for Research.

Search
Clear search
Close search
Google apps
Main menu