Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography–MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.
This dataset was created by Hussein Al Chami
Exploratory Data Analysis for the Physical Properties of Lakes
This lesson was adapted from educational material written by Dr. Kateri Salk for her Fall 2019 Hydrologic Data Analysis course at Duke University. This is the first part of a two-part exercise focusing on the physical properties of lakes.
Introduction
Lakes are dynamic, nonuniform bodies of water in which the physical, biological, and chemical properties interact. Lakes also contain the majority of Earth's fresh water supply. This lesson introduces exploratory data analysis using R statistical software in the context of the physical properties of lakes.
Learning Objectives
After successfully completing this exercise, you will be able to:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Hello! Welcome to the Capstone project I have completed to earn my Data Analytics certificate through Google. I chose to complete this case study through RStudio desktop. The reason I did this is that R is the primary new concept I learned throughout this course. I wanted to embrace my curiosity and learn more about R through this project. In the beginning of this report I will provide the scenario of the case study I was given. After this I will walk you through my Data Analysis process based on the steps I learned in this course:
The data I used for this analysis comes from this FitBit data set: https://www.kaggle.com/datasets/arashnic/fitbit
" This dataset generated by respondents to a distributed survey via Amazon Mechanical Turk between 03.12.2016-05.12.2016. Thirty eligible Fitbit users consented to the submission of personal tracker data, including minute-level output for physical activity, heart rate, and sleep monitoring. "
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The high-resolution and mass accuracy of Fourier transform mass spectrometry (FT-MS) has made it an increasingly popular technique for discerning the composition of soil, plant and aquatic samples containing complex mixtures of proteins, carbohydrates, lipids, lignins, hydrocarbons, phytochemicals and other compounds. Thus, there is a growing demand for informatics tools to analyze FT-MS data that will aid investigators seeking to understand the availability of carbon compounds to biotic and abiotic oxidation and to compare fundamental chemical properties of complex samples across groups. We present ftmsRanalysis, an R package which provides an extensive collection of data formatting and processing, filtering, visualization, and sample and group comparison functionalities. The package provides a suite of plotting methods and enables expedient, flexible and interactive visualization of complex datasets through functions which link to a powerful and interactive visualization user interface, Trelliscope. Example analysis using FT-MS data from a soil microbiology study demonstrates the core functionality of the package and highlights the capabilities for producing interactive visualizations.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The three-component reactions of the 16-electron half-sandwich complex CpCo(S2C2B10H10) (Cp = cyclopentadienyl) (1) with ethyl diazoacetate (EDA) and alkynes R1R2 (R1 = Ph, R2 = H; R1 = CO2Me, R2 = H; R1 = R2 = CO2Me; R1 = Fc, R2 = H) at ambient temperature lead to compounds CpCo(S2C2B10H9)(CH2CO2Et) (CHCO2Et)(R1R2) (2–5), CpCo(S2C2B10H9)(CH2CO2Et)(R2–R1–CHCO2Et) (6–9), CpCo(S2C2B10H9)(CH2CO2Et)(CH(Ph)CCHCO2Et) (10), and CpCo(S2C2B10H9)(CH2CO2Et)(CH(Fc)–CH–CCO2Et) (11). In 2–5, one alkyne is stereoselectively inserted into the Co–B bond, one EDA molecule is used to form a sulfide ylide, and the second EDA molecule is inserted into one Co–S bond to form a three-membered metallacyclic ring. At ambient temperature 2–5 undergo rearrangement to 6–9 through migratory insertion of the inserted EDA. Different from 2–5, in 10 phenylacetylene is inserted into the Co–B bond at the terminal carbon and the terminal carbon is coupled with one EDA to afford a six-membered metallacyclic ring with the CO coordination to metal. In 11, a stable Co–B bond is generated, and one EDA and one ethynylferrocene are inserted into the Co–S bond. Moreover, if weakly basic silica is present, 2–4 can lose an apex BH close to the two carbon atoms of o-carborane to give rise to CpCo(S2C2B9H9)(CH2CO2Et)2(R1R2) (12–14) accompanied by the coordination of the two sulfide ylide units to the metal center. The solid-state structures of 2–4, 6–12, and 14 were characterized by X-ray structural analysis.
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
This dataset of electrodermal activity was collected from 11 healthy volunteer subjects who were awake and at rest in seated position and 11 different healthy volunteers who were under controlled propofol sedation. For the awake and at rest subjects, the activity was recorded from each subject's non-dominant hand for one hour at 256 Hz. For the controlled propofol sedation subjects, the activity was recorded from each subject's left hand for about 3-4 hours at 500 Hz. From the raw data, EDA pulses were extracted and the pulse times and amplitudes reported. Electrodermal activity measures changing electrical conductance of the skin as an indicator of sweat gland activity. Sweat glands are a primitive part of the fight-or-flight response. These data were collected as part of a larger study to understand and build computational models for autonomic nervous system activity (including electrodermal activity) with approval from the Massachusetts Institute of Technology Committee on the Use of Humans as Experimental Subjects (COUHES) and the Massachusetts General Hospital Human Research Committee.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Location codes that start with a “1” indicate the front of the body and codes that begin with a “2” indicate the back of the body.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Malaria is a mosquito-borne disease spread by an infected vector (infected female Anopheles mosquito) or through transfusion of plasmodium-infected blood to susceptible individuals. The disease burden has resulted in high global mortality, particularly among children under the age of five. Many intervention responses have been implemented to control malaria disease transmission, including blood screening, Long-Lasting Insecticide Bed Nets (LLIN), treatment with an anti-malaria drug, spraying chemicals/pesticides on mosquito breeding sites, and indoor residual spray, among others. As a result, the SIR (Susceptible—Infected—Recovered) model was developed to study the impact of various malaria control and mitigation strategies. The associated basic reproduction number and stability theory is used to investigate the stability analysis of the model equilibrium points. By constructing an appropriate Lyapunov function, the global stability of the malaria-free equilibrium is investigated. By determining the direction of bifurcation, the implicit function theorem is used to investigate the stability of the model endemic equilibrium. The model is fitted to malaria data from Benue State, Nigeria, using R and MATLAB. Estimates of parameters were made. Following that, an optimal control model is developed and analyzed using Pontryaging's Maximum Principle. The malaria-free equilibrium point is locally and globally stable if the basic reproduction number (R0) and the blood transfusion reproduction number (Rα) are both less or equal to unity. The study of the sensitive parameters of the model revealed that the transmission rate of malaria from mosquito-to-human (βmh), transmission rate from humans-to-mosquito (βhm), blood transfusion reproduction number (Rα) and recruitment rate of mosquitoes (bm) are all sensitive parameters capable of increasing the basic reproduction number (R0) thereby increasing the risk in spreading malaria disease. The result of the optimal control shows that five possible controls are effective in reducing the transmission of malaria. The study recommended the combination of five controls, followed by the combination of four and three controls is effective in mitigating malaria transmission. The result of the optimal simulation also revealed that for communities or areas where resources are scarce, the combination of Long Lasting Insecticide Treated Bednets (u2), Treatment (u3), and Indoor insecticide spray (u5) is recommended. Numerical simulations are performed to validate the model's analytical results.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Prior to statistical analysis of mass spectrometry (MS) data, quality control (QC) of the identified biomolecule peak intensities is imperative for reducing process-based sources of variation and extreme biological outliers. Without this step, statistical results can be biased. Additionally, liquid chromatography–MS proteomics data present inherent challenges due to large amounts of missing data that require special consideration during statistical analysis. While a number of R packages exist to address these challenges individually, there is no single R package that addresses all of them. We present pmartR, an open-source R package, for QC (filtering and normalization), exploratory data analysis (EDA), visualization, and statistical analysis robust to missing data. Example analysis using proteomics data from a mouse study comparing smoke exposure to control demonstrates the core functionality of the package and highlights the capabilities for handling missing data. In particular, using a combined quantitative and qualitative statistical test, 19 proteins whose statistical significance would have been missed by a quantitative test alone were identified. The pmartR package provides a single software tool for QC, EDA, and statistical comparisons of MS data that is robust to missing data and includes numerous visualization capabilities.