29 datasets found
  1. d

    Additional file 2 of An updated analysis of safety climate and downstream...

    • researchdiscovery.drexel.edu
    • springernature.figshare.com
    Updated May 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashley M. Geczik; Jin Lee; Joseph A. Allen; Madison E. Raposa; Lucy F. Robinson; D. Alex Quistberg; Andrea L. Davis; Jennifer A. Taylor (2024). Additional file 2 of An updated analysis of safety climate and downstream outcomes in two convenience samples of U.S. fire departments (FOCUS 1.0 and 2.0 survey waves) [Dataset]. https://researchdiscovery.drexel.edu/esploro/outputs/dataset/Additional-file-2-of-An-updated/991021898823604721
    Explore at:
    Dataset updated
    May 22, 2024
    Dataset provided by
    figshare
    Authors
    Ashley M. Geczik; Jin Lee; Joseph A. Allen; Madison E. Raposa; Lucy F. Robinson; D. Alex Quistberg; Andrea L. Davis; Jennifer A. Taylor
    Time period covered
    Aug 15, 2024
    Description

    Additional file 2: Supplemental Figure 1. Flowcharts of the analytic samples for FOCUS 1.0 and FOCUS 2.0 survey waves. Supplemental Figure 2A. Box and whisker plots comparing FOCUS safety climate scores by size variables for FOCUSv.1.0 departments. Supplemental Figure 2B. Box and whisker plots comparing FOCUS safety climate scores by size variables for FOCUSv.2.0 departments.

  2. f

    UC_vs_US Statistic Analysis.xlsx

    • figshare.com
    xlsx
    Updated Jul 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. (Fabiano) Dalpiaz (2020). UC_vs_US Statistic Analysis.xlsx [Dataset]. http://doi.org/10.23644/uu.12631628.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 9, 2020
    Dataset provided by
    Utrecht University
    Authors
    F. (Fabiano) Dalpiaz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sheet 1 (Raw-Data): The raw data of the study is provided, presenting the tagging results for the used measures described in the paper. For each subject, it includes multiple columns: A. a sequential student ID B an ID that defines a random group label and the notation C. the used notation: user Story or use Cases D. the case they were assigned to: IFA, Sim, or Hos E. the subject's exam grade (total points out of 100). Empty cells mean that the subject did not take the first exam F. a categorical representation of the grade L/M/H, where H is greater or equal to 80, M is between 65 included and 80 excluded, L otherwise G. the total number of classes in the student's conceptual model H. the total number of relationships in the student's conceptual model I. the total number of classes in the expert's conceptual model J. the total number of relationships in the expert's conceptual model K-O. the total number of encountered situations of alignment, wrong representation, system-oriented, omitted, missing (see tagging scheme below) P. the researchers' judgement on how well the derivation process explanation was explained by the student: well explained (a systematic mapping that can be easily reproduced), partially explained (vague indication of the mapping ), or not present.

    Tagging scheme:
    Aligned (AL) - A concept is represented as a class in both models, either
    

    with the same name or using synonyms or clearly linkable names; Wrongly represented (WR) - A class in the domain expert model is incorrectly represented in the student model, either (i) via an attribute, method, or relationship rather than class, or (ii) using a generic term (e.g., user'' instead ofurban planner''); System-oriented (SO) - A class in CM-Stud that denotes a technical implementation aspect, e.g., access control. Classes that represent legacy system or the system under design (portal, simulator) are legitimate; Omitted (OM) - A class in CM-Expert that does not appear in any way in CM-Stud; Missing (MI) - A class in CM-Stud that does not appear in any way in CM-Expert.

    All the calculations and information provided in the following sheets
    

    originate from that raw data.

    Sheet 2 (Descriptive-Stats): Shows a summary of statistics from the data collection,
    

    including the number of subjects per case, per notation, per process derivation rigor category, and per exam grade category.

    Sheet 3 (Size-Ratio):
    

    The number of classes within the student model divided by the number of classes within the expert model is calculated (describing the size ratio). We provide box plots to allow a visual comparison of the shape of the distribution, its central value, and its variability for each group (by case, notation, process, and exam grade) . The primary focus in this study is on the number of classes. However, we also provided the size ratio for the number of relationships between student and expert model.

    Sheet 4 (Overall):
    

    Provides an overview of all subjects regarding the encountered situations, completeness, and correctness, respectively. Correctness is defined as the ratio of classes in a student model that is fully aligned with the classes in the corresponding expert model. It is calculated by dividing the number of aligned concepts (AL) by the sum of the number of aligned concepts (AL), omitted concepts (OM), system-oriented concepts (SO), and wrong representations (WR). Completeness on the other hand, is defined as the ratio of classes in a student model that are correctly or incorrectly represented over the number of classes in the expert model. Completeness is calculated by dividing the sum of aligned concepts (AL) and wrong representations (WR) by the sum of the number of aligned concepts (AL), wrong representations (WR) and omitted concepts (OM). The overview is complemented with general diverging stacked bar charts that illustrate correctness and completeness.

    For sheet 4 as well as for the following four sheets, diverging stacked bar
    

    charts are provided to visualize the effect of each of the independent and mediated variables. The charts are based on the relative numbers of encountered situations for each student. In addition, a "Buffer" is calculated witch solely serves the purpose of constructing the diverging stacked bar charts in Excel. Finally, at the bottom of each sheet, the significance (T-test) and effect size (Hedges' g) for both completeness and correctness are provided. Hedges' g was calculated with an online tool: https://www.psychometrica.de/effect_size.html. The independent and moderating variables can be found as follows:

    Sheet 5 (By-Notation):
    

    Model correctness and model completeness is compared by notation - UC, US.

    Sheet 6 (By-Case):
    

    Model correctness and model completeness is compared by case - SIM, HOS, IFA.

    Sheet 7 (By-Process):
    

    Model correctness and model completeness is compared by how well the derivation process is explained - well explained, partially explained, not present.

    Sheet 8 (By-Grade):
    

    Model correctness and model completeness is compared by the exam grades, converted to categorical values High, Low , and Medium.

  3. S

    Figure 1. TDP-43 phosphorylation by CK1δ and C-terminal phosphomimetic...

    • search.sourcedata.io
    zip
    Updated Jan 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lara Aletta Gruijs da Silva (2022). : Figure 1-F to G [Dataset]. http://doi.org/10.15252/embj.2021108443/figure/1/panel/5
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 1, 2022
    Authors
    Lara Aletta Gruijs da Silva
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Variables measured
    TDP-43
    Description

    Representative bright field microscopic images of TDP-43 condensates (in Hepes buffer), Bar, 25 µm (F) and quantification of condensate number (G) Box plots show the comparison of median and inter-quartile range (upper and lower quartiles) of all fields of view (FOV) from Min to Max (whiskers) of two replicates (≥ 22 FOV per condition). *p < 0.0332, **p < 0.0021 and ***p < 0.0002 by one-way ANOVA with Dunnett´s multiple comparison test to Wt, comparing the respective concentration condition (5, 10, 20 µM).. List of tagged entities: TARDBP (uniprot:Q13148), TARDBP (ncbigene:23435), TEV protease (uniprot:P04517), brightfield microscopy (bao:BAO_0000457),dose response design (obi:OBI_0001418)

  4. Pneumonia Imbalance Chest X-Ray Dataset

    • kaggle.com
    zip
    Updated Dec 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashvath S.P (2023). Pneumonia Imbalance Chest X-Ray Dataset [Dataset]. https://www.kaggle.com/datasets/ashvath07/pneumonia-imbalance-chest-x-ray-dataset
    Explore at:
    zip(985981313 bytes)Available download formats
    Dataset updated
    Dec 17, 2023
    Authors
    Ashvath S.P
    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10740475%2F292038228cf850864acacff21b9ff2db%2FPneumonia%20imbalance%20data%20(11).png?generation=1741359821707786&alt=media" alt="">

    We have made a new CXR dataset for Pneumonia detection, by amalgamating this original CXR dataset with two other CXR datasets and eventually, we incorporated two new classes Tuberculosis (TB) and Bacterial Pneumonia (BP) into this dataset. Moreover, we deliberately maintain a substantial class imbalance across various classes. In the training set, we have chosen 1946 images for BP, 2531 for Covid, 4209 for LO, 7134 for Normal, 490 for TB, and 941 for VP classes. This deliberate decision increases the challenges in the dataset compared to its previous version.

    We labeled this new dataset as 'Pneumonia Imbalance CXR Dataset'. The sources from which we assembled two new classes, BP and TB, are [12] and [13] respectively. This is to clarify that we have not introduced any additional images for the Normal, LO, Covid, and Viral Pneumonia (VP) classes in the original CXR dataset. These images from four different classes, remain unchanged from the previous version of the CXR dataset. Rather, we isolated Bacterial Pneumonia (BP) class images from the first source [12] and introduced them as a new class directly into the new dataset. Likewise, TB images were included as a separate class from the second source [13]. Compared to the existing CXR dataset, this new dataset is more skewed and challenging, making it more representative of real-world hospital scenarios. This can be further noticed from Fig.1b the number of images per class is more diverse in this CXR dataset, thus, making the problem more challenging for conventional neural network.

    It is shown that those conventional models, including CNN and Vision Transformer, did not perform well on this Imbalance Pneumonia dataset. Moreover, Fig.1a showed the box plot diagram of correlation co-efficient [23] of various classes. Here, we took any random image from each class and compute correlation co-efficient with respect to all other images in that class and thereafter we plot these box plot diagrams. From this Figure 1a, it is evident that the thickness of the box plots is greater for the Covid, Normal, and TB classes, indicating a higher intra-class variance.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10740475%2F5b52f4941a8cd8f70cba5e3bea82cc44%2Fintra-class-correlation-table.png?generation=1742629139434303&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10740475%2F73b304bd71be1deae5132b027d3296b2%2Finter-class-correlation-table.png?generation=1742632825027072&alt=media" alt="">

    The increased intra-class variance in these classes makes the Pneumonia Imbalance dataset more challenging. The mean and standard deviation of intra-class correlation and inter-class correlation are separately presented in Table 1 and Table-2 respectively. From this, it can be concluded that this new Pneumonia Imbalance CXR dataset is more diverse and challenging than any other existing Covid-19 dataset.

    This "Pneumonia Imbalance Dataset" is associated with the ref [3]. If you're employing our dataset for experimentation or paper publications, kindly cite our paper [3].

    Reference: 12. Kermany, Daniel S., et al. Identifying medical diagnoses and treatable diseases by image-based deep learning." \textit{cell}, vol. 172, no.5, pp: 1122-1131, 2018. 13. Rahman, Tawsifur, et al.Reliable tuberculosis detection using chest X-ray with deep learning, segmentation and visualization." \textit{IEEE Access}, vol. 8, pp: 191586-191601, 2020).

  5. d

    Data from: Hydroclimate Projections for Select U.S. Fish and Wildlife...

    • catalog.data.gov
    • data.usgs.gov
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Hydroclimate Projections for Select U.S. Fish and Wildlife Service Properties - Mountain-Prairie Region, 1951-2099 [Dataset]. https://catalog.data.gov/dataset/hydroclimate-projections-for-select-u-s-fish-and-wildlife-service-properties-mountain-1951-0f5b2
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Canadian Prairies
    Description

    This data release contains time series and plots summarizing mean monthly temperature (TAVE) and total monthly precipitation (PPT), and runoff (RO) from the U.S. Geological Survey Monthly Water Balance Model at 115 National Wildlife Refuges within the U.S. Fish and Wildlife Service Mountain-Prairie Region (CO, KS, MT, NE, ND, SD, UT, and WY). These three variables are derived from two sets of statistically-downscaled general circulation models from 1951 through 2099. Three variables (TAVE, PPT, and RO for refuge areas) were summarized for comparison across four 19-year periods: historic (1951-1969), baseline (1981-1999), 2050 (2041-2059), and 2080 (2071-2089). For each refuge, mean monthly plots, seasonal box plots, and annual envelope plots were produced for each of the four periods.

  6. R

    WIDEa: a Web Interface for big Data exploration, management and analysis

    • entrepot.recherche.data.gouv.fr
    Updated Sep 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philippe Santenoise; Philippe Santenoise (2021). WIDEa: a Web Interface for big Data exploration, management and analysis [Dataset]. http://doi.org/10.15454/AGU4QE
    Explore at:
    Dataset updated
    Sep 12, 2021
    Dataset provided by
    Recherche Data Gouv
    Authors
    Philippe Santenoise; Philippe Santenoise
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15454/AGU4QEhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.15454/AGU4QE

    Description

    WIDEa is R-based software aiming to provide users with a range of functionalities to explore, manage, clean and analyse "big" environmental and (in/ex situ) experimental data. These functionalities are the following, 1. Loading/reading different data types: basic (called normal), temporal, infrared spectra of mid/near region (called IR) with frequency (wavenumber) used as unit (in cm-1); 2. Interactive data visualization from a multitude of graph representations: 2D/3D scatter-plot, box-plot, hist-plot, bar-plot, correlation matrix; 3. Manipulation of variables: concatenation of qualitative variables, transformation of quantitative variables by generic functions in R; 4. Application of mathematical/statistical methods; 5. Creation/management of data (named flag data) considered as atypical; 6. Study of normal distribution model results for different strategies: calibration (checking assumptions on residuals), validation (comparison between measured and fitted values). The model form can be more or less complex: mixed effects, main/interaction effects, weighted residuals.

  7. Predict Term Deposit

    • kaggle.com
    zip
    Updated Nov 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aslan Ahmedov (2021). Predict Term Deposit [Dataset]. https://www.kaggle.com/aslanahmedov/predict-term-deposit
    Explore at:
    zip(588608 bytes)Available download formats
    Dataset updated
    Nov 29, 2021
    Authors
    Aslan Ahmedov
    Description

    Predict Term Deposit

    Introduction

    Bank has multiple banking products that it sells to customer such as saving account, credit cards, investments etc. It wants to which customer will purchase its credit cards. For the same it has various kind of information regarding the demographic details of the customer, their banking behavior etc. Once it can predict the chances that customer will purchase a product, it wants to use the same to make pre-payment to the authors.

    In this part I will demonstrate how to build a model, to predict which clients will subscribing to a term deposit, with inception of machine learning. In the first part we will deal with the description and visualization of the analysed data, and in the second we will go to data classification models.

    Strategy

    -Desire target -Data Understanding -Preprocessing Data -Machine learning Model -Prediction -Comparing Results

    Desire Target

    Predict if a client will subscribe (yes/no) to a term deposit — this is defined as a classification problem.

    Data

    The dataset (Assignment-2_data.csv) used in this assignment contains bank customers’ data. File name: Assignment-2_Data File format: . csv Numbers of Row: 45212 Numbers of Attributes: 17 non- empty conditional attributes attributes and one decision attribute.

    imagehttps://user-images.githubusercontent.com/91852182/143783430-eafd25b0-6d40-40b8-ac5b-1c4f67ca9e02.png"> imagehttps://user-images.githubusercontent.com/91852182/143783451-3e49b817-29a6-4108-b597-ce35897dda4a.png">

    Exploratory Data Analysis (EDA)

    Data pre-processing is a main step in Machine Learning as the useful information which can be derived it from data set directly affects the model quality so it is extremely important to do at least necessary preprocess for our data before feeding it into our model.

    In this assignment, we are going to utilize python to develop a predictive machine learning model. First, we will import some important and necessary libraries.

    Below we are can see that there are various numerical and categorical columns. The most important column here is y, which is the output variable (desired target): this will tell us if the client subscribed to a term deposit(binary: ‘yes’,’no’).

    imagehttps://user-images.githubusercontent.com/91852182/143783456-78c22016-149b-4218-a4a5-765ca348f069.png">

    We must to check missing values in our dataset if we do have any and do, we have any duplicated values or not.

    imagehttps://user-images.githubusercontent.com/91852182/143783471-a8656640-ec57-4f38-8905-35ef6f3e7f30.png">

    We can see that in 'age' 9 missing values and 'balance' as well 3 values missed. In this case based that our dataset it has around 45k row I will remove them from dataset. on Pic 1 and 2 you will see before and after.

    imagehttps://user-images.githubusercontent.com/91852182/143783474-b3898011-98e3-43c8-bd06-2cfcde714694.png">

    From the above analysis we can see that only 5289 people out of 45200 have subscribed which is roughly 12%. We can see that our dataset highly unbalanced. we need to take it as a note.

    imagehttps://user-images.githubusercontent.com/91852182/143783534-a05020a8-611d-4da1-98cf-4fec811cb5d8.png">

    Our list of categorical variables.

    imagehttps://user-images.githubusercontent.com/91852182/143783542-d40006cd-4086-4707-a683-f654a8cb2205.png">

    Our list of numerical variables.

    imagehttps://user-images.githubusercontent.com/91852182/143783551-6b220f99-2c4d-47d0-90ab-18ede42a4ae5.png">

    "Age" Q-Q Plots and Box Plot.

    In above boxplot we can see that some point in very young age and as well impossible age. So,

    imagehttps://user-images.githubusercontent.com/91852182/143783564-ad0e2a27-5df5-4e04-b5d7-6d218cabd405.png"> imagehttps://user-images.githubusercontent.com/91852182/143783589-5abf0a0b-8bab-4192-98c8-d2e04f32a5c5.png">

    Now, we don’t have issues on this feature so we can use it

    imagehttps://user-images.githubusercontent.com/91852182/143783599-5205eddb-a0f5-446d-9f45-cc1adbfcce67.png"> imagehttps://user-images.githubusercontent.com/91852182/143783601-e520d59c-3b21-4627-a9bb-cac06f415a1e.png">

    "Duration" Q-Q Plots and Box Plot

    imagehttps://user-images.githubusercontent.com/91852182/143783634-03e5a584-a6fb-4bcb-8dc5-1f3cc50f9507.png"> imagehttps://user-images.githubusercontent.com/91852182/143783640-f6e71323-abbe-49c1-9935-35ffb2d10569.png">

    This attribute highly affects the output target (e.g., if duration=0 then y=’no’). Yet, the duration is not known before a call is performed. Also, after the end of the call y is obviously known. Thus, this input should only be included for benchmark purposes...

  8. Comparison of result on optimal design of industrial refrigeration system.

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). Comparison of result on optimal design of industrial refrigeration system. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t015
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of result on optimal design of industrial refrigeration system.

  9. f

    Indicator species analysis comparing the noma lesion site samples (“Noma” in...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Dec 4, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Whiteson, Katrine L.; Francois, Patrice; Tangomo-Bento, Manuela; Lazarevic, Vladimir; Schrenzel, Jacques; Girard, Myriam; Maughan, Heather; Pittet, Didier (2014). Indicator species analysis comparing the noma lesion site samples (“Noma” in “Indicator of” column) with the other four health status groups (control, noma healthy site, ANG healthy site, and ANG lesion site; “Other” in “Indicator of” column). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001193425
    Explore at:
    Dataset updated
    Dec 4, 2014
    Authors
    Whiteson, Katrine L.; Francois, Patrice; Tangomo-Bento, Manuela; Lazarevic, Vladimir; Schrenzel, Jacques; Girard, Myriam; Maughan, Heather; Pittet, Didier
    Description

    Indicator values were calculated using the abundance distribution of OTUs at the 97% identity cutoff. Boldface highlights the taxa that are indicators for noma. The abundance ranges for taxa with high indicator species values are shown in box plots in Figure 4A. Results with all groups in “Other” being treated separately are shown in Table S6.Indicator species analysis comparing the noma lesion site samples (“Noma” in “Indicator of” column) with the other four health status groups (control, noma healthy site, ANG healthy site, and ANG lesion site; “Other” in “Indicator of” column).

  10. f

    Additional file 2: of Unsupervised correction of gene-independent cell...

    • datasetcatalog.nlm.nih.gov
    • springernature.figshare.com
    Updated Aug 14, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Butler, Adam; Iorio, Francesco; Saez-Rodriguez, Julio; Ansari, Rizwan; Wilkinson, Piers; Bhosle, Shriram; Chen, Elisabeth; Shepherd, Rebecca; Harper, Sarah; Garnett, Mathew; Behan, Fiona; Beaver, Charlotte; Yusa, Kosuke; Pooley, Rachel; Stronach, Euan; Gonçalves, Emanuel (2018). Additional file 2: of Unsupervised correction of gene-independent cell responses to CRISPR-Cas9 targeting [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000679482
    Explore at:
    Dataset updated
    Aug 14, 2018
    Authors
    Butler, Adam; Iorio, Francesco; Saez-Rodriguez, Julio; Ansari, Rizwan; Wilkinson, Piers; Bhosle, Shriram; Chen, Elisabeth; Shepherd, Rebecca; Harper, Sarah; Garnett, Mathew; Behan, Fiona; Beaver, Charlotte; Yusa, Kosuke; Pooley, Rachel; Stronach, Euan; Gonçalves, Emanuel
    Description

    Figure S1. CRISPR-KO screening data quality assessment. (A) Average correlation between sgRNAs read-count replicates across cell lines. (B) Receiver operating characteristic (ROC) curve obtained from classifying fitness essential (FE) and non-essential genes based on the average logFC of their targeting sgRNAs. An example cell line OVCAR-8 is shown. (C) Area under the ROC (AUROC) curve obtained for cell lines from classifying FE and non-essential genes based on the average logFC of their targeting sgRNAs. (D) Recall for sets of a priori known essential genes from MSigDB and from literature when classifying FE and non-essential genes across cell lines (5% FDR). Each circle represents a cell line and coloured by tissue type. Box and whisker plots show median, inter-quartile ranges and 95% confidence intervals. (E) Genes ranked based on the average logFC of targeting sgRNAs for OVCAR-8 and enrichment of genes belonging to predefined sets of a priori known essential genes from MSigDB, at an FDR equal to 5% when classifying FE (second last column) and non-essential genes (last column). Blue numbers at the bottom indicate the classification true positive rate (recall). Figure S2. Assessment of copy number bias before and after CRISPRcleanR correction across cell lines. sgRNA logFC values before and after CRISPRcleanR for eight cell lines are shown classified based on copy number (amplified or deleted) and expression status. Copy number segments were identified using Genomics of Drug Sensitivity in Cancer (GDSC) and Cell Line Encyclopedia (CCLE) datasets. Box and whisker plots show median, inter-quartile ranges and 95% confidence intervals. Asterisks indicate significant associations between sgRNA LogFC values (Welchs t-test, p < 0,005) and their different effect sizes accounting for the standard deviation (Cohen’s D value), compared to the whole sgRNA library. Figure S3. CN-associated effect on sgRNA logFC values in highly biased cell lines. For 3 cell lines, recall curves of non-essential genes, fitness essential genes, copy number (CN) amplified and CN amplified non-expressed genes obtained when classifying genes based on the average logFC values of their targeting sgRNAs. Figure S4. Assessment of CN-associated bias across all cell lines. LogFC values of sgRNAs averaged within segments of equal copy number (CN). One plot per cell line, with CN values at which a significant differences (Welchs t-test, p < 0.05) with respect to the logFCs corresponding to CN = 2 are initially observed (bias starting point) and start to significantly increase continuously (bias critical point). CN-associated bias is shown for all sgRNA, when excluding FE genes and histones, and for non-expressed genes only. Box and whisker plots show median, inter-quartile ranges and 95% confidence intervals. Figure S5. CRISPRcleanR correction varying the minimal number of genes required and the effect of fitness essential genes. Recall reduction of (A) amplified or (B) amplified not-expressed genes versus that of fitness essential and other prior known essential genes, when comparing CRISPRcleanR correction varying the minimal number of genes to be targeted by sgRNA in a biased segment (default parameter is n = 3). Similar results were observed when performing the analysis including or excluding known essential genes. Figure S6. CRISPRcleanR performances across 342 cell lines from an independent dataset. Recall at 5% FDR of predefined sets of genes based on their uncorrected or corrected logFCs (coordinates on the two axis) averaged across targeting sgRNAs for 342 cell lines from the Project Achilles. Figure S7. CRISPRcleanR performances in relation to data quality. The impact of data quality on recall at 5% false discovery rate (FDR) assessed following CRISPRcleanR correction for predefined set of genes. Project Achilles data (n = 342 cell lines) was binned based on the quality of uncorrected essentiality profile. This is obtained by measuring the recall at 5% FDR for predefined essential genes (from the Molecular Signature Database) and grouping the cell lines in 10 equidistant bins (1 lowest quality and 10 highest quality) when sorting them based on this value. Recall increment for fitness essential genes was greatest for the lower quality data, indicating that CRISPRcleanR can improve true signal of gene depletion in low quality data. Figure S8. Minimal impact of CRISPRcleanR on loss/gain-of-fitness effects. (A) The percentage of genes where the significance of their fitness effect (gain- or loss-of-fitness) is altered after CRISPRcleanR for Project Score and Project Achilles data. The upper row shows correction effects for all screened genes and the lower row for the subset of genes with a significant effect in the uncorrected data. Each dot is a separate cell line. Blue dots indicate the percentage of genes where significance is lost or gained post correction. Green dots indicate the percentage of genes where the fitness effect is distorted and the effect is opposite in the uncorrected data. (B) The majority of the loss-of-fitness genes impacted by correction are putative false positive effects affecting genes which are either not-expressed (FPKM < 0.5), amplified, known non-essential, or exhibit a mild phenotype in the screening data. (C) Summary of overall impact of CRISPRcleanR on fitness effects following correction when considering data for all cell lines. The colors reflect the percentage of genes with a loss-of-fitness, no phenotype or gain-of-fitness effect which are retained in the corrected data. Figure S9. CRISPRcleanR retains cancer driver gene dependencies in Project Score and Achilles data. (A) Each circle represents a tested cancer driver gene dependency (mutation or amplification of a copy number segment) and the statistical significance using MaGeCK before (x-axis) and after (y-axis) CRISPRcleanR correction, across the two screens. Plots in the first row show depletion FDR values pre/post-correction, whereas those in the second row show depletion FDR values pre-correction and enrichment FDR values post-correction. (B) Details of the tested genetic dependencies and whether they are shared before and after CRISPRcleanR correction at two different thresholds of statistical significance (5 and 10% FDR, respectively for 1st and 2nd row of plots). The third row indicates the type of alteration involving the cancer driver genes under consideration and the total number of cell lines with an alteration. (ZIP 191 kb)

  11. m

    RAAS markers and COVID-19

    • data.mendeley.com
    Updated Sep 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nisha Parikh (2022). RAAS markers and COVID-19 [Dataset]. http://doi.org/10.17632/6dzn4yxc3s.2
    Explore at:
    Dataset updated
    Sep 5, 2022
    Authors
    Nisha Parikh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary Figure 1A: Box and Whisker Plots of log Aldosterone to Renin Ratio, additionally adjusted for body mass index Supplementary Figure 1B. Box and Whisker Plots of log Renin, additionally adjusted for body mass index Supplementary Figure 1C. Box and Whisker Plots of log Aldosterone, additionally adjusted for body mass index Supplementary Figure 2. Box and Whisker Plots of log ACE activity, additionally adjusted for body mass index

  12. d

    Raw Data for Jarema et al, 2022 Developmental Neurotoxicity and Behavioral...

    • datasets.ai
    • catalog.data.gov
    53
    Updated Jun 11, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Environmental Protection Agency (2022). Raw Data for Jarema et al, 2022 Developmental Neurotoxicity and Behavioral Screening in Larval Zebrafish... [Dataset]. https://datasets.ai/datasets/raw-data-for-jarema-et-al-2022-developmental-neurotoxicity-and-behavioral-screening-in-lar
    Explore at:
    53Available download formats
    Dataset updated
    Jun 11, 2022
    Dataset authored and provided by
    U.S. Environmental Protection Agency
    Description

    The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/toxics10050256/s1. Figure S1, Effect of DMSO on Light/Dark Locomotor Activity; Figure S2, Time Course Behavioral Graph, Activity Box Plots and Developmental Toxicity for each Chemical; Table S1, Raw Data.

    This dataset is associated with the following publication: Jarema, K., D. Hunter, B. Hill, J. Olin, K. Britton, M. Waalkes, and S. Padilla. Developmental Neurotoxicity and Behavioral Screening in Larval Zebrafish with a Comparison to Other Published Results. Toxics. MDPI, Basel, SWITZERLAND, 10(5): 256, (2022).

  13. Comparison of result on three-bar truss design problem.

    • plos.figshare.com
    xls
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). Comparison of result on three-bar truss design problem. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t013
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of result on three-bar truss design problem.

  14. The comparison results of different algorithms on CEC2017 functions with...

    • plos.figshare.com
    xls
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). The comparison results of different algorithms on CEC2017 functions with D=30. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The comparison results of different algorithms on CEC2017 functions with D=30.

  15. Comparison of result on welded beam design problem.

    • plos.figshare.com
    xls
    Updated Jun 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). Comparison of result on welded beam design problem. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t011
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of result on welded beam design problem.

  16. f

    Dead cells/total cells in MCF7 cells after 3 mT PEMFs treatment for 60...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 11, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Beyer, Christian; Franco-Obregón, Alfredo; Fröhlich, Jürg; Egli, Marcel; Crocetti, Sara; Schade, Grit (2013). Dead cells/total cells in MCF7 cells after 3 mT PEMFs treatment for 60 min/day for 3 days. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001617695
    Explore at:
    Dataset updated
    Sep 11, 2013
    Authors
    Beyer, Christian; Franco-Obregón, Alfredo; Fröhlich, Jürg; Egli, Marcel; Crocetti, Sara; Schade, Grit
    Description

    Values refer to the box plots of figure 3A showing the amount of dead cells/total cells in treated samples compared to relative control samples. Data were generated from 4 independent experiments (3 replicates/experiments, n = 12).

  17. Comparison of result on pressure vessel design problem.

    • plos.figshare.com
    xls
    Updated Jun 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). Comparison of result on pressure vessel design problem. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of result on pressure vessel design problem.

  18. Comparison of result on speed reducer design problem.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). Comparison of result on speed reducer design problem. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t014
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of result on speed reducer design problem.

  19. The comparison results of different algorithms on 23 benchmark functions...

    • plos.figshare.com
    xls
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). The comparison results of different algorithms on 23 benchmark functions with D=30. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The comparison results of different algorithms on 23 benchmark functions with D=30.

  20. The comparison results of different algorithms on CEC2019 functions.

    • plos.figshare.com
    xls
    Updated Jun 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou (2023). The comparison results of different algorithms on CEC2019 functions. [Dataset]. http://doi.org/10.1371/journal.pone.0276210.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yu Li; Xiao Liang; Jingsen Liu; Huan Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The comparison results of different algorithms on CEC2019 functions.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ashley M. Geczik; Jin Lee; Joseph A. Allen; Madison E. Raposa; Lucy F. Robinson; D. Alex Quistberg; Andrea L. Davis; Jennifer A. Taylor (2024). Additional file 2 of An updated analysis of safety climate and downstream outcomes in two convenience samples of U.S. fire departments (FOCUS 1.0 and 2.0 survey waves) [Dataset]. https://researchdiscovery.drexel.edu/esploro/outputs/dataset/Additional-file-2-of-An-updated/991021898823604721

Additional file 2 of An updated analysis of safety climate and downstream outcomes in two convenience samples of U.S. fire departments (FOCUS 1.0 and 2.0 survey waves)

Explore at:
Dataset updated
May 22, 2024
Dataset provided by
figshare
Authors
Ashley M. Geczik; Jin Lee; Joseph A. Allen; Madison E. Raposa; Lucy F. Robinson; D. Alex Quistberg; Andrea L. Davis; Jennifer A. Taylor
Time period covered
Aug 15, 2024
Description

Additional file 2: Supplemental Figure 1. Flowcharts of the analytic samples for FOCUS 1.0 and FOCUS 2.0 survey waves. Supplemental Figure 2A. Box and whisker plots comparing FOCUS safety climate scores by size variables for FOCUSv.1.0 departments. Supplemental Figure 2B. Box and whisker plots comparing FOCUS safety climate scores by size variables for FOCUSv.2.0 departments.

Search
Clear search
Close search
Google apps
Main menu