Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In our submission to the Journal of Software and Systems, titled “The Power of Words in Agile vs. Waterfall Development: Written Communication in Hybrid Software Teams,” we present an exploratory case study conducted in a large software organization, AFAS Software. Our study investigates the influence of the development paradigm and the formality of communication channels on written communication within hybrid development teams.
This online appendix contains supplementary material to uphold transparency and facilitate the reproduction of the statistical analysis. Please refer to Section 2.1 for the research questions and hypotheses.
The root folder includes the “Project Teams Composition.pdf” file, which contains the anonymized compositions of 20 project teams: 11 agile (PrAG) projects and 9 waterfall (PrWF) projects. This file lists the project team members involved in communication within the Microsoft Teams and Insite channels.
This subfolder includes the SPSS files, which contain the normality test and Mann-Whitney U Tests.
We performed the normality test. In most cases, the significance of Shapiro-Wilk is below 0.05; thus, the data is not normally distributed and fails to meet the assumption for the t-test. We, therefore, opted for the Mann-Whitney U test, the non-parametric alternative of the t-test.
The files below contain the results of Mann-Whitney U Tests, which are presented in Appendix A - Tables A.7 (a), A.8 (a), A.9 (a):
The files below contain the results of Mann-Whitney U Tests, which are presented in Appendix A - Tables A.7 (b), A.8 (b), A.9 (b):
This subfolder includes the necessary files to perform calculations, and the results of these calculations.
The files below contain the results of the calculations:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To create the dataset, the top 10 countries leading in the incidence of COVID-19 in the world were selected as of October 22, 2020 (on the eve of the second full of pandemics), which are presented in the Global 500 ranking for 2020: USA, India, Brazil, Russia, Spain, France and Mexico. For each of these countries, no more than 10 of the largest transnational corporations included in the Global 500 rating for 2020 and 2019 were selected separately. The arithmetic averages were calculated and the change (increase) in indicators such as profitability and profitability of enterprises, their ranking position (competitiveness), asset value and number of employees. The arithmetic mean values of these indicators for all countries of the sample were found, characterizing the situation in international entrepreneurship as a whole in the context of the COVID-19 crisis in 2020 on the eve of the second wave of the pandemic. The data is collected in a general Microsoft Excel table. Dataset is a unique database that combines COVID-19 statistics and entrepreneurship statistics. The dataset is flexible data that can be supplemented with data from other countries and newer statistics on the COVID-19 pandemic. Due to the fact that the data in the dataset are not ready-made numbers, but formulas, when adding and / or changing the values in the original table at the beginning of the dataset, most of the subsequent tables will be automatically recalculated and the graphs will be updated. This allows the dataset to be used not just as an array of data, but as an analytical tool for automating scientific research on the impact of the COVID-19 pandemic and crisis on international entrepreneurship. The dataset includes not only tabular data, but also charts that provide data visualization. The dataset contains not only actual, but also forecast data on morbidity and mortality from COVID-19 for the period of the second wave of the pandemic in 2020. The forecasts are presented in the form of a normal distribution of predicted values and the probability of their occurrence in practice. This allows for a broad scenario analysis of the impact of the COVID-19 pandemic and crisis on international entrepreneurship, substituting various predicted morbidity and mortality rates in risk assessment tables and obtaining automatically calculated consequences (changes) on the characteristics of international entrepreneurship. It is also possible to substitute the actual values identified in the process and following the results of the second wave of the pandemic to check the reliability of pre-made forecasts and conduct a plan-fact analysis. The dataset contains not only the numerical values of the initial and predicted values of the set of studied indicators, but also their qualitative interpretation, reflecting the presence and level of risks of a pandemic and COVID-19 crisis for international entrepreneurship.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
About Dataset Safa S. Abdul-Jabbar, Alaa k. Farhan
Context This is the first Dataset for various ordinary patients in Iraq. The Dataset provides the patients’ Cell Blood Count test information that can be used to create a Hematology diagnosis/prediction system. Also, this Data was collected in 2022 from Al-Zahraa Al-Ahly Hospital. These data can be cleaned & analyzed using any programming language because it is provided in an excel file that can be accessed and manipulated easily. The user just needs to understand how rows and columns are arranged because the data was collected as images(CBC images) from the laboratories and then stored the extracted data in an excel file. Content This Dataset contains 500 rows. For each row (patient information), there are 21 columns containing CBC test features that can be described as follows:
ID: Patients Identifier
WBC: White Blood Cell, Normal Ranges: 4.0 to 10.0, Unit: 10^9/L.
LYMp: Lymphocytes percentage, which is a type of white blood cell, Normal Ranges: 20.0 to 40.0, Unit: %
MIDp: Indicates the percentage combined value of the other types of white blood cells not classified as lymphocytes or granulocytes, Normal Ranges: 1.0 to 15.0, Unit: %
NEUTp: Neutrophils are a type of white blood cell (leukocytes); neutrophils percentage, Normal Ranges: 50.0 to 70.0, Unit: %
LYMn: Lymphocytes number are a type of white blood cell, Normal Ranges: 0.6 to 4.1, Unit: 10^9/L.
MIDn: Indicates the combined number of other white blood cells not classified as lymphocytes or granulocytes, Normal Ranges: 0.1 to 1.8, Unit: 10^9/L.
NEUTn: Neutrophils Number, Normal Ranges: 2.0 to 7.8, Unit: 10^9/L.
RBC: Red Blood Cell, Normal Ranges: 3.50 to 5.50, Unit: 10^12/L
HGB: Hemoglobin, Normal Ranges: 11.0 to 16.0, Unit: g/dL
HCT: Hematocrit is the proportion, by volume, of the Blood that consists of red blood cells, Normal Ranges: 36.0 to 48.0, Unit: %
MCV: Mean Corpuscular Volume, Normal Ranges: 80.0 to 99.0, Unit: fL
MCH: Mean Corpuscular Hemoglobin is the average amount of haemoglobin in the average red cell, Normal Ranges: 26.0 to 32.0, Unit: pg
MCHC: Mean Corpuscular Hemoglobin Concentration, Normal Ranges: 32.0 to 36.0, Unit: g/dL
RDWSD: Red Blood Cell Distribution Width, Normal Ranges: 37.0 to 54.0, Unit: fL
RDWCV: Red blood cell distribution width, Normal Ranges: 11.5 to 14.5, Unit: %
PLT: Platelet Count, Normal Ranges: 100 to 400, Unit: 10^9/L
MPV: Mean Platelet Volume, Normal Ranges: 7.4 to 10.4, Unit: fL
PDW: Red Cell Distribution Width, Normal Ranges: 10.0 to 17.0, Unit: %
PCT: The level of Procalcitonin in the Blood, Normal Ranges: 0.10 to 0.28, Unit: %
PLCR: Platelet Large Cell Ratio, Normal Ranges: 13.0 to 43.0, Unit: %
Acknowledgements We thank the entire Al-Zahraa Al-Ahly Hospital Hospital team, especially the hospital manager, for cooperating with us in collecting this data while maintaining patients' confidentiality.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
** RD DATASET ** RD dataset was created by the images from the melanoma community on the internet (https://reddit.com/r/melanoma). Consecutive images were included using a python library (https://github.com/aliparlakci/bulk-downloader-for-reddit) from Jan 25, 2020, to July 30, 2021. The ground truth was voted by four dermatologists and one plastic surgeon while referring to the chief complaint and brief history. A total of 1,282 images (1,201 cases) were finally included. Because of the deleted cases by users, the links of 860 cases are valid in July 2021.
RD_RAW.xlsx The download links and ground truth of the RD dataset are included in this excel file. In addition, the raw data of the AI (Model Dermatology Build2021 - https://modelderm.com) and 32 laypersons were included.
v1_public.zip "v1_public.zip" includes the 1,282 lesional images (full-size). The 24 images that were excluded from the study are also available.
v1_private.zip is not available here. Wide field images are not available here. If the archive is needed for research purpose, please email to Dr. Han Seung Seog (whria78@gmail.com) or Dr Cristian Navarrete-Dechent (ctnavarr@gmail.com).
References - The Degradation of Performance of a State-of-the-art Skin Image Classifier When Applied to Patient-driven Internet Search - Scientific Report (in-press)
** Background normal test with the ISIC images ** ISIC dataset (https://www.isic-archive.com; Gallery -> 2018 JID Editorial images; 99 images; ISIC_0024262 and ISIC_0024261 are identical images and ISIC_0024262 was skipped) was used for the background normal test. We defined 10% area rectangle crop to “specialist-size crop”, and 5% area rectangle crop to “layperson-size crop” a) S-crops.zip: specialist-size crops Format: CROPNO_AGE(0~99)_GENDER(1=male,0=female)[m]_FILENAME.png b) L-crops.zip: layperson-size crops Format: CROPNO_AGE(0~99)_GENDER(1=male,0=female)[m]_FILENAME.png c) result_S.zip: Background normal test result using the specialist-size crops d) result_L.zip; Background normal test result using the layperson-size crops
Reference - Automated Dermatological Diagnosis: Hype or Reality? - https://doi.org/10.1016/j.jid.2018.04.040 - Multiclass Artificial Intelligence in Dermatology: Progress but Still Room for Improvement - https://doi.org/10.1016/j.jid.2020.06.040
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw data output 1. Differentially expressed genes in AML CSCs compared with GTCs as well as in TCGA AML cancer samples compared with normal ones. This data was generated based on the results of AML microarray and TCGA data analysis.
Raw data output 2. Commonly and uniquely differentially expressed genes in AML CSC/GTC microarray and TCGA bulk RNA-seq datasets. This data was generated based on the results of AML microarray and TCGA data analysis.
Raw data output 3. Common differentially expressed genes between training and test set samples the microarray dataset. This data was generated based on the results of AML microarray data analysis.
Raw data output 4. Detailed information on the samples of the breast cancer microarray dataset (GSE52327) used in this study.
Raw data output 5. Differentially expressed genes in breast CSCs compared with GTCs as well as in TCGA BRCA cancer samples compared with normal ones.
Raw data output 6. Commonly and uniquely differentially expressed genes in breast cancer CSC/GTC microarray and TCGA BRCA bulk RNA-seq datasets. This data was generated based on the results of breast cancer microarray and TCGA BRCA data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively.
Raw data output 7. Differential and common co-expression and protein-protein interaction of genes between CSC and GTC samples. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis. CSC, and GTC are abbreviations of cancer stem cell, and general tumor cell, respectively.
Raw data output 8. Differentially expressed genes between AML dormant and active CSCs. This data was generated based on the results of AML scRNA-seq data analysis.
Raw data output 9. Uniquely expressed genes in dormant or active AML CSCs. This data was generated based on the results of AML scRNA-seq data analysis.
Raw data output 10. Intersections between the targeting transcription factors of AML key CSC genes and differentially expressed genes between AML CSCs vs GTCs and between dormant and active AML CSCs or the uniquely expressed genes in either class of CSCs.
Raw data output 11. Targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section.
Raw data output 12. CSC-specific targeting desirableness score of AML key CSC genes and their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section.
Raw data output 13. The protein-protein interactions between AML key CSC genes with themselves and their targeting transcription factors. This data was generated based on the results of AML microarray and STRING database-based protein-protein interaction data analysis.
Raw data output 14. The previously confirmed associations of genes having the highest targeting desirableness and CSC-specific targeting desirableness scores with AML or other cancers’ (stem) cells as well as hematopoietic stem cells. These data were generated based on a PubMed database-based literature mining.
Raw data output 15. Drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section.
Raw data output 16. CSC-specific drug score of available drugs and bioactive small molecules targeting AML key CSC genes and/or their targeting transcription factors. These scores were generated based on an in-house scoring function described in the Methods section.
Raw data output 17. Candidate drugs for experimental validation. These drugs were selected based on their respective (CSC-specific) drug scores. CSC is the abbreviation of cancer stem cell.
Raw data output 18. Detailed information on the samples of the AML microarray dataset GSE30375 used in this study.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In our submission to the Journal of Software and Systems, titled “The Power of Words in Agile vs. Waterfall Development: Written Communication in Hybrid Software Teams,” we present an exploratory case study conducted in a large software organization, AFAS Software. Our study investigates the influence of the development paradigm and the formality of communication channels on written communication within hybrid development teams.
This online appendix contains supplementary material to uphold transparency and facilitate the reproduction of the statistical analysis. Please refer to Section 2.1 for the research questions and hypotheses.
The root folder includes the “Project Teams Composition.pdf” file, which contains the anonymized compositions of 20 project teams: 11 agile (PrAG) projects and 9 waterfall (PrWF) projects. This file lists the project team members involved in communication within the Microsoft Teams and Insite channels.
This subfolder includes the SPSS files, which contain the normality test and Mann-Whitney U Tests.
We performed the normality test. In most cases, the significance of Shapiro-Wilk is below 0.05; thus, the data is not normally distributed and fails to meet the assumption for the t-test. We, therefore, opted for the Mann-Whitney U test, the non-parametric alternative of the t-test.
The files below contain the results of Mann-Whitney U Tests, which are presented in Appendix A - Tables A.7 (a), A.8 (a), A.9 (a):
The files below contain the results of Mann-Whitney U Tests, which are presented in Appendix A - Tables A.7 (b), A.8 (b), A.9 (b):
This subfolder includes the necessary files to perform calculations, and the results of these calculations.
The files below contain the results of the calculations: