100+ datasets found
  1. o

    University SET data, with faculty and courses characteristics

    • openicpsr.org
    Updated Sep 12, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Under blind review in refereed journal (2021). University SET data, with faculty and courses characteristics [Dataset]. http://doi.org/10.3886/E149801V1
    Explore at:
    Dataset updated
    Sep 12, 2021
    Authors
    Under blind review in refereed journal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○

  2. Data analytics tools in use by organizations in the United States 2015-2017

    • statista.com
    Updated Dec 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2015). Data analytics tools in use by organizations in the United States 2015-2017 [Dataset]. https://www.statista.com/statistics/500119/united-states-survey-use-data-analytics-tools/
    Explore at:
    Dataset updated
    Dec 1, 2015
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2015
    Area covered
    United States
    Description

    The statistic shows the analytics tools currently in use by business organizations in the United States, as well as the analytics tools respondents believe they will be using in two years, according to a 2015 survey conducted by the Harvard Business Review Analytics Service. As of 2015, 73 percent of respondents believed they were going to use predictive analytics for data analysis in two years' time.

  3. f

    The Importance of Medical Students' Attitudes Regarding Cognitive Competence...

    • plos.figshare.com
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natasa M. Milic; Srdjan Masic; Jelena Milin-Lazovic; Goran Trajkovic; Zoran Bukumiric; Marko Savic; Nikola V. Milic; Andja Cirkovic; Milan Gajic; Mirjana Kostic; Aleksandra Ilic; Dejana Stanisavljevic (2023). The Importance of Medical Students' Attitudes Regarding Cognitive Competence for Teaching Applied Statistics: Multi-Site Study and Meta-Analysis [Dataset]. http://doi.org/10.1371/journal.pone.0164439
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Natasa M. Milic; Srdjan Masic; Jelena Milin-Lazovic; Goran Trajkovic; Zoran Bukumiric; Marko Savic; Nikola V. Milic; Andja Cirkovic; Milan Gajic; Mirjana Kostic; Aleksandra Ilic; Dejana Stanisavljevic
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe scientific community increasingly is recognizing the need to bolster standards of data analysis given the widespread concern that basic mistakes in data analysis are contributing to the irreproducibility of many published research findings. The aim of this study was to investigate students’ attitudes towards statistics within a multi-site medical educational context, monitor their changes and impact on student achievement. In addition, we performed a systematic review to better support our future pedagogical decisions in teaching applied statistics to medical students.MethodsA validated Serbian Survey of Attitudes Towards Statistics (SATS-36) questionnaire was administered to medical students attending obligatory introductory courses in biostatistics from three medical universities in the Western Balkans. A systematic review of peer-reviewed publications was performed through searches of Scopus, Web of Science, Science Direct, Medline, and APA databases through 1994. A meta-analysis was performed for the correlation coefficients between SATS component scores and statistics achievement. Pooled estimates were calculated using random effects models.ResultsSATS-36 was completed by 461 medical students. Most of the students held positive attitudes towards statistics. Ability in mathematics and grade point average were associated in a multivariate regression model with the Cognitive Competence score, after adjusting for age, gender and computer ability. The results of 90 paired data showed that Affect, Cognitive Competence, and Effort scores demonstrated significant positive changes. The Cognitive Competence score showed the largest increase (M = 0.48, SD = 0.95). The positive correlation found between the Cognitive Competence score and students’ achievement (r = 0.41; p

  4. d

    Data from: Empirical probability and machine learning analysis of m, n = 2,...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Mar 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Empirical probability and machine learning analysis of m, n = 2, 1 tearing mode onset parameter dependence in DIII-D H-mode scenarios [Dataset]. https://search.dataone.org/view/sha256%3A2dc795e781c3b117bc544386a2572b60e99fb46b14fc69ab545e9a6b1dc0f451
    Explore at:
    Dataset updated
    Mar 6, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    L. Bardoczi, N. J. Richner, J. Zhu, C. Rea, N. C. Logan
    Description

    m, n = 2, 1 tearing mode onset empirical probability and machine learning analyses of a multiscenario DIII-D database of over 14 000 H- mode discharges show that the normalized plasma beta, the rotation profile, and the magnetic equilibrium shape have the strongest impact on the 2,1 tearing mode stability, in qualitative agreement with neoclassical tearing modes (m and n are the poloidal and toroidal mode numbers, respectively). In addition, 2,1 tearing modes are most likely to destabilize when n > 1 tearing modes are already present in the core plasma. The covariance matrix of tearing sensitive plasma parameters takes a nearly block-diagonal form, with the blocks incorporating thermodynamic, current and safety factor profile, separatrix shape, and plasma flow parameters, respectively. This suggests a number of paths to improved stability at fixed pressure and edge safety factor primarily by preserving a minimum of 1 kHz differential rotation, increasing the minimum safety factor above unity, using upper single null magnetic configuration, and reducing the core impurity radiation. In addition, lower triangularity, lower elongation, and lower pedestal pressure may also help to improve stability. The electron and ion temperature, collisionality, resistivity, internal inductance, and the parallel current gradient appear to only weakly correlate with the 2,1 tearing mode onsets in this database.

  5. Lives Saved by Vehicle Safety Technologies and Associated Federal Motor...

    • data.virginia.gov
    • data.transportation.gov
    • +3more
    txt
    Updated May 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S Department of Transportation (2024). Lives Saved by Vehicle Safety Technologies and Associated Federal Motor Vehicle Safety Standards [Dataset]. https://data.virginia.gov/dataset/lives-saved-by-vehicle-safety-technologies-and-associated-federal-motor-vehicle-safety-standard
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 1, 2024
    Authors
    U.S Department of Transportation
    Description

    SAS programs and auxiliary files used in report no. DOT HS 812 069, Lives Saved by Vehicle Safety Technologies & Associated FMVSS, 1960 to 2012 Passenger Cars & LTV's

  6. H

    Statistical Analysis Summary Tables

    • dataverse.harvard.edu
    Updated Jun 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Bumgarner (2023). Statistical Analysis Summary Tables [Dataset]. http://doi.org/10.7910/DVN/C6QPR9
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 7, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Jacob Bumgarner
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Summary tables that contain the details for the statistical analyses throughout the manuscript.

  7. g

    Complex Analysis & Statistical Publications - Skills Bootcamps for Londoners...

    • gimi9.com
    Updated Dec 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Complex Analysis & Statistical Publications - Skills Bootcamps for Londoners [Dataset]. https://gimi9.com/dataset/london_gla-skills-bootcamps
    Explore at:
    Dataset updated
    Dec 20, 2024
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Skills Bootcamps for Londoners aim to help Londoners aged 19+ to enter employment, upskill or change career and are open to adults who are full-time or part-time employed, self-employed or unemployed, as well as adults returning to work after a break. Bootcamp training courses provide access to in-demand sector specific skills training and provide a guaranteed job interview on completion. In addition to technical training, learners will also receive guidance on entering professional working environments to fully prepare them for new roles. More information on the programme can be found here. The Skills Bootcamp for Londoners data is a summary of provider-reported Skills Bootcamps starts, completions and outcomes from courses funded by the Greater London Authority. Wave 3 data includes Bootcamps started between April 2022 and March 2023. Completions and outcomes can occur and be reported in the 2022-23 financial year and in a defined period after that year. Wave 3 was the first wave of Skills Bootcamps that were delegated to the Greater London Authority.

  8. f

    Table_1_Students in a Course-Based Undergraduate Research Experience Course...

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stokes S. Baker; Mohamed S. Alhassan; Kristian Z. Asenov; Joyce J. Choi; Griffin E. Craig; Zayn A. Dastidar; Saleh J. Karim; Erin E. Sheardy; Salameh Z. Sloulin; Nitish Aggarwal; Zahraa M. Al-Habib; Valentina Camaj; Dennis D. Cleminte; Mira H. Hamady; Mike Jaafar; Marcel L. Jones; Zayan M. Khan; Evileen S. Khoshaba; Rita Khoshaba; Sarah S. Ko; Abdulmalik T. Mashrah; Pujan A. Patel; Rabeeh Rajab; Sahil Tandon (2023). Table_1_Students in a Course-Based Undergraduate Research Experience Course Discovered Dramatic Changes in the Bacterial Community Composition Between Summer and Winter Lake Samples.pdf [Dataset]. http://doi.org/10.3389/fmicb.2021.579325.s002
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Stokes S. Baker; Mohamed S. Alhassan; Kristian Z. Asenov; Joyce J. Choi; Griffin E. Craig; Zayn A. Dastidar; Saleh J. Karim; Erin E. Sheardy; Salameh Z. Sloulin; Nitish Aggarwal; Zahraa M. Al-Habib; Valentina Camaj; Dennis D. Cleminte; Mira H. Hamady; Mike Jaafar; Marcel L. Jones; Zayan M. Khan; Evileen S. Khoshaba; Rita Khoshaba; Sarah S. Ko; Abdulmalik T. Mashrah; Pujan A. Patel; Rabeeh Rajab; Sahil Tandon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Course-based undergraduate research experience (CURE) courses incorporate high-impact pedagogies that have been shown to increase undergraduate retention among underrepresented minorities and women. As part of the Building Infrastructure Leading to Diversity program at the University of Detroit Mercy, a CURE metagenomics course was established in the winter of 2019. Students investigated the bacterial community composition in a eutrophic cove in Lake Saint Clair (Harrison Township, MI, United States) from water samples taken in the summer and winter. The students created 16S rRNA libraries that were sequenced using next-generation sequencing technology. They used a public web-based supercomputing resource to process their raw sequencing data and web-based tools to perform advanced statistical analysis. The students discovered that the most common operational taxonomic unit, representing 31% of the prokaryotic sequences in both summer and winter samples, corresponded to an organism that belongs to a previously unidentified phylum. This result showed the students the power of metagenomics because the approach was able to detect unclassified organisms. Principal Coordinates Analysis of Bray–Curtis dissimilarity index data showed that the winter community was distinct from the summer community [Analysis of Similarities (ANOSIM) r = 0.59829, n = 18, and p < 0.001]. Dendrograms based on hierarchically clustered Pearson correlation coefficients of phyla were divided into a winter clade and a summer clade. The conclusion is that the winter bacterial population was fundamentally different from the summer population, even though the samples were taken from the same locations in a protected cove. Because of the small class sizes, qualitative as well as statistical methods were used to evaluate the course’s impact on student attitudes. Results from the Laboratory Course Assessment Survey showed that most of the respondents felt they were contributing to scientific knowledge and the course fostered student collaboration. The majority of respondents agreed or strongly agreed that the course incorporated iteration aspects of scientific investigations, such as repeating procedures to fix problems. In summary, the metagenomics CURE course was able to add to scientific knowledge and allowed students to participate in authentic research.

  9. d

    Louisville Metro KY - Officer Involved Shooting Database and Statistical...

    • catalog.data.gov
    • data.lojic.org
    • +2more
    Updated Apr 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Officer Involved Shooting Database and Statistical Analysis 10-13-2021 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-officer-involved-shooting-database-and-statistical-analysis-10-13-2021
    Explore at:
    Dataset updated
    Apr 13, 2023
    Dataset provided by
    Louisville/Jefferson County Information Consortium
    Area covered
    Kentucky, Louisville
    Description

    Officer Involved Shooting (OIS) Database and Statistical Analysis. Data is updated after there is an officer involved shooting.PIU#Incident # - the number associated with either the incident or used as reference to store the items in our evidence rooms Date of Occurrence Month - month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Date of Occurrence Day - day of the month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Time of Occurrence - time the incident occurredAddress of incident - the location the incident occurredDivision - the LMPD division in which the incident actually occurredBeat - the LMPD beat in which the incident actually occurredInvestigation Type - the type of investigation (shooting or death)Case Status - status of the case (open or closed)Suspect Name - the name of the suspect involved in the incidentSuspect Race - the race of the suspect involved in the incident (W-White, B-Black)Suspect Sex - the gender of the suspect involved in the incidentSuspect Age - the age of the suspect involved in the incidentSuspect Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Suspect Weapon - the type of weapon the suspect used in the incidentOfficer Name - the name of the officer involved in the incidentOfficer Race - the race of the officer involved in the incident (W-White, B-Black, A-Asian)Officer Sex - the gender of the officer involved in the incidentOfficer Age - the age of the officer involved in the incidentOfficer Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Officer Years of Service - the number of years the officer has been serving at the time of the incidentLethal Y/N - whether or not the incident involved a death (Y-Yes, N-No, continued-pending)Narrative - a description of what was determined from the investigationContact:Carol Boylecarol.boyle@louisvilleky.gov

  10. m

    Data for: Cortical phase locking to accelerated speech in blind and sighted...

    • data.mendeley.com
    Updated Jul 18, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ingo Hertrich (2018). Data for: Cortical phase locking to accelerated speech in blind and sighted listeners prior to and after training [Dataset]. http://doi.org/10.17632/n4jv7dz6kn.1
    Explore at:
    Dataset updated
    Jul 18, 2018
    Authors
    Ingo Hertrich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Suppl1_Additional-analysis.docx This is an alternative statistical analysis of the data, using "Session" as a fixed factor rather than "Performance" as a covariate. Suppl2_Statistics-scripts.r Statistics scripts in R that were applied to the data Suppl3_Statistics-data.txt Dataset that was analyzed with R

  11. e

    Introduction to Data Analytics

    • paper.erudition.co.in
    html
    Updated Jun 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Einetic (2021). Introduction to Data Analytics [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-2020-2021/5/data-analytics-skills-for-managers
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 1, 2021
    Dataset authored and provided by
    Einetic
    License

    https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms

    Description

    Question Paper Solutions of chapter Introduction to Data Analytics of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021

  12. d

    CRIME STATISTICS DATA ANALYTICS

    • search.dataone.org
    • borealisdata.ca
    • +1more
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kwong, Cheryl; Anweiler, Drew; Sarafraz, Mary (2023). CRIME STATISTICS DATA ANALYTICS [Dataset]. http://doi.org/10.5683/SP2/IE6NRY
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Kwong, Cheryl; Anweiler, Drew; Sarafraz, Mary
    Description

    Crime isn't a topic most people want to use mental energy to think about. We want to avoid harm, protect our loved ones, and hold on to what we claim is ours. So how do we remain vigilant without digging too deep into the filth that is crime? Data, of course. The focus of our study is to explore possible trends between crime and communities in the city of Calgary. Our purpose is visualize Calgary criminal behaviour in order to help increase awareness for both citizens and law enforcement. Through the use of our visuals, individuals can make more informed decisions to improve the overall safety of their lives. Some of the main concerns of the study include: how crime rates increase with population, which areas in Calgary have the most crime, and if crime adheres to time-sensative patterns.

  13. f

    Multiple regression predictors of jump height.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amane Zushi; Takuya Yoshida; Kodayu Zushi; Yasushi Kariyama; Mitsugi Ogata (2023). Multiple regression predictors of jump height. [Dataset]. http://doi.org/10.1371/journal.pone.0268339.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Amane Zushi; Takuya Yoshida; Kodayu Zushi; Yasushi Kariyama; Mitsugi Ogata
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multiple regression predictors of jump height.

  14. A

    ‘Port of Los Angeles - Historical TEU Statistics’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jun 6, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2014). ‘Port of Los Angeles - Historical TEU Statistics’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-port-of-los-angeles-historical-teu-statistics-fc6e/de4b4a3e/?iid=000-243&v=presentation
    Explore at:
    Dataset updated
    Jun 6, 2014
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Los Angeles, Port of Los Angeles
    Description

    Analysis of ‘Port of Los Angeles - Historical TEU Statistics’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/2b70c27d-54b3-4447-8b74-835c1e594285 on 26 January 2022.

    --- Dataset description provided by original source is as follows ---

    Port of Los Angeles - Historical TEU Statistics: A "TEU" is a "twenty-foot equivalent unit," which is a standard measurement of shipping cargo based on a twenty-foot long shipping container.

    --- Original source retains full ownership of the source dataset ---

  15. f

    Data from: PECA: A Novel Statistical Tool for Deconvoluting Time-Dependent...

    • acs.figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guoshou Teo; Christine Vogel; Debashis Ghosh; Sinae Kim; Hyungwon Choi (2023). PECA: A Novel Statistical Tool for Deconvoluting Time-Dependent Gene Expression Regulation [Dataset]. http://doi.org/10.1021/pr400855q.s011
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    ACS Publications
    Authors
    Guoshou Teo; Christine Vogel; Debashis Ghosh; Sinae Kim; Hyungwon Choi
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Protein expression varies as a result of intricate regulation of synthesis and degradation of messenger RNAs (mRNA) and proteins. Studies of dynamic regulation typically rely on time-course data sets of mRNA and protein expression, yet there are no statistical methods that integrate these multiomics data and deconvolute individual regulatory processes of gene expression control underlying the observed concentration changes. To address this challenge, we developed Protein Expression Control Analysis (PECA), a method to quantitatively dissect protein expression variation into the contributions of mRNA synthesis/degradation and protein synthesis/degradation, termed RNA-level and protein-level regulation respectively. PECA computes the rate ratios of synthesis versus degradation as the statistical summary of expression control during a given time interval at each molecular level and computes the probability that the rate ratio changed between adjacent time intervals, indicating regulation change at the time point. Along with the associated false-discovery rates, PECA gives the complete description of dynamic expression control, that is, which proteins were up- or down-regulated at each molecular level and each time point. Using PECA, we analyzed two yeast data sets monitoring the cellular response to hyperosmotic and oxidative stress. The rate ratio profiles reported by PECA highlighted a large magnitude of RNA-level up-regulation of stress response genes in the early response and concordant protein-level regulation with time delay. However, the contributions of RNA- and protein-level regulation and their temporal patterns were different between the two data sets. We also observed several cases where protein-level regulation counterbalanced transcriptomic changes in the early stress response to maintain the stability of protein concentrations, suggesting that proteostasis is a proteome-wide phenomenon mediated by post-transcriptional regulation.

  16. Technologies used in big data analysis 2015

    • statista.com
    Updated Jul 29, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2015). Technologies used in big data analysis 2015 [Dataset]. https://www.statista.com/statistics/491267/big-data-technologies-used/
    Explore at:
    Dataset updated
    Jul 29, 2015
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Dec 2014 - Feb 2015
    Area covered
    Worldwide, Europe, North America
    Description

    This graph presents the results of a survey, conducted by BARC in 2014/15, into the current and planned use of technology for the analysis of big data. At the beginning of 2015, 13 percent of respondents indicated that their company was already using a big data analytical appliance for big data.

  17. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Jun 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Economic Research Forum (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Kurdistan Regional Statistics Office (KRSO)
    Economic Research Forum
    Central Statistical Organization (CSO)
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  18. Ad-hoc statistical analysis: 2020/21 Quarter 1

    • s3.amazonaws.com
    • gov.uk
    Updated Apr 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2020). Ad-hoc statistical analysis: 2020/21 Quarter 1 [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/161/1616094.html
    Explore at:
    Dataset updated
    Apr 14, 2020
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released during the period April - June 2020. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@culture.gov.uk.

    April 2020 - DCMS Economic Estimates: Experimental quarterly GVA for time series analysis

    These are experimental estimates of the quarterly GVA in chained volume measures by DCMS sectors and subsectors between 2010 and 2018, which have been produced to help the department estimate the effect of shocks to the economy. Due to substantial revisions to the base data and methodology used to construct the tourism satellite account, estimates for the tourism sector are only available for 2017. For this reason “All DCMS Sectors” excludes tourism. Further, as chained volume measures are not available for Civil Society at present, this sector is also not included.

    The methods used to produce these estimates are experimental. The data here are not comparable to those published previously and users should refer to the annual reports for estimates of GVA by businesses in DCMS sectors.

    GVA generated by businesses in DCMS sectors (excluding Tourism and Civil Society) increased by 31.0% between the fourth quarters of 2010 and 2018. The UK economy grew by 16.7% over the same period.

    All individual DCMS sectors (excluding Tourism and Civil Society) grew faster than the UK average between quarter 4 of 2010 and 2018, apart from the Telecoms sector, which decreased by 10.1%.

    April 2020 - Proportion of total DCMS sector turnover generated by businesses in different employment and turnover bands, 2017

    This data shows the proportion of the total turnover in DCMS sectors in 2017 that was generated by businesses according to individual businesses turnover, and by the number of employees.

    In 2017 a larger share of total turnover was generated by DCMS sector businesses with an annual turnover of less than one million pounds (11.4%) than the UK average (8.6%). In general, individual DCMS sectors tended to have a higher proportion of total turnover generated by businesses with individual turnover of less than one million pounds, with the exception of the Gambling (0.2%), Digital (8.2%) and Telecoms (2.0%, wholly within Digital) sectors.

    DCMS sectors tended to have a higher proportion of total turnover generated by large (250 employees or more) businesses (57.8%) than the UK average (51.4%). The exceptions were the Creative Industries (41.7%) and the Cultural sector (42.4%). Of all DCMS sectors, the Gambling sector had the highest proportion of total turnover generated by large businesses (97.5%).

    April 2

  19. List of statistical analysis procedures in metabox.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kwanjeera Wanichthanarak; Sili Fan; Dmitry Grapov; Dinesh Kumar Barupal; Oliver Fiehn (2023). List of statistical analysis procedures in metabox. [Dataset]. http://doi.org/10.1371/journal.pone.0171046.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Kwanjeera Wanichthanarak; Sili Fan; Dmitry Grapov; Dinesh Kumar Barupal; Oliver Fiehn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of statistical analysis procedures in metabox.

  20. f

    Predictive Modeling of High-Entropy Alloys and Amorphous Metallic Alloys...

    • acs.figshare.com
    xlsx
    Updated Oct 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Son Gyo Jung; Guwon Jung; Jacqueline M. Cole (2024). Predictive Modeling of High-Entropy Alloys and Amorphous Metallic Alloys Using Machine Learning [Dataset]. http://doi.org/10.1021/acs.jcim.4c00873.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Oct 1, 2024
    Dataset provided by
    ACS Publications
    Authors
    Son Gyo Jung; Guwon Jung; Jacqueline M. Cole
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    High entropy alloys and amorphous metallic alloys represent two distinct classes of advanced alloy materials, each with unique structural characteristics. Their emergence has garnered considerable interest across the materials science and engineering communities, driven by their promising properties, including exceptional strength. However, their extensive compositional diversity poses substantial challenges for systematic exploration, as traditional experimental approaches and high-throughput calculations struggle to efficiently navigate this vast space. While the recent development in data-driven materials discovery could potentially help, such efforts are hindered by the scarcity of comprehensive data and the lack of robust predictive tools that can effectively link alloy composition with specific properties. To address these challenges, we have deployed a machine-learning-based workflow for feature selection and statistical analysis to afford predictive models that accelerate the data-driven discovery and optimization of these advanced materials. Our methodology is validated through two case studies: (i) a regression analysis of the bulk modulus, and (ii) a classification analysis based on glass-forming ability. The Bayesian-optimized regression model trained for the prediction of bulk modulus achieved an R2 of 0.969, an mean absolute error (MAE) of 3.958 GPa, and an root mean square error (RMSE) of 5.411 GPa, while our classification model for predicting glass-forming ability achieved an F1-score of 0.91, an area-under-the-curve of the receiver-operating-characteristic curve of 0.98, and an accuracy of 0.91. Furthermore, by leveraging a wide array of chemical data from diverse literature sources, we have successfully predicted a broad range of properties. This success underscores the efficacy of our modeling approach and emphasizes the importance of a comprehensive feature analysis and judicious feature selection strategy over a mere reliance on complex modeling techniques.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Under blind review in refereed journal (2021). University SET data, with faculty and courses characteristics [Dataset]. http://doi.org/10.3886/E149801V1

University SET data, with faculty and courses characteristics

Explore at:
Dataset updated
Sep 12, 2021
Authors
Under blind review in refereed journal
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This paper explores a unique dataset of all the SET ratings provided by students of one university in Poland at the end of the winter semester of the 2020/2021 academic year. The SET questionnaire used by this university is presented in Appendix 1. The dataset is unique for several reasons. It covers all SET surveys filled by students in all fields and levels of study offered by the university. In the period analysed, the university was entirely in the online regime amid the Covid-19 pandemic. While the expected learning outcomes formally have not been changed, the online mode of study could have affected the grading policy and could have implications for some of the studied SET biases. This Covid-19 effect is captured by econometric models and discussed in the paper. The average SET scores were matched with the characteristics of the teacher for degree, seniority, gender, and SET scores in the past six semesters; the course characteristics for time of day, day of the week, course type, course breadth, class duration, and class size; the attributes of the SET survey responses as the percentage of students providing SET feedback; and the grades of the course for the mean, standard deviation, and percentage failed. Data on course grades are also available for the previous six semesters. This rich dataset allows many of the biases reported in the literature to be tested for and new hypotheses to be formulated, as presented in the introduction section. The unit of observation or the single row in the data set is identified by three parameters: teacher unique id (j), course unique id (k) and the question number in the SET questionnaire (n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9} ). It means that for each pair (j,k), we have nine rows, one for each SET survey question, or sometimes less when students did not answer one of the SET questions at all. For example, the dependent variable SET_score_avg(j,k,n) for the triplet (j=Calculus, k=John Smith, n=2) is calculated as the average of all Likert-scale answers to question nr 2 in the SET survey distributed to all students that took the Calculus course taught by John Smith. The data set has 8,015 such observations or rows. The full list of variables or columns in the data set included in the analysis is presented in the attached filesection. Their description refers to the triplet (teacher id = j, course id = k, question number = n). When the last value of the triplet (n) is dropped, it means that the variable takes the same values for all n ϵ {1, 2, 3, 4, 5, 6, 7, 8, 9}.Two attachments:- word file with variables description- Rdata file with the data set (for R language).Appendix 1. Appendix 1. The SET questionnaire was used for this paper. Evaluation survey of the teaching staff of [university name] Please, complete the following evaluation form, which aims to assess the lecturer’s performance. Only one answer should be indicated for each question. The answers are coded in the following way: 5- I strongly agree; 4- I agree; 3- Neutral; 2- I don’t agree; 1- I strongly don’t agree. Questions 1 2 3 4 5 I learnt a lot during the course. ○ ○ ○ ○ ○ I think that the knowledge acquired during the course is very useful. ○ ○ ○ ○ ○ The professor used activities to make the class more engaging. ○ ○ ○ ○ ○ If it was possible, I would enroll for the course conducted by this lecturer again. ○ ○ ○ ○ ○ The classes started on time. ○ ○ ○ ○ ○ The lecturer always used time efficiently. ○ ○ ○ ○ ○ The lecturer delivered the class content in an understandable and efficient way. ○ ○ ○ ○ ○ The lecturer was available when we had doubts. ○ ○ ○ ○ ○ The lecturer treated all students equally regardless of their race, background and ethnicity. ○ ○

Search
Clear search
Close search
Google apps
Main menu