100+ datasets found
  1. d

    Data from: A simple method for statistical analysis of intensity differences...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

  2. Dataset for Linear Regression with 2 IV and 1 DV

    • kaggle.com
    zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stable Space (2025). Dataset for Linear Regression with 2 IV and 1 DV [Dataset]. https://www.kaggle.com/datasets/sharmajicoder/dataset-for-linear-regression-with-2-iv-and-1-dv
    Explore at:
    zip(9351 bytes)Available download formats
    Dataset updated
    Mar 25, 2025
    Authors
    Stable Space
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset for Linear Regression with two Independent variables and one Dependent variable. Focused on Testing, Visualization and Statistical Analysis. The dataset is synthetic and contains 100 instances.

  3. D

    UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View...

    • dataverse.no
    • dataverse.azure.uit.no
    • +1more
    txt, zip
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anurag Dalal; Anurag Dalal (2025). UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View Synthesis [Dataset]. http://doi.org/10.18710/WSU7I6
    Explore at:
    txt(7447), zip(960339536)Available download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    DataverseNO
    Authors
    Anurag Dalal; Anurag Dalal
    License

    https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6

    Description

    The dataset comprises three dynamic scenes characterized by both simple and complex lighting conditions. The quantity of cameras ranges from 4 to 512, including 4, 6, 8, 10, 12, 14, 16, 32, 64, 128, 256, and 512. The point clouds are randomly generated.

  4. f

    Statistical Analysis - pwID

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sousa, Carla; Damásio, Manuel José; Neves, José Carlos (2022). Statistical Analysis - pwID [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000281176
    Explore at:
    Dataset updated
    Feb 11, 2022
    Authors
    Sousa, Carla; Damásio, Manuel José; Neves, José Carlos
    Description

    Dataset for the statistical analysis of the article "Empowerment through Participatory Game Creation: A Case Study with Adults with Intellectual Disability".

  5. i

    Household Health Survey 2012-2013, Economic Research Forum (ERF)...

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Jun 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kurdistan Regional Statistics Office (KRSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Central Statistical Organization (CSO)
    Kurdistan Regional Statistics Office (KRSO)
    Economic Research Forum
    Time period covered
    2012 - 2013
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

    Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    The survey has six main objectives. These objectives are:

    1. Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
    2. Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
    3. Provide data that meet the needs and requirements of national accounts.
    4. Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
    5. Provide detailed indicators on the sources of households and individuals income.
    6. Provide data necessary for formulation of a new consumer price index number.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Design:

    Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

    ----> Sample frame:

    Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

    ----> Sampling Stages:

    In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    ----> Preparation:

    The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

    ----> Questionnaire Parts:

    The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

    Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

    Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

    Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

    Cleaning operations

    ----> Raw Data:

    Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

    ----> Harmonized Data:

    • The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
    • The harmonization process starts with raw data files received from the Statistical Office.
    • A program is generated for each dataset to create harmonized variables.
    • Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

    Response rate

    Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

  6. M

    Multivariate Analysis Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Multivariate Analysis Software Report [Dataset]. https://www.datainsightsmarket.com/reports/multivariate-analysis-software-1402571
    Explore at:
    ppt, doc, pdfAvailable download formats
    Dataset updated
    Oct 8, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global Multivariate Analysis Software market is poised for significant expansion, projected to reach an estimated market size of USD 4,250 million in 2025, with a robust Compound Annual Growth Rate (CAGR) of 12.5% anticipated through 2033. This growth is primarily fueled by the increasing adoption of advanced statistical techniques across a wide spectrum of industries, including the burgeoning pharmaceutical sector, sophisticated chemical research, and complex manufacturing processes. The demand for data-driven decision-making, coupled with the ever-growing volume of complex datasets, is compelling organizations to invest in powerful analytical tools. Key drivers include the rising need for predictive modeling in drug discovery and development, quality control in manufacturing, and risk assessment in financial applications. Emerging economies, particularly in the Asia Pacific region, are also contributing to this upward trajectory as they invest heavily in technological advancements and R&D, further amplifying the need for sophisticated analytical solutions. The market is segmented by application into Medical, Pharmacy, Chemical, Manufacturing, and Marketing. The Pharmacy and Medical applications are expected to witness the highest growth owing to the critical need for accurate data analysis in drug efficacy studies, clinical trials, and personalized medicine. In terms of types, the market encompasses a variety of analytical methods, including Multiple Linear Regression Analysis, Multiple Logistic Regression Analysis, Multivariate Analysis of Variance (MANOVA), Factor Analysis, and Cluster Analysis. While advanced techniques like MANOVA and Factor Analysis are gaining traction for their ability to uncover intricate relationships within data, the foundational Multiple Linear and Logistic Regression analyses remain widely adopted. Restraints, such as the high cost of specialized software and the need for skilled personnel to effectively utilize these tools, are being addressed by the emergence of more user-friendly interfaces and cloud-based solutions. Leading companies like Hitachi High-Tech America, OriginLab Corporation, and Minitab are at the forefront, offering comprehensive suites that cater to diverse analytical needs. This report provides an in-depth analysis of the global Multivariate Analysis Software market, encompassing a study period from 2019 to 2033, with a base and estimated year of 2025 and a forecast period from 2025 to 2033, building upon historical data from 2019-2024. The market is projected to witness significant expansion, driven by increasing data complexity and the growing need for advanced analytical capabilities across various industries. The estimated market size for Multivariate Analysis Software is expected to reach $2.5 billion by 2025, with projections indicating a substantial growth to $5.8 billion by 2033, demonstrating a robust compound annual growth rate (CAGR) of approximately 11.5% during the forecast period.

  7. d

    Louisville Metro KY - Officer Involved Shooting Database and Statistical...

    • catalog.data.gov
    • arc-gis-hub-home-arcgishub.hub.arcgis.com
    • +1more
    Updated Apr 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Officer Involved Shooting Database and Statistical Analysis 10-13-2021 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-officer-involved-shooting-database-and-statistical-analysis-10-13-2021
    Explore at:
    Dataset updated
    Apr 13, 2023
    Dataset provided by
    Louisville/Jefferson County Information Consortium
    Area covered
    Kentucky, Louisville
    Description

    Officer Involved Shooting (OIS) Database and Statistical Analysis. Data is updated after there is an officer involved shooting.PIU#Incident # - the number associated with either the incident or used as reference to store the items in our evidence rooms Date of Occurrence Month - month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Date of Occurrence Day - day of the month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Time of Occurrence - time the incident occurredAddress of incident - the location the incident occurredDivision - the LMPD division in which the incident actually occurredBeat - the LMPD beat in which the incident actually occurredInvestigation Type - the type of investigation (shooting or death)Case Status - status of the case (open or closed)Suspect Name - the name of the suspect involved in the incidentSuspect Race - the race of the suspect involved in the incident (W-White, B-Black)Suspect Sex - the gender of the suspect involved in the incidentSuspect Age - the age of the suspect involved in the incidentSuspect Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Suspect Weapon - the type of weapon the suspect used in the incidentOfficer Name - the name of the officer involved in the incidentOfficer Race - the race of the officer involved in the incident (W-White, B-Black, A-Asian)Officer Sex - the gender of the officer involved in the incidentOfficer Age - the age of the officer involved in the incidentOfficer Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Officer Years of Service - the number of years the officer has been serving at the time of the incidentLethal Y/N - whether or not the incident involved a death (Y-Yes, N-No, continued-pending)Narrative - a description of what was determined from the investigationContact:Carol Boylecarol.boyle@louisvilleky.gov

  8. regression analysis

    • figshare.com
    docx
    Updated Nov 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria Saydakova (2022). regression analysis [Dataset]. http://doi.org/10.6084/m9.figshare.17069888.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Nov 16, 2022
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Victoria Saydakova
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Regression analysis of the business environment well-being index is presented.

  9. d

    Zelig Models for Testing Advanced Statistical Analysis

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2023). Zelig Models for Testing Advanced Statistical Analysis [Dataset]. http://doi.org/10.7910/DVN/6OIEQE
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Description

    The aim of this study is to provide datasets for teaching and testing the methods embedded in the Advanced Statistical Analysis. For each datafile, there is an accompanying document describing (i) which models could be run and tested with this particular data and (ii) the steps for doing so.

  10. f

    Descriptive and statistical analysis of the mental health measures regarding...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trindade, Luciano Imar Palheta; Rummel-Kluge, Christine; de Lucas Freitas, Joanneliese; Kohls, Elisabeth; da Silva Prado, Aneliana; Bianchi, Alessandra Sant’Anna; Baldofski, Sabrina (2023). Descriptive and statistical analysis of the mental health measures regarding the course levels students are currently enrolled in (n = 2,437). [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001001925
    Explore at:
    Dataset updated
    Apr 26, 2023
    Authors
    Trindade, Luciano Imar Palheta; Rummel-Kluge, Christine; de Lucas Freitas, Joanneliese; Kohls, Elisabeth; da Silva Prado, Aneliana; Bianchi, Alessandra Sant’Anna; Baldofski, Sabrina
    Description

    Descriptive and statistical analysis of the mental health measures regarding the course levels students are currently enrolled in (n = 2,437).

  11. students_perfomance_data

    • kaggle.com
    zip
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suhana Lodhi (2025). students_perfomance_data [Dataset]. https://www.kaggle.com/datasets/suhanalodhi/students-perfomance-data
    Explore at:
    zip(558 bytes)Available download formats
    Dataset updated
    Oct 29, 2025
    Authors
    Suhana Lodhi
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The “Students Performance Data” dataset provides academic and demographic information of students. It includes their marks in Maths, Science, and English along with attendance and city details. This dataset is ideal for beginners learning data entry, analysis, and visualization using tools like Excel or Kaggle Notebooks.

  12. Experimental statistics: fostering care datasets

    • data.wu.ac.at
    html
    Updated May 9, 2014
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ofsted (2014). Experimental statistics: fostering care datasets [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/YjJkNzFhNjctOGQ3ZS00OGUwLTgyYmQtY2QyZGJkY2FlZGE4
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 9, 2014
    Dataset provided by
    Ofstedhttps://gov.uk/ofsted
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This is the experiemental fostering care publication comprising of datasets.

    Source agency: Office for Standards in Education, Children's Services and Skills

    Designation: Experimental Official Statistics

    Language: English

    Alternative title: Experimental statistics: fostering care datasets

  13. f

    Data from: SPEED Stat: a free, intuitive, and minimalist spreadsheet program...

    • figshare.com
    • scielo.figshare.com
    xls
    Updated Mar 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    André Mundstock Xavier de Carvalho; Felipe Queiroz Mendes; Fabrícia Queiroz Mendes; Laene de Fátima Tavares (2021). SPEED Stat: a free, intuitive, and minimalist spreadsheet program for statistical analyses of experiments [Dataset]. http://doi.org/10.6084/m9.figshare.14328730.v1
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 26, 2021
    Dataset provided by
    SciELO journals
    Authors
    André Mundstock Xavier de Carvalho; Felipe Queiroz Mendes; Fabrícia Queiroz Mendes; Laene de Fátima Tavares
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract SPEED Stat is a new spreadsheet program for univariate statistical analyses, focused on the dominant profile of agricultural experimentation. The program can perform analysis of variance; tests for normality, homoscedasticity, additivity, outliers; complex contrasts; multiple comparison tests; Scott-Knott's grouping analysis; regression analysis; and others. It has available at speedstatsoftware.wordpress.com.

  14. H

    Replication Data for: The Statistical Analysis of Misreporting on Sensitive...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Dec 20, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Eady (2016). Replication Data for: The Statistical Analysis of Misreporting on Sensitive Survey Questions [Dataset]. http://doi.org/10.7910/DVN/PZKBUX
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Gregory Eady
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Replication data for the article Eady, Gregory (2016) "The Statistical Analysis of Misreporting on Sensitive Survey Questions"

  15. Ad-hoc statistical analysis: 2020/21 Quarter 1

    • s3.amazonaws.com
    • gov.uk
    Updated Apr 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department for Digital, Culture, Media & Sport (2020). Ad-hoc statistical analysis: 2020/21 Quarter 1 [Dataset]. https://s3.amazonaws.com/thegovernmentsays-files/content/161/1616094.html
    Explore at:
    Dataset updated
    Apr 14, 2020
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Department for Digital, Culture, Media & Sport
    Description

    This page lists ad-hoc statistics released during the period April - June 2020. These are additional analyses not included in any of the Department for Digital, Culture, Media and Sport’s standard publications.

    If you would like any further information please contact evidence@culture.gov.uk.

    April 2020 - DCMS Economic Estimates: Experimental quarterly GVA for time series analysis

    These are experimental estimates of the quarterly GVA in chained volume measures by DCMS sectors and subsectors between 2010 and 2018, which have been produced to help the department estimate the effect of shocks to the economy. Due to substantial revisions to the base data and methodology used to construct the tourism satellite account, estimates for the tourism sector are only available for 2017. For this reason “All DCMS Sectors” excludes tourism. Further, as chained volume measures are not available for Civil Society at present, this sector is also not included.

    The methods used to produce these estimates are experimental. The data here are not comparable to those published previously and users should refer to the annual reports for estimates of GVA by businesses in DCMS sectors.

    GVA generated by businesses in DCMS sectors (excluding Tourism and Civil Society) increased by 31.0% between the fourth quarters of 2010 and 2018. The UK economy grew by 16.7% over the same period.

    All individual DCMS sectors (excluding Tourism and Civil Society) grew faster than the UK average between quarter 4 of 2010 and 2018, apart from the Telecoms sector, which decreased by 10.1%.

    April 2020 - Proportion of total DCMS sector turnover generated by businesses in different employment and turnover bands, 2017

    This data shows the proportion of the total turnover in DCMS sectors in 2017 that was generated by businesses according to individual businesses turnover, and by the number of employees.

    In 2017 a larger share of total turnover was generated by DCMS sector businesses with an annual turnover of less than one million pounds (11.4%) than the UK average (8.6%). In general, individual DCMS sectors tended to have a higher proportion of total turnover generated by businesses with individual turnover of less than one million pounds, with the exception of the Gambling (0.2%), Digital (8.2%) and Telecoms (2.0%, wholly within Digital) sectors.

    DCMS sectors tended to have a higher proportion of total turnover generated by large (250 employees or more) businesses (57.8%) than the UK average (51.4%). The exceptions were the Creative Industries (41.7%) and the Cultural sector (42.4%). Of all DCMS sectors, the Gambling sector had the highest proportion of total turnover generated by large businesses (97.5%).

    April 2

  16. e

    Stocking density and femininity index, statistical regions, Slovenia,...

    • data.europa.eu
    html, unknown
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VLADA REPUBLIKE SLOVENIJE STATISTIČNI URAD REPUBLIKE SLOVENIJE, Stocking density and femininity index, statistical regions, Slovenia, half-yearly [Dataset]. https://data.europa.eu/data/datasets/surs05c2010s?locale=en
    Explore at:
    html, unknownAvailable download formats
    Dataset authored and provided by
    VLADA REPUBLIKE SLOVENIJE STATISTIČNI URAD REPUBLIKE SLOVENIJE
    Area covered
    Slovenia
    Description

    This collection automatically captures metadata, the source of which is the GOVERNMENT OF THE REPUBLIC OF SLOVENIA STATISTICAL USE OF THE REPUBLIC OF SLOVENIA and corresponding to the source collection entitled “Statutory population and feminite index, statistical regions, Slovenia, half-yearly”.

    Actual data are available in Px-Axis format (.px). With additional links, you can access the source portal page for viewing and selecting data, as well as the PX-Win program, which can be downloaded free of charge. Both allow you to select data for display, change the format of the printout, and store it in different formats, as well as view and print tables of unlimited size, as well as some basic statistical analyses and graphics.

  17. r

    Data from: Statistical analysis of the design procedure used in reinforced...

    • resodate.org
    • scielo.figshare.com
    Updated Jan 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    F. M. FLORESTA; C. S. VIEIRA; L. A. MENDES; D. L. N. F. AMORIM (2020). Statistical analysis of the design procedure used in reinforced concrete pipes [Dataset]. http://doi.org/10.6084/M9.FIGSHARE.12056292
    Explore at:
    Dataset updated
    Jan 1, 2020
    Dataset provided by
    SciELO journals
    Authors
    F. M. FLORESTA; C. S. VIEIRA; L. A. MENDES; D. L. N. F. AMORIM
    Description

    Abstract Structural design procedures are based on simplified hypotheses that attempt to approximate the actual behaviour. Depending on the adopted hypothesis, the design procedure may not satisfactorily describe the structural actual behaviour. Such condition occurs in the design of reinforced concrete pipes, where there are uncertainties related especially on the internal forces and the installation type of the pipe. Moreover, the main design hypothesis is that the cross section is plane and perpendicular to the deformed axis. Based on materials resistance principles it is known that this hypothesis is unsatisfactory to pipes with aspect ratio lower than ten. Note that the commercial reinforced concrete pipes usually present aspect ratio well below ten. In the light of the foregoing, the main objective of this paper is to analyse the accuracy of the design procedure for reinforced concrete pipes. Therefore, statistical processes were used to compare design values with experimental results. The comparisons in this paper showed that the design procedure results in oversized pipes.

  18. f

    Data from: Reduced Order Machine Learning Models for Accurate Prediction of...

    • acs.figshare.com
    xlsx
    Updated Jul 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vazida Mehtab; Shadab Alam; Sangeetha Povari; Lingaiah Nakka; Yarasi Soujanya; Sumana Chenna (2023). Reduced Order Machine Learning Models for Accurate Prediction of CO2 Capture in Physical Solvents [Dataset]. http://doi.org/10.1021/acs.est.3c00372.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jul 3, 2023
    Dataset provided by
    ACS Publications
    Authors
    Vazida Mehtab; Shadab Alam; Sangeetha Povari; Lingaiah Nakka; Yarasi Soujanya; Sumana Chenna
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    CO2 sorption in physical solvents is one of the promising approaches for carbon capture from highly concentrated CO2 streams at high pressures. Identifying an efficient solvent and evaluating its solubility data at different operating conditions are highly essential for effective capture, which generally involves expensive and time-consuming experimental procedures. This work presents a machine learning based ultrafast alternative for accurate prediction of CO2 solubility in physical solvents using their physical, thermodynamic, and structural properties data. First, a database is established with which several linear, nonlinear, and ensemble models were trained through a systematic cross-validation and grid search method and found that kernel ridge regression (KRR) is the optimum model. Second, the descriptors are ranked based on their complete decomposition contributions derived using principal component analysis. Further, optimum key descriptors (KDs) are evaluated through an iterative sequential addition method with the objective of maximizing the prediction accuracy of the reduced order KRR (r-KRR) model. Finally, the study resulted in the r-KRR model with nine KDs exhibiting the highest prediction accuracy with a minimum root-mean-square error (0.0023), mean absolute error (0.0016), and maximum R2 (0.999). Also, the validity of the database created and ML models developed is ensured through detailed statistical analysis.

  19. f

    Raw data and independent statistical analyses.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salois, Thomas; Karavas, Emily; Gallant, Sarah K.; Ogadi, Peace; Vibho, Amrutaa; Mohammed, Rahisa; Prairie, Michael W.; Rogat, Courtney; White, Michael; Frisbie, Seth H.; Anderson, Charles (2024). Raw data and independent statistical analyses. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001374588
    Explore at:
    Dataset updated
    Sep 17, 2024
    Authors
    Salois, Thomas; Karavas, Emily; Gallant, Sarah K.; Ogadi, Peace; Vibho, Amrutaa; Mohammed, Rahisa; Prairie, Michael W.; Rogat, Courtney; White, Michael; Frisbie, Seth H.; Anderson, Charles
    Description

    Uranium (U) is a radiologically and chemically toxic element that occurs naturally in water, soil, and rock at generally low levels. However, anthropogenic uranium can also leach into groundwater sources due to mining, ore refining, and improper nuclear waste management. Over the last few decades, various methods for measuring uranium have emerged; however, most of these techniques require skilled scientists to run samples on expensive instrumentation for detection or require the pretreatment of samples in complex procedures. In this work, a Schiff base ligand (P1) is used to develop a simple spectrophotometric method for measuring the concentration of uranium (VI) with an accurate and affordable light-emitting diode (LED) spectrophotometer. A test for a higher-order polynomial relationship was used to objectively determine the calibration data’s linearity. This test was done with a Python program on a Raspberry Pi computer that captured the spectrophotometer’s calibration and sample measurement data.

  20. Data from: LatticeQCD/AnalysisToolbox: v1.1.0

    • meta4ds.fokus.fraunhofer.de
    unknown, zip
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo, LatticeQCD/AnalysisToolbox: v1.1.0 [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/oai-zenodo-org-8368581?locale=en
    Explore at:
    unknown, zip(32066426)Available download formats
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    Description

    The AnalysisToolbox set of Python tools for statistically analyzing correlated data. This includes aspects of lattice QCD applications related to QCD phenomenology. We advertise briefly here some features of the AnalysisToolbox: General statistics: Jackknife, bootstrap, Gaussian bootstrap, error propagation, estimate integrated autocorrelation time, and curve fitting with and without Bayesian priors. We stress that these methods are useful generally, independent of physics contexts. QCD physics: Hadron resonance gas model, HotQCD equation of state, QCD beta function, physical constants, and critical exponents for various univesality classes. These methods are useful for QCD phenomenology, independent of lattice contexts. Lattice QCD: Continuum-limit extrapolation, Polyakov loop observables, SU(3) gauge fields, reading in gauge fields, and the static quark-antiquark potential. These methods rather target lattice QCD.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institutes of Health (2025). A simple method for statistical analysis of intensity differences in microarray-derived gene expression data [Dataset]. https://catalog.data.gov/dataset/a-simple-method-for-statistical-analysis-of-intensity-differences-in-microarray-derived-ge

Data from: A simple method for statistical analysis of intensity differences in microarray-derived gene expression data

Related Article
Explore at:
Dataset updated
Sep 7, 2025
Dataset provided by
National Institutes of Health
Description

Background Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.

Search
Clear search
Close search
Google apps
Main menu