100+ datasets found

i
Household Health Survey 2012-2013, Economic Research Forum (ERF)...
datacatalog.ihsn.org
catalog.ihsn.org
Updated Jun 26, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937
Explore at:
Dataset updated
Jun 26, 2017
Dataset provided by
Kurdistan Regional Statistics Office (KRSO)
Economic Research Forum
Central Statistical Organization (CSO)
Time period covered
2012 - 2013
Description
Abstract

The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

The survey has six main objectives. These objectives are:

Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.

Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.

Provide data that meet the needs and requirements of national accounts.

Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.

Provide detailed indicators on the sources of households and individuals income.

Provide data necessary for formulation of a new consumer price index number.

The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

Geographic coverage

National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

Analysis unit

1- Household/family. 2- Individual/person.

Universe

The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

Kind of data

Sample survey data [ssd]

Sampling procedure

----> Design:

Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

----> Sample frame:

Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

----> Sampling Stages:

In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

Mode of data collection

Face-to-face [f2f]

Research instrument

----> Preparation:

The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

----> Questionnaire Parts:

The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

Cleaning operations

----> Raw Data:

Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

----> Harmonized Data:

The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.

The harmonization process starts with raw data files received from the Statistical Office.

A program is generated for each dataset to create harmonized variables.

Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

Response rate

Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
f
Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS:...
frontiersin.figshare.com
zip
Updated Jun 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Loffing (2023). Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS: A Syntax Collection and Tutorial.ZIP [Dataset]. http://doi.org/10.3389/fpsyg.2022.808469.s001
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.3389/fpsyg.2022.808469.s001
Dataset updated
Jun 2, 2023
Dataset provided by
Frontiers
Authors
Florian Loffing
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Transparency in data visualization is an essential ingredient for scientific communication. The traditional approach of visualizing continuous quantitative data solely in the form of summary statistics (i.e., measures of central tendency and dispersion) has repeatedly been criticized for not revealing the underlying raw data distribution. Remarkably, however, systematic and easy-to-use solutions for raw data visualization using the most commonly reported statistical software package for data analysis, IBM SPSS Statistics, are missing. Here, a comprehensive collection of more than 100 SPSS syntax files and an SPSS dataset template is presented and made freely available that allow the creation of transparent graphs for one-sample designs, for one- and two-factorial between-subject designs, for selected one- and two-factorial within-subject designs as well as for selected two-factorial mixed designs and, with some creativity, even beyond (e.g., three-factorial mixed-designs). Depending on graph type (e.g., pure dot plot, box plot, and line plot), raw data can be displayed along with standard measures of central tendency (arithmetic mean and median) and dispersion (95% CI and SD). The free-to-use syntax can also be modified to match with individual needs. A variety of example applications of syntax are illustrated in a tutorial-like fashion along with fictitious datasets accompanying this contribution. The syntax collection is hoped to provide researchers, students, teachers, and others working with SPSS a valuable tool to move towards more transparency in data visualization.
D
UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View...
dataverse.no
search.dataone.org
txt, zip
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anurag Dalal; Anurag Dalal (2025). UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View Synthesis [Dataset]. http://doi.org/10.18710/WSU7I6
Explore at:
txt(7447), zip(960339536)Available download formats
Unique identifier
https://doi.org/10.18710/WSU7I6
Dataset updated
Apr 10, 2025
Dataset provided by
DataverseNO
Authors
Anurag Dalal; Anurag Dalal
License
https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/WSU7I6
Description
The dataset comprises three dynamic scenes characterized by both simple and complex lighting conditions. The quantity of cameras ranges from 4 to 512, including 4, 6, 8, 10, 12, 14, 16, 32, 64, 128, 256, and 512. The point clouds are randomly generated.
E
Data from: META-SAS: A Suite of SAS Programs to Analyze Multienvironment
data.moa.gov.et
html
Updated Jan 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CIMMYT Ethiopia (2025). META-SAS: A Suite of SAS Programs to Analyze Multienvironment [Dataset]. https://data.moa.gov.et/dataset/hdl-11529-10217
Explore at:
htmlAvailable download formats
Dataset updated
Jan 20, 2025
Dataset provided by
CIMMYT Ethiopia
Description
Multienvironment trials (METs) enable the evaluation of the same genotypes under a v ariety of environments and management conditions. We present META (Multi Environment Trial Analysis), a suite of 31 SAS programs that analyze METs with complete or incomplete block designs, with or without adjustment by a covariate. The entire program is run through a graphical user interface. The program can produce boxplots or histograms for all traits, as well as univariate statistics. It also calculates best linear unbiased estimators (BLUEs) and best linear unbiased predictors for the main response variable and BLUEs for all other traits. For all traits, it calculates variance components by restricted maximum likelihood, least significant difference, coefficient of variation, and broad-sense heritability using PROC MIXED. The program can analyze each location separately, combine the analysis by management conditions, or combine all locations. The flexibility and simplicity of use of this program makes it a valuable tool for analyzing METs in breeding and agronomy. The META program can be used by any researcher who knows only a few fundamental principles of SAS.
regression analysis
figshare.com
docx
Updated Nov 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Victoria Saydakova (2022). regression analysis [Dataset]. http://doi.org/10.6084/m9.figshare.17069888.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17069888.v1
Dataset updated
Nov 16, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Victoria Saydakova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Regression analysis of the business environment well-being index is presented.
f
Variable selection, basic meaning and descriptive statistics.
plos.figshare.com
xls
Updated Jun 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zengjin Liu; Zhuo Yu; Jing Zhao; Xibing Han; Caixia Li; Ning Geng; Meilian Yu (2024). Variable selection, basic meaning and descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0306041.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0306041.t001
Dataset updated
Jun 28, 2024
Dataset provided by
PLOS ONE
Authors
Zengjin Liu; Zhuo Yu; Jing Zhao; Xibing Han; Caixia Li; Ning Geng; Meilian Yu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Variable selection, basic meaning and descriptive statistics.
d
Louisville Metro KY - Officer Involved Shooting Database and Statistical...
catalog.data.gov
data.lojic.org
+2more
Updated Apr 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Louisville/Jefferson County Information Consortium (2023). Louisville Metro KY - Officer Involved Shooting Database and Statistical Analysis 10-13-2021 [Dataset]. https://catalog.data.gov/dataset/louisville-metro-ky-officer-involved-shooting-database-and-statistical-analysis-10-13-2021
Explore at:
Dataset updated
Apr 13, 2023
Dataset provided by
Louisville/Jefferson County Information Consortium
Area covered
Kentucky, Louisville
Description
Officer Involved Shooting (OIS) Database and Statistical Analysis. Data is updated after there is an officer involved shooting.PIU#Incident # - the number associated with either the incident or used as reference to store the items in our evidence rooms Date of Occurrence Month - month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Date of Occurrence Day - day of the month the incident occurred (Note the year is labeled on the tab of the spreadsheet)Time of Occurrence - time the incident occurredAddress of incident - the location the incident occurredDivision - the LMPD division in which the incident actually occurredBeat - the LMPD beat in which the incident actually occurredInvestigation Type - the type of investigation (shooting or death)Case Status - status of the case (open or closed)Suspect Name - the name of the suspect involved in the incidentSuspect Race - the race of the suspect involved in the incident (W-White, B-Black)Suspect Sex - the gender of the suspect involved in the incidentSuspect Age - the age of the suspect involved in the incidentSuspect Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Suspect Weapon - the type of weapon the suspect used in the incidentOfficer Name - the name of the officer involved in the incidentOfficer Race - the race of the officer involved in the incident (W-White, B-Black, A-Asian)Officer Sex - the gender of the officer involved in the incidentOfficer Age - the age of the officer involved in the incidentOfficer Ethnicity - the ethnicity of the suspect involved in the incident (H-Hispanic, N-Not Hispanic)Officer Years of Service - the number of years the officer has been serving at the time of the incidentLethal Y/N - whether or not the incident involved a death (Y-Yes, N-No, continued-pending)Narrative - a description of what was determined from the investigationContact:Carol Boylecarol.boyle@louisvilleky.gov
Experimental statistics: fostering care datasets
data.wu.ac.at
data.europa.eu
html
Updated May 9, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ofsted (2014). Experimental statistics: fostering care datasets [Dataset]. https://data.wu.ac.at/schema/data_gov_uk/YjJkNzFhNjctOGQ3ZS00OGUwLTgyYmQtY2QyZGJkY2FlZGE4
Explore at:
htmlAvailable download formats
Dataset updated
May 9, 2014
Dataset provided by
Ofstedhttps://gov.uk/ofsted
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
This is the experiemental fostering care publication comprising of datasets.

Source agency: Office for Standards in Education, Children's Services and Skills

Designation: Experimental Official Statistics

Language: English

Alternative title: Experimental statistics: fostering care datasets
e
Introduction to Data Analytics
paper.erudition.co.in
html
Updated Jul 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Einetic (2025). Introduction to Data Analytics [Dataset]. https://paper.erudition.co.in/makaut/bachelor-in-business-administration-2020-2021/5/data-analytics-skills-for-managers
Explore at:
htmlAvailable download formats
Dataset updated
Jul 13, 2025
Dataset authored and provided by
Einetic
License
https://paper.erudition.co.in/termshttps://paper.erudition.co.in/terms
Description
Question Paper Solutions of chapter Introduction to Data Analytics of Data Analytics Skills for Managers, 5th Semester , Bachelor in Business Administration 2020 - 2021
d
CRIME STATISTICS DATA ANALYTICS
search.dataone.org
borealisdata.ca
+1more
Updated Dec 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kwong, Cheryl; Anweiler, Drew; Sarafraz, Mary (2023). CRIME STATISTICS DATA ANALYTICS [Dataset]. http://doi.org/10.5683/SP2/IE6NRY
Explore at:
Unique identifier
https://doi.org/10.5683/SP2/IE6NRY
Dataset updated
Dec 28, 2023
Dataset provided by
Borealis
Authors
Kwong, Cheryl; Anweiler, Drew; Sarafraz, Mary
Description
Crime isn't a topic most people want to use mental energy to think about. We want to avoid harm, protect our loved ones, and hold on to what we claim is ours. So how do we remain vigilant without digging too deep into the filth that is crime? Data, of course. The focus of our study is to explore possible trends between crime and communities in the city of Calgary. Our purpose is visualize Calgary criminal behaviour in order to help increase awareness for both citizens and law enforcement. Through the use of our visuals, individuals can make more informed decisions to improve the overall safety of their lives. Some of the main concerns of the study include: how crime rates increase with population, which areas in Calgary have the most crime, and if crime adheres to time-sensative patterns.
d
Statistics analysis table for the adjustment of income and deduction amounts...
data.gov.tw
csv
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Taxation Bureau of Taipei,Ministry of Finance (2025). Statistics analysis table for the adjustment of income and deduction amounts at the Taipei National Tax Bureau of the Ministry of Finance [Dataset]. https://data.gov.tw/en/datasets/132629
Explore at:
csvAvailable download formats
Dataset updated
Jun 4, 2025
Dataset authored and provided by
National Taxation Bureau of Taipei,Ministry of Finance
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
The statistical analysis table for the adjustments of income and deductions provided by the Taipei National Taxation Bureau of the Ministry of Finance.
f
Replication data for: Collection and statistical analysis of a fixed-text...
usn.figshare.com
txt
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Halvor Nybø Risto; Olaf Hallan Graven (2025). Replication data for: Collection and statistical analysis of a fixed-text keystroke dynamics authentication data set [Dataset]. http://doi.org/10.23642/usn.23790858.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.23642/usn.23790858.v1
Dataset updated
Feb 11, 2025
Dataset provided by
University of South-Eastern Norway
Authors
Halvor Nybø Risto; Olaf Hallan Graven
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data set for keystroke dynamics authentication benchmarking and research, containing 6 passwords typed by a wide set of people, containing a large set of "attackers" and a smaller set of "legitimate users". This data set was collected for the paper "Collection and statistical analysis of a fixed-text keystroke dynamics authentication data set" for the CSNet23 conference.Article Abstract :Keystroke dynamics authentication is a promising method of improving account security with minimal detriment for user convenience. While there is an abundance of research, there is a lack of available data sets. In this study, data sets for keystroke dynamics authentication were collected for a set of 6 passwords from a group of participants, and a correlation algorithm was developed to analyze and use these data sets for authentication. The experiments aim to produce data for keystroke dynamics authentication benchmarking, and to show the effect of typing speed and consistency, password length and entropy on prediction accuracy. Through simple correlation methods, the authors achieve an Equal Error Rate varying between a range of 2.57% and 29.7%. These result give insight into what may cause the accuracy to vary depending on the person and the password.
U
2010 County and City-Level Water-Use Data and Associated Explanatory...
data.usgs.gov
datadiscoverystudio.org
+3more
Updated Oct 29, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scott Worland (2017). 2010 County and City-Level Water-Use Data and Associated Explanatory Variables [Dataset]. http://doi.org/10.5066/F72Z14FR
Explore at:
Unique identifier
https://doi.org/10.5066/F72Z14FR
Dataset updated
Oct 29, 2017
Dataset provided by
United States Geological Surveyhttp://www.usgs.gov/
Authors
Scott Worland
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Time period covered
2014
Description
This data release contains the input-data files and R scripts associated with the analysis presented in [citation of manuscript]. The spatial extent of the data is the contiguous U.S. The input-data files include one comma separated value (csv) file of county-level data, and one csv file of city-level data. The county-level csv (“county_data.csv”) contains data for 3,109 counties. This data includes two measures of water use, descriptive information about each county, three grouping variables (climate region, urban class, and economic dependency), and contains 18 explanatory variables: proportion of population growth from 2000-2010, fraction of withdrawals from surface water, average daily water yield, mean annual maximum temperature from 1970-2010, 2005-2010 maximum temperature departure from the 40-year maximum, mean annual precipitation from 1970-2010, 2005-2010 mean precipitation departure from the 40-year mean, Gini income disparity index, percent of county population with a ...
Strongly Basic Anion Exchange Resin Market By Type (Type I, Type II), By...
statsndata.org
excel, pdf
Updated May 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Strongly Basic Anion Exchange Resin Market By Type (Type I, Type II), By Application (Purification of Pharmaceutical, Water Treatment, Purification of Food) Industry Analysis: Future Opportunities and Trends 2025-2032 [Dataset]. https://www.statsndata.org/report/strongly-basic-anion-exchange-resin-market-330354
Explore at:
excel, pdfAvailable download formats
Dataset updated
May 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Strongly Basic Anion Exchange Resin market is expected to see major growth from 2024 to 2031, with projections indicating it will reach USD [XX] million by the end of the forecast period. Starting from a base of USD [XX] million in 2024, the Strongly Basic Anion Exchange Resin Industry is set to expand at a compoun
o
SAOD- Statistical Analysis of Omics Data
explore.openaire.eu
Updated Jun 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Denis Puthier (2023). SAOD- Statistical Analysis of Omics Data [Dataset]. http://doi.org/10.5281/zenodo.10649694
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10649694
Dataset updated
Jun 23, 2023
Authors
Denis Puthier
Description
Some datasets for the SAOD (Statistical Analysis of Omics Data) course (Aix-Marseille Université, D. Puthier). The Homo_sapiens.GRCh38.110.chr.tsv was produced using the following command: gtftk retrieve -r 110 gtftk convert_ensembl -i Homo_sapiens.GRCh38.110.chr.gtf.gz | gtftk nb_exons | gtftk feature_size -t mature_rna | gtftk feature_size -t transcript -k tx_genomic_size | gtftk exon_sizes | gtftk intron_sizes | gtftk select_by_key -t | gtftk tabulate -k '*' -u -x > Homo_sapiens.GRCh38.110.chr.tsv
Global Basic Starch Market Future Outlook 2025-2032
statsndata.org
excel, pdf
Updated Jun 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Basic Starch Market Future Outlook 2025-2032 [Dataset]. https://www.statsndata.org/report/basic-starch-market-18567
Explore at:
pdf, excelAvailable download formats
Dataset updated
Jun 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Basic Starch market is an essential segment of the global food and industrial sectors, encompassing a variety of starches derived primarily from corn, wheat, tapioca, and potatoes. These starches are integral to countless applications, serving not only as thickening and binding agents in food products but also a
Statistical analysis of food poisoning cases by the number of patients in...
data.gov.tw
csv, json, xml
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Food and Drug Administration (2025). Statistical analysis of food poisoning cases by the number of patients in the place of consumption [Dataset]. https://data.gov.tw/en/datasets/9839
Explore at:
csv, json, xmlAvailable download formats
Dataset updated
Jun 2, 2025
Dataset authored and provided by
Food and Drug Administrationhttp://www.fda.gov/
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Description
This dataset provides statistics on the number of food poisoning cases in eating places after the year 1981. It is available for use by the general public, industry, academic institutions, and others.
f
Data from: SPEED Stat: a free, intuitive, and minimalist spreadsheet program...
figshare.com
scielo.figshare.com
xls
Updated Mar 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
André Mundstock Xavier de Carvalho; Felipe Queiroz Mendes; Fabrícia Queiroz Mendes; Laene de Fátima Tavares (2021). SPEED Stat: a free, intuitive, and minimalist spreadsheet program for statistical analyses of experiments [Dataset]. http://doi.org/10.6084/m9.figshare.14328730.v1
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14328730.v1
Dataset updated
Mar 26, 2021
Dataset provided by
SciELO journals
Authors
André Mundstock Xavier de Carvalho; Felipe Queiroz Mendes; Fabrícia Queiroz Mendes; Laene de Fátima Tavares
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract SPEED Stat is a new spreadsheet program for univariate statistical analyses, focused on the dominant profile of agricultural experimentation. The program can perform analysis of variance; tests for normality, homoscedasticity, additivity, outliers; complex contrasts; multiple comparison tests; Scott-Knott's grouping analysis; regression analysis; and others. It has available at speedstatsoftware.wordpress.com.
n
Data from: Environmental impact assessment for large carnivores: a...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Apr 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gonçalo Ferrão da Costa; Miguel Mascarenhas; Carlos Fonseca; Chris Sutherland (2024). Environmental impact assessment for large carnivores: a methodological review of the wolf (Canis lupus) monitoring in Portugal [Dataset]. http://doi.org/10.5061/dryad.t1g1jwt87
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t1g1jwt87
Dataset updated
Apr 19, 2024
Dataset provided by
University of St Andrews
BE Bioinsight & Ecoa
University of Aveiro
Authors
Gonçalo Ferrão da Costa; Miguel Mascarenhas; Carlos Fonseca; Chris Sutherland
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Portugal
Description
The continuous growth of the global human population results in increased use and change of landscapes, with infrastructures like transportation or energy facilities, being a particular risk to large carnivores. Environmental Impact Assessments were established to identify the probable environmental consequences of any new proposed project, find ways to reduce impacts, and provide evidence to inform decision making and mitigation. Portugal has a wolf population of around 300 individuals, designated as an endangered species with full legal protection. They occupy the northern mountainous areas of the country which has also been the focus of new human infrastructures over the last 20 years. Consequently, dozens of wolf monitoring programs have been established to evaluate wolf population status, to identify impacts, and to inform appropriate mitigation or compensation measures. We reviewed Portuguese wolf monitoring programs to answer four key questions: do wolf programs examine adequate biological parameters to meet monitoring objectives? is the study design suitable for measuring impacts? are data collection methods and effort sufficient for the stated inference objectives? and do statistical analyses of the data lead to robust conclusions? Overall, we found a mismatch between the stated aims of wolf monitoring and the results reported, and often neither aligns with the existing national wolf monitoring guidelines. Despite the vast effort expended and the diversity of methods used, data analysis makes almost exclusive use of relative indices or summary statistics, with little consideration of the potential biases that arise through the (imperfect) observational process. This makes comparisons of impacts across space and time difficult and is therefore unlikely to contribute to a general understanding of wolf responses to infrastructure-related disturbance. We recommend the development of standardized monitoring protocols and advocate for the use of statistical methods that account for imperfect detection to guarantee accuracy, reproducibility, and efficacy of the programs. Methods We reviewed all major wolf monitoring programs developed for environmental impact assessments in Portugal since 2002 (Table S1, Supplementary material). Given that the focus here is on the adequacy of targeted wolf monitoring for delivering conclusions about the effects of infrastructure development, we reviewed only monitoring programs that were specifically designed for wolves and not those concerned with general mammalian assessment. The starting point was a compilation from the 2019-2021 National Wolf Census (Pimenta et al., 2023), where every wolf monitoring program that occurred between 2014 and 2019 in Portugal was identified. The list was completed with projects that started before 2014 or after 2019 based on personal knowledge, inquires to principal scientific teams, governmental agencies, and EIA consultants. Depending on duration, wolf monitoring programs can produce several, usually annual, reports that are not peer-reviewed and do not appear on standard search engines (e.g., Web of Science or Google Schoolar) but are publicly available from the Portuguese Environmental Agency (APA – www.apambiente.pt). We conducted an online search on APA´s search engine (https://siaia.apambiente.pt/) and identified a total of 30 projects. For each of these projects, we were interested in the first and the last report to identify any methodological changes. If the last report was not present, we reviewed the most recent one. If no report was present, we requested it from the team responsible. Our investigation centred on characterizing and quantifying four components of wolf monitoring programs that are interlinked and that should be ideally determined by the initial objectives: (1) biological parameters, i.e., what wolf parameters were studied to assess impacts; (2) study design, i.e., what sampling schemes were followed to collect and analyse data; (3) data collection, i.e., which sampling methodology and how much effort was used to collect data; and (4) data analysis, i.e., how data were analysed to estimate relevant parameters and assess impact. Biological parameters were identified and classified under two categories: occurrence and demography, which broadly correspond to the necessary inputs to assess impacts like exclusion effect and changes in reproductive patterns. Occurrence-related parameters refer to variables used to measure the presence or absence of wolves, whereas demographic parameters refer to variables that intend to measure population-level effects such as abundance, density, survival, or reproduction. We also recorded whether any effort was made to quantify prey population distribution or abundance as recommended in the guidelines. For study design, we reviewed the sampling design of the project, with specific focus on the spatial and temporal aspect of the study such as total area surveyed, the definition of a sampling site within this region (i.e., resolution), the duration of the study and the number of sampling seasons. The goal here was to determine whether the sampling scheme used was appropriate for assessing infrastructure impacts on wolf distribution or demography, depending on what the focus was. For data collection, we identified the main data collection methodologies used and the corresponding sampling effort. By far the most frequent method used is sign surveys, and specifically scat surveys, and for these studies we recorded whether genetic identification of species or individuals based on faecal DNA was attempted. We compare how sampling effort varies by the various inference objectives and, as above, assess which, if any, project or data collection approach is most likely to produce evidence of impact. We divided the Analysis component into two groups: single-year and multi-year analyses. For single-year analysis we identified how monitoring projects used data to make inferences about the state biological parameters of interest and discuss the associated strengths and weaknesses. For multi-year analyses, we recorded how differences or trends were quantified and associated with infrastructure impacts, commenting on the statistical robustness of the analyses used across the projects.
"9,565 Top-Rated Movies Dataset"
kaggle.com
Updated Aug 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Harshit@85 (2024). "9,565 Top-Rated Movies Dataset" [Dataset]. https://www.kaggle.com/datasets/harshit85/9565-top-rated-movies-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 19, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Harshit@85
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About the Dataset

Title: 9,565 Top-Rated Movies Dataset

Description:
This dataset offers a comprehensive collection of 9,565 of the highest-rated movies according to audience ratings on the Movie Database (TMDb). The dataset includes detailed information about each movie, such as its title, overview, release date, popularity score, average vote, and vote count. It is designed to be a valuable resource for anyone interested in exploring trends in popular cinema, analyzing factors that contribute to a movie’s success, or building recommendation engines.

Key Features: - Title: The official title of each movie. - Overview: A brief synopsis or description of the movie's plot. - Release Date: The release date of the movie, formatted as YYYY-MM-DD. - Popularity: A score indicating the current popularity of the movie on TMDb, which can be used to gauge current interest. - Vote Average: The average rating of the movie, based on user votes. - Vote Count: The total number of votes the movie has received.

Data Source: The data was sourced from the TMDb API, a well-regarded platform for movie information, using the /movie/top_rated endpoint. The dataset represents a snapshot of the highest-rated movies as of the time of data collection.

Data Collection Process: - API Access: Data was retrieved programmatically using TMDb’s API. - Pagination Handling: Multiple API requests were made to cover all pages of top-rated movies, ensuring the dataset’s comprehensiveness. - Data Aggregation: Collected data was aggregated into a single, unified dataset using the pandas library. - Cleaning: Basic data cleaning was performed to remove duplicates and handle missing or malformed data entries.

Potential Uses: - Trend Analysis: Analyze trends in movie ratings over time or compare ratings across different genres. - Recommendation Systems: Build and train models to recommend movies based on user preferences. - Sentiment Analysis: Perform text analysis on movie overviews to understand common themes and sentiments. - Statistical Analysis: Explore the relationship between popularity, vote count, and average ratings.

Data Format: The dataset is provided in a structured tabular format (e.g., CSV), making it easy to load into data analysis tools like Python, R, or Excel.

Usage License: The dataset is shared under [appropriate license], ensuring that it can be used for educational, research, or commercial purposes, with proper attribution to the data source (TMDb).

This description provides a clear and detailed overview, helping potential users understand the dataset's content, origin, and potential applications.

Facebook

Twitter

Click to copy link

Link copied

Cite

Central Statistical Organization (CSO) (2017). Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://datacatalog.ihsn.org/catalog/6937

Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq

Explore at:

Dataset updated

Jun 26, 2017

Dataset provided by

Kurdistan Regional Statistics Office (KRSO)
Economic Research Forum
Central Statistical Organization (CSO)

Time period covered

2012 - 2013

Description

Abstract

The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.

----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:

Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

The survey has six main objectives. These objectives are:

Provide data for poverty analysis and measurement and monitor, evaluate and update the implementation Poverty Reduction National Strategy issued in 2009.
Provide comprehensive data system to assess household social and economic conditions and prepare the indicators related to the human development.
Provide data that meet the needs and requirements of national accounts.
Provide detailed indicators on consumption expenditure that serve making decision related to production, consumption, export and import.
Provide detailed indicators on the sources of households and individuals income.
Provide data necessary for formulation of a new consumer price index number.

The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

Geographic coverage

National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

Analysis unit

1- Household/family. 2- Individual/person.

Universe

The survey was carried out over a full year covering all governorates including those in Kurdistan Region.

Kind of data

Sample survey data [ssd]

Sampling procedure

----> Design:

Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.

----> Sample frame:

Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.

----> Sampling Stages:

In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.

Mode of data collection

Face-to-face [f2f]

Research instrument

----> Preparation:

The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.

----> Questionnaire Parts:

The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job

Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.

Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days

Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.

Cleaning operations

----> Raw Data:

Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.

----> Harmonized Data:

The SPSS package is used to harmonize the Iraq Household Socio Economic Survey (IHSES) 2007 with Iraq Household Socio Economic Survey (IHSES) 2012.
The harmonization process starts with raw data files received from the Statistical Office.
A program is generated for each dataset to create harmonized variables.
Data is saved on the household and individual level, in SPSS and then converted to STATA, to be disseminated.

Response rate

Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).

Clear search

Close search

Google apps

Main menu

Household Health Survey 2012-2013, Economic Research Forum (ERF)...

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate

Data_Sheet_1_Raw Data Visualization for Common Factorial Designs Using SPSS:...

UnrealGaussianStat: Synthetic dataset for statistical analysis on Novel View...

Data from: META-SAS: A Suite of SAS Programs to Analyze Multienvironment

regression analysis

Variable selection, basic meaning and descriptive statistics.

Louisville Metro KY - Officer Involved Shooting Database and Statistical...

Experimental statistics: fostering care datasets

Introduction to Data Analytics

CRIME STATISTICS DATA ANALYTICS

Statistics analysis table for the adjustment of income and deduction amounts...

Replication data for: Collection and statistical analysis of a fixed-text...

2010 County and City-Level Water-Use Data and Associated Explanatory...

Strongly Basic Anion Exchange Resin Market By Type (Type I, Type II), By...

SAOD- Statistical Analysis of Omics Data

Global Basic Starch Market Future Outlook 2025-2032

Statistical analysis of food poisoning cases by the number of patients in...

Data from: SPEED Stat: a free, intuitive, and minimalist spreadsheet program...

Data from: Environmental impact assessment for large carnivores: a...

"9,565 Top-Rated Movies Dataset"

About the Dataset

Household Health Survey 2012-2013, Economic Research Forum (ERF) Harmonization Data - Iraq

Abstract

Geographic coverage

Analysis unit

Universe

Kind of data

Sampling procedure

Mode of data collection

Research instrument

Cleaning operations

Response rate