These are synthetically generated unit and area level population and sample data that can be used for testing model-based unit-level small area methods. To prevent disclosure issues the datasets have been generated by repeated (Monte-Carlo) sampling of real EU-SILC (Survey of Income and Living Conditions) data in Austria. The data include geographical identifies and can be used for fitting unit-level (Battese-Harter and Fuller type) models and area level models (Fay-Herriott- type) models. The datasets are part of the R package emdi. Examples of the use of the data can be found in the emdi manual available via https://cran.r-project.org/web/packages/emdi/emdi.pdf and in Kreutzmann et al. (2019)
Kreutzmann, A. K., Pannier, S., Rojas-Perilla, N., Schmid, T., Templ, M., & Tzavidis, N. (2019). The R package emdi for the estimation and mapping of regional disaggregated indicators. Journal of Statistical Software, 91(7). https://doi.org/10.18637/jss.v091.i07
Reliable statistics are crucial for policy relevant research. Small Area Estimation (SAE) methods generate robust reliable and consistent statistics at geographical scales for which survey data are either non-existent or too sparse to provide direct estimates of acceptable accuracy. The last decade has seen a rapid increase in the use of SAE. Statistical agencies and Governmental organisations are actively developing their own suites of estimates. In the UK the Office for National Statistics (ONS) has responded to user demands by producing estimates of average household income for wards and using SAE to answer queries from local authorities, policy advisers and government departments. The Welsh Assembly Government (WAG) is actively seeking to develop capacity for SAE. Public Health England produces SAEs of adolescent smoking and chronic kidney disease. Initial demands for small area statistics are now shifting to requirements for more complex statistics that extend beyond averages and proportions to encompass estimates of statistical distributions, multidimensional indicators (e.g. inequality and deprivation indicators) and methods for replacing the Census and adjusting Census results for undercount. These developing requirements pose significant methodological and applied real-world challenges. These challenges are deepened by different methodological approaches to SAE remaining largely unconnected, locked in disciplinary silos. The technical presentation of SAE also impedes more widespread uptake by social scientists and understanding by users. The proposed programme of work aims to (a) develop novel SAE methodologies to better serve the needs of users and producers of SAE (b) bridge different methodological approaches to SAE, (c) apply SAE for answering substantive questions in the social sciences and (d) 'Mainstream' SAE within the quantitative social sciences through the creation of methodologically comprehensive and accessible resources. The project comprises three work packages of methodological innovative research designed to deepen the understanding of SAE and achieve the aforementioned aims. The project will capitalise on a cross-disciplinary research team drawn together through an NCRM methodological network and reflecting a large part of the SAE expertise in the UK. Through long-standing collaborations with national and international agencies in the UK, Mexico and Brazil, which are placed at the centre of the project, we enjoy access to individual level secondary survey and Census data. Collaboration with key SAE users will ensure that the project remains relevant to user needs and that methodologies are used for expanding the set of small area statistics currently available. The involvement of international experts ensures the quality and relevance of the research. Substantive outputs will include SAEs of attributes of interest to users, including income, inequality, deprivation, health, ethnicity and a realistic pseudo-Census dataset for use by other researchers. The project will advance knowledge across disciplines in the social sciences including social statistics, applied economics, human geography and sociology. It will additionally impact on the production of official and Census statistics. The project is committed to adding value to NCRM's training and capacity building activities by developing new resources.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Rutherford College by gender across 18 age groups. It lists the male and female population in each age group along with the gender ratio for Rutherford College. The dataset can be utilized to understand the population distribution of Rutherford College by gender and age. For example, using this dataset, we can identify the largest age group for both Men and Women in Rutherford College. Additionally, it can be used to see how the gender ratio changes from birth to senior most age group and male to female ratio across each age group for Rutherford College.
Key observations
Largest age group (population): Male # 60-64 years (98) | Female # 30-34 years (70). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Rutherford College Population by Gender. You can refer the same here
Financial overview and grant giving statistics of Metro Ideas Project
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the population of Rutherford College by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Rutherford College across both sexes and to determine which sex constitutes the majority.
Key observations
There is a slight majority of female population, with 50.26% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Scope of gender :
Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Rutherford College Population by Race & Ethnicity. You can refer the same here
Spatial analysis and statistical summaries of the Protected Areas Database of the United States (PAD-US) provide land managers and decision makers with a general assessment of management intent for biodiversity protection, natural resource management, and recreation access across the nation. The PAD-US 3.0 Combined Fee, Designation, Easement feature class (with Military Lands and Tribal Areas from the Proclamation and Other Planning Boundaries feature class) was modified to remove overlaps, avoiding overestimation in protected area statistics and to support user needs. A Python scripted process ("PADUS3_0_CreateVectorAnalysisFileScript.zip") associated with this data release prioritized overlapping designations (e.g. Wilderness within a National Forest) based upon their relative biodiversity conservation status (e.g. GAP Status Code 1 over 2), public access values (in the order of Closed, Restricted, Open, Unknown), and geodatabase load order (records are deliberately organized in the PAD-US full inventory with fee owned lands loaded before overlapping management designations, and easements). The Vector Analysis File ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") associated item of PAD-US 3.0 Spatial Analysis and Statistics ( https://doi.org/10.5066/P9KLBB5D ) was clipped to the Census state boundary file to define the extent and serve as a common denominator for statistical summaries. Boundaries of interest to stakeholders (State, Department of the Interior Region, Congressional District, County, EcoRegions I-IV, Urban Areas, Landscape Conservation Cooperative) were incorporated into separate geodatabase feature classes to support various data summaries ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip") and Comma-separated Value (CSV) tables ("PADUS3_0SummaryStatistics_TabularData_CSV.zip") summarizing "PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.zip" are provided as an alternative format and enable users to explore and download summary statistics of interest (Comma-separated Table [CSV], Microsoft Excel Workbook [.XLSX], Portable Document Format [.PDF] Report) from the PAD-US Lands and Inland Water Statistics Dashboard ( https://www.usgs.gov/programs/gap-analysis-project/science/pad-us-statistics ). In addition, a "flattened" version of the PAD-US 3.0 combined file without other extent boundaries ("PADUS3_0VectorAnalysisFile_ClipCensus.zip") allow for other applications that require a representation of overall protection status without overlapping designation boundaries. The "PADUS3_0VectorAnalysis_State_Clip_CENSUS2020" feature class ("PADUS3_0VectorAnalysisFileOtherExtents_Clip_Census.gdb") is the source of the PAD-US 3.0 raster files (associated item of PAD-US 3.0 Spatial Analysis and Statistics, https://doi.org/10.5066/P9KLBB5D ). Note, the PAD-US inventory is now considered functionally complete with the vast majority of land protection types represented in some manner, while work continues to maintain updates and improve data quality (see inventory completeness estimates at: http://www.protectedlands.net/data-stewards/ ). In addition, changes in protected area status between versions of the PAD-US may be attributed to improving the completeness and accuracy of the spatial data more than actual management actions or new acquisitions. USGS provides no legal warranty for the use of this data. While PAD-US is the official aggregation of protected areas ( https://www.fgdc.gov/ngda-reports/NGDA_Datasets.html ), agencies are the best source of their lands data.
The Project for Statistics on Living standards and Development was a countrywide World Bank sponsored Living Standards Measurement Survey. It covered approximately 9000 households, drawn from a representative sample of South African households. The fieldwork was undertaken during the nine months leading up to the country's first democratic elections at the end of April 1994. The purpose of the survey was to collect data on the conditions under which South Africans live in order to provide policymakers with the data necessary for development planning. This data would aid the implementation of goals such as those outlined in the Government of National Unity's Reconstruction and Development Programme.
The survey had national coverage
Households and individuals
The survey covered all household members. Individuals in hospitals, old age homes, hotels and hostels of educational institutions were not included in the sample. Migrant labour hostels were included. In addition to those that turned up in the selected ESDs, a sample of three hostels was chosen from a national list provided by the Human Sciences Research Council and within each of these hostels a representative sample was drawn for the households in ESDs.
Sample survey data
Face-to-face [f2f]
The main instrument used in the survey was a comprehensive household questionnaire. This questionnaire covered a wide range of topics but was not intended to provide exhaustive coverage of any single subject. In other words, it was an integrated questionnaire aimed at capturing different aspects of living standards. The topics covered included demographics, household services, household expenditure, educational status and expenditure, remittances and marital maintenance, land access and use, employment and income, health status and expenditure and anthropometry (children under the age of six were weighed and their heights measured). This questionnaire was available to households in two languages, namely English and Afrikaans. In addition, interviewers had in their possession a translation in the dominant African language/s of the region.
In addition to the detailed household questionnaire, a community questionnaire was administered in each cluster of the sample. The purpose of this questionnaire was to elicit information on the facilities available to the community in each cluster. Questions related primarily to the provision of education, health and recreational facilities. Furthermore there was a detailed section for the prices of a range of commodities from two retail sources in or near the cluster: a formal source such as a supermarket and a less formal one such as the "corner cafe" or a "spaza". The purpose of this latter section was to obtain a measure of regional price variation both by region and by retail source. These prices were obtained by the interviewer. For the questions relating to the provision of facilities, respondents were "prominent" members of the community such as school principals, priests and chiefs.
A literacy assessment module (LAM) was administered to two respondents in each household, (a household member 13-18 years old and a one between 18 and 50) to assess literacy levels.
The data collected in clusters 217 and 218 are highly unreliable and have therefore been removed from the dataset currently available on the portal. Researchers who have downloaded the data in the past should download version 2.0 of the dataset to ensure they have the corrected data. Version 2.0 of the dataset excludes two clusters from both the 1993 and 1998 samples. During follow-up field research for the KwaZulu-Natal Income Dynamics Study (KIDS) in May 2001 it was discovered that all 39 household interviews in clusters 217 and 218 had been fabricated in both 1993 and 1998. These households have been dropped in the updated release of the data. In addition, cluster 206 is now coded as urban as this was incorrectly coded as rural in the first release of the data. Note: Weights calculated by the World Bank and provided with the original data are NOT updated to reflect these changes.
Abstract copyright UK Data Service and data collection copyright owner.
The main objective of the HEIS survey is to obtain detailed data on household expenditure and income, linked to various demographic and socio-economic variables, to enable computation of poverty indices and determine the characteristics of the poor and prepare poverty maps. Therefore, to achieve these goals, the sample had to be representative on the sub-district level. The raw survey data provided by the Statistical Office was cleaned and harmonized by the Economic Research Forum, in the context of a major research project to develop and expand knowledge on equity and inequality in the Arab region. The main focus of the project is to measure the magnitude and direction of change in inequality and to understand the complex contributing social, political and economic forces influencing its levels. However, the measurement and analysis of the magnitude and direction of change in this inequality cannot be consistently carried out without harmonized and comparable micro-level data on income and expenditures. Therefore, one important component of this research project is securing and harmonizing household surveys from as many countries in the region as possible, adhering to international statistics on household living standards distribution. Once the dataset has been compiled, the Economic Research Forum makes it available, subject to confidentiality agreements, to all researchers and institutions concerned with data collection and issues of inequality.
Data collected through the survey helped in achieving the following objectives: 1. Provide data weights that reflect the relative importance of consumer expenditure items used in the preparation of the consumer price index 2. Study the consumer expenditure pattern prevailing in the society and the impact of demographic and socio-economic variables on those patterns 3. Calculate the average annual income of the household and the individual, and assess the relationship between income and different economic and social factors, such as profession and educational level of the head of the household and other indicators 4. Study the distribution of individuals and households by income and expenditure categories and analyze the factors associated with it 5. Provide the necessary data for the national accounts related to overall consumption and income of the household sector 6. Provide the necessary income data to serve in calculating poverty indices and identifying the poor characteristics as well as drawing poverty maps 7. Provide the data necessary for the formulation, follow-up and evaluation of economic and social development programs, including those addressed to eradicate poverty
National
Sample survey data [ssd]
The Household Expenditure and Income survey sample for 2010, was designed to serve the basic objectives of the survey through providing a relatively large sample in each sub-district to enable drawing a poverty map in Jordan. The General Census of Population and Housing in 2004 provided a detailed framework for housing and households for different administrative levels in the country. Jordan is administratively divided into 12 governorates, each governorate is composed of a number of districts, each district (Liwa) includes one or more sub-district (Qada). In each sub-district, there are a number of communities (cities and villages). Each community was divided into a number of blocks. Where in each block, the number of houses ranged between 60 and 100 houses. Nomads, persons living in collective dwellings such as hotels, hospitals and prison were excluded from the survey framework.
A two stage stratified cluster sampling technique was used. In the first stage, a cluster sample proportional to the size was uniformly selected, where the number of households in each cluster was considered the weight of the cluster. At the second stage, a sample of 8 households was selected from each cluster, in addition to another 4 households selected as a backup for the basic sample, using a systematic sampling technique. Those 4 households were sampled to be used during the first visit to the block in case the visit to the original household selected is not possible for any reason. For the purposes of this survey, each sub-district was considered a separate stratum to ensure the possibility of producing results on the sub-district level. In this respect, the survey framework adopted that provided by the General Census of Population and Housing Census in dividing the sample strata. To estimate the sample size, the coefficient of variation and the design effect of the expenditure variable provided in the Household Expenditure and Income Survey for the year 2008 was calculated for each sub-district. These results were used to estimate the sample size on the sub-district level so that the coefficient of variation for the expenditure variable in each sub-district is less than 10%, at a minimum, of the number of clusters in the same sub-district (6 clusters). This is to ensure adequate presentation of clusters in different administrative areas to enable drawing an indicative poverty map.
It should be noted that in addition to the standard non response rate assumed, higher rates were expected in areas where poor households are concentrated in major cities. Therefore, those were taken into consideration during the sampling design phase, and a higher number of households were selected from those areas, aiming at well covering all regions where poverty spreads.
Face-to-face [f2f]
Raw Data: - Organizing forms/questionnaires: A compatible archive system was used to classify the forms according to different rounds throughout the year. A registry was prepared to indicate different stages of the process of data checking, coding and entry till forms were back to the archive system. - Data office checking: This phase was achieved concurrently with the data collection phase in the field where questionnaires completed in the field were immediately sent to data office checking phase. - Data coding: A team was trained to work on the data coding phase, which in this survey is only limited to education specialization, profession and economic activity. In this respect, international classifications were used, while for the rest of the questions, coding was predefined during the design phase. - Data entry/validation: A team consisting of system analysts, programmers and data entry personnel were working on the data at this stage. System analysts and programmers started by identifying the survey framework and questionnaire fields to help build computerized data entry forms. A set of validation rules were added to the entry form to ensure accuracy of data entered. A team was then trained to complete the data entry process. Forms prepared for data entry were provided by the archive department to ensure forms are correctly extracted and put back in the archive system. A data validation process was run on the data to ensure the data entered is free of errors. - Results tabulation and dissemination: After the completion of all data processing operations, ORACLE was used to tabulate the survey final results. Those results were further checked using similar outputs from SPSS to ensure that tabulations produced were correct. A check was also run on each table to guarantee consistency of figures presented, together with required editing for tables' titles and report formatting.
Harmonized Data: - The Statistical Package for Social Science (SPSS) was used to clean and harmonize the datasets. - The harmonization process started with cleaning all raw data files received from the Statistical Office. - Cleaned data files were then merged to produce one data file on the individual level containing all variables subject to harmonization. - A country-specific program was generated for each dataset to generate/compute/recode/rename/format/label harmonized variables. - A post-harmonization cleaning process was run on the data. - Harmonized data was saved on the household as well as the individual level, in SPSS and converted to STATA format.
Kickstarter, the popular crowdfunding platform, has seen a significant number of projects fall short of their funding goals. As of January 2025, 376,698 projects failed to reach their targets, with the majority (246,351) achieving only 1-20 percent of their funding objectives. This failure rate underscores the challenges creators face in securing financial backing for their ideas, despite Kickstarter's global reach and billions in pledged funds. Crowdfunding's growing impact Since its launch in 2009, Kickstarter has become a major player in the crowdfunding industry. The number of projects hosted on the platform exceeded 651,000 projects, with pledges surpassing 8.5 billion U.S. dollars. Notably, the most successful project to date, "Surpise! Four Secret Novels by Brandon Sanderson", raised an impressive 41 million U.S. dollars in 2022. These figures highlight the platform's potential for creators to secure substantial funding for their projects. Success rates vary by category While many projects struggle to meet their funding goals, success rates differ significantly across categories. As of January 2025, comics boasted the highest success rate at 67.65 percent, followed by dance at 61.11 percent and theater at 59.72 percent. These statistics suggest that certain creative fields may resonate more strongly with Kickstarter's backer community, potentially offering better odds for project success in these areas.
We create a synthetic administrative dataset to be used in the development of the R package for calculating quality indicators for administrative data (see: https://github.com/sook-tusk/qualadmin) that mimic the properties of a real administrative dataset according to specifications by the ONS. Taking over 1 million records from a synthetic 1991 UK census dataset, we deleted records, moved records to a different geography and duplicated records to a different geography according to pre-specified proportions for each broad ethnic group (White, Non-white) and gender (males, females). The final size of the synthetic administrative data was 1033664 individuals.
National Statistical Institutes (NSIs) are directing resources into advancing the use of administrative data in official statistics systems. This is a top priority for the UK Office for National Statistics (ONS) as they are undergoing transformations in their statistical systems to make more use of administrative data for future censuses and population statistics. Administrative data are defined as secondary data sources since they are produced by other agencies as a result of an event or a transaction relating to administrative procedures of organisations, public administrations and government agencies. Nevertheless, they have the potential to become important data sources for the production of official statistics by significantly reducing the cost and burden of response and improving the efficiency of such systems. Embedding administrative data in statistical systems is not without costs and it is vital to understand where potential errors may arise. The Total Administrative Data Error Framework sets out all possible sources of error when using administrative data as statistical data, depending on whether it is a single data source or integrated with other data sources such as survey data. For a single administrative data, one of the main sources of error is coverage and representation to the target population of interest. This is particularly relevant when administrative data is delivered over time, such as tax data for maintaining the Business Register. For sub-project 1 of this research project, we develop quality indicators that allow the statistical agency to assess if the administrative data is representative to the target population and which sub-groups may be missing or over-covered. This is essential for producing unbiased estimates from administrative data. Another priority at statistical agencies is to produce a statistical register for population characteristic estimates, such as employment statistics, from multiple sources of administrative and survey data. Using administrative data to build a spine, survey data can be integrated using record linkage and statistical matching approaches on a set of common matching variables. This will be the topic for sub-project 2, which will be split into several topics of research. The first topic is whether adding statistical predictions and correlation structures improves the linkage and data integration. The second topic is to research a mass imputation framework for imputing missing target variables in the statistical register where the missing data may be due to multiple underlying mechanisms. Therefore, the third topic will aim to improve the mass imputation framework to mitigate against possible measurement errors, for example by adding benchmarks and other constraints into the approaches. On completion of a statistical register, estimates for key target variables at local areas can easily be aggregated. However, it is essential to also measure the precision of these estimates through mean square errors and this will be the fourth topic of the sub-project. Finally, this new way of producing official statistics is compared to the more common method of incorporating administrative data through survey weights and model-based estimation approaches. In other words, we evaluate whether it is better 'to weight' or 'to impute' for population characteristic estimates - a key question under investigation by survey statisticians in the last decade.
This statistic shows information on some of the fastest project launches on crowdfunding website Kickstarter as of November 2016, based on the amount of time several million dollar projects took to surpass the 1 million dollar funding mark. On Black Friday 2016, board game Kingdom Death: Monster 1.5, a follow-up to the 2012 game Kingdom Death, generated 1 million U.S. dollars in pledges within 19 minutes. On February 23, 2015, smartwatch Pebble Time surpassed the 1 million U.S. dollar mark within 49 minutes. The Veronica Mars movie project had been the fastest Kickstarter movie project to accumulate 1 million U.S. dollars, taking only 4 hours and 24 minutes to do so. The fastest gaming project to reach 1 million U.S. dollars in funding was Shenmue 3 as of June 2015.
Successful Kickstarter campaigns - additional information
As of 2015, Kickstarter is one of the largest crowdfunding platforms in the world, having reportedly received more than 1.6 billion U.S. dollars in pledges from 8.9 million individuals since its 2009 launch. According to industry experts, global crowdfunding campaigns have raised a total of 16.2 billion U.S. dollars in 2014, a 167 percent growth from the previous year’s 6.1 billion. In 2015, the industry is expected to reach 34.4 billion U.S. dollars in pledges from all over the world. However, crowdfunding, financing of a project through small donations from a great number of individuals, is not in itself a new idea and has been used in history before. A notable example is the completion of the Statue of Liberty, which was backed by more than 120,000 contributors, most of whom gave less than a dollar, following a campaign initiated by famed journalist Joseph Pulitzer.
While other websites, such as GoFundMe, allow people to raise money for anything from graduations to medical bills, trips or charity, Kickstarter focuses on mainly creative projects, from craft ideas to music albums or technological innovations. As of June 2015, the most popular category of projects featured on the platform is games, with some 360 million U.S. dollars pledged, followed by technology, design, film & video, and music. As of April 2015, some 38 percent of the campaigns posted on Kickstarter have reached or even exceeded their funding goal. The most successful Kickstarter campaign of all time is the one supporting Pebble Time, a smartwatch developed by Pebble Technology Corporation, which had an initial fundraising target of 100 thousand U.S. dollars, but received pledges worth over 10 million U.S. dollars from almost 70 thousand backers. It is also the campaign fastest to reach pledges worth 1 million U.S. dollars, in a record 30 minutes. The popular smartwatch went into production and was released in 2013, selling its one millionth unit in December 2014.
The first part of this report discusses the overall statistical planning, coordination and design for several tar sand wastewater treatment projects contracted by the Laramie Energy Technology Center (LETC) of the Department of Energy. A general discussion of the benefits of consistent statistical design and analysis for data-oriented projects is included, with recommendations for implementation. A detailed outline of the principles of general linear models design is followed by an introduction to recent developments in general linear models by ranks (GLMR) analysis and a comparison to standard analysis using Gaussian or normal theory (GLMN). A listing of routines contained in the VPI Nonparametric Statistics Package (NPSP), installed on the Cyber computer system at the University of Wyoming is included. Part 2 describes in detail the design and analysis for treatments by Gas Flotation, Foam Separation, Coagulation, and Ozonation, with comparisons among the first three methods. Rank methods are used for most analyses, and several detailed examples are included. For optimization studies, the powerful tools of response surface analysis (RSA) are employed, and several sections contain discussion on the benefits of RSA. All four treatment methods proved to be effective for removal of TOC and suspended solids from the wastewater. Because the processes and equipment designs were new, optimum removals were not achieved by these initial studies and reasons for that are discussed. Pollutant levels were nevertheless reduced to levels appropriate for recycling within the process, and for such reuses as steam generation, according to the DOE/LETC project officer. 12 refs., 8 figs., 21 tabs.
https://snd.se/en/search-and-order-data/using-datahttps://snd.se/en/search-and-order-data/using-data
Since the beginning of the 1960s, Statistics Sweden, in collaboration with various research institutions, has carried out follow-up surveys in the school system. These surveys have taken place within the framework of the IS project (Individual Statistics Project) at the University of Gothenburg and the UGU project (Evaluation through follow-up of students) at the University of Teacher Education in Stockholm, which since 1990 have been merged into a research project called 'Evaluation through Follow-up'. The follow-up surveys are part of the central evaluation of the school and are based on large nationally representative samples from different cohorts of students.
Evaluation through follow-up (UGU) is one of the country's largest research databases in the field of education. UGU is part of the central evaluation of the school and is based on large nationally representative samples from different cohorts of students. The longitudinal database contains information on nationally representative samples of school pupils from ten cohorts, born between 1948 and 2004. The sampling process was based on the student's birthday for the first two and on the school class for the other cohorts.
For each cohort, data of mainly two types are collected. School administrative data is collected annually by Statistics Sweden during the time that pupils are in the general school system (primary and secondary school), for most cohorts starting in compulsory school year 3. This information is provided by the school offices and, among other things, includes characteristics of school, class, special support, study choices and grades. Information obtained has varied somewhat, e.g. due to changes in curricula. A more detailed description of this data collection can be found in reports published by Statistics Sweden and linked to datasets for each cohort.
Survey data from the pupils is collected for the first time in compulsory school year 6 (for most cohorts). Questionnaire in survey in year 6 includes questions related to self-perception and interest in learning, attitudes to school, hobbies, school motivation and future plans. For some cohorts, questionnaire data are also collected in year 3 and year 9 in compulsory school and in upper secondary school.
Furthermore, results from various intelligence tests and standartized knowledge tests are included in the data collection year 6. The intelligence tests have been identical for all cohorts (except cohort born in 1987 from which questionnaire data were first collected in year 9). The intelligence test consists of a verbal, a spatial and an inductive test, each containing 40 tasks and specially designed for the UGU project. The verbal test is a vocabulary test of the opposite type. The spatial test is a so-called ‘sheet metal folding test’ and the inductive test are made up of series of numbers. The reliability of the test, intercorrelations and connection with school grades are reported by Svensson (1971).
For the first three cohorts (1948, 1953 and 1967), the standartized knowledge tests in year 6 consist of the standard tests in Swedish, mathematics and English that up to and including the beginning of the 1980s were offered to all pupils in compulsory school year 6. For the cohort 1972, specially prepared tests in reading and mathematics were used. The test in reading consists of 27 tasks and aimed to identify students with reading difficulties. The mathematics test, which was also offered for the fifth cohort, (1977) includes 19 assignments. After a changed version of the test, caused by the previously used test being judged to be somewhat too simple, has been used for the cohort born in 1982. Results on the mathematics test are not available for the 1987 cohort. The mathematics test was not offered to the students in the cohort in 1992, as the test did not seem to fully correspond with current curriculum intentions in mathematics. For further information, see the description of the dataset for each cohort.
For several of the samples, questionnaires were also collected from the students 'parents and teachers in year 6. The teacher questionnaire contains questions about the teacher, class size and composition, the teacher's assessments of the class' knowledge level, etc., school resources, working methods and parental involvement and questions about the existence of evaluations. The questionnaire for the guardians includes questions about the child's upbringing conditions, ambitions and wishes regarding the child's education, views on the school's objectives and the parents' own educational and professional situation.
The students are followed up even after they have left primary school. Among other things, data collection is done during the time they are in high school. Then school administrative data such as e.g. choice of upper secondary school line / program and grades after completing studies. For some of the cohorts, in addition to school administrative data, questionnaire data were also collected from the students.
The Southern Africa Consortium for Monitoring Educational Quality (SACMEQ) is a consortium of Ministries of Education and Culture located in the Southern Africa subregion. This consortium works in close partnership with the International Institute for Educational Planning (IIEP). SACMEQ’s main aim is to undertake co-operative educational policy research in order to generate information that can be used by decision-makers to plan the quality of education. SACMEQ’s programme of educational policy research has four features which have optimized its contributions to the field of educational planning: (1) it provides research-based policy advice concerning high-priority educational quality issues that have been identified by key decision-makers in Southern Africa, (2) it functions as a co-operative venture based on a strong network of Ministries of Education and Culture, (3) it combines research and training components that are linked with institutional capacity building, and its future directions are defined by participating ministries. In each participating country, a National Research Co-ordinator is responsible for implementing SACMEQ’s projects.
The SACMEQ I Project commenced in 1995 and was completed in 1999. The SACMEQ I main data collection was implemented in seven SACMEQ Ministries of Education (Kenya, Mauritius, Malawi, Namibia, Zambia, Zanzibar, and Zimbabwe). The study provided "agendas for government action" concerning: educational inputs to schools, benchmark standards for educational provision, equity in the allocation of educational resources, and the reading literacy performance of Grade 6 learners. The data collection for this project included information gathered from around 20,000 learners; 3,000 teachers; and 1,000 school principals.
This co-operative sub-regional educational research project collected data in order to guide decisionmaking in these countries with respect to questions around high priority policy issues. These included: • What are the baseline data for selected inputs to primary schools? • How do the conditions of primary schooling compare with the Ministry of Education and Culture’s own bench-mark standards? • Have educational inputs to schools been allocated in an equitable fashion? • What is the basic literacy level among pupils in upper primary school? • Which educational inputs to primary schools have most impact on pupil reading achievement at the upper primary level?
In 1995 there were five fully active members of SACMEQ: Mauritius, Namibia, Zambia, Tanzania (Zanzibar), and Zimbabwe. These Ministries of Education and Culture participated in all phases of SACMEQ’s establishment and its initial educational policy research project. There are also four partially active members of SACMEQ: Kenya, Tanzania (Mainland), Malawi, and Swaziland. These Ministries of Education and Culture have made contributions to the preparation of the Project Plan for SACMEQ’s initial educational policy research project. Three other countries (Botswana, Lesotho, and South Africa) had observer status due to their involvement in SACMEQ related training workshops or their participation in some elements of the preparation of the first proposal for launching SACMEQ.
National Coverage
The target population for SACMEQ's Initial Project was defined as "all pupils at the Grade 6 level in 1995 who were attending registered government or non-government schools". Grade 6 was chosen because it was the grade level where the basics of reading literacy were expected to have been acquired.
Sample survey data [ssd]
A stratified two-stage sample design was used to select around 150 schools in each country. Pupils were then selected within these schools by drawing simple random samples. A more detailed explanation of the sampling process is available under the 'Sampling' section of the report provided as external resources.
All sample designs applied in SACMEQ'S initial project were selected so as to meet the standards set down by the International Association for the Evaluation of Education Achievement (Ross, 1991). These standards require sample estimates of important pupil population characteristics to be (a) adjusted by weighing procedures designed to remove the potential for bias that may arise from different probabilities of selection, and (b) have sampling errors for the main criterion variables that are of the same magnitude or smaller than a simple random sample of 400 pupils (thereby providing 95 percent confidence limits for sample estimates of population percentages of plus or minus 5 percentage points, and 95 percent confidence limits for sample estimates of population means of plus or minus one tenth of a pupil standard deviation unit).
The desired target population in Zambia was 'all pupils at the Grade 6 level in the eleventh month of the school year, 1995, who were attending registered government and grant-aided schools in the country'. The number of schools and pupils in the desired, excluded, and defined population have been presented in Table 2.2 of the Sample Report provided as external resources. From the defined target population a probability sample of schools (with probability proportional to the Grade 6 enrolment in each school) was drawn. This resulted in a planned national sample of 165 schools and 3,300 pupils. This sample design was designed to yield an equivalent sample size' (Ross and Wilson, 1994) of 400 pupils - based on an estimated intra-class correlation (rho) for pupil reading test scores of around 0.30. In fact, after the rho was calculated for the reading scores, it was found to be 0.3 1 - which was about the same as had been expected At the first stage of sampling, schools were selected with a probability proportional to the number of pupils who were members of the defined target population. To achieve this selection a 'random start - constant interval' procedure was applied (Ross, 1987). In several strata there were some schools with numbers of pupils in the defined target population that exceeded the size of the 'constant interval', and therefore each of these schools was randomly broken into smaller 'pseudo schools' before the commencement of the sampling. At the second stage of sampling, a simple random sample of 20 pupils was selected within each selected school. Sampling weights were used to adjust for the disproportionate allocation of the sample across districts and also to account for the small loss of student data due to absenteeism on the day of the data collection.
Face-to-face [f2f]
The data collection for SACMEQ's Initial Project took place in October 1995 and involved the administration of questionnaires to pupils, teachers, and school heads. The pupil questionnaire contained questions about the pupils' home backgrounds and their school life; the teacher questionnaire asked about classrooms, teaching practices, working conditions, and teacher housing; and the school head questionnaire collected information about teachers, enrolments, buildings, facilities, and management. A reading literacy test was also given to the pupils. The test was based on items that were selected after a trial-testing programme had been completed.
The SACMEQ Data Collection Instruments include the following documents: - SACMEQ Questionnaires - which are administered to pupils, teachers, and school heads. - SACMEQ Tests - which are administered to pupils and teachers (covering reading mathematics, and HIV-AIDS knowledge). - Other SACMEQ Data Collection Instruments - such as take-home pupil questionnaires, school context proformas, and within-school project management documents.
All of the team leaders for the data collectors returned the instruments to the Ministry Headquarters (for the attention of the NRC), during the second week after the test administration. Once the instruments were returned to the Headquarters, three data entry staff within the Statistical Section of the Ministry entered the data, using the Data Entry Manager (DEM) a software programme developed at the IIEP (Schleicher, 1995). This software was adapted specifically for the entry of SACMEQ data. The data entry took six weeks and the data were sent on diskette to IIEP in March, 1996. It must be mentioned that at the time of data entry, the earlier version of the DEM structure files was used, and this caused major problems in cleaning the data at a later stage and reconstituting the structure of the files as they were meant to be.
The planned sample was designed to contain 165 schools allocated across provinces, as shown in the first column of figures in Table 2.3 of the Survey Report provided as external resources. The achieved sample of schools was 157. The response rates for the sample have been recorded in Table 2.3. The percentage response for schools was 95.2 percent and that of pupils was 77.5 percent. The non-responding pupils were those who were absent on the day of testing. By province, this absenteeism varied from 2 to 12 percent.
In the survey report provided as external resources, standard errors were provided for all important variables. The calculation of these errors acknowledged that the sample was not a simple random sample - but rather a complex two-stage cluster sample that included weighting adjustments to compensate for variations in selection probabilities. The errors were
The Lao PDR Public Expenditure Tracking Survey (PETS) Project, a partnership between the Government of Lao PDR and the World Bank, was carried out from January 31 to March 11, 2006. The project analyzed the flow of funds in country's primary education and primary health sectors.
In Lao PDR, the fiscal structure is highly decentralized. Provinces and districts are responsible for delivering basic social services to households.
This survey explored the impact of public expenditure management and human resources on education and health service delivery by: - documenting financial and human resources at subnational levels; - tracking salary payments from districts to facilities; - correlating performance with actual service delivery; - proposing policy reforms on expenditure management practices.
Documented here is the research analyzing flow of resources in Lao PDR primary health sector. One hundred and seven health centers in 17 provinces were covered by the PETS. The study sample was based on health facilities in villages surveyed in 2002-2003 Lao Expenditure and Consumption Survey (LECS).
17 provinces
Sample survey data [ssd]
The PETS sample was based on schools and health centers in villages surveyed by the 2002-2003 Lao Expenditure and Consumption Survey (LECS). One operational advantage of this sampling was that National Statistics Centre was familiar with the location and village representatives. An analytical advantage of this sampling design was that this PETS was derived from a nationally representative household survey that covered 17 out of 18 provinces and 56 out of the 141 districts. In contrast, most PETS take place in selected areas of a country to conserve spending on the cost of field work. As of 2003, there were 717 health centers in Lao PDR. Based on this number and the experience of other PETS surveys, the target number of health centers was 114. The target was off by seven clinics.
According to their financial status in the 2003-2004 Official Gazatte, five provinces, Borikhamxay, Champasak, Luangnamtha, Savanakhet, and Vientiane Capital were considered surplus provinces; the remaining 12 were considered deficit provinces. Within provinces, there are 143 districts. Of these, 47 were priority districts in need of public investments to attain their poverty reduction targets. These 47 districts were identified based on a set of household, village, and district level indicators for basic minimum needs. Of the 56 districts in the survey, 10 were priority districts and 46 - non-priority districts.
Of the 107 health centers covered by the survey, 33 were in urban areas, 74 in rural areas; 71 in non-poor areas, 36 in poor areas, and 91 in Lao-Tai areas, and 16 in non Lao-Tai areas.
Face-to-face [f2f]
Three survey instruments were used to collect the data:
The district level questionnaire recorded resources that were allocated to primary schools and health centers and tracked the funds for salary payments from the district to facility levels.
The facility level questionnaire collected detailed information about inputs, including public funds, outputs and outcomes for primary schools and health centers.
The community level questionnaire was a scaled-down version of the village questionnaire from 2002-2003 Lao Expenditure and Consumption Survey. The questionnaire recorded community characteristics.
The basic goal of this survey is to provide the necessary database for formulating national policies at various levels. It represents the contribution of the household sector to the Gross National Product (GNP). Household Surveys help as well in determining the incidence of poverty, and providing weighted data which reflects the relative importance of the consumption items to be employed in determining the benchmark for rates and prices of items and services. Generally, the Household Expenditure and Consumption Survey is a fundamental cornerstone in the process of studying the nutritional status in the Palestinian territory.
The raw survey data provided by the Statistical Office was cleaned and harmonized by the Economic Research Forum, in the context of a major research project to develop and expand knowledge on equity and inequality in the Arab region. The main focus of the project is to measure the magnitude and direction of change in inequality and to understand the complex contributing social, political and economic forces influencing its levels. However, the measurement and analysis of the magnitude and direction of change in this inequality cannot be consistently carried out without harmonized and comparable micro-level data on income and expenditures. Therefore, one important component of this research project is securing and harmonizing household surveys from as many countries in the region as possible, adhering to international statistics on household living standards distribution. Once the dataset has been compiled, the Economic Research Forum makes it available, subject to confidentiality agreements, to all researchers and institutions concerned with data collection and issues of inequality. Data is a public good, in the interest of the region, and it is consistent with the Economic Research Forum's mandate to make micro data available, aiding regional research on this important topic.
The survey data covers urban, rural and camp areas in West Bank and Gaza Strip.
1- Household/families. 2- Individuals.
The survey covered all the Palestinian households who are a usual residence in the Palestinian Territory.
Sample survey data [ssd]
The sampling frame consists of all enumeration areas which enumerated in 1997 and the numeration area consists of buildings and housing units and has in average about 150 households in it. We use the enumeration areas as primary sampling units PSUs in the first stage of the sampling selection. The enumeration areas of the master sample were updated in 2003.
The sample is stratified cluster systematic random sample with two stages: The calculated sample size is 1,616 households, the completed households were 1,281 (847 in the west bank and 434 in the Gaza strip). First stage: selection a systematic random sample of 120 enumeration areas. Second stage: selection a systematic random sample of 12-18 households from each enumeration area selected in the first stage.
We divided the population by: 1- Region (North West Bank, Middle West Bank, South West Bank, Gaza Strip) 2- Type of Locality (urban, rural, refugee camps)
The target cluster size or "sample-take" is the average number of households to be selected per PSU. In this survey, the sample take is around 12 households.
The calculated sample size is 1,616 households, the completed households were 1,281 (847 in the west bank and 434 in the Gaza strip).
Face-to-face [f2f]
The PECS questionnaire consists of two main sections:
First section: Certain articles / provisions of the form filled at the beginning of the month, and the remainder filled out at the end of the month. The questionnaire includes the following provisions:
Cover sheet: It contains detailed and particulars of the family, date of visit, particular of the field/office work team, number/sex of the family members.
Statement of the family members: Contains social, economic and demographic particulars of the selected family.
Statement of the long-lasting commodities and income generation activities: Includes a number of basic and indispensable items (i.e., Livestock, or agricultural lands).
Housing Characteristics: Includes information and data pertaining to the housing conditions, including type of house, number of rooms, ownership, rent, water, electricity supply, connection to the sewer system, source of cooking and heating fuel, and remoteness/proximity of the house to education and health facilities.
Monthly and Annual Income: Data pertaining to the income of the family is collected from different sources at the end of the registration / recording period.
Assistance and poverty: includes questions about household conditions and assistances that got through the the past month.
Second section: The second section of the questionnaire includes a list of 55 consumption and expenditure groups itemized and serially numbered according to its importance to the family. Each of these groups contains important commodities. The number of commodities items in each for all groups stood at 667 commodities and services items. Groups 1-21 include food, drink, and cigarettes. Group 22 includes homemade commodities. Groups 23-45 include all items except for food, drink and cigarettes. Groups 50-55 include all of the long-lasting commodities. Data on each of these groups was collected over different intervals of time so as to reflect expenditure over a period of one full year, except the cars group the data of which was collected for three previous years. These data was abotained from the recording book which is covered a period of month for each household.
Data editing took place though a number of stages, including: 1. Office editing and coding 2. Data entry 3. Structure checking and completeness 4. Structural checking of SPSS data files
The survey sample consists of about 1,616 households interviewed over a twelve months period between (January 2006-January 2007), 1,281 households completed interview, of which 847 in the West Bank and 434 household in Gaza Strip, the response rate was 79.3% in the Palestinian Territory.
Generally, surveys samples are exposed to two types of errors. The statistical errors, being the first type, result from studying a part of a certain society and not including all its sections. And since the Household Expenditure and Consumption Surveys are conducted using a sample method, statistical errors are then unavoidable. Therefore, a potential sample using a suitable design has been employed whereby each unit of the society has a high chance of selection. Upon calculating the rate of bias in this survey, it appeared that the data is of high quality. The second type of errors is the non-statistical errors that relate to the design of the survey, mechanisms of data collection, and management and analysis of data. Members of the work commission were trained on all possible mechanisms to tackle such potential problems, as well as on how to address cases in which there were no responses (representing 9.6%).
There is a long history to the agricultural census in the Netherlands. From 1934 onwards a census has been carried out (almost) every year. In recent years it is no longer purely a statistical project, but serves several purposes: on the one hand production of statistics by Statistics Netherlands and creating a frame for sampling, on the other hand providing data on individual holdings for administrative purposes by the Ministry of Economic Affairs, Agriculture and Innovation (the Ministry). Since the Ministry and Statistics Netherlands have a common interest in the census, it is held as a joint effort. In 1990, it was the last time special meeting days were organised to assess the data from the farmers. On these meeting days, farmers and enumerators jointly filled in the questionnaire manually. In the period 1991 – 1995, these sessions still took place, but the manual procedure was gradually replaced by filling in the information in a computer file. In 1996, the farmer could make a choice between coming to a special meeting place or filling in the survey form himself and returning it by postal mail. From 1997 on, a complete census was organised by postal mail every year. The year 2003 was a pilot year in which respondents had the opportunity to supply the census information through an internet application. In recent years the information is predominantly supplied via the internet. Since the statistical year 2002 the questionnaire of the agricultural census is combined with the application for animal, crop and arable land subsidies (in 2006 also for the single payment scheme). In 2007 data collection for the enforcement of the manure law is also combined in this questionnaire. This is done for efficiency reasons, both for farmers, and for administration and processing of data.
National coverage
Households
The statistical unit was the agricultural holding, defined as a single unit, both technically and economically, which has a single management and which undertakes agricultural activities listed in Annex Ito the European Parliament and Council Regulation (EC) No. 1166/2008 within the economic territory of the EU, either as its primary or secondary activity.
Census/enumeration data [cen]
Frame Statistics Netherlands has a business register of all industrial and non-industrial commercial establishments, but the agricultural holdings are not yet fully covered in this register. The agricultural census therefore relies on the administrative farm register (AFR) of the Ministry held by NSIR, an executive service of the Ministry. By law farmers have to register with NSIR. The AFR contains names, addresses and a few other characteristics of holders or holdings and a unique registration number. With the census information of several years Statistics Netherlands has built up a statistical farm register (SFR). Relevant characteristics from the AFR (a.o. identification number, addresses, legal status) are also stored in the SFR. Changes in addresses are entered into the AFR throughout the year, changes in the SFR of course only once a year. The SFR provides a magnificent basis for stratification and efficient sampling of subsequent agricultural statistics. An annual census may seem expensive (even when only half of the cost is looked upon as expenses for statistics). But the excellent quality of the sample frame allows for relative small samples in related agricultural statistics and thus reduction of costs.
Computer Assisted Web Interview (CAWI)
One questionnaire was used, integrating both the 2010 AC and the SAPM, and presented to respondents as a single statistical inquiry. The questionnaire covered all 16 core items recommended in the WCA 2010.
Questionnaire:
1 Work and education 2 Number of animals and housing 3 Horticulture under glass 4 Mushrooms, bulb growing, chicory growing 5 Crops on open land and land use 6 Agricultural land area 7 Subsidies 8 Farm data 9 Livestock manure 10 Excavation notification (WION) 11 Signature
a. Data collection and data entry About 85% of the questionnaires was filled in and returned using the web application, which already contained a lotof c hecks and validations. Paper forms were digitized by a data-entry firm and processed by NSIR in the same way as the online questionnaires. There were several quality controls to ensure correct digitization.
b. Data processing, estimation and analysis Data processing, estimation and analysis were performed in two successive stages:
Pre-processing at NSIR After data collection and data entry the input data go through an extensive error control phase. In this phase checks are made on missing values, valid values, unlikely values, range checks, checks of correlation in the data, checks of totals and so on. When necessary additional information is collected from the farmers by phone. Data that is checked and accepted by NSIR is forwarded to Statistics Netherlands.
Processing at Statistics Netherlands Processing at Statistics Netherlands involves additional error control, enrichment with additional information, such as total SO and typology, imputation for non-response and analysis. Analyses are made at several levels of aggregation and comprise comparison with previous results and agricultural data from other sources.
Checking the information in the questionnaires took place using a special control programme. Data were checked for hard and soft errors. Hard errors are non-valid values. Soft errors are unlikely values. If necessary, the checking personnel contacted the respondent to correct for errors. Approximately 85 percent of the questionnaires were completed online. The online questionnaire application contained extensive interactive controls and edits.
Dissemination: Dissemination is done via the Statline database, which is available on the Internet (www.cbs.nl ). In this database, Internet users may select their own indicators and information topics. Short publications on specific subjects are presented in the form of newspaper or Internet articles. Safe access to census microdata is also provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Transcriptome statistics from samples obtained on LMG1411.
Diatom isolates were obtained from the Western Antarctic Peninsula surface waters.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The coupling coordination between higher education and regional industries is an effective way to enhance the performance of industry-education integration and an important driving force for promoting sustainable economic and social development. This article aims to evaluate the performance level of industry-education integration in higher education from the perspective of coupling coordination between industry and education, and to analyze its sustainable development trends. To achieve this, a framework for the industry-education complex system is constructed based on the theory of coupling coordination, and a performance evaluation model for industry-education integration is established. The "structure-conduct-performance" (S-C-P) analysis paradigm of industrial organization theory is used to design a performance evaluation indicator system, and the entropy weight method is applied to assign weights to the indicator system. Empirical research is conducted on Chongqing based on statistical data from 2000 to 2022. The results show that: (1) During the sample period, the construction effectiveness of industry-education integration in higher education in Chongqing is relatively significant, and the coordination level between regional industries and higher education continues to rise. (2) The orderly development level of the two subsystems of regional industries and higher education alternately rises, jointly promoting the continuous improvement of the performance level of industry-education integration, which is the main driving factor for performance improvement. (3) The stagnation of the coupling strength between regional industries and higher education is a key obstacle to further improving the performance of industry-education integration. Industry-education integration is a systematic project that should be promoted collaboratively from the perspectives of the regional industrial subsystem, higher education subsystem, industry-education complex system, and external environment to continuously enhance the performance level.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This project contains a city-level panel dataset of city revenue, expenses, and debt between the years 1924 and 1938, annually, derived from state archival records and the U.S. Census Bureau. All variables have been aggregated to the smallest comparable category – that is, if one state only reports “total taxes” while another splits taxes into different types, total taxes from the first and the sum from the second are reported here. Please get in touch with the principal investigator if you would like the disaggregated data (especially for Massachusetts and New York). Sample sizes vary from 519 to 819 cities per year.
These are synthetically generated unit and area level population and sample data that can be used for testing model-based unit-level small area methods. To prevent disclosure issues the datasets have been generated by repeated (Monte-Carlo) sampling of real EU-SILC (Survey of Income and Living Conditions) data in Austria. The data include geographical identifies and can be used for fitting unit-level (Battese-Harter and Fuller type) models and area level models (Fay-Herriott- type) models. The datasets are part of the R package emdi. Examples of the use of the data can be found in the emdi manual available via https://cran.r-project.org/web/packages/emdi/emdi.pdf and in Kreutzmann et al. (2019)
Kreutzmann, A. K., Pannier, S., Rojas-Perilla, N., Schmid, T., Templ, M., & Tzavidis, N. (2019). The R package emdi for the estimation and mapping of regional disaggregated indicators. Journal of Statistical Software, 91(7). https://doi.org/10.18637/jss.v091.i07
Reliable statistics are crucial for policy relevant research. Small Area Estimation (SAE) methods generate robust reliable and consistent statistics at geographical scales for which survey data are either non-existent or too sparse to provide direct estimates of acceptable accuracy. The last decade has seen a rapid increase in the use of SAE. Statistical agencies and Governmental organisations are actively developing their own suites of estimates. In the UK the Office for National Statistics (ONS) has responded to user demands by producing estimates of average household income for wards and using SAE to answer queries from local authorities, policy advisers and government departments. The Welsh Assembly Government (WAG) is actively seeking to develop capacity for SAE. Public Health England produces SAEs of adolescent smoking and chronic kidney disease. Initial demands for small area statistics are now shifting to requirements for more complex statistics that extend beyond averages and proportions to encompass estimates of statistical distributions, multidimensional indicators (e.g. inequality and deprivation indicators) and methods for replacing the Census and adjusting Census results for undercount. These developing requirements pose significant methodological and applied real-world challenges. These challenges are deepened by different methodological approaches to SAE remaining largely unconnected, locked in disciplinary silos. The technical presentation of SAE also impedes more widespread uptake by social scientists and understanding by users. The proposed programme of work aims to (a) develop novel SAE methodologies to better serve the needs of users and producers of SAE (b) bridge different methodological approaches to SAE, (c) apply SAE for answering substantive questions in the social sciences and (d) 'Mainstream' SAE within the quantitative social sciences through the creation of methodologically comprehensive and accessible resources. The project comprises three work packages of methodological innovative research designed to deepen the understanding of SAE and achieve the aforementioned aims. The project will capitalise on a cross-disciplinary research team drawn together through an NCRM methodological network and reflecting a large part of the SAE expertise in the UK. Through long-standing collaborations with national and international agencies in the UK, Mexico and Brazil, which are placed at the centre of the project, we enjoy access to individual level secondary survey and Census data. Collaboration with key SAE users will ensure that the project remains relevant to user needs and that methodologies are used for expanding the set of small area statistics currently available. The involvement of international experts ensures the quality and relevance of the research. Substantive outputs will include SAEs of attributes of interest to users, including income, inequality, deprivation, health, ethnicity and a realistic pseudo-Census dataset for use by other researchers. The project will advance knowledge across disciplines in the social sciences including social statistics, applied economics, human geography and sociology. It will additionally impact on the production of official and Census statistics. The project is committed to adding value to NCRM's training and capacity building activities by developing new resources.