Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.
The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire
Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe
Basic units of analysis that the study investigates include: individuals and groups
Sample survey data [ssd]
A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.
The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.
Sample Universe
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample Design
The sample design is a clustered, stratified, multi-stage, area probability sample.
To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.
In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:
The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages
A first-stage to stratify and randomly select primary sampling units;
A second-stage to randomly select sampling start-points;
A third stage to randomly choose households;
A final-stage involving the random selection of individual respondents
We shall deal with each of these stages in turn.
STAGE ONE: Selection of Primary Sampling Units (PSUs)
The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.
We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.
Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.
Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.
Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.
Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.
The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.
These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.
The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will
In 2023, Washington, D.C. had the highest population density in the United States, with 11,130.69 people per square mile. As a whole, there were about 94.83 residents per square mile in the U.S., and Alaska was the state with the lowest population density, with 1.29 residents per square mile. The problem of population density Simply put, population density is the population of a country divided by the area of the country. While this can be an interesting measure of how many people live in a country and how large the country is, it does not account for the degree of urbanization, or the share of people who live in urban centers. For example, Russia is the largest country in the world and has a comparatively low population, so its population density is very low. However, much of the country is uninhabited, so cities in Russia are much more densely populated than the rest of the country. Urbanization in the United States While the United States is not very densely populated compared to other countries, its population density has increased significantly over the past few decades. The degree of urbanization has also increased, and well over half of the population lives in urban centers.
The Country-Level Population and Downscaled Projections Based on Special Report on Emissions Scenarios (SRES) A1, B1, and A2 Scenarios, 1990-2100, were adopted in 2000 from population projections realized at the International Institute for Applied Systems Analysis (IIASA) in 1996. The Intergovernmental Panel on Climate Change (IPCC) SRES A1 and B1 scenarios both used the same IIASA "rapid" fertility transition projection, which assumes low fertility and low mortality rates. The SRES A2 scenario used a corresponding IIASA "slow" fertility transition projection (high fertility and high mortality rates). Both IIASA low and high projections are performed for 13 world regions including North Africa, Sub-Saharan Africa, China and Centrally Planned Asia, Pacific Asia, Pacific OECD, Central Asia, Middle East, South Asia, Eastern Europe, European part of the former Soviet Union, Western Europe, Latin America, and North America. This data set is produced and distributed by the Columbia University Center for International Earth Science Information Network (CIESIN).
Between Oct. 14, 2014, and May 21, 2015, Pew Research Center, with generous funding from The Pew Charitable Trusts and the Neubauer Family Foundation, completed 5,601 face-to-face interviews with non-institutionalized adults ages 18 and older living in Israel.
The survey sampling plan was based on six districts defined in the 2008 Israeli census. In addition, Jewish residents of West Bank (Judea and Samaria) were included.
The sample includes interviews with 3,789 respondents defined as Jews, 871 Muslims, 468 Christians and 439 Druze. An additional 34 respondents belong to other religions or are religiously unaffiliated. Five groups were oversampled as part of the survey design: Jews living in the West Bank, Haredim, Christian Arabs, Arabs living in East Jerusalem and Druze.
Interviews were conducted under the direction of Public Opinion and Marketing Research of Israel (PORI). Surveys were administered through face-to-face, paper and pencil interviews conducted at the respondent's place of residence. Sampling was conducted through a multi-stage stratified area probability sampling design based on national population data available through the Israel's Central Bureau of Statistics' 2008 census.
The questionnaire was designed by Pew Research Center staff in consultation with subject matter experts and advisers to the project. The questionnaire was translated into Hebrew, Russian and Arabic, independently verified by professional linguists conversant in regional dialects and pretested prior to fieldwork.
The questionnaire was divided into four sections. All respondents who took the survey in Russian or Hebrew were branched into the Jewish questionnaire (Questionnaire A). Arabic-speaking respondents were branched into the Muslim (Questionnaire B), Christian (Questionnaire C) or Druze questionnaire (D) based on their response to the religious identification question. For the full question wording and exact order of questions, please see the questionnaire.
Note that not all respondents who took the questionnaire in Hebrew or Russian are classified as Jews in this study. For further details on how respondents were classified as Jews, Muslims, Christians and Druze in the study, please see sidebar in the report titled "http://www.pewforum.org/2016/03/08/israels-religiously-divided-society/" Target="_blank">"How Religious are Defined".
Following fieldwork, survey performance was assessed by comparing the results for key demographic variables with population statistics available through the census. Data were weighted to account for different probabilities of selection among respondents. Where appropriate, data also were weighted through an iterative procedure to more closely align the samples with official population figures for gender, age and education. The reported margins of sampling error and the statistical tests of significance used in the analysis take into account the design effects due to weighting and sample design.
In addition to sampling error and other practical difficulties, one should bear in mind that question wording also can have an impact on the findings of opinion polls.
This dataset consists of growth and yield data for each season when soybean [Glycine max (L.) Merr.] was grown for seed at the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL). In the 1994, 2003, 2004, and 2010 seasons, soybean was grown on two large, precision weighing lysimeters, each in the center of a 4.44 ha square field. In 2019, soybean was grown on four large, precision weighing lysimeters and their surrounding 4.4 ha fields. The square fields are themselves arranged in a larger square with four fields in four adjacent quadrants of the larger square. Fields and lysimeters within each field are thus designated northeast (NE), southeast (SE), northwest (NW), and southwest (SW). Soybean was grown on different combinations of fields in different years. Irrigation was by linear move sprinkler system in 1995, 2003, 2004, and 2010 although in 2010 only one irrigation was applied to establish the crop after which it was grown as a dryland crop. Irrigation protocols described as full were managed to replenish soil water used by the crop on a weekly or more frequent basis as determined by soil profile water content readings made with a neutron probe to 2.4-m depth in the field. Irrigation protocols described as deficit typically involved irrigations to establish the crop early in the season, followed by reduced or absent irrigations later in the season (typically in the later winter and spring). The growth and yield data include plant population density, height, plant row width, leaf area index, growth stage, total above-ground biomass, leaf and stem biomass, head mass (when present), kernel or seed number, and final yield. Data are from replicate samples in the field and non-destructive (except for final harvest) measurements on the weighing lysimeters. In most cases yield data are available from both manual sampling on replicate plots in each field and from machine harvest. Machine harvest yields are commonly smaller than hand harvest yields due to combine losses. These datasets originate from research aimed at determining crop water use (ET), crop coefficients for use in ET-based irrigation scheduling based on a reference ET, crop growth, yield, harvest index, and crop water productivity as affected by irrigation method, timing, amount (full or some degree of deficit), agronomic practices, cultivar, and weather. Prior publications have focused on soybean ET, crop coefficients, and crop water productivity. Crop coefficients have been used by ET networks. The data have utility for testing simulation models of crop ET, growth, and yield and have been used for testing, and calibrating models of ET that use satellite and/or weather data. See the README for descriptions of each data file. Resources in this dataset:Resource Title: 1995 Bushland, TX, west soybean growth and yield data. File Name: 1995 West Soybean_Growth_and_Yield-V2.xlsxResource Title: 2003 Bushland, TX, east soybean growth and yield data. File Name: 2003 East Soybean_Growth_and_Yield-V2.xlsxResource Title: 2004 Bushland, TX, east soybean growth and yield data. File Name: 2004 East Soybean_Growth-and_Yield-V2.xlsxResource Title: 2019 Bushland, TX, east soybean growth and yield data. File Name: 2019 East Soybean_Growth_and_Yield-V2.xlsxResource Title: 2019 Bushland, TX, west soybean growth and yield data. File Name: 2019 West Soybean_Growth_and_Yield-V2.xlsxResource Title: 2010 Bushland, TX, west soybean growth and yield data. File Name: 2010 West_Soybean_Growth_and_Yield-V2.xlsxResource Title: README. File Name: README_Soybean_Growth_and_Yield.txt
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset gives the population estimates of tigers. In the dataset, states have been categorized as Shivalik-Gangetic Plain Landscape Complex, Uttarakhand, Uttar Pradesh, Bihar. Shivalik-Gangetic includes: Central India Landscape Complex, Andhra Pradesh (Including Telangana), Chhattisgarh, Madhya Pradesh, Maharashtra, Odisha, Rajasthan, Jharkhand, Central Indian, Western Ghats Landscape Complex, Karnataka, Kerala, Tamil Nadu, Goa. Western Ghats includes: North East Hills and Brahmaputra Flood Plains, Assam, Arunachal Pradesh, Mizoram, Northern West Bengal, North East Hills and Brahmaputra includes Sundarbans. NB: Ranipur (Uttar Pradesh) is added in Shivalik landscape for convenience. State population estimate does not add up to the landscape estimate due to common tigers, tiger outside protected areas, and model range limits.
The resource comprises population surfaces generated from publicly available GB Census data for 1971, 1981, 1991, 2001 and 2011 to enable direct comparisons between Censuses. Population surfaces are estimates of counts of people for regular grids (with population estimates over, for example, 1km by 1km grid cells) and these can be directly compared between Censuses. Variables include age, country of birth, ethnicity, housing tenure, employment, self-reported health, overcrowding and a composite measure of deprivation over 1km by 1km cells for all Censuses where variables are available.The research will explore how the population of the UK is, or has been, geographically distributed. The project will bring a new and important perspective to debates about divisions, inequalities and the ways in which people in the UK live together or apart. It will address questions such as: are health inequalities between places greater now than in the past? What makes localities different - are they geographically distinguished more by housing tenure or health than they are by employment status or ethnicity? What areas have the greatest diversity of people and how has this changed between 1971 and 2011? To answer these questions, we will generate population surfaces from publicly available Census data for 1971, 1981, 1991, 2001 and 2011 to enable direct comparisons between Censuses. Counts of people in a variety of population sub-groups (e.g., by qualifications, age, etc) have been released from each Census for sets of small geographical areas (such as enumeration districts or output areas). This allows the mapping and analysis of geographical patterning in population groups across the UK for each Census. However, these small areas differ in size and shape between Censuses, so the 1971 small area boundaries, for example, are very different to those for 2011. This project will produce population surfaces for each Census year as a means of overcoming this problem. Population surfaces are estimates of counts of people for regular grids (with population estimates over, for example, 100m by 100m grid cells); these can be directly compared between Censuses. So, once these population surfaces are available we will be able to consider how localities have changed and in what ways. This new population surface resource will be made freely available so that users can explore these changes for themselves and also consider in more depth the results we produce. We will use this resource to provide the first systematic review of how the population of the UK has changed over the last 40 years. It will show how population groups in the UK are geographically distributed and it will assess, in detail, how far different localities (for example, within central Scotland) or regions (for example, south east England or north west England) are becoming more similar or more different to one another in terms of their population characteristics. The project will also consider how the relationships between population groups have changed across time. For example, with a consistent geography, it will be possible to assess which small area localities have very high rates of unemployment together with large proportions of social rented households, and how the characteristics of these localities changed between 1971 and 2011. We will also be able to identify which population characteristics most strongly distinguish particular areas. As an example, the population in some localities in north west England may be very similar in terms of levels of poor health, unemployment and housing tenure, but differ in terms of the number of single person households or the average number of dependent children. The project will explore these differences in detail and, for the first time, construct a detailed profile of the geographical distribution of individual population groups and the multiple characteristics of areas in combination. The population surface resource will be invaluable to any users interested in the population geography of the UK, while the results of our analysis of population distributions will enrich our understanding of the ways in which the population of the UK has changed over the last 40 years. The resource was generated using small area (enumeration district and output area) Census data for Britain. The counts for these areas were reallocated to 1km by 1km grid cells using information on postcode intensity to determine how many people in each population group (e.g., unemployed people) should be transferred from the original zones to the 1km by 1km cells.
This dataset consists of growth and yield data for each year when maize (Zea mays, L., also known as corn in the United States) was grown for grain at the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL). Maize was grown for grain on four large, precision weighing lysimeters, each in the center of a 4.44 ha square field. The four square fields are themselves arranged in a larger square with the fields in four adjacent quadrants of the larger square. Fields and lysimeters within each field are thus designated northeast (NE), southeast (SE), northwest (NW), and southwest (SW). Irrigation was by linear move sprinkler system in 1989, 1990, and 1994. In 2013, 2016, and 2018, two lysimeters and their respective fields (NE and SE) were irrigated using subsurface drip irrigation (SDI), and two lysimeters and their respective fields (NW and SW) were irrigated by a linear move sprinkler system. Irrigations were managed to replenish soil water used by the crop on a weekly or more frequent basis as determined by soil profile water content readings made with a neutron probe to 2.4-m depth in the field. The growth and yield data include plant population density, height, plant row width, leaf area index, growth stage, total above-ground biomass, leaf and stem biomass, ear mass (when present), kernel number, and final yield. Data are from replicate samples in the field and non-destructive (except for final harvest) measurements on the weighing lysimeters. In most cases yield data are available from both manual sampling on replicate plots in each field and from machine harvest. These datasets originate from research aimed at determining crop water use (ET), crop coefficients for use in ET-based irrigation scheduling based on a reference ET, crop growth, yield, harvest index, and crop water productivity as affected by irrigation method, timing, amount (full or some degree of deficit), agronomic practices, cultivar, and weather. Prior publications have focused on maize ET, crop coefficients, and crop water productivity. Crop coefficients have been used by ET networks. The data have utility for testing simulation models of crop ET, growth, and yield and have been used by the Agricultural Model Intercomparison and Improvement Project (AgMIP), by OPENET, and by many others for testing, and calibrating models of ET that use satellite and/or weather data.Resources in this dataset:Resource Title: 1989 Bushland, TX, east maize growth and yield data. File Name: 1989_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: This dataset consists of growth and yield data for one of the seasons when maize was grown for grain at the USDA-ARS Conservation and Production Laboratory (CPRL), Soil and Water Management Research Unit (SWMRU) research weather station, Bushland, Texas (Lat. 35.186714°, Long. -102.094189°, elevation 1170 m above MSL). Maize was grown for grain on four large, precision weighing lysimeters, each in the center of a 4.44 ha square field. The four square fields are themselves arranged in a larger square with the fields in four adjacent quadrants of the larger square. Fields and lysimeters within each field are thus designated northeast (NE), southeast (SE), northwest (NW), and southwest (SW). Irrigation was by linear move sprinkler system in 1989, 1990, and 1994. In 2013, 2016, and 2018, two lysimeters and their respective fields (NE and SE) were irrigated using subsurface drip irrigation (SDI), and two lysimeters and their respective fields (NW and SW) were irrigated by a linear move sprinkler system. Irrigations were managed to replenish soil water used by the crop on a weekly or more frequent basis as determined by soil profile water content readings made with a neutron probe to 2.4-m depth in the field. The growth and yield data include plant population density, height, plant row width, leaf area index, growth stage, total above-ground biomass, leaf and stem biomass, ear mass (when present), kernel number, and final yield. Data are from replicate samples in the field and non-destructive (except for final harvest) measurements on the weighing lysimeters. In most cases yield data are available from both manual sampling on replicate plots in each field and from machine harvest. There are separate spreadsheets for the east (NE and SE) lysimeters and fields, and for the west (NW and SW) lysimeters and fields. The spreadsheets contain tabs for data and corresponding tabs for data dictionaries. Typically there are separate data tabs and corresponding dictionaries for plant growth during the season, crop growth stage, plant population, manual harvest from replicate plots in each field and from lysimeter surfaces, and machine (combine) harvest, An Introduction tab explains the tab names and contents, lists the authors, explains conventions, and lists some relevant references.Resource Title: 1990 Bushland, TX, east maize growth and yield data. File Name: 1990_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 1990 East.Resource Title: 1994 Bushland, TX, east maize growth and yield data. File Name: 1994_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 1994 East.Resource Title: 1994 Bushland, TX, west maize growth and yield data. File Name: 1994_West_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 1994 West.Resource Title: 2013 Bushland, TX, west maize growth and yield data. File Name: 2013_West_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2013 West.Resource Title: 2016 Bushland, TX, east maize growth and yield data. File Name: 2016_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2016 East.Resource Title: 2016 Bushland, TX, west maize growth and yield data. File Name: 2016_West_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2016 West.Resource Title: 2018 Bushland, TX, west maize growth and yield data. File Name: 2018_West_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2018 West.Resource Title: 2013 Bushland, TX, east maize growth and yield data. File Name: 2013_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2013 East.Resource Title: 2018 Bushland, TX, east maize growth and yield data. File Name: 2018_East_Maize_Growth_and_Yield(ADC).xlsx. Resource Description: As above for 2018 East.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
National and subnational mid-year population estimates for the UK and its constituent countries by administrative area, age and sex (including components of population change, median age and population density).
Berlin in the crisis situation after construction of the wall on 13. Aug. 1961. Topics: Judgement on the prospects for the future of Berlin; personal prospects for the future; visits in the eastern part of the city before construction of the wall; trip to the wall; assumptions about how long the wall will remain in existence; assessment of the future political task of Berlin; judgement on the East-West balance of power and the current situation of Berlin; knowledge about the loudspeaker broadcasts; judgement about measures of the senate; possibilities of the Americans to prevent the wall; assumed reasons for toleration of construction of the wall; judgement on the conduct of the Americans after construction of the wall; assessment of the danger of war due to Berlin; attitude to the suburban railway boycott and deployment of police in demonstrations; judgement on the allied Berlin guarantees, the foreign policy as well as general measures of the Federal Government; judgement on the readiness for action for Berlin of West Germans as well as selected politicians; personal participation in political rallies; significance of the resignation of Willy Brandt; possible concessions of the West in negotiations with the East; attitude to the Oder-Neisse Line and assessment of the chances of reunification; basic attitude to democracy and politics; attitude to one or multi-party system; conduct regarding agents from the East; assessment of the departures from Berlin; personal readiness to leave; local residency; relatives in the East; newspapers read; political interest; self-image of the Berliners and image of the West Germans; political interest; party preference. Demography: sex; age (classified); employment; occupation; school education; religiousness; religious denomination; household income. Berlin in der Krisensituation nach dem Mauerbau am 13. 8. 1961. Themen: Beurteilung der Zukunftsaussichten Berlins; persönliche Zukunftsaussichten; Besuche im Ostteil der Stadt vor dem Mauerbau Besuch der Mauer; Vermutungen über die Dauer des Bestehenbleibens der Mauer; Einschätzung der künftigen politischen Aufgabe Berlins Beurteilung des Ost-West-Kräfteverhältnisses und der derzeitigen Berlin-Situation; Kenntnis der Lautsprechersendungen; Urteil über Maßnahmen des Senats; Möglichkeiten der Amerikaner, die Mauer zu verhindern; vermutete Gründe für die Hinnahme des Mauerbaus; Beurteilung des Verhaltens der Amerikaner nach dem Mauerbau; Einschätzung der Kriegsgefahr wegen Berlin; Einstellung zum S-Bahn-Boykott und zum Einsatz der Polizei bei Demonstrationen; Beurteilung der alliierten Berlin-Garantien, der Außenpolitik sowie allgemeiner Maßnahmen der Bundesregierung; Beurteilung der Einsatzbereitschaft der Bundesbürger sowie ausgewählter Politiker für Berlin; eigene Teilnahme an politischen Kundgebungen; Bedeutung des Rücktritts von Willy Brandt; mögliche Zugeständnisse des Westens bei Verhandlungen mit dem Osten; Einstellung zur Oder-Neiße-Linie und Einschätzung der Wiedervereinigungschancen; grundsätzliche Einstellung zur Demokratie und zur Politik; Einstellung zum Ein- oder Mehrparteiensystem; Verhalten gegenüber Ostagenten; Einschätzung der Abwanderungen aus Berlin; eigene Fortzugsbereitschaft; Ortsansässigkeit; Verwandte im Osten; gelesene Zeitungen; politisches Interesse; Selbstbild der Berliner und Image von den Bundesbürgern; politisches Interesse; Parteipräferenz. Demographie: Geschlecht; Alter (klassiert); Berufstätigkeit; Beruf; Schulbildung; Religiosität; Konfession; Haushaltseinkommen.
(https://www.kaggle.com/c/house-prices-advanced-regression-techniques) About this Dataset Start here if... You have some experience with R or Python and machine learning basics. This is a perfect competition for data science students who have completed an online course in machine learning and are looking to expand their skill set before trying a featured competition.
Competition Description
Ask a home buyer to describe their dream house, and they probably won't begin with the height of the basement ceiling or the proximity to an east-west railroad. But this playground competition's dataset proves that much more influences price negotiations than the number of bedrooms or a white-picket fence.
With 79 explanatory variables describing (almost) every aspect of residential homes in Ames, Iowa, this competition challenges you to predict the final price of each home.
Practice Skills Creative feature engineering Advanced regression techniques like random forest and gradient boosting Acknowledgments The Ames Housing dataset was compiled by Dean De Cock for use in data science education. It's an incredible alternative for data scientists looking for a modernized and expanded version of the often cited Boston Housing dataset.
There's a story behind every dataset and here's your opportunity to share yours.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Opinion on questions concerning security policy. East-West comparison. Topics: Satisfaction with the standard of living; attitude to France, Great Britain, Italy, USA, USSR, Red China and West Germany; preferred East-West-orientation of one´s own country and correspondence of national interests with the interests of selected countries; judgement on the American, Soviet and Red Chinese peace efforts; judgement on the foreign policy of the USA and the USSR; trust in the foreign policy capabilities of the USA; the most powerful country in the world, currently and in the future; comparison of the USA with the USSR concerning economic and military strength, nuclear weapons and the areas of culture, science, space research, education as well as the economic prospects for the average citizen; significance of a landing on the moon; Soviet citizen or American as first on the moon; assumed significance of space research for military development; attitude to a united Europe and Great Britain´s joining the Common Market; preferred relation of a united Europe to the United States; fair share of the pleasant things of life; lack of effort or fate as reasons for poverty; general contentment with life; perceived growth rate of the country´s population and preference for population growth; attitude to the growth of the population of the world; preferred measures against over-population; attitude to a birth control program in the developing countries and in one´s own country; present politician idols in Europe and in the rest of the world; attitude to disarmament; trust in the alliance partners; degree of familiarity with the NATO and assessment of its present strength; attitude to a European nuclear force; desired and estimated loyalty of the Americans to the NATO alliance partners; evaluation of the development of the UN; equal voice for all members of the UN; desired distribution of the UN financial burdens; attitude to an acceptance of Red China in the United Nations; knowledge about battles in Vietnam; attitude to the Vietnam war; attitude to the behavior of America, Red China and the Soviet Union in this conflict; attitude to the withdrawal of American troops from Vietnam and preferred attitude of one´s own country in this conflict and in case of a conflict with Red China; opinion on the treatment of colored people in Great Britain, America and the Soviet Union; judgement on the American Federal Government and on the American population regarding the equality of Negros; degree of familiarity with the Chinese nuclear tests; effects of this test on the military strength of Red China; attitude to American private investments in the Federal Republic; the most influential groups and organizations in the country; party preference; religiousness. Interviewer rating: social class of respondent. Additionally encoded were: number of contact attempts; date of interview. Beurteilung von Sicherheitsfragen. Ost-West-Vergleich. Themen: Zufriedenheit mit dem Lebensstandard; Einstellung zu Frankreich, Großbritannien, Italien, USA, UdSSR, Rotchina, Westdeutschland; präferierte Ost-West-Orientierung des eigenen Landes und Übereinstimmung der Landesinteressen mit den Interessen ausgewählter Länder; Beurteilung der Friedensbemühungen Amerikas, der Sowjetunion und Rotchinas; Beurteilung der Außenpolitik der USA und der UdSSR; Vertrauen in die außenpolitischen Fähigkeiten der USA; mächtigstes Land der Erde, derzeit und zukünftig; Vergleich der USA mit der UdSSR bezüglich der militärischen und wirtschaftlichen Stärke, der Atomwaffen und auf den Gebieten Kultur, Wissenschaft, Weltraumforschung, Bildung sowie der wirtschaftlichen Aussichten für den Durchschnittsbürger; Bedeutung einer Mondlandung; Sowjetbürger oder Amerikaner als erster auf dem Mond; vermutete Bedeutung der Weltraumforschung für die militärische Entwicklung; Einstellung zu einem vereinten Europa und zu einem Beitritt Großbritanniens zum Gemeinsamen Markt; präferierte Beziehung eines vereinten Europas zu den Vereinigten Staaten; gerechter Anteil an den angenehmen Dingen des Lebens; fehlende Anstrengung oder Schicksal als Gründe für Armut; allgemeine Lebenszufriedenheit; perzipierte Zuwachsrate der Bevölkerung im Lande und Präferenz für Bevölkerungszuwachs; Einstellung zu einem Anwachsen der Weltbevölkerung; präferierte Maßnahmen zur Bekämpfung einer Überbevölkerung; Einstellung zu einem Geburtenkontrollprogramm in den Entwicklungsländern und im eigenen Lande; gegenwärtige Politikeridole in Europa und in der übrigen Welt; Einstellung zur Abrüstung; Vertrauen in die Bündnispartner; Bekanntheitsgrad der Nato und Einschätzung ihrer derzeitigen Stärke; Einstellung zu einer europäischen Atomstreitmacht; gewünschte und eingeschätzte Loyalität der Amerikaner gegenüber den Nato-Bündnispartnern; Einschätzung der Entwicklung der UNO; gleiches Mitspracherecht für alle UNO-Mitglieder; gewünschte Verteilung der UNO-Finanzlasten; Einstellung zu einer Aufnahme Rotchinas in die Vereinten Nationen; Kenntnisse über Kämpfe in Vietnam; Einstellung zum Vietnamkrieg; Einstellung zum Verhalten Amerikas, Rotchinas und der Sowjetunion in diesem Konflikt; Einstellung zum Rückzug amerikanischer Truppen aus Vietnam und präferierte Haltung des eigenen Landes in diesem Konflikt und im Falle eines Konfliktes mit Rotchina; Beurteilung der Behandlung von Farbigen in Großbritannien, Amerika und der Sowjetunion; Beurteilung der amerikanischen Bundesregierung und der amerikanischen Bevölkerung in bezug auf die Gleichberechtigung für Neger; Bekanntheitsgrad der chinesischen Atombombenversuche; Auswirkungen dieses Versuchs auf die militärische Stärke Rotchinas; Einstellung zu amerikanischen Privatinvestitionen in der Bundesrepublik; einflußreichste Gruppen und Organisationen im Lande; Parteipräferenz; Religiosität. Interviewerrating: Schichtzugehörigkeit des Befragten. Zusätzlich verkodet wurde: Anzahl der Kontaktversuche; Interviewdatum.
Abstract copyright UK Data Service and data collection copyright owner. The Ashford study aimed to create a machine-readable database of information relating to the social and economic activities of the inhabitants of Ashford in the mid to late nineteenth century. Material has been transcribed from the census enumerator's books, civil registers of births, marriages and deaths, poll books, trade directories and Parliamentary sessional papers listings of landowners. Each source is held as a separate data file containing the following information: 1) Census data: the data consists of all persons residing in Ashford and surrounding rural areas on the night of the censuses of 1841, 1851 and 1861. However, the data file for 1841 is only a partial transcript as some of the original enumerator's books are missing. The data comprise one record for each individual transcribed as recorded in the source with some additional information. 2) Civil register data: The data comprise all marriages which took place in Ashford churches between 1837 and 1870, in total about 1600, of which around 600 were in non-conformist churches. Also recorded are births and deaths registered in the Ashford division of the West Ashford Union, including the parishes of Bethersden, Great Chart, Hothfield, Kingsnorth, Shadoxhurst and Ashford. For each marriage two records were prepared, one for the husband and for the wife thereby retaining all details of the marriage. Births and deaths are recorded as in the original source with some minor coding. 3) Directory data: The data comprise details of various members of the community including local officers, gentry, professionals, shopkeepers and traders. 4) Electoral data: The data comprise details of enfranchised members of constituencies and how they cast their votes in the Parliamentary elections of 1852, 1857, 1863 and 1868. 5) Landowners data: The data comprise details of landowners, land and estimated rental for all those people with addresses in the East or West Ashford Union who owned land in Kent. Main Topics: The data files may be analysed separately or linked and merged to provide a means of evaluating and quantifying aspects of the lives of the people of Ashford over a period of time. The data may be of interest for a wide range of topics, for example, fertility patterns, marriage patterns, household structure and composition, migration, (during this period Ashford experienced a large influx of migrants associated with the newly built railway works), economic activities, social composition. No sampling (total universe)
Attitudes toward parity and the fall of the Berlin Wall. Topics: 1. Parity: Advocacy of women taking responsible positions in politics, government institutions, and business; areas where women should take more responsible positions (in political parties, governments, large business enterprises, public administrations, small and medium-sized business enterprises, courts, universities, and the Federal Armed Forces, none of the above, elsewhere); change in policy if more women were represented in politics; Advocacy of government measures to increase the quota of women in the Bundestag; opinion on various statements about the role of sex and origin in selection for positions (it does not matter whether a man or a woman takes responsibility, sex should play a role in selection for positions, origin should play a role in selection for positions, women should be preferred in selection for positions, those who prefer women in selection for positions must also prefer other groups). 2. Fall of the Wall: Longing back for the GDR; agreement with various statements about the GDR and the relationship between West and East Germans (many things were better in the GDR, West Germans treat East Germans from above, West Germans do not understand what East Germans accomplished in GDR times, West Germans do not understand what East Germans went through in GDR times); assessment of German unity rather positive or rather negative; Agreement with various statements on the peaceful revolution in the GDR and reunification (the peaceful revolution was a stroke of luck in German history, the fall of the Wall was a very moving moment for me, all in all reunification was successful, people in East and West Germany have come closer together after reunification, the alignment of living conditions in East and West Germany has progressed in recent years, East Germans are also materially better off after reunification). Demography: sex; age; school education; occupation; household size; number of persons older than 14 in household; net household income; party preference. Additionally coded were: Interview number; weight; city size (BIK); federal state; region (old federal states, new federal states). Einstellungen zu Parität und zum Mauerfall. Themen: 1. Parität: Befürwortung der Übernahme verantwortungsvoller Positionen in Politik, staatlichen Institutionen und Wirtschaft durch Frauen; Bereiche, in denen Frauen mehr verantwortungsvolle Positionen übernehmen sollten (in Parteien, Regierungen, großen Wirtschaftsunternehmen, öffentlichen Verwaltungen, kleineren und mittleren Wirtschaftsunternehmen, Gerichten, Universitäten und in der Bundeswehr, nichts davon, woanders); Änderung der Politik, wenn mehr Frauen in der Politik vertreten wären; Befürwortung staatlicher Maßnahmen zur Erhöhung der Frauenquote im Bundestag; Meinung zu verschiedenen Aussagen zur Rolle von Geschlecht und Herkunft bei der Auswahl für Positionen (es spielt keine Rolle, ob ein Mann oder eine Frau Verantwortung übernimmt, bei der Auswahl für Positionen sollte das Geschlecht eine Rolle spielen, bei der Auswahl für Positionen sollte die Herkunft eine Rolle spielen, bei der Auswahl für Positionen sollten Frauen bevorzugt werden, wer Frauen bei der Auswahl für Positionen bevorzugt, muss auch andere Gruppen bevorzugen). 2. Mauerfall: Zurücksehnen nach der DDR; Zustimmung zu verschiedenen Aussagen zur DDR und zum Verhältnis zwischen West- und Ostdeutschen (in der DDR war vieles besser, Westdeutsche behandeln Ostdeutsche von oben herab, Westdeutsche verstehen nicht, was Ostdeutsche zu DDR-Zeiten geleistet haben, Westdeutsche verstehen nicht, was Ostdeutsche zu DDR-Zeiten durchgemacht haben); Beurteilung der deutschen Einheit eher positiv oder eher negativ; Zustimmung zu verschiedenen Aussagen zur friedlichen Revolution in der DDR und der Wiedervereinigung (Die friedliche Revolution war ein Glücksfall in der deutschen Geschichte, der Fall der Mauer war für mich ein sehr bewegender Moment, alles in allem ist die Wiedervereinigung erfolgreich verlaufen, die Menschen in Ost- und Westdeutschland sind sich nach der Wiedervereinigung nähergekommen, die Angleichung der Lebensverhältnisse in Ost- und Westdeutschland ist in den letzten Jahren vorangekommen, den Ostdeutschen geht es nach der Wiedervereinigung auch materiell besser). Demographie: Geschlecht; Alter; Schulbildung; Berufstätigkeit; Haushaltsgröße; Anzahl Personen älter als 14 Jahre im Haushalt; Haushaltsnettoeinkommen; Parteipräferenz. Zusätzlich verkodet wurde: Interviewnummer; Gewicht; Ortsgröße (BIK); Bundesland; Region (alte Bundesländer, neue Bundesländer).
Estimated number of persons by quarter of a year and by year, Canada, provinces and territories.
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
OMOP dataset: Hospital COVID patients: severity, acuity, therapies, outcomes Dataset number 2.0
Coronavirus disease 2019 (COVID-19) was identified in January 2020. Currently, there have been more than 6 million cases & more than 1.5 million deaths worldwide. Some individuals experience severe manifestations of infection, including viral pneumonia, adult respiratory distress syndrome (ARDS) & death. There is a pressing need for tools to stratify patients, to identify those at greatest risk. Acuity scores are composite scores which help identify patients who are more unwell to support & prioritise clinical care. There are no validated acuity scores for COVID-19 & it is unclear whether standard tools are accurate enough to provide this support. This secondary care COVID OMOP dataset contains granular demographic, morbidity, serial acuity and outcome data to inform risk prediction tools in COVID-19.
PIONEER geography The West Midlands (WM) has a population of 5.9 million & includes a diverse ethnic & socio-economic mix. There is a higher than average percentage of minority ethnic groups. WM has a large number of elderly residents but is the youngest population in the UK. Each day >100,000 people are treated in hospital, see their GP or are cared for by the NHS. The West Midlands was one of the hardest hit regions for COVID admissions in both wave 1 & 2.
EHR. University Hospitals Birmingham NHS Foundation Trust (UHB) is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & 100 ITU beds. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”. UHB has cared for >5000 COVID admissions to date. This is a subset of data in OMOP format.
Scope: All COVID swab confirmed hospitalised patients to UHB from January – August 2020. The dataset includes highly granular patient demographics & co-morbidities taken from ICD-10 & SNOMED-CT codes. Serial, structured data pertaining to care process (timings, staff grades, specialty review, wards), presenting complaint, acuity, all physiology readings (pulse, blood pressure, respiratory rate, oxygen saturations), all blood results, microbiology, all prescribed & administered treatments (fluids, antibiotics, inotropes, vasopressors, organ support), all outcomes.
Available supplementary data: Health data preceding & following admission event. Matched “non-COVID” controls; ambulance, 111, 999 data, synthetic data. Further OMOP data available as an additional service.
Available supplementary support: Analytics, Model build, validation & refinement; A.I.; Data partner support for ETL (extract, transform & load) process, Clinical expertise, Patient & end-user access, Purchaser access, Regulatory requirements, Data-driven trials, “fast screen” services.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
FUME data on projected distributions of migrants at local level between 2030 and 2050. The dataset contains a folder of data for each destination city as a gridded dataset at 100m resolution in GeoTIFF format. The examined destination cities are: Amsterdam, Copenhagen, Krakow and Rome. The dataset is provided as 100m grid cells based on the Eurostat GISCO grid of the 2021 NUTS version, using ETRS89 Lambert Azimuthal Equal-Area (EPSG: 3035) as coordinate system. The file names consist of the projected year, the corresponding scenario, and the reference migrant group. The projections have been performed for the years 2030, 2040 and 2050. The investigated scenarios are the following: • benchmark (bs), • baseline (bs), • Rising East (re), • EU Recovery (eur), • Intensifying Global Competition (igc), and • War (war). The migration background is derived from data about the Region of Origin (RoO) for migrants in Copenhagen and Amsterdam, and from Region of Citizenship (CoC) for migrants in Krakow and Rome. The case study of Copenhagen covers the two central NUTS3 areas (DK011, DK012) and the groups presented are the following: • total population (totalpop), • native population (DNK), • Eastern EU European migrants (EU_East), • Western EU Europeans migrants (EU_West), • Non-EU European migrants (EurNonEU), • migrants from Turkey (Turkey), • the MENAP countries (MENAP; excluding Turkey), • other non-Western (OthNonWest), and • other Western countries (OthWestern). The case study of Amsterdam covers one NUTS3 area (NL329) and the presented groups are the following: • total population (totalpop), • native population (NLD), • Eastern EU European migrants (EU East), • Western EU European migrants (EU West), • migrants from Turkey and Morocco (Turkey + Morocco), • migrants from the Middle East and Africa (Middle East + Africa), • migrants from the former colonies (Former Colonies), and • migrants from the rest of the world (Other Europe etc). The case study of Krakow covers the Municipality of Krakow, and the presented groups are the following: • total population (totalpop), • native population (POL), • EU/EFTA European migrants (EU), • non-EU European migrants (Europe_nonEU), and • migrants from the rest of the world (Other). The case of Rome covers the Municipality of Rome, and the presented groups are the following: • total population (totalpop), • native population (ITA), • migrants from Romania (ROU), • Philippines (PHL), • Bangladesh (BGD), • the EU (EU; excluding Romania), • Africa (Africa), • Asia (Asia; excluding Philippines and Bangladesh) and • America (America).
Provides regional identifiers for county based regions of various types. These can be combined with other datasets for visualization, mapping, analyses, and aggregation. These regions include:Metropolitan Statistical Areas (Current): MSAs as defined by US OMB in 2023Metropolitan Statistical Areas (2010s): MSAs as defined by US OMB in 2013Metropolitan Statistical Areas (2000s): MSAs as defined by US OMB in 2003Region: Three broad regions in North Carolina (Eastern, Western, Central)Council of GovernmentsProsperity Zones: NC Department of Commerce Prosperity ZonesNCDOT Divisions: NC Dept. of Transportation DivisionsNCDOT Districts (within Divisions)Metro Regions: Identifies Triangle, Triad, Charlotte, All Other Metros, & Non-MetropolitanUrban/Rural defined by:NC Rural Center (Urban, Regional/Suburban, Rural) - 2020 Census designations2010 Census (Urban = Counties with 50% or more population living in urban areas in 2010)2010 Census Urbanized (Urban = Counties with 50% or more of the population living in urbanized areas in 2010 (50,000+ sized urban area))Municipal Population - State Demographer (Urban = counties with 50% or more of the population living in a municipality as of July 1, 2019)Isserman Urban-Rural Density Typology
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
Strokes can be ischaemic or haemorrhagic in nature, leading to debilitating symptoms which are dependent on the location of the stroke in the brain and the severity of the insult. Stroke care is centred around Hyper-acute Stroke Units (HASU), Acute Stroke and Brain Injury Units (ASU/ABIU) and specialist stroke services. Early presentation enables the use of more invasive treatments to clear blood clots, but commonly strokes present late, preventing their use.
This synthetic dataset represents approximately 29,000 stroke patients. Data includes demography, socioeconomic status, co-morbidities, “time stamped” serial acuity, physiology and treatments, investigations (structured and unstructured data), hospital care processes, and outcomes.
The dataset was created using the Synthetic Data Vault (SDV) package, specifically employing the GAN synthesizer. Real. data was first read and pre-processed, ensuring datetime columns were correctly parsed and identifiers were handled as strings. Metadata was defined to capture the schema, specifying field types and primary keys. This metadata guided the synthesizer in understanding the structure of the data. The GAN synthesizer was then fitted to the real data, learning the distributions and dependencies within. After fitting, the synthesizer generated synthetic data that mirrors the statistical properties and relationships of the original dataset.
Geography: The West Midlands (WM) has a population of 6 million & includes a diverse ethnic & socio-economic mix. UHB is one of the largest NHS Trusts in England, providing direct acute stroke services & specialist care across four hospital sites.
Data set availability: Data access is available via the PIONEER Hub for projects which will benefit the public or patients. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact pioneer@uhb.nhs.uk or visit www.pioneerdatahub.co.uk for more details.
Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can provide the dataset in OMOP and other common data models and can build synthetic data to meet bespoke requirements.
Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and “off the shelf” Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and “fast screen” services to assess population size.
The Keriyan people live in an isolated village in the Taklimakan Desert in Xinjiang, Western China. The origin and migration of the Keriyans remains unclear. We studied paternal and maternal genetic variance through typing Y-STR loci and sequencing the complete control region of the mtDNA and compared them with other adjacent populations. Data show that the Keriyan have relatively low genetic diversity on both the paternal and maternal lineages and possess both European and Asian specific haplogroups, indicating Keriyan is an admixture population of West and East. There is a gender-bias in the extent of contribution from Europe vs. Asia to the Keriyan gene pool. Keriyans have more genetic affinity to Uyghurs than to Tibetans. The Keriyan are not the descendants of the Guge Tibetans.
Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.
The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire
Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe
Basic units of analysis that the study investigates include: individuals and groups
Sample survey data [ssd]
A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.
The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.
Sample Universe
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample Design
The sample design is a clustered, stratified, multi-stage, area probability sample.
To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.
In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:
The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages
A first-stage to stratify and randomly select primary sampling units;
A second-stage to randomly select sampling start-points;
A third stage to randomly choose households;
A final-stage involving the random selection of individual respondents
We shall deal with each of these stages in turn.
STAGE ONE: Selection of Primary Sampling Units (PSUs)
The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.
We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.
Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.
Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.
Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.
Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.
The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.
These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.
The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will