analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
The main objective of the new agricultural statistics program is to provide timely, accurate, credible and comprehensive agricultural statistics to describe the structure of agriculture in Rwanda in terms of land use, crop production and livestock; which can be used for food and agriculture policy formulation and planning, and for the compilation of national accounts statistics.
In this regard, the National Institute of Statistics of Rwanda (NISR) conducted the Seasonal Agriculture Survey (SAS) from November 2015 to October 2016 to gather up-to-date information for monitoring progress on agriculture programs and policies in Rwanda, including the Second Economic Development and Poverty Reduction Strategy (EDPRS II) and Vision 2020. This 2016 RSAS covered three agricultural seasons (A, B and C) and provides data on background characteristics of the agricultural operators, farm characteristics (area, yield and production), agricultural practices, agricultural equipments, use of crop production by agricultural operators and by large scale farmers.
National coverage
Agricultural holdings
The 2016 RSAS targeted agricultural operators and large scale farmers operating in Rwanda.
Sample survey data [ssd]
The Seasonal Agriculture Survey (SAS) sample is composed of two categories of respondents: agricultural operators1 and large-scale farmers (LSF).
For the 2016 SAS, NISR used as the sampling method a dual frame sampling design combining selected area frame sample3 segments and a list of large-scale farmers.
NISR used also imagery from RNRA with a very high resolution of 25 centimeters to divide the total land of the country into twelve strata. A total number of 540 segments were spread throughout the country as coverage of the survey with 25,346 and 23,286 agricultural operators in Season A and Season B respectively. From these numbers of agricultural operators, sub-samples were selected during the second phases of Seasons A and B.
It is important to note that in each of agricultural season A and B, data collection was undertaken in two phases. Phase I was mainly used to collect data on demographic and social characteristics of interviewees, area under crops, crops planted, rainfall, livestock, etc. Phase II was mainly devoted to the collection of data on yield and production of crops.
Phase I serves at collecting data on area under different types of crops in the screening process, whereas the Phase II is mainly devoted to the collection of data on demographic, social characteristics of interviewees, together with yields of the different crops produced. Enumerated large-scale farmers (LSF) were 558 in both 2015 Season A and B. The LSF were engaged in either crop farming activities only, livestock farming activities only, or both crop and livestock farming activities.
Agricultural operators are the small scale farmers within the sample segments. Every selected segment was firstly screened using the appropriate materials such as the segment maps, GIS devices and the screening form. Using these devices, the enumerators accounted for every plot inside the sample segments. All Tracts6 were classified as either agricultural (cultivated land, pasture, and fallow land) or non-agricultural land (water, forests, roads, rocky and bare soils, and buildings).
During Phase I, a complete enumeration of all farmers having agricultural land and operating within the 540 selected segments was undertaken and a total of 25,495 and 24,911 agricultural operators were enumerated respectively in Seasons A and B. Season C considered only 152 segments, involving 3,445 agricultural operators.
In phase II, 50% of the large-scale farmers were undertaking crop farming activities only and 50% of the large-scale farmers were undertaking both crop and livestock farming and were selected for interview. A sample of 199 and 194 large-scale farmers were interviewed in Seasons A and B, respectively, using a farm questionnaire.
From the agricultural operators enumerated in the sample segments during Phase I, a sample of the agricultural operators was designed for Phase II as follows: 5,502 for Season A, 5,337 for Season B and 644 for Season C. The method of probability proportional to size (PPS) sampling at the national level was used. Furthermore, the total number of enumerated large-scale farmers was 774 in 2016 Season A and 622 in Season B.
The Season C considered 152 segments counting 8,987 agricultural operators from which 963 agricultural operators were selected for survey interviews.
Face-to-face paper [f2f]
There were two types of questionnaires used for this survey namely Screening questionnaire and farm questionnaires.
A Screening Questionnaire was used to collect information that enabled identification of an Agricultural Operator or Large Scale Farmer and his or her land use.
Farm questionnaires were of two types:
a) Phase I Farm Questionnaire was used to collect data on characteristics of Agricultural Operators, crop identification and area, inputs (seeds, fertilizers, labor, …) for Agricultural Operators and large scale farmers.
b) Phase 2 Farm questionnaire was used in the collection of data on crop production and use of production.
It is important to mention that all these Farm Questionnaires were subjected to two/three rounds of data quality checking. The first round was conducted by the enumerator and the second round was conducted by the team leader to check if questionnaires had been well completed by enumerators. For season C, after screening, an interview was conducted for each selected tract/Agricultural Operator using one consolidated Farm questionnaire. All the surveys questionnaires used were published in both English and Kinyarwanda languages.
Data editing took place at different stage. Firstly, the filled questionnaires were repatriated at NISR for office editing and coding before data entry started. Data entry of the completed and checked questionnaires was undertaken at the NISR office by 20 staff trained in using the CSPro software. To ensure appropriate matching of data in the completed questionnaires and plot area measurements from the GIS unit, a LOOKUP file was integrated in the CSPro data entry program to confirm the identification of each agricultural operator or LSF before starting data entry. Thereafter, data were entered in computers, edited and summarized in tables using SPSS and Excel.
The response rate for Seasonal Agriculture Survey is 98%.
All Farm questionnaires were subjected to two/three rounds of data quality checking. The first round was conducted by the enumerator and the second round was conducted by the team leader to check if questionnaires had been well completed by enumerators. And in most cases, questionnaires completed by one enumerator were peer-reviewed by another enumerator before being checked by the Team leader.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the area resource file (arf) with r the arf is fun to say out loud. it's also a single county-level data table with about 6,000 variables, produced by the united states health services and resources administration (hrsa). the file contains health information and statistics for over 3,000 us counties. like many government agencies, hrsa provides only a sas importation script and an as cii file. this new github repository contains two scripts: 2011-2012 arf - download.R download the zipped area resource file directly onto your local computer load the entire table into a temporary sql database save the condensed file as an R data file (.rda), comma-separated value file (.csv), and/or stata-readable file (.dta). 2011-2012 arf - analysis examples.R limit the arf to the variables necessary for your analysis sum up a few county-level statistics merge the arf onto other data sets, using both fips and ssa county codes create a sweet county-level map click here to view these two scripts for mo re detail about the area resource file (arf), visit: the arf home page the hrsa data warehouse notes: the arf may not be a survey data set itself, but it's particularly useful to merge onto other survey data. confidential to sas, spss, stata, and sudaan users: time to put down the abacus. time to transition to r. :D
Not seeing a result you expected?
Learn how you can add new datasets to our index.
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D