8 datasets found
  1. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  2. H

    National Health and Nutrition Examination Survey (NHANES)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). National Health and Nutrition Examination Survey (NHANES) [Dataset]. http://doi.org/10.7910/DVN/IMWQPJ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the national health and nutrition examination survey (nhanes) with r nhanes is this fascinating survey where doctors and dentists accompany survey interviewers in a little mobile medical center that drives around the country. while the survey folks are interviewing people, the medical professionals administer laboratory tests and conduct a real doctor's examination. the b lood work and medical exam allow researchers like you and me to answer tough questions like, "how many people have diabetes but don't know they have diabetes?" conducting the lab tests and the physical isn't cheap, so a new nhanes data set becomes available once every two years and only includes about twelve thousand respondents. since the number of respondents is so small, analysts often pool multiple years of data together. the replication scripts below give a few different examples of how multiple years of data can be pooled with r. the survey gets conducted by the centers for disease control and prevention (cdc), and generalizes to the united states non-institutional, non-active duty military population. most of the data tables produced by the cdc include only a small number of variables, so importation with the foreign package's read.xport function is pretty straightforward. but that makes merging the appropriate data sets trickier, since it might not be clear what to pull for which variables. for every analysis, start with the table with 'demo' in the name -- this file includes basic demographics, weighting, and complex sample survey design variables. since it's quick to download the files directly from the cdc's ftp site, there's no massive ftp download automation script. this new github repository co ntains five scripts: 2009-2010 interview only - download and analyze.R download, import, save the demographics and health insurance files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the interview weights run a series of pretty generic analyses on the health insurance ques tions 2009-2010 interview plus laboratory - download and analyze.R download, import, save the demographics and cholesterol files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the mobile examination component (mec) weights perform a direct-method age-adjustment and matc h figure 1 of this cdc cholesterol brief replicate 2005-2008 pooled cdc oral examination figure.R download, import, save, pool, recode, create a survey object, run some basic analyses replicate figure 3 from this cdc oral health databrief - the whole barplot replicate cdc publications.R download, import, save, pool, merge, and recode the demographics file plus cholesterol laboratory, blood pressure questionnaire, and blood pressure laboratory files match the cdc's example sas and sudaan syntax file's output for descriptive means match the cdc's example sas and sudaan synta x file's output for descriptive proportions match the cdc's example sas and sudaan syntax file's output for descriptive percentiles replicate human exposure to chemicals report.R (user-contributed) download, import, save, pool, merge, and recode the demographics file plus urinary bisphenol a (bpa) laboratory files log-transform some of the columns to calculate the geometric means and quantiles match the 2007-2008 statistics shown on pdf page 21 of the cdc's fourth edition of the report click here to view these five scripts for more detail about the national health and nutrition examination survey (nhanes), visit: the cdc's nhanes homepage the national cancer institute's page of nhanes web tutorials notes: nhanes includes interview-only weights and interview + mobile examination component (mec) weights. if you o nly use questions from the basic interview in your analysis, use the interview-only weights (the sample size is a bit larger). i haven't really figured out a use for the interview-only weights -- nhanes draws most of its power from the combination of the interview and the mobile examination component variables. if you're only using variables from the interview, see if you can use a data set with a larger sample size like the current population (cps), national health interview survey (nhis), or medical expenditure panel survey (meps) instead. confidential to sas, spss, stata, sudaan users: why are you still riding around on a donkey after we've invented the internal combustion engine? time to transition to r. :D

  3. d

    Census block internal point coordinates and weights formatted specifically...

    • catalog.data.gov
    Updated Sep 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    OP,ORPM (2023). Census block internal point coordinates and weights formatted specifically for use in R code of the Environmental Justice Analysis Multisite (EJAM) tool, USA, 2020, EPA, EPA AO OP ORPM [Dataset]. https://catalog.data.gov/dataset/census-block-internal-point-coordinates-and-weights-formatted-specifically-for-use-in-r-co
    Explore at:
    Dataset updated
    Sep 8, 2023
    Dataset provided by
    OP,ORPM
    Area covered
    United States
    Description

    This is Census 2020 block data specifically formatted for use by the Environmental Protection Agency (EPA) in-development Environmental Justice Analysis Multisite (EJAM) tool, which uses R code to find which block centroids are within X miles of each specified point (e.g., regulated facility), and to find those distances. The datasets have latitude and longitude of each block's internal point, as provided by Census Bureau, and the FIPS code of the block and its parent block group. The datasets also include a weight for each block, representing this block's Census 2020 population count as a fraction of the count for the parent block group overall, for use in estimating how much of a given block group is within X miles of a specified point or inside a polygon of interest. The datasets also have an effective radius of each block, which is what the radius would be in miles if the block covered the same area in square miles but were circular. The datasets also have coordinates in units that facilitate building a quadtree index of locations. They are in R data.table format, saved as .rda or .arrow files to be read by R code.

  4. r

    NSW Post-School Destinations and Experiences Survey

    • researchdata.edu.au
    • data.nsw.gov.au
    Updated Dec 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nsw.gov.au (2024). NSW Post-School Destinations and Experiences Survey [Dataset]. https://researchdata.edu.au/nsw-post-school-experiences-survey/3453078
    Explore at:
    Dataset updated
    Dec 10, 2024
    Dataset provided by
    data.nsw.gov.au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New South Wales
    Description

    The NSW Post-School Destinations and Experiences Survey (PSDES) collects information about the main destinations of recent school leavers in the 6 to 12 months after leaving school.\r \r

    Data Notes\r

    \r * The survey collected data on school leavers in the 6-12 months after leaving school in 2023, School leavers comprise students who completed Year 12 and students who left school while they were in Year 10, 11 or 12 (early school leavers).\r \r * There are some caveats and limitations in the generalisability of survey findings to the total population of recent school leavers in NSW. For example, students who completed Year 12 via an alternative pathway other than the HSC, such as the International Baccalaureate, are not counted as Year 12 completers and are not covered in the survey.\r \r * Prior to 2021 a stratified sampling approach was used for the mainstream Year 12 completer survey (excluding Aboriginal and/or Torres Strait Islander and non-Connected Community school leavers). The sampling strategy for this group changed to a census for the first time in 2021 and resulted in a marked increase in the overall proportion of responses collected from the target population.\r \r * Time series data of destinations by student type from 2014 to 2018 should be used with caution as some counts of school leavers are estimated from lower cell counts than in later years. Estimates in the data are based on base weights which are adjusted to matched population distributions for school leaver characteristics to minimise non-response bias.\r \r * Each table shows population estimates (as column totals) for each grouping variable and leaver type combination as well as weighted percentages for each of the 10 main destination categories included in the survey. Population estimates and destination percentage breakdowns are also included for all leavers (across leaver type). Findings are reported at a system level (across leavers from government and non-government schools).\r \r * For a full description of notes and caveats, see the 2023 Post-School Destinations and Experiences Survey Technical Report \r \r * See the 2023 Post-School Destinations and Experiences Survey, Annual Report and fact sheets \r \r

    Data Source\r

    NSW Post-School Destinations and Experiences Survey\r \r Available tables in this dataset:\r \r * Table 1 provides a breakdown of main destination by leaver type and survey year (2014 to 2023).\r * Table 2 provides a breakdown of main destination by leaver type and gender (as self-identified) for 2023 only.\r * Table 3 provides a breakdown of main destination by leaver type and Aboriginal status (as self-identified) for 2023 only.

  5. d

    Supplementary materials for: \"Comparing Internet experiences and...

    • dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hargittai, Eszter; Shaw, Aaron (2023). Supplementary materials for: \"Comparing Internet experiences and prosociality in Amazon Mechanical Turk and population-based survey samples\" [Dataset]. http://doi.org/10.7910/DVN/UFL6MI
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Hargittai, Eszter; Shaw, Aaron
    Description

    Overview Supplementary materials for the paper "Comparing Internet experiences and prosociality in Amazon Mechanical Turk and population-based survey samples" by Eszter Hargittai and Aaron Shaw published in Socius in 2020 (https://doi.org/10.1177/2378023119889834). License The materials provided here are issued under the same (Creative Commons Attribution Non-Commercial 4.0) license as the paper. Details and a copy of the license are available at: http://creativecommons.org/licenses/by-nc/4.0/. Manifest The files included are: Hargittai-Shaw-AMT-NORC-2019.rds and Hargittai-Shaw-AMT-NORC-2019.tsv: Two (identical) versions the dataset used for the analysis. The tsv file is provided to facilitate import into software other than R. R analysis code files: 01-import.R - Imports dataset. Creates a mapping of dependent variables and variable names used elsewhere in the figure and analysis. 02-gen_figure.R - Generates Figure 1 in PDF and PNG formats and saves them in the "figures" directory. 03-gendescriptivestats.R - Generates results reported in Table 1. 04-gen_models.R - Fits models reported in Tables 2-4. 05-alternative_specifications.R - Fits models using log-transformed version of the income variable. Makefile: Executes all of the R files in sequence, produces corresponding .log files in the "log" directory that contain the full R session from each file as well as separate error log files (also in the "log" directory) that capture any error messages and warnings generated by R along the way. HargittaiShaw2019Socius-Instrument.pdf: The questions distributed to both the NORC and AMT survey participants used in the analysis reported in this paper. How to reproduce the analysis presented in the paper Depending on your computing environment, reproducing the analysis presented in the paper may be as easy as invoking "make all" or "make" in the directory containing this file on a system that has the appropriate software installed. Once compilation is complete, you can review the log files in a text editor. See below for more on software and dependencies. If calling the makefile fails, the individual R scripts can also be run interactively or in batch mode. Software and dependencies The R and compilation materials provided here were created and tested on a 64-bit laptop pc running Ubuntu 18.04.3 LTS, R version 3.6.1, ggplot2 version 3.2.1, reshape2 version 1.4.3, forcats version 0.4.0, pscl version 1.5.2, and stargazer version 5.2.2 (these last five are R packages called in specific .R files). As with all software, your mileage may vary and the authors provide no warranties. Codebook The dataset consists of 36 variables (columns) and 2,716 participants (rows). The variable names and brief descriptions follow below. Additional details of measurement are provided in the paper and survey instrument. All dichotomous indicators are coded 0/1 where 1 is the affirmative response implied by the variable name: id: Index to identify individual units (participants). svy_raked_wgt: Raked survey weights provided by NORC. amtsample: Data source coded 0 (NORC) or 1 (AMT). age: Participant age in years. female: Participant selected "female" gender. incomecont: Income in USD (continuous) coded from center-points of categories reported in the instruments. incomediv: Income in $1,000s USD (=incomecont/1000). incomesqrt: Square-root of incomecont. lincome: Natural logarithm of incomecont. rural: Participant resides in a rural area. employed: Participant is fully or partially employed. eduhsorless: Highest education level is high school or less. edusc: Highest education level is completed some college. edubaormore: Highest education level is BA or more. white: Race = white. black: Race = black. nativeam: Race = native american. hispanic: Ethnicity = hispanic. asian: Race = asian. raceother: Race = other. skillsmean: Internet use skills index (described in paper). accesssum: Internet use autonomy (described in paper). webweekhrs: Internet use frequency (described in paper). do_sum: Participatory online activities (described in paper). snssumcompare: Social network site activities (described in paper). altru_scale: Generous behaviors (described in paper). trust_scale: Trust scale score (described in paper). pts_give: Points donated in unilateral dictator game (described in paper). std_accesssum: Standardized (z-score) version of accesssum. std_webweekhrs: Standardized (z-score) version of webweekhrs. std_skillsmean: Standardized (z-score) version of skillsmean. std_do_sum: Standardized (z-score) version of do_sum. std_snssumcompare: Standardized (z-score) version of snssumcompare. std_trust_scale: Standardized (z-score) version of trust_scale. std_altru_scale: Standardized (z-score) version of altru_scale. std_pts_give: Standardized (z-score) version of pts_give.

  6. f

    Demographic information table of the experimental subject.

    • plos.figshare.com
    xls
    Updated May 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingwei Xu; Shangxue Yang; Ke Wang; Chengliu Yu; Guanlin Liu; Chao Dai; Ruiqi Wang (2025). Demographic information table of the experimental subject. [Dataset]. http://doi.org/10.1371/journal.pone.0323911.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 15, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Mingwei Xu; Shangxue Yang; Ke Wang; Chengliu Yu; Guanlin Liu; Chao Dai; Ruiqi Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Demographic information table of the experimental subject.

  7. f

    Table 1 - National estimates from the Youth ’19 Rangatahi smart survey: A...

    • plos.figshare.com
    xls
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. Rivera-Rodriguez; T. C. Clark; T. Fleming; D. Archer; S. Crengle; R. Peiris-John; S. Lewycka (2023). Table 1 - National estimates from the Youth ’19 Rangatahi smart survey: A survey calibration approach [Dataset]. http://doi.org/10.1371/journal.pone.0251177.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    PLOS ONE
    Authors
    C. Rivera-Rodriguez; T. C. Clark; T. Fleming; D. Archer; S. Crengle; R. Peiris-John; S. Lewycka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Table 1 - National estimates from the Youth ’19 Rangatahi smart survey: A survey calibration approach

  8. Household Travel Survey

    • researchdata.edu.au
    • data.nsw.gov.au
    • +1more
    Updated Jul 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.nsw.gov.au (2022). Household Travel Survey [Dataset]. https://researchdata.edu.au/household-travel-survey/1986260
    Explore at:
    Dataset updated
    Jul 9, 2022
    Dataset provided by
    Government of New South Waleshttp://nsw.gov.au/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Household Travel Survey (HTS) is the most comprehensive source of personal travel data for the Sydney Greater Metropolitan Area (GMA). This data explores average weekday travel patterns for residents in Sydney GMA.\r \r The Household Travel Survey (HTS) collects information on personal travel behaviour. The study area for the survey is the Sydney Greater Metropolitan Area (GMA) which includes Sydney Greater Capital City Statistical Area (GCCSA), parts of Illawarra and Hunter regions. All residents of occupied private dwellings within the Sydney GMA are considered within scope of the survey and are randomly selected to participate.\r The HTS has been running continuously since 1997/981 and collects data for all days through the year – including during school and public holidays.\r \r Typically, approximately 2,000-3,000 households participate in the survey annually. Data is collected on all trips made over a 24-hour period by all members of the participating households.\r \r Annual estimates from the HTS are usually produced on a rolling basis using multiple years of pooled data for each reporting year2. All estimates are weighted to the Australian Bureau of Statistics’ Estimated Resident Population, corresponding to the year of collection3. Unless otherwise stated, all reported estimates are for an average weekday.\r \r \r \r Due to disruptions in data collection resulting from the lockdowns during the COVID-19 pandemic, post-COVID releases of HTS data are based on a lower sample size than previous HTS releases. To ensure integrity of the results and mitigate risk of sampling errors some post-COVID results have been reported differently to previous years. Please see below for more information on changes to HTS post-COVID (2020/21 onwards).\r \r 1. Data collection for the HTS was suspended during lock-down periods announced by the NSW Government due to COVID-19.\r \r 2. Exceptions apply to the estimates for 2020/21 which are based on a single year of sample as it was decided not to pool the sample with data collected pre-COVID-19. \r \r 3. HTS population estimates are also slightly lower than those reported in the ABS census as the survey excludes overseas visitors and those in non-private dwellings.\r \r Changes to HTS post-COVID (2020/21 onwards)\r \r HTS was suspended from late March 2020 to early October 2020 due to the impact and restrictions of COVID-19, and again from July 2021 to October 2021 following the Delta wave of COVID-19. Consequently, both the 2020/21 and 2021/22 releases are based on a reduced data collection period and smaller samples.\r \r Due to the impact of changed travel behaviours resulting from COVID-19 breaking previous trends, HTS releases since 2020/21 have been separated from pre-COVID-19 samples when pooled. As a result, HTS 2020/21 was based on a single wave of data collection which limited the breadth of geography available for release. Subsequent releases are based on pooled post-COVID samples to expand the geographies included with reliable estimates.\r \r Disruption to the data collection during, and post-COVID has led to some adjustments being made to the HTS estimates released post-COVID:\r \r SA3 level data has not been released for 2020/21 and 2021/22 due to low sample collection.\r LGA level data for 2021/22 has been released for selected LGAs when robust Relative Standard Error (RSE) for total trips are achieved\r Mode categories for all geographies are aggregated differently to the pre-COVID categories\r Purpose categories for some geographies are aggregated differently across 2020/21 and 2021/22.\r A new data release – for six cities as defined by the Greater Sydney Commission - is included since 2021/22.\r Please refer to the Data Document for 2022/23 (PDF, 262.54 KB) for further details.\r \r \r RELEASE NOTE\r \r The latest release of HTS data is 15 May 2025. This release includes Region, LGA, SA3 and Six Cities data for 2023/24. Please see 2023/24 Data Document for details.\r \r A revised dataset for LGAs and Six Cities for HTS 2022/23 data has also been included in this release on 15 May 2025. If you have downloaded HTS 2022/23 data by LGA and/or Six Cities from this link prior to 15/05/2025, we advise you replace it with the revised tables. If you have been supplied bespoke data tables for 2022/23 LGAs and/or Six Cities, please request updated tables.\r \r Revisions to HTS data may be made on previously published data as new sample data is appended to improve reliability of results. Please check this page for release dates to ensure you are using the most current version or create a subscription (https://opendata.transport.nsw.gov.au/subscriptions) to be notified of revisions and future releases.\r

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD

Current Population Survey (CPS)

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description

analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

Search
Clear search
Close search
Google apps
Main menu