Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chudik, Kapetanios, & Pesaran (Econometrica 2018, 86, 1479-1512) propose a one covariate at a time, multiple testing (OCMT) approach to variable selection in high-dimensional linear regression models as an alternative approach to penalised regression. We offer a narrow replication of their key OCMT results based on the Stata software instead of the original MATLAB routines. Using the new user-written Stata commands baing and ocmt, we find results that match closely those reported by these authors in their Monte Carlo simulations. In addition, we replicate exactly their findings in the empirical illustration, which relate to top five variables with highest inclusion frequencies based on the OCMT selection method.
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
These data were used to generate the results in the article “Household Food Waste Trending Upwards in the United States: Insights from a National Tracking Survey,” by Ran Li, Yiheng Shu, Kathryn E. Bender & Brian E. Roe, which has been accepted for publication in the Journal of the Agricultural and Applied Economics Association (doi – pending). The Stata code used to generate results is available from the authors upon request. U.S. residents who participate in consumer panels managed by a commercial vendor were invited by email or text message to participate in a two-part online survey during four waves of data collection: February and March of 2021 (Feb 21 wave, 425 initiated, 361 completed), July and August of 2021 (Jul 21 wave, 606 initiated, 419 completed), December of 2021 and January of 2022 (Dec 21 wave, 760 initiated, 610 completed), and February, March and April of 2022 (Feb 22 wave, 607 initiated, 587 completed). We are not able to determine if any respondents participated in multiple waves, i.e., if any of the observations are repeat participants. All participants provided informed consent and received compensation. Inclusion criteria included age 18 years or older and performance of at least half of the household food preparation. No data was collected during major holidays, i.e., the weeks of the Fourth of July (Independence Day), Christmas, or New Years. Recruitment quotas were implemented to ensure sufficient representation by geographical region, race, and age group. Post-hoc sample weights were constructed to reflect population characteristics on age, income and household size. The protocol was approved by the local Internal Review Board. The approach begins with participants completing an initial survey that ends with an announcement that a follow-up survey will arrive in about one week, and that for the next 7 days, participants should pay close attention to the amounts of different foods their household throws away, feeds to animals or composts because the food is past date, spoiled or no longer wanted for other reasons. They are told to exclude items they would normally not eat, such as bones, pits, and shells. Approximately 7 days later they received the follow-up survey, which elicited the amount of waste in up to 24 categories of food and included other questions (see supplemental materials for core survey questions). Waste amounts in each category are reported by selecting from one of several ranges of possible amounts. The gram weight for categories with volumetric ranges (e.g., listed in cups) were derived by assigning an appropriate mass to the midpoint of the selected range consistent with the food category. For the categories with highly variable weight per volume (e.g., a cup of raw asparagus weighs about 7 times more than a cup of raw chopped arugula), we use the profile of items most consumed in the United States to determine the appropriate gram weight. For display purposes, the 24 categories are consolidated into 8 more general categories. Total weekly household food waste is calculated by summing up reported gram amounts across all categories. We divide this total by the number of household members to generate the per person weekly food waste amount.
These data were used to generate the results in the article “Household Food Waste Trending Upwards in the United States: Insights from a National Tracking Survey,” by Ran Li, Yiheng Shu, Kathryn E. Bender & Brian E. Roe, which has been accepted for publication in the Journal of the Agricultural and Applied Economics Association (doi – https://doi.org/10.1002/jaa2.59). The Stata code used to generate results is available from the authors upon request. U.S. residents who participate in consumer panels managed by a commercial vendor were invited by email or text message to participate in a two-part online survey during four waves of data collection: February and March of 2021 (Feb 21 wave, 425 initiated, 361 completed), July and August of 2021 (Jul 21 wave, 606 initiated, 419 completed), December of 2021 and January of 2022 (Dec 21 wave, 760 initiated, 610 completed), and February, March and April of 2022 (Feb 22 wave, 607 initiated, 587 completed), July, August and Septemper of 2022 (Jul 22 wave, 1817 initiated, 1067 completed). We are not able to determine if any respondents participated in multiple waves, i.e., if any of the observations are repeat participants. All participants provided informed consent and received compensation. Inclusion criteria included age 18 years or older and performance of at least half of the household food preparation. No data was collected during major holidays, i.e., the weeks of the Fourth of July (Independence Day), Christmas, or New Years. Recruitment quotas were implemented to ensure sufficient representation by geographical region, race, and age group. Post-hoc sample weights were constructed to reflect population characteristics on age, income and household size. The protocol was approved by the local Internal Review Board. The approach begins with participants completing an initial survey that ends with an announcement that a follow-up survey will arrive in about one week, and that for the next 7 days, participants should pay close attention to the amounts of different foods their household throws away, feeds to animals or composts because the food is past date, spoiled or no longer wanted for other reasons. They are told to exclude items they would normally not eat, such as bones, pits, and shells. Approximately 7 days later they received the follow-up survey, which elicited the amount of waste in up to 24 categories of food and included other questions (see supplemental materials for core survey questions in Li et al. 2023). Waste amounts in each category are reported by selecting from one of several ranges of possible amounts. The gram weight for categories with volumetric ranges (e.g., listed in cups) were derived by assigning an appropriate mass to the midpoint of the selected range consistent with the food category. For the categories with highly variable weight per volume (e.g., a cup of raw asparagus weighs about 7 times more than a cup of raw chopped arugula), we use the profile of items most consumed in the United States to determine the appropriate gram weight. For display purposes, the 24 categories are consolidated into 8 more general categories. Total weekly household food waste is calculated by summing up reported gram amounts across all categories. We divide this total by the number of household members to generate the per person weekly food waste amount.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Associations between disability, comorbidities and sociodemographic characteristics and functional limitations among elderly individuals in Brazil.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Socio-demographic characteristics of MDR-TB cases and their controls.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Knowledge and attitude of food handlers and bivariable analysis with infection prevention practices among food handlers at food and drink establishments in Dessie City and Kombolcha Town, Northeastern Ethiopia, July–August 2020.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Frequency and percentage distribution of attitude towards COVID-19 infection prevention practices among food handlers at food and drink establishments in Dessie City and Kombolcha Town, Northeastern Ethiopia, July–August 2020.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chudik, Kapetanios, & Pesaran (Econometrica 2018, 86, 1479-1512) propose a one covariate at a time, multiple testing (OCMT) approach to variable selection in high-dimensional linear regression models as an alternative approach to penalised regression. We offer a narrow replication of their key OCMT results based on the Stata software instead of the original MATLAB routines. Using the new user-written Stata commands baing and ocmt, we find results that match closely those reported by these authors in their Monte Carlo simulations. In addition, we replicate exactly their findings in the empirical illustration, which relate to top five variables with highest inclusion frequencies based on the OCMT selection method.