Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.
The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.
These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis.
The data include the following:
1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc).
2. A text file to import the analysis database into SAS
3. The SAS code to format the analysis database to be used for analytics
4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6
5. SAS code for deriving the multiple regression formula in Table 4.
6. SAS code for deriving the multiple regression formula in Table 5
7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8
9. The Excel files that accompanied the above SAS code to produce the tables
For questions, please email davidkcundiff@gmail.com. Thanks.
This publication provides all the information required to understand the PISA 2003 educational performance database and perform analyses in accordance with the complex methodologies used to collect and process the data. It enables researchers to both reproduce the initial results and to undertake further analyses. The publication includes introductory chapters explaining the statistical theories and concepts required to analyse the PISA data, including full chapters on how to apply replicate weights and undertake analyses using plausible values; worked examples providing full syntax in SAS®; and a comprehensive description of the OECD PISA 2003 international database. The PISA 2003 database includes micro-level data on student educational performance for 41 countries collected in 2003, together with students’ responses to the PISA 2003 questionnaires and the test questions. A similar manual is available for SPSS users.
The OECD Programme for International Student Assessment (PISA) surveys collected data on students’ performance in reading, mathematics and science, as well as contextual information on students’ background, home characteristics and school factors which could influence performance. This publication includes detailed information on how to analyse the PISA data, enabling researchers to both reproduce the initial results and to undertake further analyses. In addition to the inclusion of the necessary techniques, the manual also includes a detailed account of the PISA 2006 database. It also includes worked examples providing full syntax in SAS
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
The simulated synthetic aperture sonar (SAS) data presented here was generated using PoSSM [Johnson and Brown 2018]. The data is suitable for bistatic, coherent signal processing and will form acoustic seafloor imagery. Included in this data package is simulated sonar data in Generic Data Format (GDF) files, a description of the GDF file contents, example SAS imagery, and supporting information about the simulated scenes. In total, there are eleven 60 m x 90 m scenes, labeled scene00 through scene10, with scene00 provided with the scatterers in isolation, i.e. no seafloor texture. This is provided for beamformer testing purposes and should result in an image similar to the one labeled "PoSSM-scene00-scene00-starboard-0.tif" in the Related Data Sets tab. The ten other scenes have varying degrees of model variation as described in "Description_of_Simulated_SAS_Data_Package.pdf". A description of the data and the model is found in the associated document called "Description_of_Simulated_SAS_Data_Package.pdf" and a description of the format in which the raw binary data is stored is found in the related document "PSU_GDF_Format_20240612.pdf". The format description also includes MATLAB code that will effectively parse the data to aid in signal processing and image reconstruction. It is left to the researcher to develop a beamforming algorithm suitable for coherent signal and image processing. Each 60 m x 90 m scene is represented by 4 raw (not beamformed) GDF files, labeled sceneXX-STARBOARD-000000 through 000003. It is possible to beamform smaller scenes from any one of these 4 files, i.e. the four files are combined sequentially to form a 60 m x 90 m image. Also included are comma separated value spreadsheets describing the locations of scatterers and objects of interest within each scene. In addition to the binary GDF data, a beamformed GeoTIFF image and a single-look complex (SLC, science file) data of each scene is provided. The SLC data (science) is stored in the Hierarchical Data Format 5 (https://www.hdfgroup.org/), and appended with ".hdf5" to indicate the HDF5 format. The data are stored as 32-bit real and 32-bit complex values. A viewer is available that provides basic graphing, image display, and directory navigation functions (https://www.hdfgroup.org/downloads/hdfview/). The HDF file contains all the information necessary to reconstruct a synthetic aperture sonar image. All major and contemporary programming languages have library support for encoding/decoding the HDF5 format. Supporting documentation that outlines positions of the seafloor scatterers is included in "Scatterer_Locations_Scene00.csv", while the locations of the objects of interest for scene01-scene10 are included in "Object_Locations_All_Scenes.csv". Portable Network Graphic (PNG) images that plot the location of objects of all the objects of interest in each scene in Along-Track and Cross-Track notation are provided.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Matching is frequently used in observational studies, especially in medical research. However, only a small number of articles with matching programs for the SAS software (SAS Institute Inc., Cary, NC, USA) are available, even less are usable for inexperienced users of SAS software. This article presents a matching program for the SAS software and links to an online repository for examples and test data. The program enables matching on several variables and includes in-depth explanation of the expressions used and how to customize the program. The selection of controls is randomized and automated, minimizing the risk of selection bias. Also, the program provides means for the researcher to test for incomplete matching.
This SAS code extracts data from EU-SILC User Database (UDB) longitudinal files and edits it such that a file is produced that can be further used for differential mortality analyses. Information from the original D, R, H and P files is merged per person and possibly pooled over several longitudinal data releases. Vital status information is extracted from target variables DB110 and RB110, and time at risk between the first interview and either death or censoring is estimated based on quarterly date information. Apart from path specifications, the SAS code consists of several SAS macros. Two of them require parameter specification from the user. The other ones are just executed. The code was written in Base SAS, Version 9.4. By default, the output file contains several variables which are necessary for differential mortality analyses, such as sex, age, country, year of first interview, and vital status information. In addition, the user may specify the analytical variables by which mortality risk should be compared later, for example educational level or occupational class. These analytical variables may be measured either at the first interview (the baseline) or at the last interview of a respondent. The output file is available in SAS format and by default also in csv format.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example of a dataset for analyzing the ADR (adr) for the concomitant use of two drugs (d1 and d2) for the listds data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example of a dataset of three patients for the drugds data.
analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii
function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the consumer expenditure survey (ce) with r the consumer expenditure survey (ce) is the primo data source to understand how americans spend money. participating households keep a running diary about every little purchase over the year. those diaries are then summed up into precise expenditure categories. how else are you gonna know that the average american household spent $34 (±2) on bacon, $826 (±17) on cellular phones, and $13 (±2) on digital e-readers in 2011? an integral component of the market basket calculation in the consumer price index, this survey recently became available as public-use microdata and they're slowly releasing historical files back to 1996. hooray! for a t aste of what's possible with ce data, look at the quick tables listed on their main page - these tables contain approximately a bazillion different expenditure categories broken down by demographic groups. guess what? i just learned that americans living in households with $5,000 to $9,999 of annual income spent an average of $283 (±90) on pets, toys, hobbies, and playground equipment (pdf page 3). you can often get close to your statistic of interest from these web tables. but say you wanted to look at domestic pet expenditure among only households with children between 12 and 17 years old. another one of the thirteen web tables - the consumer unit composition table - shows a few different breakouts of households with kids, but none matching that exact population of interest. the bureau of labor statistics (bls) (the survey's designers) and the census bureau (the survey's administrators) have provided plenty of the major statistics and breakouts for you, but they're not psychic. if you want to comb through this data for specific expenditure categories broken out by a you-defined segment of the united states' population, then let a little r into your life. fun starts now. fair warning: only analyze t he consumer expenditure survey if you are nerd to the core. the microdata ship with two different survey types (interview and diary), each containing five or six quarterly table formats that need to be stacked, merged, and manipulated prior to a methodologically-correct analysis. the scripts in this repository contain examples to prepare 'em all, just be advised that magnificent data like this will never be no-assembly-required. the folks at bls have posted an excellent summary of what's av ailable - read it before anything else. after that, read the getting started guide. don't skim. a few of the descriptions below refer to sas programs provided by the bureau of labor statistics. you'll find these in the C:\My Directory\CES\2011\docs directory after you run the download program. this new github repository contains three scripts: 2010-2011 - download all microdata.R lo op through every year and download every file hosted on the bls's ce ftp site import each of the comma-separated value files into r with read.csv depending on user-settings, save each table as an r data file (.rda) or stat a-readable file (.dta) 2011 fmly intrvw - analysis examples.R load the r data files (.rda) necessary to create the 'fmly' table shown in the ce macros program documentation.doc file construct that 'fmly' table, using five quarters of interviews (q1 2011 thru q1 2012) initiate a replicate-weighted survey design object perform some lovely li'l analysis examples replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using unimputed variables replicate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t -tests using unimputed variables create an rsqlite database (to minimize ram usage) containing the five imputed variable files, after identifying which variables were imputed based on pdf page 3 of the user's guide to income imputation initiate a replicate-weighted, database-backed, multiply-imputed survey design object perform a few additional analyses that highlight the modified syntax required for multiply-imputed survey designs replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using imputed variables repl icate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t-tests using imputed variables replicate the %proc_reg() and %proc_logistic() macros found in "ce macros.sas" and provide some examples of regressions and logistic regressions using both unimputed and imputed variables replicate integrated mean and se.R match each step in the bls-provided sas program "integr ated mean and se.sas" but with r instead of sas create an rsqlite database when the expenditure table gets too large for older computers to handle in ram export a table "2011 integrated mean and se.csv" that exactly matches the contents of the sas-produced "2011 integrated mean and se.lst" text file click here to view these three scripts for...
Dr. Kevin Bronson provides a second experiment year of Field 13 nitrogen and water management in cotton agricultural research data for compute, including notation of field events and operations, an intermediate analysis mega-table of correlated and calculated parameters, and laboratory analysis results generated during the experimentation, plus high-resolution plot level intermediate data analysis tables of SAS process output, as well as the complete raw data sensor recorded logger outputs. The reflectance data is good. There are some errors in the CS data. See included README file for operational details and further description of the measured data signals. Summary: Active optical proximal cotton canopy sensing spatial data and including additional related metrics are presented. Agronomic nitrogen and irrigation management related field operations are listed. Unique research experimentation intermediate analysis table is made available, along with raw data. The raw data recordings, and annotated table outputs with calculated VIs are made available. Plot polygon coordinate designations allow a re-intersection spatial analysis. Data was collected in the 2015 cotton season at Maricopa Agricultural Center, Arizona, USA. High throughput proximal plant phenotyping via electronic sampling and data processing method approach is exampled using a modified high-clearance Hamby spray-rig. Acquired data conforms to location standard methodologies of the plant phenotyping. SAS and GIS compute processing output tables, including Excel formatted examples are presented, where data tabulation and analysis is available. Additional data illustration is offered as a report file with annotated time-series charts. The weekly proximal sensing data collected include the primary canopy reflectance at six wavelengths. Lint and seed yields, first open boll biomass, and nitrogen uptake were also determined. Soil profile nitrate to 1.8 m depth was determined in 30-cm increments, before planting and after harvest. Nitrous oxide emissions were determined with 1-L vented chambers (samples taken at 0, 12, and 24 minutes). Nitrous oxide was determined by gas chromatography (electron detection detector).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example of a dataset of three patients for the aeds data.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the national health and nutrition examination survey (nhanes) with r nhanes is this fascinating survey where doctors and dentists accompany survey interviewers in a little mobile medical center that drives around the country. while the survey folks are interviewing people, the medical professionals administer laboratory tests and conduct a real doctor's examination. the b lood work and medical exam allow researchers like you and me to answer tough questions like, "how many people have diabetes but don't know they have diabetes?" conducting the lab tests and the physical isn't cheap, so a new nhanes data set becomes available once every two years and only includes about twelve thousand respondents. since the number of respondents is so small, analysts often pool multiple years of data together. the replication scripts below give a few different examples of how multiple years of data can be pooled with r. the survey gets conducted by the centers for disease control and prevention (cdc), and generalizes to the united states non-institutional, non-active duty military population. most of the data tables produced by the cdc include only a small number of variables, so importation with the foreign package's read.xport function is pretty straightforward. but that makes merging the appropriate data sets trickier, since it might not be clear what to pull for which variables. for every analysis, start with the table with 'demo' in the name -- this file includes basic demographics, weighting, and complex sample survey design variables. since it's quick to download the files directly from the cdc's ftp site, there's no massive ftp download automation script. this new github repository co ntains five scripts: 2009-2010 interview only - download and analyze.R download, import, save the demographics and health insurance files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the interview weights run a series of pretty generic analyses on the health insurance ques tions 2009-2010 interview plus laboratory - download and analyze.R download, import, save the demographics and cholesterol files onto your local computer load both files, limit them to the variables needed for the analysis, merge them together perform a few example variable recodes create the complex sample survey object, using the mobile examination component (mec) weights perform a direct-method age-adjustment and matc h figure 1 of this cdc cholesterol brief replicate 2005-2008 pooled cdc oral examination figure.R download, import, save, pool, recode, create a survey object, run some basic analyses replicate figure 3 from this cdc oral health databrief - the whole barplot replicate cdc publications.R download, import, save, pool, merge, and recode the demographics file plus cholesterol laboratory, blood pressure questionnaire, and blood pressure laboratory files match the cdc's example sas and sudaan syntax file's output for descriptive means match the cdc's example sas and sudaan synta x file's output for descriptive proportions match the cdc's example sas and sudaan syntax file's output for descriptive percentiles replicate human exposure to chemicals report.R (user-contributed) download, import, save, pool, merge, and recode the demographics file plus urinary bisphenol a (bpa) laboratory files log-transform some of the columns to calculate the geometric means and quantiles match the 2007-2008 statistics shown on pdf page 21 of the cdc's fourth edition of the report click here to view these five scripts for more detail about the national health and nutrition examination survey (nhanes), visit: the cdc's nhanes homepage the national cancer institute's page of nhanes web tutorials notes: nhanes includes interview-only weights and interview + mobile examination component (mec) weights. if you o nly use questions from the basic interview in your analysis, use the interview-only weights (the sample size is a bit larger). i haven't really figured out a use for the interview-only weights -- nhanes draws most of its power from the combination of the interview and the mobile examination component variables. if you're only using variables from the interview, see if you can use a data set with a larger sample size like the current population (cps), national health interview survey (nhis), or medical expenditure panel survey (meps) instead. confidential to sas, spss, stata, sudaan users: why are you still riding around on a donkey after we've invented the internal combustion engine? time to transition to r. :D
Dr. Kevin Bronson provides a second year of nitrogen and water management in wheat agricultural research dataset for compute. Ten irrigation treatments from a linear sprinkler were combined with nitrogen treatments. This dataset includes notation of field events and operations, an intermediate analysis mega-table of correlated and calculated parameters, including laboratory analysis results generated during the experimentation, plus high resolution plot level intermediate data tables of SAS process output, as well as the complete raw data sensor records and logger outputs. This proximal terrestrial high-throughput plant phenotyping data examples our early tri-metric field method, where a geo-referenced 5Hz crop canopy height, temperature and spectral signature are recorded coincident to indicate a plant health status. In this development period, our Proximal Sensing Cart Mark1 (PSCM1) platform suspends a single cluster of sensors on a dual sliding vertical placement armature. See included README file for operational details and further description of the measured data signals. Summary: Active optical proximal wheat canopy sensing spatial data and including additional related metrics such as thermal are presented. Agronomic nitrogen and irrigation management related field operations are listed. Unique research experimentation intermediate analysis table is made available, along with raw data. The raw data recordings, and annotated table outputs with calculated VIs are made available. Plot polygon coordinate designations allow a re-intersection spatial analysis. Data was collected in the 2014 season at Maricopa Agricultural Center, Arizona, USA. High throughput proximal plant phenotyping via electronic sampling and data processing method approach is exampled. Acquired data using USDA Maricopa first mobile platforms, such as the Proximal Sensing Cart Mark 1, where the first cluster sensor bracket design and rickshaw inspired operator's handle were successfully employed. SAS and GIS compute processing output tables, including Excel formatted examples are presented, where intermediate data tabulation and analysis is available. The weekly proximal sensing data collected include canopy reflectance at six wavelengths, ultrasonic distance sensing of canopy height, and infrared thermometry. Ten levels gradient irrigation application from linear move sprinkler system were applied. Soil physical texture and fertility chemistry results are available. Durum wheat data includes in-season biomass and plant N content, final total biomass, grain yield, grain nitrogen, and yellow berry assessment.
Dr. Kevin Bronson provides a small area nitrogen and water management in Guayule agricultural research dataset for compute, including notation of field events and operations, an intermediate analysis table of correlated and calculated parameters with laboratory analysis results generated during the experimentation, plus high resolution plot level intermediate data analysis tables of SAS process output, as well as the complete raw sensor recorded logger outputs. This data was collected during the beginning time period of our USDA Maricopa terrestrial proximal high-throughput plant phenotyping tri-metric method generation, where a 5Hz crop canopy height, temperature and spectral signature are recorded coincident to indicate a plant health status. In this early development period, our Proximal Sensing Cart Mark1 (PSCM1) platform supplants people carrying the CropCircle (CC) sensors, and with an improved view mechanical performance result. Summary: Active optical proximal cotton canopy sensing spatial data and including additional related metrics such as thermal are presented. Agronomic nitrogen and irrigation management related field operations are listed. Unique research experimentation intermediate analysis table is made available, along with raw data. The raw data recordings, and annotated table outputs with calculated VIs are made available. Plot polygon coordinate designations allow a re-intersection spatial analysis. Data was collected in the 2013 season at Maricopa Agricultural Center, Arizona, USA. High throughput proximal plant phenotyping via electronic sampling and data processing method approach is exampled. Acquired data using USDA Maricopa first mobile platforms, such as the Proximal Sensing Cart Mark 1, where the first dual sliding arm configuration was deployed and platform clearance raised successfully as design improvements. SAS and GIS compute processing output tables, including Excel formatted examples are presented, where intermediate data tabulation and analysis is available. The weekly proximal sensing data collected include canopy reflectance at six wavelengths, ultrasonic distance sensing of canopy height, and infrared thermometry. Lint and seed yields, first open boll biomass, and nitrogen uptake were also determined. Soil profile nitrate to 1.8 m depth was determined in 30-cm increments, before planting and after harvest. Nitrous oxide emissions were determined 20 or more weeks in the season with 1-L vented chambers (samples taken at 0, 12, and 24 minutes). Nitrous oxide was determined by gas chromatography (electron detection detector).
Important Note: This item is in mature support as of September 2023 and will be retired in December 2025. A new version of this item is available for your use. Esri recommends updating your maps and apps to use the new version.
The USGS Protected Areas Database of the United States (PAD-US) is the official inventory of public parks and other protected open space. The spatial data in PAD-US represents public lands held in trust by thousands of national, state and regional/local governments, as well as non-profit conservation organizations.Manager Type provides a coarse level land manager description from the PAD-US "Agency Type" Domain, "Manager Type" Field (for example, Federal, State, Local Government, Private).PAD-US is published by the U.S. Geological Survey (USGS) Science Analytics and Synthesis (SAS), Gap Analysis Project (GAP). GAP produces data and tools that help meet critical national challenges such as biodiversity conservation, recreation, public health, climate change adaptation, and infrastructure investment. See the GAP webpage for more information about GAP and other GAP data including species and land cover.Dataset SummaryPhenomenon Mapped: This layer displays protected areas symbolized by manager type.Coordinate System: Web Mercator Auxiliary SphereExtent: 50 United States plus Puerto Rico, the US Virgin Islands, the Northern Mariana Islands and other Pacific Ocean IslandsVisible Scale: 1:1,000,000 and largerSource: U.S. Geological Survey (USGS) Science Analytics and Synthesis (SAS), Gap Analysis Project (GAP) PAD-US version 3.0Publication Date: July 2022Attributes included in this layer are: CategoryOwner TypeOwner NameLocal OwnerManager TypeManager NameLocal ManagerDesignation TypeLocal DesignationUnit NameLocal NameSourcePublic AccessGAP Status - Status 1, 2, 3 or 4GAP Status DescriptionInternational Union for Conservation of Nature (IUCN) Description - I: Strict Nature Reserve, II: National Park, III: Natural Monument or Feature, IV: Habitat/Species Management Area, V: Protected Landscape/Seascape, VI: Protected area with sustainable use of natural resources, Other conservation area, UnassignedDate of EstablishmentThe source data for this layer are available here. What can you do with this Feature Layer?Feature layers work throughout the ArcGIS system. Generally your work flow with feature layers will begin in ArcGIS Online or ArcGIS Pro. Below are just a few of the things you can do with a feature service in Online and Pro.ArcGIS OnlineAdd this layer to a map in the map viewer. The layer is limited to scales of approximately 1:1,000,000 or larger but a vector tile layer created from the same data can be used at smaller scales to produce a webmap that displays across the full range of scales. The layer or a map containing it can be used in an application.Change the layer’s transparency and set its visibility rangeOpen the layer’s attribute table and make selections and apply filters. Selections made in the map or table are reflected in the other. Center on selection allows you to zoom to features selected in the map or table and show selected records allows you to view the selected records in the table.Change the layer’s style and filter the data. For example, you could set a filter for Gap Status Code = 3 to create a map of only the GAP Status 3 areas.Add labels and set their propertiesCustomize the pop-upArcGIS ProAdd this layer to a 2d or 3d map. The same scale limit as Online applies in ProUse as an input to geoprocessing. For example, copy features allows you to select then export portions of the data to a new feature class. Note that many features in the PAD-US database overlap. For example wilderness area designations overlap US Forest Service and other federal lands. Any analysis should take this into consideration. An imagery layer created from the same data set can be used for geoprocessing analysis with larger extents and eliminates some of the complications arising from overlapping polygons.Change the symbology and the attribute field used to symbolize the dataOpen table and make interactive selections with the mapModify the pop-upsApply Definition Queries to create sub-sets of the layerThis layer is part of the Living Atlas of the World that provides an easy way to explore the landscape layers and many other beautiful and authoritative maps on hundreds of topics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes 4 files with segmentation results for 4 different ODT reconstructions of SH-SY5Y neuroblastoma cell. The segmentation results contain:
All files are .*mat files.
The files REC_SH-SY5Y_1.mat, REC_SH-SY5Y_2.mat and REC_SH-SY5Y_3.mat consist of 7 variables:
RECON – tomographic reconstruction of SH-SY5Y neuroblastoma cell;
n_imm – refractive index of object immersion medium;
dx – object space sample size in XY [\(\mu m\)];
rayXY – xy-coordinates of illumination vectors;
maskManual – table with manually determined 3D binary masks of organelles;
maskCellpose – 3D binary mask of biological cell obtained through Cellpose;
maskODTSAS – table with 3D binary masks of biological cell and their organelles obtained through ODT-SAS.
File REC_SH-SY5Y_4.mat includes masks for the ODT-SAS and Cellpose segmentation of three closely packed cells and consists of 5 variables: RECON, n_imm, dx, maskCellpose and maskODTSAS.
Access a particular 3D binary mask from 'maskManual' and 'maskODTSAS' tables, using the following names: 'Cell', 'Nucleoli', 'LS'.
For example:
cellMask = maskODTSAS.Cell{1};
[1] Stringer, C., Wang, T., Michaelos, M., & Pachitariu, M. (2021). Cellpose: a generalist algorithm for cellular segmentation. Nature methods, 18(1), 100-106.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
G protein-coupled receptors (GPCRs) are one of the largest protein families in mammals. They mediate signal transduction across cell membranes and are important targets for the pharmaceutical industry. The G Protein-Coupled Receptors—Sequence Analysis and Statistics (GPCR-SAS) web application provides a set of tools to perform comparative analysis of sequence positions between receptors, based on a curated structural-informed multiple sequence alignment. The analysis tools include: (i) percentage of occurrence of an amino acid or motif and entropy at a position or range of positions, (ii) covariance of two positions, (iii) correlation between two amino acids in two positions (or two sequence motifs in two ranges of positions), and (iv) snake-plot representation for a specific receptor or for the consensus sequence of a group of selected receptors. The analysis of conservation of residues and motifs across transmembrane (TM) segments may guide the design of more selective ligands or help to rationalize activation mechanisms, among others. As an example, here we analyze the amino acids of the “transmission switch”, that initiates receptor activation following ligand binding. The tool is freely accessible at http://lmc.uab.cat/gpcrsas/.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.
The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.
These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis.
The data include the following:
1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc).
2. A text file to import the analysis database into SAS
3. The SAS code to format the analysis database to be used for analytics
4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6
5. SAS code for deriving the multiple regression formula in Table 4.
6. SAS code for deriving the multiple regression formula in Table 5
7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8
9. The Excel files that accompanied the above SAS code to produce the tables
For questions, please email davidkcundiff@gmail.com. Thanks.