In this paper, we investigate the use of Bayesian networks to construct large-scale diagnostic systems. In particular, we consider the development of large-scale Bayesian networks by composition. This compositional approach reflects how (often redundant) subsystems are architected to form systems such as electrical power systems. We develop high-level specifications, Bayesian networks, clique trees, and arithmetic circuits representing 24 different electrical power systems. The largest among these 24 Bayesian networks contains over 1,000 random variables. Another BN represents the real-world electrical power system ADAPT, which is representative of electrical power systems deployed in aerospace vehicles. In addition to demonstrating the scalability of the compositional approach, we briefly report on experimental results from the diagnostic competition DXC, where the ProADAPT team, using techniques discussed here, obtained the highest scores in both Tier 1 (among 9 international competitors) and Tier 2 (among 6 international competitors) of the industrial track. While we consider diagnosis of power systems specically, we believe this work is relevant to other system health management problems, in particular in dependable systems such as aircraft and spacecraft. Reference: O. J. Mengshoel, S. Poll, and T. Kurtoglu. "Developing Large-Scale Bayesian Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft." Proc. of the IJCAI-09 Workshop on Self-* and Autonomous Systems (SAS): Reasoning and Integration Challenges, 2009 BibTex Reference: @inproceedings{mengshoel09developing, title = {Developing Large-Scale {Bayesian} Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft}, author = {Mengshoel, O. J. and Poll, S. and Kurtoglu, T.}, booktitle = {Proc. of the IJCAI-09 Workshop on Self-$\star$ and Autonomous Systems (SAS): Reasoning and Integration Challenges}, year={2009} }
Summary data for the studies used in the meta-analysis of local adaptation (Table 1 from the publication)This table contains the data used in this published meta-analysis. The data were originally extracted from the publications listed in the table. The file corresponds to Table 1 in the original publication.tb1.xlsSAS script used to perform meta-analysesThis file contains the essential elements of the SAS script used to perform meta-analyses published in Hoeksema & Forde 2008. Multi-factor models were fit to the data using weighted maximum likelihood estimation of parameters in a mixed model framework, using SAS PROC MIXED, in which the species traits and experimental design factors were considered fixed effects, and a random between-studies variance component was estimated. Significance (at alpha = 0.05) of individual factors in these models was determined using randomization procedures with 10,000 iterations (performed with a combination of macros in SAS), in which effect sizes a...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SAS Code for Spatial Optimization of Supply Chain Network for Nitrogen Based Fertilizer in North America, by type, by mode of transportation, per county, for all major crops, using Proc OptModel. the code specifies set of random values to run the mixed integer stochastic spatial optimization model repeatedly and collect results for each simulation that are then compiled and exported to be projected in GIS (geographic information systems). Certain supply nodes (fertilizer plants) are specified to work at either 70 percent of their capacities or more. Capacities for nodes of supply (fertilizer plants), demand (county centroids), transhipment nodes (transfer points-mode may change), and actual distance travelled are specified over arcs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Parameter estimates for the generalized H2 model (SAS output).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Example of the code used to account for statistical significances for phenotype and other variables.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mortality rates were calculated as defined in the text.Summary statistics for Black cervical cancer mortality rates in thirteen U.S. states from 1975 to 2010.
Multienvironment trials (METs) enable the evaluation of the same genotypes under a v ariety of environments and management conditions. We present META (Multi Environment Trial Analysis), a suite of 31 SAS programs that analyze METs with complete or incomplete block designs, with or without adjustment by a covariate. The entire program is run through a graphical user interface. The program can produce boxplots or histograms for all traits, as well as univariate statistics. It also calculates best linear unbiased estimators (BLUEs) and best linear unbiased predictors for the main response variable and BLUEs for all other traits. For all traits, it calculates variance components by restricted maximum likelihood, least significant difference, coefficient of variation, and broad-sense heritability using PROC MIXED. The program can analyze each location separately, combine the analysis by management conditions, or combine all locations. The flexibility and simplicity of use of this program makes it a valuable tool for analyzing METs in breeding and agronomy. The META program can be used by any researcher who knows only a few fundamental principles of SAS.
https://creativecommons.org/share-your-work/public-domain/pdmhttps://creativecommons.org/share-your-work/public-domain/pdm
This data collection contains Supplemental Nutrition Assistance Program (SNAP) SAS proc contents (metadata only) files for Arizona (AZ), Hawaii (HI), Illinois (IL), Kentucky (KY), New Jersey (NJ), New York (NY), Oregon (OR), Tennessee (TN), and Virginia (VA).
The data set is a crosswalk file for working with 2020 Census block group boundaries and Philadelphia Police Department district and police service areas (PSAs). Census blockgroup population centroids were situated in police geographies using SAS Proc GINSIDE. The data facilitate demographic approximations of the residential population within Philadelphia police districts and police service areas (PSAs).
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data are at the block group level and include coordinates for population centroids. Population centroids were situated in police geographies using SAS Proc GINSIDE. The data facilitate demographic approximations of the residential population within Philadelphia police districts and police service areas (PSAs). UPDATE: PLEASE NOTE, IN MAY OF 2024 THE 9th AND 6th DISTRICTS WERE MERGED. THIS CROSSWALK WAS CREATED BEFORE THAT CHANGE.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Mortality rates were calculated as defined in the text.Summary statistics for White cervical cancer mortality rates in 13 U.S. states from 1975 to 2010.
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
The files submitted here contains data collected for the thesis titled "Relative preference for pecking blocks and its association with keel status and eggshell quality in laying hens housed in enriched cages." The purpose of this research was to determine pecking block preferences of White and Brown feathered laying hens strains, and if there is a time of day effect on pecking block use. We then investigated the association between pecking block preference, pecking block use, keel status, and eggshell quality. We also investigated if laying hens are consistent in their pecking block preference over time. Data on weekly pecking block disappearance, number of hens using pecking blocks across the day, eggshell quality and keel status in focal birds were also assessed. Data was analyzed using SAS Proc GLIMMIX, and consistency data was analyzed using SAS Proc Freq.
The current study examined how racial/ethnic self-identification combines with gender to shape self-reports of everyday discrimination among youth in the U.S. as they transition to adulthood. Data came from seven waves of the Panel Study of Income Dynamics Transition into Adulthood Supplement (TAS). The sample included individuals with two or more observations who identified as White, Black, or Hispanic (n=2,532). Data includes average everyday discrimination scale scores over 9 time periods (i.e., ages 18 to 27) as well as pattern variables for race/ethnicity and sex groups and family SES proxied by highest level of education in household at baseline. Developmental trajectories of everyday discrimination across ages 18 to 27 were estimated using multilevel longitudinal models with the SAS Proc Mixed procedure.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset includes SAS codes and associated Excel files (.csv and .xlsx) containing data from Nigerian catfish farmers. The .xslx file includes the main variables and the formula for the other derived variables. The SAS code utilizes PROC GLM to produce Type III Sum of Squares, effect size measures (Partial Eta Squared, Semi-Partial Eta Squared, Partial Omega Squared, and Semi-Partial Omega Squared), and the linear regression estimates.
We collected black spruce needle from several sites located near Fairbanks Ak, and Delta, Ak in August 1999 (Sites designated here as S1, S2, and S3) were located in Fairbanks (64o40?N, 148o15?W), Alaska (S1: 64o52.164? N, 147o51.462? W; S2: 64o52.058? N, 147o51.378? W; S3: 64o51.603? N, 147o52.789? W) with 2 additional sites (S4, S5) in Delta, Alaska (64010? N, 145030? W). In each stand, 30 trees were randomly selected for sampling. Within each tree, five shoots were collected from the southern aspect of the mid-canopy height from each of the following age classes: 0-, 1-, 4-, 9-, and 19-years old. In two of the Fairbanks stands (S2, S3), 19-year old needles were not present. A total of 690 samples over the five stands include 150 samples per stand in three stands (S1, S4, S5) and 120 samples per stand in the other two stands (S2, S3). Needles were returned to the lab and nitrogen content was determined. We used a three-factor nested analysis of variance (ANOVA) to evaluate differences in needle N concentration among the ages of needles on a tree, among trees nested within a stand, and among stands. Ages and stands were treated as fixed factors and tree was treated as a random factor. The ANOVA and means testing for these differences used Proc GLM in the SAS statistical package with needle N content as the dependent variable and needle age, tree, and stand as independent variables. The GLM procedure handled the problem of missing treatment combinations within the ANOVA. Type III sum of squares was used to test the effects of factors without interactions
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Text S2. SAS code. (PROC QTL) (SAS 64Â kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of SAS Proc Traj results for three groups based on mouse alcohol consumption data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*Estimation by maximum likelihood method, SAS PROC PHREG, option ties = breslow ([32]).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chronology of sampling for group 1 of conditioned black Angus and black Simmental calves transported for 12 or 36 h and rested for 0, 4, 8, or 12 h.
In this paper, we investigate the use of Bayesian networks to construct large-scale diagnostic systems. In particular, we consider the development of large-scale Bayesian networks by composition. This compositional approach reflects how (often redundant) subsystems are architected to form systems such as electrical power systems. We develop high-level specifications, Bayesian networks, clique trees, and arithmetic circuits representing 24 different electrical power systems. The largest among these 24 Bayesian networks contains over 1,000 random variables. Another BN represents the real-world electrical power system ADAPT, which is representative of electrical power systems deployed in aerospace vehicles. In addition to demonstrating the scalability of the compositional approach, we briefly report on experimental results from the diagnostic competition DXC, where the ProADAPT team, using techniques discussed here, obtained the highest scores in both Tier 1 (among 9 international competitors) and Tier 2 (among 6 international competitors) of the industrial track. While we consider diagnosis of power systems specically, we believe this work is relevant to other system health management problems, in particular in dependable systems such as aircraft and spacecraft. Reference: O. J. Mengshoel, S. Poll, and T. Kurtoglu. "Developing Large-Scale Bayesian Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft." Proc. of the IJCAI-09 Workshop on Self-* and Autonomous Systems (SAS): Reasoning and Integration Challenges, 2009 BibTex Reference: @inproceedings{mengshoel09developing, title = {Developing Large-Scale {Bayesian} Networks by Composition: Fault Diagnosis of Electrical Power Systems in Aircraft and Spacecraft}, author = {Mengshoel, O. J. and Poll, S. and Kurtoglu, T.}, booktitle = {Proc. of the IJCAI-09 Workshop on Self-$\star$ and Autonomous Systems (SAS): Reasoning and Integration Challenges}, year={2009} }