Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
data file in SAS format
Facebook
TwitterThe Fiscal Intermediary maintains the Provider Specific File (PSF). The file contains information about the facts specific to the provider that affects computations for the Prospective Payment System. The Provider Specific files in SAS format are located in the Download section below for the following provider-types, Inpatient, Skilled Nursing Facility, Home Health Agency, Hospice, Inpatient Rehab, Long Term Care, Inpatient Psychiatric Facility
Facebook
TwitterThe simulated synthetic aperture sonar (SAS) data presented here was generated using PoSSM [Johnson and Brown 2018]. The data is suitable for bistatic, coherent signal processing and will form acoustic seafloor imagery. Included in this data package is simulated sonar data in Generic Data Format (GDF) files, a description of the GDF file contents, example SAS imagery, and supporting information about the simulated scenes. In total, there are eleven 60 m x 90 m scenes, labeled scene00 through scene10, with scene00 provided with the scatterers in isolation, i.e. no seafloor texture. This is provided for beamformer testing purposes and should result in an image similar to the one labeled "PoSSM-scene00-scene00-starboard-0.tif" in the Related Data Sets tab. The ten other scenes have varying degrees of model variation as described in "Description_of_Simulated_SAS_Data_Package.pdf". A description of the data and the model is found in the associated document called "Description_of_Simulated_SAS_Data_Package.pdf" and a description of the format in which the raw binary data is stored is found in the related document "PSU_GDF_Format_20240612.pdf". The format description also includes MATLAB code that will effectively parse the data to aid in signal processing and image reconstruction. It is left to the researcher to develop a beamforming algorithm suitable for coherent signal and image processing. Each 60 m x 90 m scene is represented by 4 raw (not beamformed) GDF files, labeled sceneXX-STARBOARD-000000 through 000003. It is possible to beamform smaller scenes from any one of these 4 files, i.e. the four files are combined sequentially to form a 60 m x 90 m image. Also included are comma separated value spreadsheets describing the locations of scatterers and objects of interest within each scene. In addition to the binary GDF data, a beamformed GeoTIFF image and a single-look complex (SLC, science file) data of each scene is provided. The SLC data (science) is stored in the Hierarchical Data Format 5 (https://www.hdfgroup.org/), and appended with ".hdf5" to indicate the HDF5 format. The data are stored as 32-bit real and 32-bit complex values. A viewer is available that provides basic graphing, image display, and directory navigation functions (https://www.hdfgroup.org/downloads/hdfview/). The HDF file contains all the information necessary to reconstruct a synthetic aperture sonar image. All major and contemporary programming languages have library support for encoding/decoding the HDF5 format. Supporting documentation that outlines positions of the seafloor scatterers is included in "Scatterer_Locations_Scene00.csv", while the locations of the objects of interest for scene01-scene10 are included in "Object_Locations_All_Scenes.csv". Portable Network Graphic (PNG) images that plot the location of objects of all the objects of interest in each scene in Along-Track and Cross-Track notation are provided.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.
The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.
These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis.
The data include the following:
1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc).
2. A text file to import the analysis database into SAS
3. The SAS code to format the analysis database to be used for analytics
4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6
5. SAS code for deriving the multiple regression formula in Table 4.
6. SAS code for deriving the multiple regression formula in Table 5
7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8
9. The Excel files that accompanied the above SAS code to produce the tables
For questions, please email davidkcundiff@gmail.com. Thanks.
Facebook
TwitterThe raw data for each of the analyses are presented. Baseline severity difference (probands only) (Figure A in S1 Dataset), Repeated measures analysis of change in lesion severity (Figure B in S1 Dataset). Logistic regression of survivorship (Figure C in S1 Dataset). Time to cure (Figure D in S1 Dataset). Each data set is given as a SAS code for the data itself, and the equivalent analysis to that performed in JMP (and reported in the text). Data are presented in SAS format as this is a simple text format. The data and code were generated as direct exports from JMP, and additional SAS code added as needed (for instance, JMP does not export code for post-hoc tests). Note, however, that SAS rounds to less precision than JMP, and can give slightly different results, especially for REML methods. (DOCX)
Facebook
TwitterThis SAS code extracts data from EU-SILC User Database (UDB) longitudinal files and edits it such that a file is produced that can be further used for differential mortality analyses. Information from the original D, R, H and P files is merged per person and possibly pooled over several longitudinal data releases. Vital status information is extracted from target variables DB110 and RB110, and time at risk between the first interview and either death or censoring is estimated based on quarterly date information. Apart from path specifications, the SAS code consists of several SAS macros. Two of them require parameter specification from the user. The other ones are just executed. The code was written in Base SAS, Version 9.4. By default, the output file contains several variables which are necessary for differential mortality analyses, such as sex, age, country, year of first interview, and vital status information. In addition, the user may specify the analytical variables by which mortality risk should be compared later, for example educational level or occupational class. These analytical variables may be measured either at the first interview (the baseline) or at the last interview of a respondent. The output file is available in SAS format and by default also in csv format.
Facebook
TwitterThese data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed.This study sought to examine any major changes in schools in the past two years as an evaluation of the Safe and Civil Schools Initiative. Students, faculty, and administrators were asked questions on topics including school safety, climate, and the discipline process.This collection includes 6 SAS data files: "psja_schools.sas7bdat" with 66 variables and 15 cases, "psja_schools_v01.sas7bdat" with 104 variables and 15 cases, "psja_staff.sas7bdat" with 39 variables and 2,921 cases, "psja_staff_v01.sas7bdat" with 202 variables and 2,398 cases, "psja_students.sas7bdat" with 97 variables and 4,382 cases, and "psja_students_v01.sas7bdat" with 332 variables and 4,267 cases. Additionally, the collection includes 1 SAS formats catalog "formats.sas7bcat", and 10 SAS syntax files.
Facebook
TwitterThese data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. Teenage adolescent females residing in Baltimore, Maryland who were involved in a relationship with a history of violence were sought after to participate in this research study. Respondents were interviewed and then followed through daily diary entries for several months. The aim of the research was to understand the context regarding teen dating violence (TDV). Prior research on relationship context has not focused on minority populations; therefore, the focus of this project was urban, predominantly African American females. The available data in this collection includes three SAS (.sas7bdat) files and a single SAS formats file that contains variable and value label information for all three data files. The three data files are: final_baseline.sas7bdat (157 cases / 252 variables) final_partnergrid.sas7bdat (156 cases / 76 variables) hart_final_sas7bdata (7004 cases / 23 variables)
Facebook
TwitterSabotaging milkweed by monarch caterpillars (Danaus plexippus) is a famous textbook example of disarming plant defence. By severing leaf veins, monarchs are thought to prevent the flow of toxic latex to their feeding site. Here, we show that sabotaging by monarch caterpillars is not only an avoidance strategy. While young caterpillars appear to avoid latex, late-instar caterpillars actively ingest exuding latex, presumably to increase sequestration of cardenolides used for defence against predators. Comparisons with caterpillars of the related but non-sequestering common crow butterfly (Euploea core) revealed three lines of evidence supporting our hypothesis. First, monarch caterpillars sabotage inconsistently and therefore the behaviour is not obligatory to feed on milkweed, whereas sabotaging precedes each feeding event in Euploea caterpillars. Second, monarch caterpillars shift their behaviour from latex avoidance in younger to eager drinking in later stages, whereas Euploea caterpil..., , , Readme for the statistical documentation for the publication: Monarchs sabotage milkweed to acquire toxins, not to disarm plant defense Authors: Anja Betz, Robert Bischoff, Georg Petschenka
For the statistical documentation, we provide the following files: This readme gives a brief outline of the different files and data provided in the statistical documentation Subfolders for each experiment containing
Disclaimer: Excel automatically formats numbers. We do not take any responsibility for automatic formatting of the numbers by Excel. This might lead to different results, if the Excel files are used for analysis. The sas7bdat files, or data at the start of the individual sas-analysis files should be resistant to automatic formatting, so we suggest using them for analysis.
The datasets co...
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File List ACS.zip -- .zip file containing SAS macro and example code, and example Aletris bracteata data sets. acs.sas chekika_ACS_estimation.sas chekika_1.csv chekika_2.csv philippi.3.1.zip
Description "acs.sas" is a SAS macro for computing Horvitz-Thompson and Hansen-Horwitz estimates of population size for adaptive cluster sampling with random initial sampling. This version uses ugly base SAS code and does not require SQL or SAS products other than Base SAS, and should work with versions 8.2 onward (tested with versions 9.0 and 9.1). "chekika_ACS_estimation.sas" is example SAS code calling the acs macro to analyze the Chekika Aletris bracteata example data sets. "chekika_1.csv" is an example data set in ASCII comma-delimited format from adaptive cluster sampling of A. bracteata at Chekika, Everglades National Park, with 1-m2 quadrats. "chekika_2.csv" is an example data set in ASCII comma-delimited format from adaptive cluster sampling of A. bracteata at Chekika, Everglades National Park, with 4-m2 quadrats. "philippi.3.1.zip" metadata file generated by morpho, including both xml and css.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This SAS macro generates childhood mortality estimates (neonatal, post-neonatal, infant (1q0), child (4q1) and under-five (5q0) mortality) and standard errors based on birth histories reported by women during a household survey. We have made the SAS macro flexible enough to accommodate a range of calculation specifications including multi-stage sampling frames, and simple random samples or censuses. Childhood mortality rates are the component death probabilities of dying before a specific age. This SAS macro is based on a macro built by Keith Purvis at MeasureDHS. His method is described in Estimating Sampling Errors of Means, Total Fertility, and Childhood Mortality Rates Using SAS (www.measuredhs.com/pubs/pdf/OD17/OD17.pdf, section 4). More information about Childhood Mortality Estimation can also be found in the Guide to DHS Statistics (www.measuredhs.com/pubs/pdf/DHSG1/Guide_DHS_Statistics.pdf, page 93). We allow the user to specify whether childhood mortality calculations should be based on 5 or 10 years of birth histories, when the birth history window ends, and how to handle age of death with it is reported in whole months (rather than days). The user can also calculate mortality rates within sub-populations, and take account of a complex survey design (unequal probability and cluster samples). Finally, this SAS program is designed to read data in a number of different formats.
Facebook
TwitterCommunity-Based Survey of Supports for Healthy Eating and Active Living (CBS HEAL) is a CDC survey of a nationally representative sample of U.S. municipalities to better understand existing community-level policies and practices that support healthy eating and active living. The survey collects information about policies such as nutrition standards, incentives for healthy food retail, bike/pedestrian-friendly design, and Complete Streets. About 2,000 municipalities respond to the survey. Participating municipalities receive a report that allows them to compare their policies and practices with other municipalities of similar geography, population size, and urban status. The CBS HEAL survey was first administered in 2014 and was administered again in 2021. Data is provided in multiple formats for download including as a SAS file. A methods report and a SAS program for formatting the data are also provided.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File formats:
.xls: Excel file with variable names in 1. row and variable labels in 2. row
.xpt/.xpf: SAS XPORT data file (.xpt) and value labels (formats.xpf).
Note that the following variables were renamed in the output file: sumcadhssb -> SUMCADHS, sumcwursk -> SUMCWURS, adhdnotest -> ADHDNOTE, subs_subnotob -> SUBS_SUB, and that the internally recorded dataset name was shortened to "Liebrenz" .dta: Stata 13 data file
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Approved for Public Release; distribution is unlimited.
Dataset title: Synthetic Aperture Sonar Seabed Environment Dataset (SASSED)
Date: June 2018
Description: This dataset contains 129 complex-valued, single (high frequency) channel, 1001x1001 pixel, synthetic aperture sonar snippets of various seafloor texture types. Each snippet contains one or more seabed environments, e.g., hardpack sand, mud, sea grass, rock, and sand ripple.
For each snippet there is a corresponding hand-segmented and -labeled "mask" image. The labels should not be interpreted as the ground truth for specific seafloor types. The labels were not verified by visual inspection of the actual seafloor environments or by any other method. Instead, interpret the labels as groupings of similar seafloor textures. Example code for preprocessing the data is included.
The data is stored in hdf5 format. The SAS data is stored under the hdf5 dataset 'snippets', and the hand-segmented labels are stored under 'labels'. For information on how to read hdf5 data, please visit one of the following websites: (general) https://support.hdfgroup.org/HDF5/ (python) https://www.h5py.org
Acknowledgements: Thanks go to J. Tory Cobb for curating this dataset. Please credit NSWC Panama City Division in any publication using this data.
Past Usage: Cobb, J. T., & Zare, A. (2014). Boundary detection and superpixel formation in synthetic aperture sonar imagery. Proceedings of the Institute of Acoustics, 36(Pt 1).
Approved for Public Release; distribution is unlimited.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SAS data set "neuro_morbi_indep.sas7bdat" contains data for publication "Neurologic morbidity and functional independence in adult survivors of childhood cancer". Variables and formats are in the file "Variables and formats for neuro_morbi_indep.docx".
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The SAS code (Supplementary File 1) and R program code (Supplementary File 2). For the analysis to proceed, this code requires an input data file (Supplementary File 3-5) prepared in excel format (CSV). Data can be stored in any format such as xlsx, txt, xls and others. Economic values in the SAS code are entered manually in the code, but in the R code are stored in an Excel file (Supplementary File 6).
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The data source for this project work is a large collection of raw data publicly available on CDC website. “CDC is the nation’s leading science-based, data-driven, service organisation that protects the public’s health. In 1984, the CDC established the Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviours, chronic health conditions, and use of preventive services.” (CDC - BRFSS Annual Survey Data, 2020).
I have referred to a set of data collected between the years 2005 and 2021 and it contains more than 7 Million records in total (7,143,987 to be exact). For each year there are around 300 to 400 features available in the dataset, but not all of them are needed for this project, as some of them are irrelevant to my work. I have shortlisted a total of 22 features which are relevant for designing and developing my ML models and I have explained them in detail in the below table.
The codebook link (of the year 2021) explains below columns in more details - https://www.cdc.gov/brfss/annual_data/2021/pdf/codebook21_llcp-v2-508.pdf
All datasets are obtained from CDC website wherein they are available in Zip format containing a SAS format file with .xpt extension. So, I downloaded all the zip files, extracted them and then converted each one of them into a .csv format so I could easily fetch the records in my project code. I used below command in Anaconda Prompt to convert .xpt extension file into .csv extension file,
C:\users\mayur\Downloads> python -m xport LLCP2020.xpt > LLCP2020.csv
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/10.0/customlicense?persistentId=doi:10.7910/DVN/PNOFKIhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/10.0/customlicense?persistentId=doi:10.7910/DVN/PNOFKI
InfoGroup’s Historical Business Backfile consists of geo-coded records of millions of US businesses and other organizations that contain basic information on each entity, such as: contact information, industry description, annual revenues, number of employees, year established, and other data. Each annual file consists of a “snapshot” of InfoGroup’s data as of the last day of each year, creating a time series of data 1997-2019. Access is restricted to current Harvard University community members. Use of Infogroup US Historical Business Data is subject to the terms and conditions of a license agreement (effective March 16, 2016) between Harvard and Infogroup Inc. and subject to applicable laws. Most data files are available in either .csv or .sas format. All data files are compressed into an archive in .gz, or GZIP, format. Extraction software such as 7-Zip is required to unzip these archives.
Facebook
TwitterViC dataset is a collection for implementing a Dynamic Spectrum Access(DSA) system testbed in the CBRS band in the USA. This data is a DSA system which consists of a 2-tier user : Incident user: generating a chirp signal with a Radar system, Primary user: LTE-TDD signal with a CBSD base station system, and corresponds to signal waveforms in the band 3.55-3.56 GHz (Ch1), 3.56-3.57 GHz (Ch2) respectively. There are a total of 12 classes, excluding the assumption that two of the 16 cases are used by CBSD base stations, depending on the presence or absence of two users in two channels. The labels of each data have the following meanings :
0000 (0) : All off 0001 (1) : Ch2 - Radar on 0010 (2) : Ch2 - LTE on 0011 (3) : Ch2 – LTE, Radar on 0100 (4) : Ch1 – Radar on 0101 (5) : Ch1 – Radar on / Ch2 – Radar on 0110 (6) : Ch1 – Radar on /Ch2 – LTE on 0111 (7) : Ch1 – Radar on / Ch2 – LTE, Radar on 1000 (8) : Ch1 – LTE on 1001 (9) : Ch1 – LTE on / Ch2 – Radar on (X) 1010 (10) : Ch1 – LTE on / Ch2 – LTE on (X) 1011 (11) : Ch1 – LTE on / Ch2 – LTE, Radar on 1100 (12) : Ch1 – LTE, Radar on 1101 (13) : Ch1 – LTE, Radar on / Ch2 – Radar on (X) 1110 (14) : Ch1 – LTE, Radar on / Ch2 – LTE on (X) 1111 (15) : Ch1 – LTE, Radar on / Ch2 – LTE, Radar on
This dataset has a total of 7 types consisting of one raw dataset expressed in two extensions, 4 processed datasets processed in different ways, and a label. Except for one of the datasets, all are Python version of numpy files, and the other is a csv file.
(Raw) The raw data is a IQ data generated from testbeds created by imitating the SAS system of CBRS in the United States. In the testbeds, the primary user was made using the LabView communication tool and the USRP antenna (Radar), and the secondary user was made by manufacturing the CBSD base station. This has both csv format and numpy format exist.
(Processed) All of these data except one are normalized to values between 0 and 255 and consist of spectrogram, scalogram, and IQ data. The other one is a spectrogram dataset which is not normalized. They are measured between 250us. In the case of spectrograms and scalograms, the figure formed at 3.56 GHz to 3.57 GHz corresponds to channel 1, and at 3.55 GHz to 3.56 GHz corresponds to channel 2. Among them, signals transmitted from the CBSD base station are output in the form of LTE-TDD signals, and signals transmitted from the Radar system are output in the form of Chirp signals.
(Label) All of the above five data share one label. This label has a numpy format.
Facebook
TwitterThe National Sample Survey of Registered Nurses (NSSRN) Download makes data from the survey readily available to users in a one-stop download. The Survey has been conducted approximately every four years since 1977. For each survey year, HRSA has prepared two Public Use File databases in flat ASCII file format without delimiters. The 2008 data are also offerred in SAS and SPSS formats. Information likely to point to an individual in a sparsely-populated county has been withheld. General Public Use Files are State-based and provide information on nurses without identifying the County and Metropolitan Area in which they live or work. County Public Use Files provide most, but not all, the same information on the nurse from the General Public Use File, and also identifies the County and Metropolitan Areas in which the nurses live or work. NSSRN data are to be used for research purposes only and may not be used in any manner to identify individual respondents.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
data file in SAS format