Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
data file in SAS format
Facebook
TwitterThe Fiscal Intermediary maintains the Provider Specific File (PSF). The file contains information about the facts specific to the provider that affects computations for the Prospective Payment System. The Provider Specific files in SAS format are located in the Download section below for the following provider-types, Inpatient, Skilled Nursing Facility, Home Health Agency, Hospice, Inpatient Rehab, Long Term Care, Inpatient Psychiatric Facility
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The raw data for each of the analyses are presented. Baseline severity difference (probands only) (Figure A in S1 Dataset), Repeated measures analysis of change in lesion severity (Figure B in S1 Dataset). Logistic regression of survivorship (Figure C in S1 Dataset). Time to cure (Figure D in S1 Dataset). Each data set is given as a SAS code for the data itself, and the equivalent analysis to that performed in JMP (and reported in the text). Data are presented in SAS format as this is a simple text format. The data and code were generated as direct exports from JMP, and additional SAS code added as needed (for instance, JMP does not export code for post-hoc tests). Note, however, that SAS rounds to less precision than JMP, and can give slightly different results, especially for REML methods. (DOCX)
Facebook
TwitterThe simulated synthetic aperture sonar (SAS) data presented here was generated using PoSSM [Johnson and Brown 2018]. The data is suitable for bistatic, coherent signal processing and will form acoustic seafloor imagery. Included in this data package is simulated sonar data in Generic Data Format (GDF) files, a description of the GDF file contents, example SAS imagery, and supporting information about the simulated scenes. In total, there are eleven 60 m x 90 m scenes, labeled scene00 through scene10, with scene00 provided with the scatterers in isolation, i.e. no seafloor texture. This is provided for beamformer testing purposes and should result in an image similar to the one labeled "PoSSM-scene00-scene00-starboard-0.tif" in the Related Data Sets tab. The ten other scenes have varying degrees of model variation as described in "Description_of_Simulated_SAS_Data_Package.pdf". A description of the data and the model is found in the associated document called "Description_of_Simulated_SAS_Data_Package.pdf" and a description of the format in which the raw binary data is stored is found in the related document "PSU_GDF_Format_20240612.pdf". The format description also includes MATLAB code that will effectively parse the data to aid in signal processing and image reconstruction. It is left to the researcher to develop a beamforming algorithm suitable for coherent signal and image processing. Each 60 m x 90 m scene is represented by 4 raw (not beamformed) GDF files, labeled sceneXX-STARBOARD-000000 through 000003. It is possible to beamform smaller scenes from any one of these 4 files, i.e. the four files are combined sequentially to form a 60 m x 90 m image. Also included are comma separated value spreadsheets describing the locations of scatterers and objects of interest within each scene. In addition to the binary GDF data, a beamformed GeoTIFF image and a single-look complex (SLC, science file) data of each scene is provided. The SLC data (science) is stored in the Hierarchical Data Format 5 (https://www.hdfgroup.org/), and appended with ".hdf5" to indicate the HDF5 format. The data are stored as 32-bit real and 32-bit complex values. A viewer is available that provides basic graphing, image display, and directory navigation functions (https://www.hdfgroup.org/downloads/hdfview/). The HDF file contains all the information necessary to reconstruct a synthetic aperture sonar image. All major and contemporary programming languages have library support for encoding/decoding the HDF5 format. Supporting documentation that outlines positions of the seafloor scatterers is included in "Scatterer_Locations_Scene00.csv", while the locations of the objects of interest for scene01-scene10 are included in "Object_Locations_All_Scenes.csv". Portable Network Graphic (PNG) images that plot the location of objects of all the objects of interest in each scene in Along-Track and Cross-Track notation are provided.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This formatted dataset (AnalysisDatabaseGBD) originates from raw data files from the Institute of Health Metrics and Evaluation (IHME) Global Burden of Disease Study (GBD2017) affiliated with the University of Washington. We are volunteer collaborators with IHME and not employed by IHME or the University of Washington.
The population weighted GBD2017 data are on male and female cohorts ages 15-69 years including noncommunicable diseases (NCDs), body mass index (BMI), cardiovascular disease (CVD), and other health outcomes and associated dietary, metabolic, and other risk factors. The purpose of creating this population-weighted, formatted database is to explore the univariate and multiple regression correlations of health outcomes with risk factors. Our research hypothesis is that we can successfully model NCDs, BMI, CVD, and other health outcomes with their attributable risks.
These Global Burden of disease data relate to the preprint: The EAT-Lancet Commission Planetary Health Diet compared with Institute of Health Metrics and Evaluation Global Burden of Disease Ecological Data Analysis.
The data include the following:
1. Analysis database of population weighted GBD2017 data that includes over 40 health risk factors, noncommunicable disease deaths/100k/year of male and female cohorts ages 15-69 years from 195 countries (the primary outcome variable that includes over 100 types of noncommunicable diseases) and over 20 individual noncommunicable diseases (e.g., ischemic heart disease, colon cancer, etc).
2. A text file to import the analysis database into SAS
3. The SAS code to format the analysis database to be used for analytics
4. SAS code for deriving Tables 1, 2, 3 and Supplementary Tables 5 and 6
5. SAS code for deriving the multiple regression formula in Table 4.
6. SAS code for deriving the multiple regression formula in Table 5
7. SAS code for deriving the multiple regression formula in Supplementary Table 7
8. SAS code for deriving the multiple regression formula in Supplementary Table 8
9. The Excel files that accompanied the above SAS code to produce the tables
For questions, please email davidkcundiff@gmail.com. Thanks.
Facebook
TwitterThis SAS code extracts data from EU-SILC User Database (UDB) longitudinal files and edits it such that a file is produced that can be further used for differential mortality analyses. Information from the original D, R, H and P files is merged per person and possibly pooled over several longitudinal data releases. Vital status information is extracted from target variables DB110 and RB110, and time at risk between the first interview and either death or censoring is estimated based on quarterly date information. Apart from path specifications, the SAS code consists of several SAS macros. Two of them require parameter specification from the user. The other ones are just executed. The code was written in Base SAS, Version 9.4. By default, the output file contains several variables which are necessary for differential mortality analyses, such as sex, age, country, year of first interview, and vital status information. In addition, the user may specify the analytical variables by which mortality risk should be compared later, for example educational level or occupational class. These analytical variables may be measured either at the first interview (the baseline) or at the last interview of a respondent. The output file is available in SAS format and by default also in csv format.
Facebook
TwitterThese data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed.This study sought to examine any major changes in schools in the past two years as an evaluation of the Safe and Civil Schools Initiative. Students, faculty, and administrators were asked questions on topics including school safety, climate, and the discipline process.This collection includes 6 SAS data files: "psja_schools.sas7bdat" with 66 variables and 15 cases, "psja_schools_v01.sas7bdat" with 104 variables and 15 cases, "psja_staff.sas7bdat" with 39 variables and 2,921 cases, "psja_staff_v01.sas7bdat" with 202 variables and 2,398 cases, "psja_students.sas7bdat" with 97 variables and 4,382 cases, and "psja_students_v01.sas7bdat" with 332 variables and 4,267 cases. Additionally, the collection includes 1 SAS formats catalog "formats.sas7bcat", and 10 SAS syntax files.
Facebook
TwitterThis database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .
Facebook
TwitterThe SAS2RAW database is a log of the 28 SAS-2 observation intervals and contains target names, sky coordinates start times and other information for all 13056 photons detected by SAS-2. The original data came from 2 sources. The photon information was obtained from the Event Encyclopedia, and the exposures were derived from the original "Orbit Attitude Live Time" (OALT) tapes stored at NASA/GSFC. These data sets were combined into FITS format images at HEASARC. The images were formed by making the center pixel of a 512 x 512 pixel image correspond to the RA and DEC given in the event file. Each photon's RA and DEC was converted to a relative pixel in the image. This was done by using Aitoff projections. All the raw data from the original SAS-2 binary data files are now stored in 28 FITS files. These images can be accessed and plotted using XIMAGE and other columns of the FITS file extensions can be plotted with the FTOOL FPLOT. This is a service provided by NASA HEASARC .
Facebook
TwitterThis data contains linked birth registry information with greenery metrics in North Carolina. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Birth records can be requested through the NC State Health Department. Greenery meterics can be downloaded through EPA's EnviroAtlas. Format: Datasets are in csvs, R and SAS formats. This dataset is associated with the following publication: Tsai, W., T. Luben, and K. Rappazzo. Associations between neighborhood greenery and birth outcomes in a North Carolina cohort. Journal of Exposure Science and Environmental Epidemiology. Nature Publishing Group, London, UK, 35(5): 821-830, (2025).
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. Teenage adolescent females residing in Baltimore, Maryland who were involved in a relationship with a history of violence were sought after to participate in this research study. Respondents were interviewed and then followed through daily diary entries for several months. The aim of the research was to understand the context regarding teen dating violence (TDV). Prior research on relationship context has not focused on minority populations; therefore, the focus of this project was urban, predominantly African American females. The available data in this collection includes three SAS (.sas7bdat) files and a single SAS formats file that contains variable and value label information for all three data files. The three data files are: final_baseline.sas7bdat (157 cases / 252 variables) final_partnergrid.sas7bdat (156 cases / 76 variables) hart_final_sas7bdata (7004 cases / 23 variables)
Facebook
TwitterThis database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .
Facebook
Twitterhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/10.0/customlicense?persistentId=doi:10.7910/DVN/PNOFKIhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/10.0/customlicense?persistentId=doi:10.7910/DVN/PNOFKI
InfoGroup’s Historical Business Backfile consists of geo-coded records of millions of US businesses and other organizations that contain basic information on each entity, such as: contact information, industry description, annual revenues, number of employees, year established, and other data. Each annual file consists of a “snapshot” of InfoGroup’s data as of the last day of each year, creating a time series of data 1997-2019. Access is restricted to current Harvard University community members. Use of Infogroup US Historical Business Data is subject to the terms and conditions of a license agreement (effective March 16, 2016) between Harvard and Infogroup Inc. and subject to applicable laws. Most data files are available in either .csv or .sas format. All data files are compressed into an archive in .gz, or GZIP, format. Extraction software such as 7-Zip is required to unzip these archives.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File formats:
.xls: Excel file with variable names in 1. row and variable labels in 2. row
.xpt/.xpf: SAS XPORT data file (.xpt) and value labels (formats.xpf).
Note that the following variables were renamed in the output file: sumcadhssb -> SUMCADHS, sumcwursk -> SUMCWURS, adhdnotest -> ADHDNOTE, subs_subnotob -> SUBS_SUB, and that the internally recorded dataset name was shortened to "Liebrenz" .dta: Stata 13 data file
Facebook
TwitterDataset is linked information on sudden death in Wake County NC with air pollution concentrations from central site monitor. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: SUDDEN data can be requested through the UNC PI, Dr. Ross Simpson. Air pollution data can be accessed through the AQS data mart. Greenspace metrics can be acquired through the National Land Cover Land Use database. Format: Datasets are in csv and SAS formats. This dataset is associated with the following publication: Rappazzo, K., N. Egerstrom, J. Wu, A. Capone, G. Joodi, S. Keen, W. Cascio, and R. Simpson, Jr. Fine particulate matter-sudden death association modified by ventricular hypertrophy and inflammation: a case-crossover study. Frontiers in Public Health. Frontiers, Lausanne, SWITZERLAND, 12: 1367416, (2024).
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/35612/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/35612/terms
These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. The School Health Center Healthy Adolescent Relationship Program (SHARP) was a school health center (SHC) provider-delivered multi-level intervention to reduce adolescent relationship abuse (ARA) among adolescents ages 14-19 seeking care in SHCs. This study tested the effectiveness of a brief relationship abuse education and counseling intervention in SHCs. The SHARP intervention consisted of three levels of integrated intervention: A brief clinical intervention on healthy and unhealthy relationships for SHC (cisgender and transgender) male and female patients delivered by SHC providers during all clinic visits (evaluated via client pre- and post-surveys and chart review) Development of an ARA-informed SHC staff and clinic environment (evaluated via provider pre and post-training surveys and interviews) SHC-based youth-led outreach activities within the school to promote healthy relationships and improve student safety (evaluated by focus groups with youth leaders and measures of school climate) The collection consists of: 3 SAS data files sharp_abuse_data_archive.sas7bdat (n=1,011; 272 variables) sharp_blt2exit_long_data_archive.sas7bdat (n=1,949; 259 variables) sharp_chart_data_archive_icpsr.sas7bdat (n=936; 24 variables) 2 Stata data files SHARP_Provider Immediate Post_0829 and 0905 training_final-ICPSR.dta (n=38; 21 variables) SHARP_Provider Pre and Followup_final.dta-ICPSR.dta (n=66; 102 variables) 5 SAS syntax files NIJ SHARP - Analyses.sas NIJ SHARP - DataMgmt_Final.sas NIJ SHARP - Formats.sas SHARP - Chart Extraction Data-MASKED.sas SHARP - Chart Extraction Formats.sas 3 Stata syntax files code-for-SHARP-dating-violence-analyses-deidentified-MASKED.do SHARP_Provider Data to Archive-MASKED.do SHARP-analyses-deidentified-MASKED.do 3 PI provided codebooks SHARP Codebook_Client Chart Data.xlsx (1 worksheet) SHARP Codebook_Client Survey Data.xlsx (3 worksheets) SHARP Codebook_Provider Survey Data.xlsx (1 worksheet) For confidentiality reasons, qualitative data from focus groups are not currently available. Focus groups were conducted with each student outreach team following the conclusion of data collection. Discussions focused on awareness about ARA, the school-wide campaign, using the SHC as a resource, and what else can be done to prevent ARA in schools.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
File List ACS.zip -- .zip file containing SAS macro and example code, and example Aletris bracteata data sets. acs.sas chekika_ACS_estimation.sas chekika_1.csv chekika_2.csv philippi.3.1.zip
Description "acs.sas" is a SAS macro for computing Horvitz-Thompson and Hansen-Horwitz estimates of population size for adaptive cluster sampling with random initial sampling. This version uses ugly base SAS code and does not require SQL or SAS products other than Base SAS, and should work with versions 8.2 onward (tested with versions 9.0 and 9.1). "chekika_ACS_estimation.sas" is example SAS code calling the acs macro to analyze the Chekika Aletris bracteata example data sets. "chekika_1.csv" is an example data set in ASCII comma-delimited format from adaptive cluster sampling of A. bracteata at Chekika, Everglades National Park, with 1-m2 quadrats. "chekika_2.csv" is an example data set in ASCII comma-delimited format from adaptive cluster sampling of A. bracteata at Chekika, Everglades National Park, with 4-m2 quadrats. "philippi.3.1.zip" metadata file generated by morpho, including both xml and css.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441277https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441277
Abstract (en): This study is part of a time-series collection of national surveys fielded continuously since 1952. The election studies are designed to present data on Americans' social backgrounds, enduring political predispositions, social and political values, perceptions and evaluations of groups and candidates, opinions on questions of public policy, and participation in political life. A Black supplement of 263 respondents, who were asked the same questions that were administered to the national cross-section sample, is included with the national cross-section of 1,571 respondents. In addition to the usual content, the study contains data on opinions about the Supreme Court, political knowledge, and further information concerning racial issues. Voter validation data have been included as an integral part of the election study, providing objective information from registration and voting records or from respondents' past voting behavior. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Performed consistency checks.; Standardized missing values.; Performed recodes and/or calculated derived variables.; Checked for undocumented or out-of-range codes.. United States citizens of voting age living in private households in the continental United States. A representative cross-section sample, consisting of 1,571 respondents, plus a Black supplement sample of 263 respondents. 2015-11-10 The study metadata was updated.1999-12-14 The data for this study are now available in SAS transport and SPSS export formats, in addition to the ASCII data file. Variables in the dataset have been renumbered to the following format: 2-digit (or 2-character) year prefix + 4 digits + [optional] 1-character suffix. Dataset ID and version variables have also been added. In addition, SAS and SPSS data definition statements have been created for this collection, and the data collection instruments are now available as a PDF file. face-to-face interview, telephone interviewThe SAS transport file was created using the SAS CPORT procedure.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The data source for this project work is a large collection of raw data publicly available on CDC website. “CDC is the nation’s leading science-based, data-driven, service organisation that protects the public’s health. In 1984, the CDC established the Behavioral Risk Factor Surveillance System (BRFSS). The BRFSS is the nation’s premier system of health-related telephone surveys that collect state data about U.S. residents regarding their health-related risk behaviours, chronic health conditions, and use of preventive services.” (CDC - BRFSS Annual Survey Data, 2020).
I have referred to a set of data collected between the years 2005 and 2021 and it contains more than 7 Million records in total (7,143,987 to be exact). For each year there are around 300 to 400 features available in the dataset, but not all of them are needed for this project, as some of them are irrelevant to my work. I have shortlisted a total of 22 features which are relevant for designing and developing my ML models and I have explained them in detail in the below table.
The codebook link (of the year 2021) explains below columns in more details - https://www.cdc.gov/brfss/annual_data/2021/pdf/codebook21_llcp-v2-508.pdf
All datasets are obtained from CDC website wherein they are available in Zip format containing a SAS format file with .xpt extension. So, I downloaded all the zip files, extracted them and then converted each one of them into a .csv format so I could easily fetch the records in my project code. I used below command in Anaconda Prompt to convert .xpt extension file into .csv extension file,
C:\users\mayur\Downloads> python -m xport LLCP2020.xpt > LLCP2020.csv
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457280https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457280
Abstract (en): This collection examines the characteristics of users and sellers of crack cocaine and the impact of users and sellers on the criminal justice system and on drug treatment and community programs. Information was also collected concerning users of drugs other than crack cocaine and the attributes of those users. Topics covered include initiation into substance use and sales, expenses for drug use, involvement with crime, sources of income, and primary substance of abuse. Demographic information includes subject's race, educational level, living area, social setting, employment status, occupation, marital status, number of children, place of birth, and date of birth. Information was also collected about the subject's parents: education level, occupation, and place of birth. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Standardized missing values.; Performed recodes and/or calculated derived variables.; Checked for undocumented or out-of-range codes.. Residents of two New York City neighborhoods, some of whom had been arrested for drug offenses, some of whom used drugs but had eluded arrest, and some of whom were participating in drug treatment programs. Respondents were selected through police records and snowball sampling methods. 2005-11-04 On 2005-03-14 new files were added to one or more datasets. These files included additional setup files as well as one or more of the following: SAS program, SAS transport, SPSS portable, and Stata system files. The metadata record was revised 2005-11-04 to reflect these additions.2002-04-25 The data file was converted from card image to logical record length data format. SAS and SPSS data definition statements were created, and the codebook was converted to PDF format. Funding insitution(s): United States Department of Justice. Office of Justice Programs. National Institute of Justice (87-IJ-CX-0064). The codebook is provided by ICPSR as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided on the ICPSR Web site.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
data file in SAS format