98 datasets found
  1. d

    DHS data extractors for Stata

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Oster (2023). DHS data extractors for Stata [Dataset]. http://doi.org/10.7910/DVN/RRX3QD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Emily Oster
    Description

    This package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.

  2. c

    SAS-2 Map Product Catalog

    • s.cnmilf.com
    • catalog.data.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). SAS-2 Map Product Catalog [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sas-2-map-product-catalog
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    This database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .

  3. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  4. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  5. m

    Model-derived synthetic aperture sonar (SAS) data in Generic Data Format...

    • marine-geo.org
    Updated Sep 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Model-derived synthetic aperture sonar (SAS) data in Generic Data Format (GDF) [Dataset]. https://www.marine-geo.org/tools/files/31898
    Explore at:
    Dataset updated
    Sep 24, 2024
    Description

    The simulated synthetic aperture sonar (SAS) data presented here was generated using PoSSM [Johnson and Brown 2018]. The data is suitable for bistatic, coherent signal processing and will form acoustic seafloor imagery. Included in this data package is simulated sonar data in Generic Data Format (GDF) files, a description of the GDF file contents, example SAS imagery, and supporting information about the simulated scenes. In total, there are eleven 60 m x 90 m scenes, labeled scene00 through scene10, with scene00 provided with the scatterers in isolation, i.e. no seafloor texture. This is provided for beamformer testing purposes and should result in an image similar to the one labeled "PoSSM-scene00-scene00-starboard-0.tif" in the Related Data Sets tab. The ten other scenes have varying degrees of model variation as described in "Description_of_Simulated_SAS_Data_Package.pdf". A description of the data and the model is found in the associated document called "Description_of_Simulated_SAS_Data_Package.pdf" and a description of the format in which the raw binary data is stored is found in the related document "PSU_GDF_Format_20240612.pdf". The format description also includes MATLAB code that will effectively parse the data to aid in signal processing and image reconstruction. It is left to the researcher to develop a beamforming algorithm suitable for coherent signal and image processing. Each 60 m x 90 m scene is represented by 4 raw (not beamformed) GDF files, labeled sceneXX-STARBOARD-000000 through 000003. It is possible to beamform smaller scenes from any one of these 4 files, i.e. the four files are combined sequentially to form a 60 m x 90 m image. Also included are comma separated value spreadsheets describing the locations of scatterers and objects of interest within each scene. In addition to the binary GDF data, a beamformed GeoTIFF image and a single-look complex (SLC, science file) data of each scene is provided. The SLC data (science) is stored in the Hierarchical Data Format 5 (https://www.hdfgroup.org/), and appended with ".hdf5" to indicate the HDF5 format. The data are stored as 32-bit real and 32-bit complex values. A viewer is available that provides basic graphing, image display, and directory navigation functions (https://www.hdfgroup.org/downloads/hdfview/). The HDF file contains all the information necessary to reconstruct a synthetic aperture sonar image. All major and contemporary programming languages have library support for encoding/decoding the HDF5 format. Supporting documentation that outlines positions of the seafloor scatterers is included in "Scatterer_Locations_Scene00.csv", while the locations of the objects of interest for scene01-scene10 are included in "Object_Locations_All_Scenes.csv". Portable Network Graphic (PNG) images that plot the location of objects of all the objects of interest in each scene in Along-Track and Cross-Track notation are provided.

  6. c

    SAS-3 Y-Axis Pointed Obs Log

    • s.cnmilf.com
    • catalog.data.gov
    Updated Sep 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). SAS-3 Y-Axis Pointed Obs Log [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sas-3-y-axis-pointed-obs-log
    Explore at:
    Dataset updated
    Sep 19, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    This database is the Third Small Astronomy Satellite (SAS-3) Y-Axis Pointed Observation Log. It identifies possible pointed observations of celestial X-ray sources which were performed with the y-axis detectors of the SAS-3 X-Ray Observatory. This log was compiled (by R. Kelley, P. Goetz and L. Petro) from notes made at the time of the observations and it is expected that it is neither complete nor fully accurate. Possible errors in the log are (i) the misclassification of an observation as a pointed observation when it was either a spinning or dither observation and (ii) inaccuracy of the dates and times of the start and end of an observation. In addition, as described in the HEASARC_Updates section, the HEASARC added some additional information when creating this database. Further information about the SAS-3 detectors and their fields of view can be found at: http://heasarc.gsfc.nasa.gov/docs/sas3/sas3_about.html Disclaimer: The HEASARC is aware of certain inconsistencies between the Start_date, End_date, and Duration fields for a number of rows in this database table. They appear to be errors present in the original table. Except for one entry where the HEASARC corrected an error where there was a near-certainty which parameter was incorrect (as noted in the 'HEASARC_Updates' section of this documentation), these inconsistencies have been left as they were in the original table. This database table was released by the HEASARC in June 2000, based on the SAS-3 Y-Axis pointed Observation Log (available from the NSSDC as dataset ID 75-037A-02B), together with some additional information provided by the HEASARC itself. This is a service provided by NASA HEASARC .

  7. m

    Object locations (PNG image format) used for synthetic aperture sonar (SAS)...

    • marine-geo.org
    Updated Sep 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Object locations (PNG image format) used for synthetic aperture sonar (SAS) data [Dataset]. https://www.marine-geo.org/tools/datasets/31901
    Explore at:
    Dataset updated
    Sep 24, 2024
    Description

    The simulated synthetic aperture sonar (SAS) data presented here was generated using PoSSM [Johnson and Brown 2018]. The data is suitable for bistatic, coherent signal processing and will form acoustic seafloor imagery. Included in this data package is simulated sonar data in Generic Data Format (GDF) files, a description of the GDF file contents, example SAS imagery, and supporting information about the simulated scenes. In total, there are eleven 60 m x 90 m scenes, labeled scene00 through scene10, with scene00 provided with the scatterers in isolation, i.e. no seafloor texture. This is provided for beamformer testing purposes and should result in an image similar to the one labeled "PoSSM-scene00-scene00-starboard-0.tif" in the Related Data Sets tab. The ten other scenes have varying degrees of model variation as described in "Description_of_Simulated_SAS_Data_Package.pdf". A description of the data and the model is found in the associated document called "Description_of_Simulated_SAS_Data_Package.pdf" and a description of the format in which the raw binary data is stored is found in the related document "PSU_GDF_Format_20240612.pdf". The format description also includes MATLAB code that will effectively parse the data to aid in signal processing and image reconstruction. It is left to the researcher to develop a beamforming algorithm suitable for coherent signal and image processing. Each 60 m x 90 m scene is represented by 4 raw (not beamformed) GDF files, labeled sceneXX-STARBOARD-000000 through 000003. It is possible to beamform smaller scenes from any one of these 4 files, i.e. the four files are combined sequentially to form a 60 m x 90 m image. Also included are comma separated value spreadsheets describing the locations of scatterers and objects of interest within each scene. In addition to the binary GDF data, a beamformed GeoTIFF image and a single-look complex (SLC, science file) data of each scene is provided. The SLC data (science) is stored in the Hierarchical Data Format 5 (https://www.hdfgroup.org/), and appended with ".hdf5" to indicate the HDF5 format. The data are stored as 32-bit real and 32-bit complex values. A viewer is available that provides basic graphing, image display, and directory navigation functions (https://www.hdfgroup.org/downloads/hdfview/). The HDF file contains all the information necessary to reconstruct a synthetic aperture sonar image. All major and contemporary programming languages have library support for encoding/decoding the HDF5 format. Supporting documentation that outlines positions of the seafloor scatterers is included in "Scatterer_Locations_Scene00.csv", while the locations of the objects of interest for scene01-scene10 are included in "Object_Locations_All_Scenes.csv". Portable Network Graphic (PNG) images that plot the location of objects of all the objects of interest in each scene in Along-Track and Cross-Track notation are provided.

  8. f

    Supplement 1. Sample data, metadata, and SAS code.

    • wiley.figshare.com
    html
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Everett Weber (2023). Supplement 1. Sample data, metadata, and SAS code. [Dataset]. http://doi.org/10.6084/m9.figshare.3521543.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Wiley
    Authors
    Everett Weber
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List ECO101_sample_data.xls ECO101_sample_data.txt SAS_Code.rtf

    Please note that ESA cannot guarantee the availability of Excel files in perpetuity as it is proprietary software. Thus, the data file here is also supplied as a tab-delimited ASCII file, and the other Excel workbook sheets are provided below in the description section. Description -- TABLE: Please see in attached file. --

  9. d

    Replication Data for: WHICH PANEL DATA ESTIMATOR SHOULD I USE?: A...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moundigbaye, Mantobaye; William S. Rea; W. Robert Reed (2023). Replication Data for: WHICH PANEL DATA ESTIMATOR SHOULD I USE?: A CORRIGENDUM AND EXTENSION [Dataset]. http://doi.org/10.7910/DVN/YKSATT
    Explore at:
    Dataset updated
    Nov 22, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Moundigbaye, Mantobaye; William S. Rea; W. Robert Reed
    Description

    This dataset contains all the materials needed to reproduce the results in "Which Panel Data Estimator Should I Use?: A Corrigendum and Extension". Please read the README document first. The results were obtained using SAS/IML software, and the files consist of SAS data sets and SAS programs.

  10. Patient-reported outcomes via electronic health record portal vs. telephone:...

    • data.niaid.nih.gov
    • search.dataone.org
    • +2more
    zip
    Updated Oct 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heidi Munger Clary; Beverly Snively (2022). Patient-reported outcomes via electronic health record portal vs. telephone: process and retention data in a pilot trial of anxiety or depression symptoms in epilepsy [Dataset]. http://doi.org/10.5061/dryad.qz612jmk3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 23, 2022
    Dataset provided by
    Atrium Healthhttp://www.atriumhealth.org/
    Authors
    Heidi Munger Clary; Beverly Snively
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objective: To close gaps between research and clinical practice, tools are needed for efficient pragmatic trial recruitment and patient-reported outcome(PROM) collection. The objective was to assess feasibility and process measures for patient-reported outcome collection in a randomized trial comparing electronic health record(EHR) patient portal questionnaires to telephone interview among adults with epilepsy and anxiety or depression symptoms. Results: Participants were 60% women, 77% White/non-Hispanic, with mean age 42.5 years. Among 15 individuals randomized to EHR portal, 10(67%, CI 41.7-84.8%) met the 6-month retention endpoint, versus 100%(CI 79.6-100%) in the telephone group(p=0.04). EHR outcome collection at 6 months required 11.8 minutes less research staff time per participant than telephone (5.9, CI 3.3-7.7 vs. 17.7, CI 14.1-20.2). Subsequent telephone contact after unsuccessful EHR attempts enabled near complete data collection and still saved staff time. Discussion: Data from this randomized pilot study of pragmatic outcome collection methods for patients with anxiety or depression symptoms in epilepsy includes baseline participant characteristics, recruitment flow resulting from a novel EHR-based, care-embedded recruitment process, and data on retention along with various process measures at 6-months. Methods The dataset was collected via a combination of the following: 1. manual extraction of EHR-based data followed by entry into REDCap and then analysis and further processing in SAS 9.4; 2. Data pull of Epic EHR-based data from Clarity database using standard programming techniques, followed by processing in SAS 9.4 and merging with data from REDCap; 3. Collection of data directly from participants via telephone with entry into REDCap and further processing in SAS 9.4; 4. Collection of process measures from study team tracking records followed by entry into REDCap and further processing in SAS 9.4. One file in the dataset contains aggregate data generated following merging of Clarity data pull-origin dataset with a REDCap dataset and further manual processing. Recruitment for the randomized trial began at an epilepsy clinic visit, with EHR-embedded validated anxiety and depression instruments, followed by automated EHR-based research screening consent and eligibility assessment. Fully eligible individuals later completed telephone consent, enrollment and randomization. Thirty total participants were randomized 1:1 to EHR portal versus telephone outcome assessment, and patient-reported and process outcomes were collected at 3- and 6-months, with primary outcome 6-month retention in EHR arm(feasibility target: ≥11 participants retained). Variables in this dataset include recruitment flow diagram data, baseline participant sociodemographic and clinical characteristics, retention (successful PROM collection at 6 months), and process measures. The process measures included research staff time to collect outcomes, research staff time to collect outcomes and enter data, time from initial outcome collection reminder to outcome collection, and number of reminders sent to participants for outcome collection. PROMs were collected via the randomized method only at 3 months. At 6 months, if the criteria for retention was not met by the randomized method (failure to return outcomes by 1 week after 5 post-due date reminders for outcome collection), up to 3 additional attempts were made to collect outcomes by the alternative method, and process measures were also collected during this hybrid outcome collection method approach.

  11. E

    SAS: Semantic Artist Similarity Dataset

    • live.european-language-grid.eu
    • zenodo.org
    txt
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). SAS: Semantic Artist Similarity Dataset [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7418
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 28, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity

  12. FHFA Data: Uniform Appraisal Dataset Aggregate Statistics

    • datalumos.org
    • openicpsr.org
    Updated Feb 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Federal Housing Finance Agency (2025). FHFA Data: Uniform Appraisal Dataset Aggregate Statistics [Dataset]. http://doi.org/10.3886/E219961V1
    Explore at:
    Dataset updated
    Feb 18, 2025
    Dataset authored and provided by
    Federal Housing Finance Agencyhttps://www.fhfa.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2013 - 2024
    Area covered
    United States of America
    Description

    The Uniform Appraisal Dataset (UAD) Aggregate Statistics Data File and Dashboards are the nation’s first publicly available datasets of aggregate statistics on appraisal records, giving the public new access to a broad set of data points and trends found in appraisal reports. The UAD Aggregate Statistics for Enterprise Single-Family, Enterprise Condominium, and Federal Housing Administration (FHA) Single-Family appraisals may be grouped by neighborhood characteristics, property characteristics and different geographic levels.DocumentationOverview (10/28/2024)Data Dictionary (10/28/2024)Data File Version History and Suppression Rates (12/18/2024)Dashboard Guide (2/3/2025)UAD Aggregate Statistics DashboardsThe UAD Aggregate Statistics Dashboards are the visual front end of the UAD Aggregate Statistics Data File. The Dashboards are designed to provide easy access to customized maps and charts for all levels of users. Access the UAD Aggregate Statistics Dashboards here.UAD Aggregate Statistics DatasetsNotes:Some of the data files are relatively large in size and will not open correctly in certain software packages, such as Microsoft Excel. All the files can be opened and used in data analytics software such as SAS, Python, or R.All CSV files are zipped.

  13. U

    Time Diary Study (CAPS-DIARY module)

    • dataverse-staging.rdmc.unc.edu
    • datasearch.gesis.org
    Updated May 18, 2009
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2009). Time Diary Study (CAPS-DIARY module) [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CAPS-DIARY
    Explore at:
    tsv(68411), application/x-sas-transport(237840), application/x-spss-por(75276), application/x-sas-transport(242160), application/x-spss-por(75850), application/x-sas-transport(240000), txt(70468), application/x-spss-por(74374), application/x-spss-por(77572), tsv(65433), txt(452140), txt(91461), application/x-sas-transport(1613120), application/x-spss-por(75358), txt(135850), txt(237380), application/x-spss-por(392206), txt(219960), txt(223730), txt(243880), application/x-sas-transport(945520), txt(437710), txt(447330), application/x-sas-transport(235680), txt(239720), tsv(65759), tsv(66745), txt(134420), txt(198510), txt(231010), application/x-spss-por(75522), text/x-sas-syntax(14192), tsv(66377), application/x-spss-por(75686), txt(218140), txt(247000), txt(229190), txt(456950), tsv(67095), txt(209820), txt(29480), txt(234130), text/x-sas-syntax(14213), tsv(67582), txt(223990), txt(227110), txt(432900), application/x-spss-por(74702), application/x-spss-por(76506), txt(248950), application/x-spss-por(75768), txt(132990), text/x-sas-syntax(14212), tsv(66338), tsv(65479), txt(442520), txt(133120), txt(220870), text/x-sas-syntax(14200), tsv(515401), txt(130390), txt(222560), txt(217100), txt(246350), tsv(66085), txt(461760), application/x-spss-por(76260), tsv(66939), txt(235560), txt(229450), txt(72104), tsv(66400), txt(211510), txt(226850), application/x-spss-por(492492), txt(205790), txt(210210), tsv(66217), tsv(66157), txt(234390), application/x-spss-por(75112), application/x-spss-por(75932), txt(224770), application/x-spss-por(74784), tsv(66192), txt(131560), txt(230100), txt(219050), tsv(382593), txt(213980), tsv(66604), txt(140140)Available download formats
    Dataset updated
    May 18, 2009
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARYhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARY

    Description

    The purpose of this project is to determine how college students distribute their activities in time (with a particular focus on academic and athletic activities) and to examine the factors that influence such distributions.Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday. Rs were told the week before they were to report which day was assigned and were given a report form to complete during that day. They entered the i nformation from that form when they returned the next week.The activity codes included were: 0: Sleeping. 1: Attending classes. 2: Studying or preparing classroom assignments. 3: Working at a jog (including CAPS). 4: Cooking, home chores, laundry, grocery shopping. 5: Errands, non-grocery shopping, gardening, animal care. 6: Eating. 7: Bathing, getting dressed, etc. 8: Sports, exercising, other physical activities. 9: Playing competitive games (cards, darts, videogames, frisbee, chess, Tr ivial Pursuit, etc.). 10: Participating in UNC-sponsored organizations (student government, band, sorority, etc.). 11: Listening to the radio. 12: Watching TV. 13: Reading for pleasure (not studying or reading for class). 14: Going to a movie. 15: Attending a cultural event (such as a play, concert, or museum). 16: Attending a sports event as a spectator. 17: Partying. 18: Religious activities. 19: Conversation. 20: Travel. 21: Resting. 22: Doing other things DIARY1-8: These datasets contain a matrix of activities by times for a particular day. Included is time period, activity code (see above), # of friends present, # of others present. (Rs were allowed to report doing two activities at once. In these cases they were also asked to report the % of time during the time period affected which was allocated to the first of the two activities listed.)THE DIARY DATASETS ARE STORED IN RAW FORM. SUMMARY FILES, CALLED TIMEREP, CONTAIN MOST SUMMA RY INFORMATION WHICH MIGHT BE USED IN ANALYSES. THE DIARY DATASETS CAN BE LISTED TO ALLOW UNIQUE CODING OF THE ORIGINAL DATA. Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday.TIMEREP: The TIMEREP dataset is a summary file which gives the amount of time spent on each activity during each of the eight reporting periods and also includes more detailed information about many of the activities from follow-up questions which were asked if the respondent reported having engaged in certain activities. Data from additional questions asked of every respondent after each diary entry are also included: contact with family members, number of alcoholic drinks consumed during the 24 hour period reported on, number of friends and others present while drinking, number of cigarettes smoked on day reported about, and number of classes skipped on day reported about. Follow-up questions include detail about kind of physical activity or sports participation, kind of university organization, kind of radio program listened to and place of listening, kind of TV program watched and place of watching, kind of reading material read and topic, alcohol consumed while partying and place of partying, conversation topics, kind of travel, activities included in 'other' category.Special processing is required to put the dataset into SAS format. See spec for details.

  14. d

    MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Immature...

    • catalog.data.gov
    • gimi9.com
    Updated Oct 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Fish and Wildlife Service (2025). MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Immature Monarch Butterfly and Plant Abundance from SOP 3 Data [Dataset]. https://catalog.data.gov/dataset/mcsp-monarch-and-plant-monitoring-sas-output-summarizing-2018-immature-monarch-butterfly-a
    Explore at:
    Dataset updated
    Oct 5, 2025
    Dataset provided by
    U.S. Fish and Wildlife Service
    Description

    Output from programming code written to summarize immature monarch butterfly, milkweed and nectar plant abundance from monitoring data acquired using a grid of 1 square-meter quadrats at custom 2017 GRTS draw sites within select monitoring areas (see SOP 3 in ServCat reference 103368 for methods) of FWS Legacy Regions 2 and 3. Areas monitored included Balcones Canyonlands (TX), Hagerman (TX), Washita (OK), Neal Smith (IA) NWRs and several locations near the town of Lamoni, Iowa and northern Missouri. Input data file is named 'FWS_2018_MonMonSOP3DS1_forSAS.csv' and is stored in ServCat reference 137698. See SM 5 (ServCat reference 103388) for dictionary of data fields in the input data file.

  15. f

    Supplement 1. MATLAB and SAS code necessary to replicate the simulation...

    • wiley.figshare.com
    • datasetcatalog.nlm.nih.gov
    html
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffrey A. Evans; Adam S. Davis; S. Raghu; Ashok Ragavendran; Douglas A. Landis; Douglas W. Schemske (2023). Supplement 1. MATLAB and SAS code necessary to replicate the simulation models and other demographic analyses presented in the paper. [Dataset]. http://doi.org/10.6084/m9.figshare.3517478.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Wiley
    Authors
    Jeffrey A. Evans; Adam S. Davis; S. Raghu; Ashok Ragavendran; Douglas A. Landis; Douglas W. Schemske
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List Code_and_Data_Supplement.zip (md5: dea8636b921f39c9d3fd269e44b6228c) Description The supplementary material provided includes all code and data files necessary to replicate the simulation models other demographic analyses presented in the paper. MATLAB code is provided for the simulations, and SAS code is provided to show how model parameters (vital rates) were estimated.

      The principal programs are Figure_3_4_5_Elasticity_Contours.m and Figure_6_Contours_Stochastic_Lambda.m which perform the elasticity analyses and run the stochastic simulation, respectively.
    
    
      The files are presented in a zipped folder called Code_and_Data_Supplement. When uncompressed, users may run the MATLAB programs by opening them from within this directory. Subdirectories contain the data files and supporting MATLAB functions necessary to complete execution. The programs are written to find the necessary supporting functions in the Code_and_Data_Supplement directory. If users copy these MATLAB files to a different directory, they must add the Code_and_Data_Supplement directory and its subdirectories to their search path to make the supporting files available.
    
    
      More details are provided in the README.txt file included in the supplement.
    
    
      The file and directory structure of entire zipped supplement is shown below.
    
      Folder PATH listing
    Code_and_Data_Supplement
    |  Figure_3_4_5_Elasticity_Contours.m
    |  Figure_6_Contours_Stochastic_Lambda.m
    |  Figure_A1_RefitG2.m
    |  Figure_A2_PlotFecundityRegression.m
    |  README.txt
    |  
    +---FinalDataFiles
    +---Make Tables
    |    README.txt
    |    Table_lamANNUAL.csv
    |    Table_mgtProbPredicted.csv
    |    
    +---ParameterEstimation
    |  |  Categorical Model output.xls
    |  |  
    |  +---Fecundity
    |  |    Appendix_A3_Fecundity_Breakpoint.sas
    |  |    fec_Cat_Indiv.sas
    |  |    Mean_Fec_Previous_Study.m
    |  |    
    |  +---G1
    |  |    G1_Cat.sas
    |  |    
    |  +---G2
    |  |    G2_Cat.sas
    |  |    
    |  +---Model Ranking
    |  |    Categorical Model Ranking.xls
    |  |    
    |  +---Seedlings
    |  |    sdl_Cat.sas
    |  |    
    |  +---SS
    |  |    SS_Cat.sas
    |  |    
    |  +---SumSrv
    |  |    sum_Cat.sas
    |  |    
    |  \---WinSrv
    |      modavg.m
    |      winCatModAvgfitted.m
    |      winCatModAvgLinP.m
    |      winCatModAvgMu.m
    |      win_Cat.sas
    |      
    +---ProcessedDatafiles
    |    fecdat_gm_param_est_paper.mat
    |    hierarchical_parameters.mat
    |    refitG2_param_estimation.mat
    |    
    \---Required_Functions
      |  hline.m
      |  hmstoc.m
      |  Jeffs_Figure_Settings.m
      |  Jeffs_startup.m
      |  newbootci.m
      |  sem.m
      |  senstuff.m
      |  vline.m
      |  
      +---export_fig
      |    change_value.m
      |    eps2pdf.m
      |    export_fig.m
      |    fix_lines.m
      |    ghostscript.m
      |    license.txt
      |    pdf2eps.m
      |    pdftops.m
      |    print2array.m
      |    print2eps.m
      |    
      +---lowess
      |    license.txt
      |    lowess.m
      |    
      +---Multiprod_2009
      |  |  Appendix A - Algorithm.pdf
      |  |  Appendix B - Testing speed and memory usage.pdf
      |  |  Appendix C - Syntaxes.pdf
      |  |  license.txt
      |  |  loc2loc.m
      |  |  MULTIPROD Toolbox Manual.pdf
      |  |  multiprod.m
      |  |  multitransp.m
      |  |  
      |  \---Testing
      |    |  arraylab13.m
      |    |  arraylab131.m
      |    |  arraylab132.m
      |    |  arraylab133.m
      |    |  genop.m
      |    |  multiprod13.m
      |    |  readme.txt
      |    |  sysrequirements_for_testing.m
      |    |  testing_memory_usage.m
      |    |  testMULTIPROD.m
      |    |  timing_arraylab_engines.m
      |    |  timing_matlab_commands.m
      |    |  timing_MX.m
      |    |  
      |    \---Data
      |        Memory used by MATLAB statements.xls
      |        Timing results.xlsx
      |        timing_MX.txt
      |        
      +---province
      |    PROVINCE.DBF
      |    province.prj
      |    PROVINCE.SHP
      |    PROVINCE.SHX
      |    README.txt
      |    
      +---SubAxis
      |    parseArgs.m
      |    subaxis.m
      |    
      +---suplabel
      |    license.txt
      |    suplabel.m
      |    suplabel_test.m
      |    
      \---tight_subplot
          license.txt
          tight_subplot.m
    
  16. E

    Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0

    • dtechtive.com
    • find.data.gov.scot
    txt, zip
    Updated Jun 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. The Centre for Speech Technology Research (CSTR) (2017). Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0 [Dataset]. http://doi.org/10.7488/ds/2072
    Explore at:
    txt(0.0166 MB), zip(110.2 MB), txt(0.0019 MB)Available download formats
    Dataset updated
    Jun 22, 2017
    Dataset provided by
    University of Edinburgh. The Centre for Speech Technology Research (CSTR)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files are complementary to the fileset: Wu et al. (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/252. These two filesets should be considered two complementary parts of a single dataset.

  17. d

    Data from: Backwater Sedimentation in Navigation Pools 4 and 8 of the Upper...

    • catalog.data.gov
    • data.usgs.gov
    • +1more
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Geological Survey (2025). Backwater Sedimentation in Navigation Pools 4 and 8 of the Upper Mississippi River [Dataset]. https://catalog.data.gov/dataset/backwater-sedimentation-in-navigation-pools-4-and-8-of-the-upper-mississippi-river
    Explore at:
    Dataset updated
    Oct 8, 2025
    Dataset provided by
    U.S. Geological Survey
    Area covered
    Upper Mississippi River, Mississippi River
    Description

    Transects in backwaters of Navigation Pools 4 and 8 of the Upper Mississippi River (UMR) were established in 1997 to measure sedimentation rates. Annual surveys were conducted from 1997-2002 and then some transects surveyed again in 2017-18. Changes and patterns observed were reported on in 2003 for the 1997-2002 data, and a report summarizing changes and patterns from 1997-2017 will be reported on at this time. Several variables are recorded each survey year and placed into an Excel spreadsheet. The spreadsheets are read with a SAS program to generate a SAS dataset used in SAS programs to determine rates, depth loss, and associations between depth and change through regression.

  18. z

    GAPs Data Repository on Return: Guideline, Data Samples and Codebook

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    Updated Feb 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeynep Sahin Mencutek; Zeynep Sahin Mencutek; Fatma Yılmaz-Elmas; Fatma Yılmaz-Elmas (2025). GAPs Data Repository on Return: Guideline, Data Samples and Codebook [Dataset]. http://doi.org/10.5281/zenodo.14862490
    Explore at:
    Dataset updated
    Feb 13, 2025
    Dataset provided by
    RedCAP
    Authors
    Zeynep Sahin Mencutek; Zeynep Sahin Mencutek; Fatma Yılmaz-Elmas; Fatma Yılmaz-Elmas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The GAPs Data Repository provides a comprehensive overview of available qualitative and quantitative data on national return regimes, now accessible through an advanced web interface at https://data.returnmigration.eu/.

    This updated guideline outlines the complete process, starting from the initial data collection for the return migration data repository to the development of a comprehensive web-based platform. Through iterative development, participatory approaches, and rigorous quality checks, we have ensured a systematic representation of return migration data at both national and comparative levels.

    The Repository organizes data into five main categories, covering diverse aspects and offering a holistic view of return regimes: country profiles, legislation, infrastructure, international cooperation, and descriptive statistics. These categories, further divided into subcategories, are based on insights from a literature review, existing datasets, and empirical data collection from 14 countries. The selection of categories prioritizes relevance for understanding return and readmission policies and practices, data accessibility, reliability, clarity, and comparability. Raw data is meticulously collected by the national experts.

    The transition to a web-based interface builds upon the Repository’s original structure, which was initially developed using REDCap (Research Electronic Data Capture). It is a secure web application for building and managing online surveys and databases.The REDCAP ensures systematic data entries and store them on Uppsala University’s servers while significantly improving accessibility and usability as well as data security. It also enables users to export any or all data from the Project when granted full data export privileges. Data can be exported in various ways and formats, including Microsoft Excel, SAS, Stata, R, or SPSS for analysis. At this stage, the Data Repository design team also converted tailored records of available data into public reports accessible to anyone with a unique URL, without the need to log in to REDCap or obtain permission to access the GAPs Project Data Repository. Public reports can be used to share information with stakeholders or external partners without granting them access to the Project or requiring them to set up a personal account. Currently, all public report links inserted in this report are also available on the Repository’s webpage, allowing users to export original data.

    This report also includes a detailed codebook to help users understand the structure, variables, and methodologies used in data collection and organization. This addition ensures transparency and provides a comprehensive framework for researchers and practitioners to effectively interpret the data.

    The GAPs Data Repository is committed to providing accessible, well-organized, and reliable data by moving to a centralized web platform and incorporating advanced visuals. This Repository aims to contribute inputs for research, policy analysis, and evidence-based decision-making in the return and readmission field.

    Explore the GAPs Data Repository at https://data.returnmigration.eu/.

  19. E

    Spoofing and Anti-Spoofing (SAS) corpus v1.0

    • dtechtive.com
    • find.data.gov.scot
    • +1more
    gz, pdf, txt
    Updated May 27, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. The Centre for Speech Technology Research (CSTR) (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0 [Dataset]. http://doi.org/10.7488/ds/252
    Explore at:
    txt(0.001 MB), gz(7773.184 MB), txt(0.0166 MB), gz(3306.496 MB), gz(10065.92 MB), gz(7763.968 MB), gz(10280.96 MB), gz(7478.272 MB), gz(6644.736 MB), gz(7974.912 MB), gz(6674.432 MB), gz(9846.784 MB), pdf(0.1048 MB), gz(9935.872 MB), gz(10393.6 MB), gz(7985.152 MB), gz(10240 MB)Available download formats
    Dataset updated
    May 27, 2015
    Dataset provided by
    University of Edinburgh. The Centre for Speech Technology Research (CSTR)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is associated with the paper ''SAS: A speaker verification spoofing database containing diverse attacks': presents the first version of a speaker verification spoofing and anti-spoofing database, named SAS corpus. The corpus includes nine spoofing techniques, two of which are speech synthesis, and seven are voice conversion. We design two protocols, one for standard speaker verification evaluation, and the other for producing spoofing materials. Hence, they allow the speech synthesis community to produce spoofing materials incrementally without knowledge of speaker verification spoofing and anti-spoofing. To provide a set of preliminary results, we conducted speaker verification experiments using two state-of-the-art systems. Without any anti-spoofing techniques, the two systems are extremely vulnerable to the spoofing attacks implemented in our SAS corpus'. N.B. the files in the following fileset should also be taken as part of the same dataset as those provided here: Wu et al. (2017). Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). http://hdl.handle.net/10283/2741

  20. w

    Vision

    • data.wu.ac.at
    csv, json, xls
    Updated Jul 27, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Transportation (2017). Vision [Dataset]. https://data.wu.ac.at/schema/public_opendatasoft_com/dmlzaW9u
    Explore at:
    csv, xls, jsonAvailable download formats
    Dataset updated
    Jul 27, 2017
    Dataset provided by
    U.S. Department of Transportation
    Description

    The Fataility Analysis Reporting System (FARS) dataset is as of July 1, 2017, and is part of the U.S. Department of Transportation (USDOT)/Bureau of Transportation Statistics's (BTS's) National Transportation Atlas Database (NTAD). One of the primary objectives of the National Highway Traffic Safety Administration (NHTSA) is to reduce the staggering human toll and property damage that motor vehicle traffic crashes impose on our society. FARS is a census of fatal motor vehicle crashes with a set of data files documenting all qualifying fatalities that occurred within the 50 States, the District of Columbia, and Puerto Rico since 1975. To qualify as a FARS case, the crash had to involve a motor vehicle traveling on a trafficway customarily open to the public, and must have resulted in the death of a motorist or a non-motorist within 30 days of the crash. This data file contains information about crash characteristics and environmental conditions at the time of the crash. There is one record per crash. Please note: 207 records in this database were geocoded to latitude and logtitude of 0,0 due to lack of location information or errors in the reported locations. FARS data are made available to the public in Statistical Analysis System (SAS) data files as well as Database Files (DBF). Over the years changes have been made to the type of data collected and the way the data are presented in the SAS data files. Some data elements have been dropped and new ones added, coding of individual data elements has changed, and new SAS data files have been created. Coding changes and the years for which individual data items are available are shown in the “Data Element Definitions and Codes” section of this document. The FARS Coding and Editing Manual contains a detailed description of each SAS data elements including coding instructions and attribute definitions. The Coding Manual is published for each year of data collection. Years 2001 to current are available at: http://www-nrd.nhtsa.dot.gov/Cats/listpublications.aspx?Id=J&ShowBy=DocType Note: In this manual the word vehicle means in-transport motor vehicle unless otherwise noted.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Emily Oster (2023). DHS data extractors for Stata [Dataset]. http://doi.org/10.7910/DVN/RRX3QD

DHS data extractors for Stata

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Emily Oster
Description

This package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.

Search
Clear search
Close search
Google apps
Main menu