65 datasets found
  1. SAS-2 Map Product Catalog - Dataset - NASA Open Data Portal

    • data.nasa.gov
    • data.staging.idas-ds1.appdat.jsc.nasa.gov
    Updated Apr 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). SAS-2 Map Product Catalog - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/sas-2-map-product-catalog
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    This database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .

  2. Urban Sound & Sight (Urbansas) - Labeled set

    • zenodo.org
    • explore.openaire.eu
    txt, zip
    Updated Jun 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Magdalena Fuentes; Bea Steers; Pablo Zinemanas; Martín Rocamora; Luca Bondi; Julia Wilkins; Qianyi Shi; Yao Hou; Samarjit Das; Xavier Serra; Juan Pablo Bello; Magdalena Fuentes; Bea Steers; Pablo Zinemanas; Martín Rocamora; Luca Bondi; Julia Wilkins; Qianyi Shi; Yao Hou; Samarjit Das; Xavier Serra; Juan Pablo Bello (2022). Urban Sound & Sight (Urbansas) - Labeled set [Dataset]. http://doi.org/10.5281/zenodo.6658386
    Explore at:
    txt, zipAvailable download formats
    Dataset updated
    Jun 20, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Magdalena Fuentes; Bea Steers; Pablo Zinemanas; Martín Rocamora; Luca Bondi; Julia Wilkins; Qianyi Shi; Yao Hou; Samarjit Das; Xavier Serra; Juan Pablo Bello; Magdalena Fuentes; Bea Steers; Pablo Zinemanas; Martín Rocamora; Luca Bondi; Julia Wilkins; Qianyi Shi; Yao Hou; Samarjit Das; Xavier Serra; Juan Pablo Bello
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Urban Sound & Sight (Urbansas):

    Version 1.0, May 2022

    Created by
    Magdalena Fuentes (1, 2), Bea Steers (1, 2), Pablo Zinemanas (3), Martín Rocamora (4), Luca Bondi (5), Julia Wilkins (1, 2), Qianyi Shi (2), Yao Hou (2), Samarjit Das (5), Xavier Serra (3), Juan Pablo Bello (1, 2)
    1. Music and Audio Research Lab, New York University
    2. Center for Urban Science and Progress, New York University
    3. Universitat Pompeu Fabra, Barcelona, Spain
    4. Universidad de la República, Montevideo, Uruguay
    5. Bosch Research, Pittsburgh, PA, USA

    Publication

    If using this data in academic work, please cite the following paper, which presented this dataset:
    M. Fuentes, B. Steers, P. Zinemanas, M. Rocamora, L. Bondi, J. Wilkins, Q. Shi, Y. Hou, S. Das, X. Serra, J. Bello. “Urban Sound & Sight: Dataset and Benchmark for Audio-Visual Urban Scene Understanding”. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022.

    Description

    Urbansas is a dataset for the development and evaluation of machine listening systems for audiovisual spatial urban understanding. One of the main challenges to this field of study is a lack of realistic, labeled data to train and evaluate models on their ability to localize using a combination of audio and video.
    We set four main goals for creating this dataset:
    1. To compile a set of real-field audio-visual recordings;
    2. The recordings should be stereo to allow exploring sound localization in the wild;
    3. The compilation should be varied in terms of scenes and recording conditions to be meaningful for training and evaluation of machine learning models;
    4. The labeled collection should be accompanied by a bigger unlabeled collection with similar characteristics to allow exploring self-supervised learning in urban contexts.
    Audiovisual data
    We have compiled and manually annotated Urbansas from two publicly available datasets, plus the addition of unreleased material. The public datasets are the TAU Urban Audio-Visual Scenes 2021 Development dataset (street-traffic subset) and the Montevideo Audio-Visual Dataset (MAVD):


    Wang, Shanshan, et al. "A curated dataset of urban scenes for audio-visual scene analysis." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021.

    Zinemanas, Pablo, Pablo Cancela, and Martín Rocamora. "MAVD: A dataset for sound event detection in urban environments." Detection and Classification of Acoustic Scenes and Events, DCASE 2019, New York, NY, USA, 25–26 oct, page 263--267 (2019).


    The TAU dataset consists of 10-second segments of audio and video from different scenes across European cities, traffic being one of the scenes. Only the scenes labeled as traffic were included in Urbansas. MAVD is an audio-visual traffic dataset curated in different locations of Montevideo, Uruguay, with annotations of vehicles and vehicle components sounds (e.g. engine, brakes) for sound event detection. Besides the published datasets, we include a total of 9.5 hours of unpublished material recorded in Montevideo, with the same recording devices of MAVD but including new locations and scenes.

    Recordings for TAU were acquired using a GoPro Hero 5 (30fps, 1280x720) and a Soundman OKM II Klassik/studio A3 electret binaural in-ear microphone with a Zoom F8 audio recorder (48kHz, 24 bits, stereo). Recordings for MAVD were collected using a GoPro Hero 3 (24fps, 1920x1080) and a SONY PCM-D50 recorder (48kHz, 24 bits, stereo).

    When compiled in Urbansas, it includes 15 hours of stereo audio and video, stored in separate 10 second MPEG4 (1280x720, 24fps) and WAV (48kHz, 24 bit, 2 channel) files. Both released video datasets are already anonymized to obscure people and license plates, the unpublished MAVD data was anonymized similarly using this anonymizer. We also distribute the 2fps video used for producing the annotations.

    The audio and video files both share the same filename stem, meaning that they can be associated after removing the parent directory and extension.

    MAVD:
    video/

    TAU:
    video/


    where location_id in both cases includes the city and an ID number.


    city & places & clips & mins & frames & labeled mins \\
    Montevideo & 8 & 4085 & 681 & 980400 & 92 \\
    Stockholm & 3 & 91 & 15 & 21840 & 2 \\
    Barcelona & 4 & 144 & 24 & 34560 & 24 \\
    Helsinki & 4 & 144 & 24 & 34560 & 16 \\
    Lisbon & 4 & 144 & 24 & 34560 & 19 \\
    Lyon & 4 & 144 & 24 & 34560 & 6 \\
    Paris & 4 & 144 & 24 & 34560 & 2 \\
    Prague & 4 & 144 & 24 & 34560 & 2 \\
    Vienna & 4 & 144 & 24 & 34560 & 6 \\
    London & 5 & 144 & 24 & 34560 & 4 \\
    Milan & 6 & 144 & 24 & 34560 & 6 \\
    \midrule
    Total & 50 & 5472 & 912 & 1.3M & 180 \\


    Annotations


    Of the 15 hours of audio and video, 3 hours of data (1.5 hours TAU, 1.5 hours MAVD) are manually annotated by our team both in audio and image, along with 12 hours of unlabeled data (2.5 hours TAU, 9.5 hours of unpublished material) for the benefit of unsupervised models. The distribution of clips across locations was selected to maximize variance across different scenes. The annotations were collected at 2 frames per second (FPS) as it provided a balance between temporal granularity and clip coverage.

    The annotation data is contained in video_annotations.csv and audio_annotations.csv.

    Video Annotations

    Each row in the video annotations represents a single object in a single frame of the video. The annotation schema is as follows:

    • frame_id: The index of the frame within the clip the annotation is associated with. This index is 0-based and goes up to 19 (assuming 10-second clips with annotations at 2 FPS)
    • track_id: The ID of the detected instance that identifies the same object across different frames. These IDs are guaranteed to be unique within a clip.
    • x, y, w, h: The top-left corner and width and height of the object’s bounding box in the video. The values are given in absolute coordinates with respect to the image size (1280x720).
    • class_id: The index of the class corresponding to: [0, 1, 2, 3, -1] — see label for the index mapping. The -1 value corresponds to the case where there are no events, but still clip-level annotations, like night and city. When operating on bounding boxes, class_id of -1 should be filtered.
    • label: The label text. This is equivalent to LABELS[class_id], where LABELS=[car, bus, motorbike, truck, -1]. The label -1 has the same role as above.
    • visibility: The visibility of the object. This is 1 unless the object becomes obstructed, where it changes to 0.
    • filename: The file ID of the associated file. This is the file’s path minus the parent directory and extension.
    • city: The city where the clip was collected in.
    • location_id: The specific name of the location. This may include an integer ID following the city name for cases where there are multiple collection points.
    • time: The time (in seconds) of the annotation, relative to the start of the file. Equivalent to frame_id / fps .
    • night: Whether the clip takes place during the day or at night. This value is singular per clip.
    • subset: Which data source the data originally belongs to (TAU or MAVD).

    Audio Annotations

    Each row represents a single object instance, along with the time range that it exists within the clip. The annotation schema is as follows:

    • filename: The file ID odd the associated audio file. See filename above.
    • class_id, label: See above. Audio has an additional class_id of 4 (label=offscreen) which indicates an off-screen vehicle - meaning a vehicle that is heard but not seen. A class_id of -1 indicates a clip-level annotation for a clip that has no object annotations (an empty scene).
    • non_identifiable_vehicle_sound: True if the region contains the sound of vehicles where individual instances cannot be uniquely identified.
    • start, end: The start and end times (in seconds) of the annotation relative to the file.

    Conditions of use

    Dataset created by Magdalena Fuentes, Bea Steers, Pablo Zinemanas, Martín Rocamora, Luca Bondi, Julia Wilkins, Qianyi Shi, Yao Hou, Samarjit Das, Xavier Serra, and Juan Pablo Bello.

    The Urbansas dataset is offered free of charge under the following terms:

    • Urbansas annotations are release under the CC BY 4.0 license
    • Urbansas video and audio replicates the original sources licenses:
      • MAVD subset is released under CC BY 4.0
      • TAU subset is released under a Non-Commercial license

    Feedback

    Please help us improve Urbansas by sending your feedback to:

    • Magdalena Fuentes: mfuentes@nyu.edu
    • Bea Steers: bsteers@nyu.edu

    In case of a problem, please include as many details as possible.

    Acknowledgments

    This work was partially supported by the National Science

  3. E

    SAS: Semantic Artist Similarity Dataset

    • live.european-language-grid.eu
    • zenodo.org
    txt
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). SAS: Semantic Artist Similarity Dataset [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7418
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 28, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity

  4. d

    DHS data extractors for Stata

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Oster (2023). DHS data extractors for Stata [Dataset]. http://doi.org/10.7910/DVN/RRX3QD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Emily Oster
    Description

    This package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.

  5. d

    SAS-2 Map Product Catalog

    • catalog.data.gov
    • s.cnmilf.com
    Updated Jul 4, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). SAS-2 Map Product Catalog [Dataset]. https://catalog.data.gov/dataset/sas-2-map-product-catalog
    Explore at:
    Dataset updated
    Jul 4, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    This database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .

  6. H

    Survey of Income and Program Participation (SIPP)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Survey of Income and Program Participation (SIPP) [Dataset]. http://doi.org/10.7910/DVN/I0FFJV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...

  7. d

    SAS-3 Y-Axis Pointed Obs Log

    • catalog.data.gov
    • s.cnmilf.com
    Updated Jul 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    High Energy Astrophysics Science Archive Research Center (2025). SAS-3 Y-Axis Pointed Obs Log [Dataset]. https://catalog.data.gov/dataset/sas-3-y-axis-pointed-obs-log
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset provided by
    High Energy Astrophysics Science Archive Research Center
    Description

    This database is the Third Small Astronomy Satellite (SAS-3) Y-Axis Pointed Observation Log. It identifies possible pointed observations of celestial X-ray sources which were performed with the y-axis detectors of the SAS-3 X-Ray Observatory. This log was compiled (by R. Kelley, P. Goetz and L. Petro) from notes made at the time of the observations and it is expected that it is neither complete nor fully accurate. Possible errors in the log are (i) the misclassification of an observation as a pointed observation when it was either a spinning or dither observation and (ii) inaccuracy of the dates and times of the start and end of an observation. In addition, as described in the HEASARC_Updates section, the HEASARC added some additional information when creating this database. Further information about the SAS-3 detectors and their fields of view can be found at: http://heasarc.gsfc.nasa.gov/docs/sas3/sas3_about.html Disclaimer: The HEASARC is aware of certain inconsistencies between the Start_date, End_date, and Duration fields for a number of rows in this database table. They appear to be errors present in the original table. Except for one entry where the HEASARC corrected an error where there was a near-certainty which parameter was incorrect (as noted in the 'HEASARC_Updates' section of this documentation), these inconsistencies have been left as they were in the original table. This database table was released by the HEASARC in June 2000, based on the SAS-3 Y-Axis pointed Observation Log (available from the NSSDC as dataset ID 75-037A-02B), together with some additional information provided by the HEASARC itself. This is a service provided by NASA HEASARC .

  8. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  9. d

    SAS Programs - Claims-Based Frailty Index

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kim, Dae Hyun; Gautam, Nileesa (2024). SAS Programs - Claims-Based Frailty Index [Dataset]. http://doi.org/10.7910/DVN/HM8DOI
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Kim, Dae Hyun; Gautam, Nileesa
    Description

    This SAS program calculates CFI for each patient from analytic data files containing information on patient identifiers, ICD-9-CM diagnosis codes (version 32), ICD-10-CM Diagnosis Codes (version 2020), CPT codes, and HCPCS codes. NOTE: When downloading, store "CFI_ICD9CM_V32.tab", "CFI_ICD10CM_V2020.tab", and "PX_CODES.tab" as csv files (these files are originally stored as csv files, but Dataverse automatically converts them to tab files). Please read "Frailty-Index-SAS-code-Guide" before proceeding. Interpretation, validation data, and annotated references are provided in "Research Background - Claims-Based Frailty Index".

  10. h

    daily-historical-stock-price-data-for-atland-sas-20002025

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khaled Ben Ali, daily-historical-stock-price-data-for-atland-sas-20002025 [Dataset]. https://huggingface.co/datasets/khaledxbenali/daily-historical-stock-price-data-for-atland-sas-20002025
    Explore at:
    Authors
    Khaled Ben Ali
    Description

    📈 Daily Historical Stock Price Data for Atland SAS (2000–2025)

    A clean, ready-to-use dataset containing daily stock prices for Atland SAS from 2000-12-12 to 2025-05-28. This dataset is ideal for use in financial analysis, algorithmic trading, machine learning, and academic research.

      🗂️ Dataset Overview
    

    Company: Atland SAS Ticker Symbol: ATLD.PA Date Range: 2000-12-12 to 2025-05-28 Frequency: Daily Total Records: 6282 rows (one per trading day)

      🔢 Columns… See the full description on the dataset page: https://huggingface.co/datasets/khaledxbenali/daily-historical-stock-price-data-for-atland-sas-20002025.
    
  11. d

    MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Monarch...

    • catalog.data.gov
    • gimi9.com
    Updated Feb 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Fish and Wildlife Service (2025). MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Monarch Butterfly Abundance from SOP 2 Data [Dataset]. https://catalog.data.gov/dataset/mcsp-monarch-and-plant-monitoring-sas-output-summarizing-2018-monarch-butterfly-abundance-
    Explore at:
    Dataset updated
    Feb 21, 2025
    Dataset provided by
    U.S. Fish and Wildlife Service
    Description

    Output from programming code written to summarize 2018 monarch butterfly abundance from monitoring data acquired using a modified Pollard walk at custom 2017 GRTS draw sites within select monitoring areas (see SOP 2 in ServCat reference 103367 for methods) of FWS Legacy Regions 2 and 3. Areas monitored included Balcones Canyonlands (TX), Hagerman (TX), Washita (OK), Neal Smith (IA) NWRs and several locations near the town of Lamoni, Iowa and northern Missouri. Input data file is named 'FWS_2018_MM_SOP2_for_SAS.csv' and is stored in ServCat reference 136485. See SM 5 (ServCat reference 103388) for dictionary of data fields in the input data file.

  12. E

    Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0

    • find.data.gov.scot
    • dtechtive.com
    • +1more
    txt, zip
    Updated Jun 22, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. The Centre for Speech Technology Research (CSTR) (2017). Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0 [Dataset]. http://doi.org/10.7488/ds/2072
    Explore at:
    txt(0.0019 MB), zip(110.2 MB), txt(0.0166 MB)Available download formats
    Dataset updated
    Jun 22, 2017
    Dataset provided by
    University of Edinburgh. The Centre for Speech Technology Research (CSTR)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files are complementary to the fileset: Wu et al. (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). https://doi.org/10.7488/ds/252. These two filesets should be considered two complementary parts of a single dataset.

  13. Code for merging National Neighborhood Data Archive ZCTA level datasets with...

    • linkagelibrary.icpsr.umich.edu
    Updated Oct 15, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Megan Chenoweth; Anam Khan (2020). Code for merging National Neighborhood Data Archive ZCTA level datasets with the UDS Mapper ZIP code to ZCTA crosswalk [Dataset]. http://doi.org/10.3886/E124461V4
    Explore at:
    Dataset updated
    Oct 15, 2020
    Dataset provided by
    University of Michigan. Institute for Social Research
    Authors
    Megan Chenoweth; Anam Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The sample SAS and Stata code provided here is intended for use with certain datasets in the National Neighborhood Data Archive (NaNDA). NaNDA (https://www.openicpsr.org/openicpsr/nanda) contains some datasets that measure neighborhood context at the ZIP Code Tabulation Area (ZCTA) level. They are intended for use with survey or other individual-level data containing ZIP codes. Because ZIP codes do not exactly match ZIP code tabulation areas, a crosswalk is required to use ZIP-code-level geocoded datasets with ZCTA-level datasets from NaNDA. A ZIP-code-to-ZCTA crosswalk was previously available on the UDS Mapper website, which is no longer active. An archived copy of the ZIP-code-to-ZCTA crosswalk file has been included here. Sample SAS and Stata code are provided for merging the UDS mapper crosswalk with NaNDA datasets.

  14. d

    MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Immature...

    • catalog.data.gov
    • gimi9.com
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Fish and Wildlife Service (2025). MCSP Monarch and Plant Monitoring - SAS Output Summarizing 2018 Immature Monarch Butterfly and Plant Abundance from SOP 3 Data [Dataset]. https://catalog.data.gov/dataset/mcsp-monarch-and-plant-monitoring-sas-output-summarizing-2018-immature-monarch-butterfly-a
    Explore at:
    Dataset updated
    Feb 22, 2025
    Dataset provided by
    U.S. Fish and Wildlife Service
    Description

    Output from programming code written to summarize immature monarch butterfly, milkweed and nectar plant abundance from monitoring data acquired using a grid of 1 square-meter quadrats at custom 2017 GRTS draw sites within select monitoring areas (see SOP 3 in ServCat reference 103368 for methods) of FWS Legacy Regions 2 and 3. Areas monitored included Balcones Canyonlands (TX), Hagerman (TX), Washita (OK), Neal Smith (IA) NWRs and several locations near the town of Lamoni, Iowa and northern Missouri. Input data file is named 'FWS_2018_MonMonSOP3DS1_forSAS.csv' and is stored in ServCat reference 137698. See SM 5 (ServCat reference 103388) for dictionary of data fields in the input data file.

  15. c

    Sub-state Autonomy Scale (SAS)

    • datacatalogue.cessda.eu
    • sodha.be
    Updated Aug 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niessen, Christoph (2023). Sub-state Autonomy Scale (SAS) [Dataset]. http://doi.org/10.34934/DVN/LSXXZV
    Explore at:
    Dataset updated
    Aug 1, 2023
    Dataset provided by
    Université catholique de Louvain & European University Institute
    Authors
    Niessen, Christoph
    Description

    This dataset comprises the data collected for the Sub-state Autonomy Scale (SAS). The SAS is an indicator measuring the autonomy demands and statutes of sub-state communities in kind (whether competences are administrative or legislative), in degree (how much each dimension is present) and by competences (as a function of the extent of comprised policy domains).

    Definitions:
    -By 'sub-state community', I refer to sub-state entities within countries for which autonomous institutions have been demanded by a significant regionalist or traditional (centrist, liberal or socialist main-stream) political party (>5%) or to which autonomous institutions have been conferred.
    -By 'autonomy statutes', I refer to the legal autonomy prerogatives obtained by sub-state communities.
    -For 'autonomy demands', I distinguish between the legal autonomy prerogatives demanded by the regionalist party with the highest vote share and those demanded by the traditional party with the largest autonomy demand.

    Detailed conceptual presentation: see the Regional Studies article cited below (the open access author version can be found in the files section).

    Specifications:
    -Unit of analysis: sub-state communities by yearly intervals.
    -Country coverage: Belgium, Spain, United Kingdom (31 sub-state communities).
    -Time coverage: 1707-2020 (starting dates vary across sub-state communities).
    *For the full list of sub-state communities and their respective time coverage, see the codebook.

    Citation and acknowledgement: when using the data, please cite the Regional Studies article listed below.

    Latest version: 1.0 [01.02.2022].

  16. E

    Human vs Machine Spoofing

    • dtechtive.com
    • find.data.gov.scot
    gz, txt
    Updated Jun 9, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh (2015). Human vs Machine Spoofing [Dataset]. http://doi.org/10.7488/ds/258
    Explore at:
    gz(1906.688 MB), txt(0.0166 MB), txt(0.0022 MB), gz(718.7 MB)Available download formats
    Dataset updated
    Jun 9, 2015
    Dataset provided by
    University of Edinburgh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Listening test materials for 'Human vs Machine Spoofing Detection on Wideband and Narrowband data.' They include lists of the speech material selected from the SAS spoofing database and the listeners' responses. The main data file has been split into five smaller files (labelled 'aa' to 'ae') for ease of download.

  17. d

    Health and Retirement Study (HRS)

    • search.dataone.org
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Health and Retirement Study (HRS) [Dataset]. http://doi.org/10.7910/DVN/ELEKOY
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D

  18. U

    Time Diary Study (CAPS-DIARY module)

    • dataverse-staging.rdmc.unc.edu
    • datasearch.gesis.org
    Updated May 18, 2009
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UNC Dataverse (2009). Time Diary Study (CAPS-DIARY module) [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CAPS-DIARY
    Explore at:
    tsv(68411), application/x-sas-transport(237840), application/x-spss-por(75276), application/x-sas-transport(242160), application/x-spss-por(75850), application/x-sas-transport(240000), txt(70468), application/x-spss-por(74374), application/x-spss-por(77572), tsv(65433), txt(452140), txt(91461), application/x-sas-transport(1613120), application/x-spss-por(75358), txt(135850), txt(237380), application/x-spss-por(392206), txt(219960), txt(223730), txt(243880), application/x-sas-transport(945520), txt(437710), txt(447330), application/x-sas-transport(235680), txt(239720), tsv(65759), tsv(66745), txt(134420), txt(198510), txt(231010), application/x-spss-por(75522), text/x-sas-syntax(14192), tsv(66377), application/x-spss-por(75686), txt(218140), txt(247000), txt(229190), txt(456950), tsv(67095), txt(209820), txt(29480), txt(234130), text/x-sas-syntax(14213), tsv(67582), txt(223990), txt(227110), txt(432900), application/x-spss-por(74702), application/x-spss-por(76506), txt(248950), application/x-spss-por(75768), txt(132990), text/x-sas-syntax(14212), tsv(66338), tsv(65479), txt(442520), txt(133120), txt(220870), text/x-sas-syntax(14200), tsv(515401), txt(130390), txt(222560), txt(217100), txt(246350), tsv(66085), txt(461760), application/x-spss-por(76260), tsv(66939), txt(235560), txt(229450), txt(72104), tsv(66400), txt(211510), txt(226850), application/x-spss-por(492492), txt(205790), txt(210210), tsv(66217), tsv(66157), txt(234390), application/x-spss-por(75112), application/x-spss-por(75932), txt(224770), application/x-spss-por(74784), tsv(66192), txt(131560), txt(230100), txt(219050), tsv(382593), txt(213980), tsv(66604), txt(140140)Available download formats
    Dataset updated
    May 18, 2009
    Dataset provided by
    UNC Dataverse
    License

    https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARYhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARY

    Description

    The purpose of this project is to determine how college students distribute their activities in time (with a particular focus on academic and athletic activities) and to examine the factors that influence such distributions.Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday. Rs were told the week before they were to report which day was assigned and were given a report form to complete during that day. They entered the i nformation from that form when they returned the next week.The activity codes included were: 0: Sleeping. 1: Attending classes. 2: Studying or preparing classroom assignments. 3: Working at a jog (including CAPS). 4: Cooking, home chores, laundry, grocery shopping. 5: Errands, non-grocery shopping, gardening, animal care. 6: Eating. 7: Bathing, getting dressed, etc. 8: Sports, exercising, other physical activities. 9: Playing competitive games (cards, darts, videogames, frisbee, chess, Tr ivial Pursuit, etc.). 10: Participating in UNC-sponsored organizations (student government, band, sorority, etc.). 11: Listening to the radio. 12: Watching TV. 13: Reading for pleasure (not studying or reading for class). 14: Going to a movie. 15: Attending a cultural event (such as a play, concert, or museum). 16: Attending a sports event as a spectator. 17: Partying. 18: Religious activities. 19: Conversation. 20: Travel. 21: Resting. 22: Doing other things DIARY1-8: These datasets contain a matrix of activities by times for a particular day. Included is time period, activity code (see above), # of friends present, # of others present. (Rs were allowed to report doing two activities at once. In these cases they were also asked to report the % of time during the time period affected which was allocated to the first of the two activities listed.)THE DIARY DATASETS ARE STORED IN RAW FORM. SUMMARY FILES, CALLED TIMEREP, CONTAIN MOST SUMMA RY INFORMATION WHICH MIGHT BE USED IN ANALYSES. THE DIARY DATASETS CAN BE LISTED TO ALLOW UNIQUE CODING OF THE ORIGINAL DATA. Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday.TIMEREP: The TIMEREP dataset is a summary file which gives the amount of time spent on each activity during each of the eight reporting periods and also includes more detailed information about many of the activities from follow-up questions which were asked if the respondent reported having engaged in certain activities. Data from additional questions asked of every respondent after each diary entry are also included: contact with family members, number of alcoholic drinks consumed during the 24 hour period reported on, number of friends and others present while drinking, number of cigarettes smoked on day reported about, and number of classes skipped on day reported about. Follow-up questions include detail about kind of physical activity or sports participation, kind of university organization, kind of radio program listened to and place of listening, kind of TV program watched and place of watching, kind of reading material read and topic, alcohol consumed while partying and place of partying, conversation topics, kind of travel, activities included in 'other' category.Special processing is required to put the dataset into SAS format. See spec for details.

  19. H

    Consumer Expenditure Survey (CE)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Consumer Expenditure Survey (CE) [Dataset]. http://doi.org/10.7910/DVN/UTNJAH
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the consumer expenditure survey (ce) with r the consumer expenditure survey (ce) is the primo data source to understand how americans spend money. participating households keep a running diary about every little purchase over the year. those diaries are then summed up into precise expenditure categories. how else are you gonna know that the average american household spent $34 (±2) on bacon, $826 (±17) on cellular phones, and $13 (±2) on digital e-readers in 2011? an integral component of the market basket calculation in the consumer price index, this survey recently became available as public-use microdata and they're slowly releasing historical files back to 1996. hooray! for a t aste of what's possible with ce data, look at the quick tables listed on their main page - these tables contain approximately a bazillion different expenditure categories broken down by demographic groups. guess what? i just learned that americans living in households with $5,000 to $9,999 of annual income spent an average of $283 (±90) on pets, toys, hobbies, and playground equipment (pdf page 3). you can often get close to your statistic of interest from these web tables. but say you wanted to look at domestic pet expenditure among only households with children between 12 and 17 years old. another one of the thirteen web tables - the consumer unit composition table - shows a few different breakouts of households with kids, but none matching that exact population of interest. the bureau of labor statistics (bls) (the survey's designers) and the census bureau (the survey's administrators) have provided plenty of the major statistics and breakouts for you, but they're not psychic. if you want to comb through this data for specific expenditure categories broken out by a you-defined segment of the united states' population, then let a little r into your life. fun starts now. fair warning: only analyze t he consumer expenditure survey if you are nerd to the core. the microdata ship with two different survey types (interview and diary), each containing five or six quarterly table formats that need to be stacked, merged, and manipulated prior to a methodologically-correct analysis. the scripts in this repository contain examples to prepare 'em all, just be advised that magnificent data like this will never be no-assembly-required. the folks at bls have posted an excellent summary of what's av ailable - read it before anything else. after that, read the getting started guide. don't skim. a few of the descriptions below refer to sas programs provided by the bureau of labor statistics. you'll find these in the C:\My Directory\CES\2011\docs directory after you run the download program. this new github repository contains three scripts: 2010-2011 - download all microdata.R lo op through every year and download every file hosted on the bls's ce ftp site import each of the comma-separated value files into r with read.csv depending on user-settings, save each table as an r data file (.rda) or stat a-readable file (.dta) 2011 fmly intrvw - analysis examples.R load the r data files (.rda) necessary to create the 'fmly' table shown in the ce macros program documentation.doc file construct that 'fmly' table, using five quarters of interviews (q1 2011 thru q1 2012) initiate a replicate-weighted survey design object perform some lovely li'l analysis examples replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using unimputed variables replicate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t -tests using unimputed variables create an rsqlite database (to minimize ram usage) containing the five imputed variable files, after identifying which variables were imputed based on pdf page 3 of the user's guide to income imputation initiate a replicate-weighted, database-backed, multiply-imputed survey design object perform a few additional analyses that highlight the modified syntax required for multiply-imputed survey designs replicate the %mean_variance() macro found in "ce macros.sas" and provide some examples of calculating descriptive statistics using imputed variables repl icate the %compare_groups() macro found in "ce macros.sas" and provide some examples of performing t-tests using imputed variables replicate the %proc_reg() and %proc_logistic() macros found in "ce macros.sas" and provide some examples of regressions and logistic regressions using both unimputed and imputed variables replicate integrated mean and se.R match each step in the bls-provided sas program "integr ated mean and se.sas" but with r instead of sas create an rsqlite database when the expenditure table gets too large for older computers to handle in ram export a table "2011 integrated mean and se.csv" that exactly matches the contents of the sas-produced "2011 integrated mean and se.lst" text file click here to view these three scripts for...

  20. E

    Spoofing and Anti-Spoofing (SAS) corpus v1.0

    • find.data.gov.scot
    • search.datacite.org
    gz, pdf, txt
    Updated May 27, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Edinburgh. The Centre for Speech Technology Research (CSTR) (2015). Spoofing and Anti-Spoofing (SAS) corpus v1.0 [Dataset]. http://doi.org/10.7488/ds/252
    Explore at:
    gz(6674.432 MB), gz(9935.872 MB), gz(3306.496 MB), txt(0.001 MB), gz(7763.968 MB), gz(10393.6 MB), gz(7773.184 MB), gz(9846.784 MB), gz(10280.96 MB), gz(7478.272 MB), gz(6644.736 MB), txt(0.0166 MB), gz(10240 MB), gz(7985.152 MB), gz(10065.92 MB), gz(7974.912 MB), pdf(0.1048 MB)Available download formats
    Dataset updated
    May 27, 2015
    Dataset provided by
    University of Edinburgh. The Centre for Speech Technology Research (CSTR)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is associated with the paper ''SAS: A speaker verification spoofing database containing diverse attacks': presents the first version of a speaker verification spoofing and anti-spoofing database, named SAS corpus. The corpus includes nine spoofing techniques, two of which are speech synthesis, and seven are voice conversion. We design two protocols, one for standard speaker verification evaluation, and the other for producing spoofing materials. Hence, they allow the speech synthesis community to produce spoofing materials incrementally without knowledge of speaker verification spoofing and anti-spoofing. To provide a set of preliminary results, we conducted speaker verification experiments using two state-of-the-art systems. Without any anti-spoofing techniques, the two systems are extremely vulnerable to the spoofing attacks implemented in our SAS corpus'. N.B. the files in the following fileset should also be taken as part of the same dataset as those provided here: Wu et al. (2017). Key files for Spoofing and Anti-Spoofing (SAS) corpus v1.0, [dataset]. University of Edinburgh. The Centre for Speech Technology Research (CSTR). http://hdl.handle.net/10283/2741

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nasa.gov (2025). SAS-2 Map Product Catalog - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/sas-2-map-product-catalog
Organization logo

SAS-2 Map Product Catalog - Dataset - NASA Open Data Portal

Explore at:
Dataset updated
Apr 1, 2025
Dataset provided by
NASAhttp://nasa.gov/
Description

This database is a collection of maps created from the 28 SAS-2 observation files. The original observation files can be accessed within BROWSE by changing to the SAS2RAW database. For each of the SAS-2 observation files, the analysis package FADMAP was run and the resulting maps, plus GIF images created from these maps, were collected into this database. Each map is a 60 x 60 pixel FITS format image with 1 degree pixels. The user may reconstruct any of these maps within the captive account by running FADMAP from the command line after extracting a file from within the SAS2RAW database. The parameters used for selecting data for these product map files are embedded keywords in the FITS maps themselves. These parameters are set in FADMAP, and for the maps in this database are set as 'wide open' as possible. That is, except for selecting on each of 3 energy ranges, all other FADMAP parameters were set using broad criteria. To find more information about how to run FADMAP on the raw event's file, the user can access help files within the SAS2RAW database or can use the 'fhelp' facility from the command line to gain information about FADMAP. This is a service provided by NASA HEASARC .

Search
Clear search
Close search
Google apps
Main menu