6 datasets found
  1. d

    DHS data extractors for Stata

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily Oster (2023). DHS data extractors for Stata [Dataset]. http://doi.org/10.7910/DVN/RRX3QD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Emily Oster
    Description

    This package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.

  2. E

    SAS: Semantic Artist Similarity Dataset

    • live.european-language-grid.eu
    txt
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). SAS: Semantic Artist Similarity Dataset [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7418
    Explore at:
    txtAvailable download formats
    Dataset updated
    Oct 28, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Semantic Artist Similarity dataset consists of two datasets of artists entities with their corresponding biography texts, and the list of top-10 most similar artists within the datasets used as ground truth. The dataset is composed by a corpus of 268 artists and a slightly larger one of 2,336 artists, both gathered from Last.fm in March 2015. The former is mapped to the MIREX Audio and Music Similarity evaluation dataset, so that its similarity judgments can be used as ground truth. For the latter corpus we use the similarity between artists as provided by the Last.fm API. For every artist there is a list with the top-10 most related artists. In the MIREX dataset there are 188 artists with at least 10 similar artists, the other 80 artists have less than 10 similar artists. In the Last.fm API dataset all artists have a list of 10 similar artists. There are 4 files in the dataset.mirex_gold_top10.txt and lastfmapi_gold_top10.txt have the top-10 lists of artists for every artist of both datasets. Artists are identified by MusicBrainz ID. The format of the file is one line per artist, with the artist mbid separated by a tab with the list of top-10 related artists identified by their mbid separated by spaces.artist_mbid \t artist_mbid_top10_list_separated_by_spaces mb2uri_mirex and mb2uri_lastfmapi.txt have the list of artists. In each line there are three fields separated by tabs. First field is the MusicBrainz ID, second field is the last.fm name of the artist, and third field is the DBpedia uri.artist_mbid \t lastfm_name \t dbpedia_uri There are also 2 folders in the dataset with the biography texts of each dataset. Each .txt file in the biography folders is named with the MusicBrainz ID of the biographied artist. Biographies were gathered from the Last.fm wiki page of every artist.Using this datasetWe would highly appreciate if scientific publications of works partly based on the Semantic Artist Similarity dataset quote the following publication:Oramas, S., Sordo M., Espinosa-Anke L., & Serra X. (In Press). A Semantic-based Approach for Artist Similarity. 16th International Society for Music Information Retrieval Conference.We are interested in knowing if you find our datasets useful! If you use our dataset please email us at mtg-info@upf.edu and tell us about your research. https://www.upf.edu/web/mtg/semantic-similarity

  3. J

    Data associated with: Study to Understand Fall Reduction and Vitamin D in...

    • archive.data.jhu.edu
    Updated May 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lawrence J. Appel; Erin D. Michos; Edgar R. Miller III (2025). Data associated with: Study to Understand Fall Reduction and Vitamin D in You (STURDY) randomized clinical trial [Dataset]. http://doi.org/10.7281/T1/PXEROL
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 28, 2025
    Dataset provided by
    Johns Hopkins Research Data Repository
    Authors
    Lawrence J. Appel; Erin D. Michos; Edgar R. Miller III
    License

    https://archive.data.jhu.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7281/T1/PXEROLhttps://archive.data.jhu.edu/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.7281/T1/PXEROL

    Dataset funded by
    Johns Hopkins Institute for Clinical and Translation Research
    National Institutes of Health
    Mid-Atlantic Nutrition Obesity Research Center
    Description

    This is the limited access database for the Study to Understand Fall Reduction and Vitamin D in You (STURDY) randomized response-adaptive clinical trial. The database includes baseline, treatment and post randomization data. This Database includes a set of files pertaining to the full study population (688 randomized participants plus screenees who were not randomized) and a set of files pertaining to the burn-in cohort (the 406 participants randomized prior to the first adjustment of the randomization probabilities). The Database also includes files that support the analyses included in the primary outcome paper published by the Annals of Internal Medicine (2021;174:(2):145-156). Each data file in the Database corresponds to a specific data collection form or type of data. This documentation notebook includes a SAS PROC CONTENTS listing for each SAS file and a copy of the relevant form if applicable. Each variable on each SAS data file has an associated SAS label. Several STURDY documents, including the final versions of the screening and trial consent statements, the Protocol, and the Manual of Procedures, are included with this documentation notebook to assist with understanding and navigation of STURDY data. Notes on analysis questions and issues are also included, as is a list of STURDY publications.

  4. g

    Evaluation of Seven Second Chance Act Adult Demonstration Grantees, December...

    • search.gesis.org
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    D'Amico, Ronald, Evaluation of Seven Second Chance Act Adult Demonstration Grantees, December 2001-September 2014 - Version 1 [Dataset]. http://doi.org/10.3886/ICPSR36992.v1
    Explore at:
    Dataset provided by
    Inter-University Consortium for Political and Social Research
    GESIS search
    Authors
    D'Amico, Ronald
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de651513https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de651513

    Description

    Abstract (en): These data are part of NACJD's Fast Track Release and are distributed as they were received from the data depositor. The files have been zipped by NACJD for release, but not checked or processed except for the removal of direct identifiers. Users should refer to the accompanying readme file for a brief description of the files available with this collection and consult the investigator(s) if further information is needed. This study evaluates the impacts of re-entry programs developed by seven grantees awarded funds under the Second Chance Act (SCA) Adult Demonstration Program to reduce recidivism by addressing the challenges faced by adults returning to their communities after incarceration. The collection contains 3 SAS data files: admin30.sas(n=966; 111 variables), MIS.sas(n=606; 48 variables), and survey.sas(n=789; 273 variables) and 1 SAS syntax file. This evaluation estimates the impacts of programs developed by seven agencies that were awarded grants through the first round of funding under the SCA Adult Demonstration Program; these grants were awarded in fiscal year(FY) 2009. The Adult Demonstration Program represents one of a number of separate grant programs authorized through SCA. The seven grantees were purposively selected and drawn from only one grant program. In estimating impacts, the evaluation used a randomized controlled trial, whereby 966 individuals eligible for SCA were randomly assigned to either a program group, whose members could participate in individualized SCA services, or a control group, whose members could receive all re-entry services otherwise available but not individualized SCA services. Each study participant was measured on a range of outcomes at 18 months after random assignment and again approximately one year later. The grantees selected by BJA for the study include: State Agencies 1. Kentucky Department of Corrections 2. Oklahoma Department of Correction 3. South Dakota Department of Corrections Local Agencies 4. Allegheny County (PA) Department of Human Services 5. Marion County (OR) Sheriff's Office 6. San Francisco (CA) Department of Public Health 7. San Mateo County (CA) Division of Health and Recovery Services The outcomes at 18 months, measured through a survey of study participants and from administrative data, included services received, recidivism (re-arrest, reconviction, and re-incarceration), employment and earnings, housing stability, and self-reported health, among others. The outcomes measured one year later were drawn solely from administrative data and included recidivism and employment and earnings. Crime related variables include the number and nature of convictions and time spent incarcerated. Other demographic variables include gender, age, race, ethnicity, education, income, marital status, and number of children. Presence of Common Scales: Several likert-type scales were used. Response Rates: 82 percent (18 Month Follow-up Survey) Datasets:DS1: Dataset Adults who have been imprisoned in a state, local, or tribal prison who were convicted as an adult and are classified as being at medium or high risk of recidivism. Smallest Geographic Unit: none Those determined eligible for SCA were randomly assigned to either a program group or a control group. The study allowed each grantee to establish its own criteria for determining who was eligible for SCA. All those eligible were at medium or high risk or recidivism. Funding insitution(s): United States Department of Justice. Office of Justice Programs. National Institute of Justice (2010-RY-BX-0003). record abstracts computer-assisted personal interview (CAPI) computer-assisted telephone interview (CATI)

  5. i

    Seasonal Agriculture Survey 2015-2016 - Rwanda

    • catalog.ihsn.org
    Updated Sep 19, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Statistics of Rwanda (2018). Seasonal Agriculture Survey 2015-2016 - Rwanda [Dataset]. https://catalog.ihsn.org/index.php/catalog/7334
    Explore at:
    Dataset updated
    Sep 19, 2018
    Dataset authored and provided by
    National Institute of Statistics of Rwanda
    Time period covered
    2015 - 2016
    Area covered
    Rwanda
    Description

    Abstract

    The main objective of the Seasonal Agriculture Survey (SAS) 2015, was to provide timely, accurate, credible and comprehensive agricultural statistics that would not only describe the structure of agriculture in Rwanda in terms of land use, crop production and livestock and could be used for food and agriculture policy formulation and planning, but also which could also be used for the compilation of national accounts statistics.

    In this regard, the National Institute of Statistics of Rwanda (NISR) conducted the Seasonal Agriculture Survey (SAS) from November 2015 to October 2016 to gather up-to-date information for monitoring progress on agriculture programs and policies in Rwanda, including the Second Economic Development (SED) and Poverty Reduction Strategy (EDPRS II) and Vision 2020. This 2016 RSAS covered three agricultural seasons (A, B and C) and provides data on background characteristics of the agricultural operators, farm characteristics (area, yield and production), agricultural practices, agricultural equipment's, use of crop production by agricultural operators and by large scale farmers.

    Geographic coverage

    National coverage

    Analysis unit

    This seasonal agriculture survey focused on the following units of analysis: - Agricultural operators and large scale farmers

    Universe

    The SAS 2016 targeted agricultural operators and large scale farmers operating in Rwanda.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The Seasonal Agriculture Survey (SAS) sample was composed of two categories of respondents: agricultural operators1 and large-scale farmers (LSF).

    For the 2016 SAS, NISR used as the sampling method a dual frame sampling design combining selected area frame sample3 segments and a list of large-scale farmers. NISR used also imagery from RNRA with a very high resolution of 25 centimeters to divide the total land of the country into twelve strata. A total number of 540 segments were spread throughout the country as coverage of the survey with 25,346 and 23,286 agricultural operators in Season A and Season B respectively. From these numbers of agricultural operators, sub-samples were selected during the second phases of Seasons A and B.

    It is important to note that in each of agricultural season A and B, data collection was undertaken in two phases. Phase I was mainly used to collect data on demographic and social characteristics of interviewees, area under crops, crops planted, rainfall, livestock, etc. Phase II was mainly devoted to the collection of data on yield and production of crops.

    Phase I serves at collecting data on area under different types of crops in the screening process, whereas the Phase II is mainly devoted to the collection of data on demographic, social characteristics of interviewees, together with yields of the different crops produced. Enumerated large-scale farmers (LSF) were 558 in both 2015 Season A and B. The LSF were engaged in either crop farming activities only, livestock farming activities only, or both crop and livestock farming activities. Agricultural operators are the small-scale farmers within the sample segments. Every selected segment was firstly screened using the appropriate materials such as the segment maps, GIS devices and the screening form. Using these devices, the enumerators accounted for every plot inside the sample segments. All Tracts6 were classified as either agricultural (cultivated land, pasture, and fallow land) or non-agricultural land (water, forests, roads, rocky and bare soils, and buildings). During Phase I, a complete enumeration of all farmers having agricultural land and operating within the 540 selected segments was undertaken and a total of 25,495 and 24,911 agricultural operators were enumerated respectively in Seasons A and B. Season C considered only 152 segments, involving 3,445 agricultural operators.

    In phase II, 50% of the large-scale farmers were undertaking crop farming activities only and 50% of the large-scale farmers were undertaking both crop and livestock farming and were selected for interview. A sample of 199 and 194 large-scale farmers were interviewed in Seasons A and B, respectively, using a farm questionnaire. From the agricultural operators enumerated in the sample segments during Phase I, a sample of the agricultural operators was designed for Phase II as follows: 5,502 for Season A, 5,337 for Season B and 644 for Season C. The method of probability proportional to size (PPS) sampling at the national level was used. Furthermore, the total number of enumerated large-scale farmers was 774 in 2016 Season A and 622 in Season B.

    The Season C considered 152 segments counting 8,987 agricultural operators from which 963 agricultural operators were selected for survey interviews.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    There were two types of questionnaires used for this survey namely; Screening Questionnaire and Farm Questionnaires. A Screening Questionnaire was used to collect information that enabled identification of an agricultural operator or large scale farmer and his or her land use.

    Farm questionnaires were of two types: a) Phase I: Farm Questionnaire, this survey was used to collect data on characteristics of agricultural operators, crop identification and area, inputs (seeds, fertilizers, labor) for agricultural operators and large scale farmers. b) Phase 2: Farm Questionnaire was used in the collection of data on crop production and use of production.

    It is important to mention that all these farm questionnaires were subjected to two/three rounds of data quality checking. The first round was conducted by the enumerator and the second round was conducted by the team leader to check if questionnaires had been well completed by enumerators.

    For season C, after screening, an interview was conducted for each selected tract/agricultural operator using one consolidated Farm Questionnaire. All the survey questionnaires used were published in both English and Kinyarwanda languages.

    Cleaning operations

    Data editing took place at different stage. Firstly, the filled questionnaires were repatriated at NISR for office editing and coding before data entry started. Data entry of the completed and checked questionnaires was undertaken at the NISR office by 20 staff trained in using the CSPro software. To ensure appropriate matching of data in the completed questionnaires and plot area measurements from the GIS unit, a "lookup" file was integrated in the CSPro data entry program to confirm the identification of each agricultural operator or LSF before starting data entry. Thereafter, data were entered in computers, edited and summarized in tables using SPSS and Excel.

    Response rate

    The response rate for Seasonal Agriculture Survey is 98%.

    Data appraisal

    All Farm questionnaires were subjected to two/three rounds of data quality checking. The first round was conducted by the enumerator and the second round was conducted by the team leader to check if questionnaires had been well completed by enumerators. And in most cases, questionnaires completed by one enumerator were peer-reviewed by another enumerator before being checked by the Team leader.

  6. Z

    Database of Nightside, High-latitude Ionosphere Meso-scale Flow...

    • data-staging.niaid.nih.gov
    Updated Jan 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabrielse, Christine; Nishimura, Toshi; Lyons, Larry; Gallardo-Lacourt, Bea; Deng, Yue; Pinto, Victor; Donovan, Eric (2020). Database of Nightside, High-latitude Ionosphere Meso-scale Flow Characteristics [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_2539828
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset provided by
    UCLA
    University of Texas at Arlington
    Boston University
    University of Calgary
    Authors
    Gabrielse, Christine; Nishimura, Toshi; Lyons, Larry; Gallardo-Lacourt, Bea; Deng, Yue; Pinto, Victor; Donovan, Eric
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database is a compilation of nightside, high-latitude ionosphere meso-scale flow characteristics built on those used in Gabrielse et al. 2018. It is the most complete version. If you would like to use the database, please contact Christine Gabrielse (cgabrielse@ucla.edu, cgabrielse@gmail.com, and/or christine.gabrielse@aero.org). Depending on how the results are used, the main authors request co-authorship on publications that utilize this database.

    The methodology and selection criteria can be found in Gabrielse et al. 2018.

    The following list describes the columns in each data file labeled, ***_FLOW-DATA-PCvsAO_YYYY.txt The first three letters (RNK or SAS) designate the station used (Rankin Inlet or Saskatoon). Files named ***_FLOW-DATA-PCvsAO_poleward_YYYY.txt are for poleward-directed flows. Each text file is for a different year (YYYY).

    AO=Auroral Oval for Rankin Inlet; equatorward of the auroral oval for Saskatoon (not used) PC=Polar Cap for Rankin Inlet; Auroral Oval for Saskatoon

    (Note: the data files for RNK and SAS have the same format, so the PC designator means flows above the pertinent boundary (polar cap boundary for RNK, auroral oval equatorward boundary at SAS) and the AO designator means flows below the pertinent boundary.)

     time [YYYYMMDDhhmmss]
     flagAO [-1=flow could not be observed. 0=flow could be observed, but was not. 1=flow was observed]
     flagPC [-1=flow could not be observed. 0=flow could be observed, but was not. 1=flow was observed]
     FWHMavg_AO [degrees]
     FWHMkmavg_AO=[km]
     longtestranges=[ignore]
     Velmaxavg_AO=[m/s, actual average of max V in each range gate used]
     VelmaxFITavg_AO=[m/s, determined from the Gaussian fits]
     FWHMavg_PC=[degrees]
     FWHMkmavg_PC=[km]
     Velmaxavg_PC=[m/s, actual average of max V in each range gate used]
     VelmaxFITavg_PC=[m/s, determined from the Gaussian fits]
    

    ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; For the bearings/orientation, see the orientation text files. The following four variables were calculated in a first step but are not those used in the paper. They were not found with the strict selection criteria. Please do not use. mbearingAO=[degrees in magnetic coordinates, a negative value is South of East (clockwise from East), a positive value is North of East (CC)] mbearingPC=[degrees in magnetic coordinates, a negative value is South of East (clockwise from East), a positive value is North of East (CC)]
    gbearingAO=[degrees in geographic coordinates, a negative value is South of East (clockwise from East), a positive value is North of East (CC)] gbearingPC=[degrees in geographic coordinates, a negative value is South of East (clockwise from East), a positive value is North of East (CC)] ;;;;;;;;;;;;;;; minlatAO=[degrees, min geographic latitude of the flow] maxlatAO=[degrees, max geographic latitude of the flow] minlatPC=[degrees, min geographic latitude of the flow] maxlatPC=[degrees, max geographic latitude of the flow] mltAO=[degrees (MLT)] mltPC=[degrees (MLT)] AE=[nT] AL=[nT] SYMH=[nT] IMFBy=[nT] IMFBz=[nT]
    F107=[sfu]

    The following list describes the columns in each data file labeled, ***_orientation_YYYY.txt Files named ***_orientation_poleward_YYYY.txt are for poleward-directed flows. Each text file is for a different year (YYYY). The orientation was determined when enough bearings between RGs were available. See Gabrielse et al. [2018] for description. https://doi.org/10.1029/2018JA025440 AO=auroral oval PC=polar cap

     time [YYYYMMDDhhmmss]
     mbearingAO [degrees clockwise from magnetic North]
     gbearingAO [degrees clockwise from geographic North]
     mbearingPC [degrees clockwise from magnetic North]
     gbearingPC [degrees clockwise from geographic North]
    

    The following list describes the columns in each data file labeled, _SPEC_TEST__noRG1-2.txt

     time [YYYYMMDDhhmmss]
     RG [the range gate number at which the polar cap boundary was determined at RNK, or the auroral oval's equatorial boundary at SAS]
    
  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Emily Oster (2023). DHS data extractors for Stata [Dataset]. http://doi.org/10.7910/DVN/RRX3QD

DHS data extractors for Stata

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Emily Oster
Description

This package contains two files designed to help read individual level DHS data into Stata. The first file addresses the problem that versions of Stata before Version 7/SE will read in only up to 2047 variables and most of the individual files have more variables than that. The file will read in the .do, .dct and .dat file and output new .do and .dct files with only a subset of the variables specified by the user. The second file deals with earlier DHS surveys in which .do and .dct file do not exist and only .sps and .sas files are provided. The file will read in the .sas and .sps files and output a .dct and .do file. If necessary the first file can then be run again to select a subset of variables.

Search
Clear search
Close search
Google apps
Main menu