17 datasets found
  1. WAVES

    • stanford.redivis.com
    • redivis.com
    application/jsonl +7
    Updated Apr 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford WAVES initiative (2021). WAVES [Dataset]. http://doi.org/10.57761/5tdn-yy04
    Explore at:
    avro, stata, csv, sas, arrow, spss, parquet, application/jsonlAvailable download formats
    Dataset updated
    Apr 18, 2021
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford WAVES initiative
    Description

    Abstract

    WAVES is a pediatric physiological waveform dataset containing ECG, respiratory, plethysmogram, arterial blood pressure, and a variety of other high-frequency waveforms extracted from bedside monitors.

    Usage

    For code examples and documentation, please refer to the WAVES utilities python package or its associated source code.

    Waveform data is stored in compressed and base-64 encoded .csv files that cannot be properly loaded and decompressed using standard csv libraries. The utility codebase provides data loaders to interface with the raw data, and usage examples like basic plotting.

    As an unrestricted preview of the dataset, the WAVES utilities code includes a very small sample dataset .csv file in the format that would be provided if you extract/filter download a waveform dataset .csv file from Redivis. The "Supporting files" section of the WAVES dataset on Redivis also includes a larger subset of ~25 samples running for roughly 8 hours each.

    BY DOWNLOADING THE SAMPLE DATA FILE, YOU ARE AGREEING TO THE TERMS OF THE PROVIDED DATA USE AGREEMENT (DUA)

    Documentation

    Initial release of WAVES to validate and document user access

  2. Regrid nationwide parcel data

    • redivis.com
    application/jsonl +7
    Updated Mar 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford University Libraries (2023). Regrid nationwide parcel data [Dataset]. http://doi.org/10.57761/h0be-7n22
    Explore at:
    parquet, spss, avro, csv, sas, stata, application/jsonl, arrowAvailable download formats
    Dataset updated
    Mar 20, 2023
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford University Libraries
    Description

    Abstract

    The data comprises all the nationwide parcel shapes, owner information, address, and any columns related to ownership or public/private land designations, added during the term of the license agreement. The attributes available may vary from parcel to parcel. Data are updated monthly and made available in GeoJSON format.

    Methodology

    Refer to LoveLand’s FAQs on Parcel Data for more information on the data, how it is collected, standardized, etc.

    Usage

    Parcel data are made available as tables for each state and the District of Columbia.

    Table names use the abbreviation for each state followed by the month and year of the update.

    Bulk data access

    Individual parcel data files are available in four formats: GeoJSON, Geopackage, Geodatabase and Shapefile. Instructions for accessing data can be found here:

    https://code.stanford.edu/sul-socialsciences/landgrid-parcel-data#access

  3. M

    OLD-INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and...

    • stanfordaimi.azurewebsites.net
    Updated May 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Microsoft Research (2024). OLD-INSPECT: A Multimodal Dataset for Pulmonary Embolism Diagnosis and Prognosis [Dataset]. https://stanfordaimi.azurewebsites.net/datasets/318f3464-c4b6-4006-9856-6f48ba40ad67
    Explore at:
    Dataset updated
    May 30, 2024
    Dataset authored and provided by
    Microsoft Research
    License

    https://aimistanford-web-api.azurewebsites.net/licenses/f1f352a6-243f-4905-8e00-389edbca9e83/viewhttps://aimistanford-web-api.azurewebsites.net/licenses/f1f352a6-243f-4905-8e00-389edbca9e83/view

    Description

    Synthesizing information from various data sources plays a crucial role in the practice of modern medicine. Current applications of artificial intelligence in medicine often focus on single-modality data due to a lack of publicly available, multimodal medical datasets. To address this limitation, we introduce INSPECT, which contains de-identified longitudinal records from a large cohort of pulmonary embolism (PE) patients, along with ground truth labels for multiple outcomes. INSPECT contains data from 19,438 patients, including CT images, sections of radiology reports, and structured electronic health record (EHR) data (including demographics, diagnoses, procedures, and vitals). Using our provided dataset, we develop and release a benchmark for evaluating several baseline modeling approaches on a variety of important PE related tasks. We evaluate image-only, EHR-only, and fused models. Trained models and the de-identified dataset are made available for non-commercial use under a data use agreement. To the best our knowledge, INSPECT is the largest multimodal dataset for enabling reproducible research on strategies for integrating 3D medical imaging and EHR data. NOTE: this is the first part of release due to PHI review. This release has 20078 CT scans, 21,266 impression sections and the EHR modality data will be uploaded to Stanford Redivis website (https://redivis.com/Stanford)

  4. n

    Data from: In an age of open access to research policies: physician and...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jul 2, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura L. Moorhead; Cheryl Holzmeyer; Lauren A. Maggio; Ryan M. Steinberg; John M. Willinsky; John Willinsky (2016). In an age of open access to research policies: physician and public health NGO staff research use and policy awareness [Dataset]. http://doi.org/10.5061/dryad.7g984
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 2, 2016
    Dataset provided by
    Stanford University School of Medicine
    Stanford University
    Authors
    Laura L. Moorhead; Cheryl Holzmeyer; Lauren A. Maggio; Ryan M. Steinberg; John M. Willinsky; John Willinsky
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Introduction: Through funding agency and publisher policies, an increasing proportion of the health sciences literature is being made open access. Such an increase in access raises questions about the awareness and potential utilization of this literature by those working in health fields. Methods: A sample of physicians (N=336) and public health non-governmental organization (NGO) staff (N=92) were provided with relatively complete access to the research literature indexed in PubMed, as well as access to the point-of-care service UpToDate, for up to one year, with their usage monitored through the tracking of web-log data. The physicians also participated in a one-month trial of relatively complete or limited access. Results: The study found that participants' research interests were not satisfied by article abstracts alone nor, in the case of the physicians, by a clinical summary service such as UpToDate. On average, a third of the physicians viewed research a little more frequently than once a week, while two-thirds of the public health NGO staff viewed more than three articles a week. Those articles were published since the 2008 adoption of the NIH Public Access Policy, as well as prior to 2008 and during the maximum 12-month embargo period. A portion of the articles in each period was already open access, but complete access encouraged a viewing of more research articles. Conclusion: Those working in health fields will utilize more research in the course of their work as a result of (a) increasing open access to research, (b) improving awareness of and preparation for this access, and (c) adjusting public and open access policies to maximize the extent of potential access, through reduction in embargo periods and access to pre-policy literature.

  5. T

    Data from: chexpert

    • tensorflow.org
    • opendatalab.com
    • +1more
    Updated Feb 1, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). chexpert [Dataset]. https://www.tensorflow.org/datasets/catalog/chexpert
    Explore at:
    Dataset updated
    Feb 1, 2019
    Description

    CheXpert is a large dataset of chest X-rays and competition for automated chest x-ray interpretation, which features uncertainty labels and radiologist-labeled reference standard evaluation sets. It consists of 224,316 chest radiographs of 65,240 patients, where the chest radiographic examinations and the associated radiology reports were retrospectively collected from Stanford Hospital. Each report was labeled for the presence of 14 observations as positive, negative, or uncertain. We decided on the 14 observations based on the prevalence in the reports and clinical relevance.

    The CheXpert dataset must be downloaded separately after reading and agreeing to a Research Use Agreement. To do so, please follow the instructions on the website, https://stanfordmlgroup.github.io/competitions/chexpert/.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('chexpert', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  6. s

    Contracted Williamson Act Parcels: Santa Clara County, California, 2015

    • searchworks.stanford.edu
    zip
    Updated Dec 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). Contracted Williamson Act Parcels: Santa Clara County, California, 2015 [Dataset]. https://searchworks.stanford.edu/view/qr197tf8065
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2020
    Area covered
    Santa Clara County, California
    Description

    This polygon shapefile depicts Williamson Act Parcels with an ongoing contract within the unincorporated areas of the County of Santa Clara, California. The California Land Conservation Act, better known as the Williamson Act, has been the California’s agricultural land protection program since its enactment in1965. The Williamson Act preserves agricultural and open space lands through property tax incentives and voluntary restrictive use contracts. Private landowners voluntarily restrict their land to agricultural and compatible open space uses under minimum ten year rolling term contracts with local governments. In return, restricted parcels are assessed for property tax purposes at a rate consistent with their actual use, rather than potential market value. The purpose of the Williamson Act is to preserve the County’s prime soils and intensive high value agricultural operations, and to discourage premature and unnecessary conversion of agricultural land to urban use. This layer is part of a collection of GIS data for Santa Clara County, California.

  7. O

    New York State HCUP

    • redivis.com
    avro, csv, ndjson +4
    Updated Jan 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orthopaedic Biostatistics Data Portal (2023). New York State HCUP [Dataset]. https://redivis.com/datasets/hn7z-4fz9mdfh3
    Explore at:
    stata, sas, csv, spss, parquet, avro, ndjsonAvailable download formats
    Dataset updated
    Jan 23, 2023
    Authors
    Orthopaedic Biostatistics Data Portal
    Time period covered
    Jan 1, 2016 - Dec 31, 2020
    Area covered
    New York
    Description

    Abstract

    Currently, we have 2016 - 2020 New York State HCUP databases. Additional data will be added as it becomes available.

    Documentation

    Data Documentation is on the HCUP Website:

    %3Cu%3E%3Cstrong%3EState Inpatient Databases (SID)%3C/strong%3E%3C/u%3E

    SID Overview:

    https://www.hcup-us.ahrq.gov/sidoverview.jsp

    NY SID Description:

    https://www.hcup-us.ahrq.gov/db/state/siddist/siddist_filecompny.jsp

    SID Data Elements by State:

    https://www.hcup-us.ahrq.gov/db/state/siddist/siddistvarnote2016.jsp

    https://www.hcup-us.ahrq.gov/db/state/siddist/siddistvarnote2017.jsp

    %3Cu%3E%3Cstrong%3EState Ambulatory Surgery & Services Databses (SASD)%3C/strong%3E%3C/u%3E

    SASD Overview:

    https://www.hcup-us.ahrq.gov/db/state/sasddbdocumentation.jsp

    NY SASD Description:

    https://www.hcup-us.ahrq.gov/db/state/sasddist/sasddist_filecompny.jsp

    SASD Data Elements by State:

    https://www.hcup-us.ahrq.gov/db/state/sasddist/sasddistvarnote2016.jsp

    https://www.hcup-us.ahrq.gov/db/state/sasddist/sasddistvarnote2017.jsp

    %3Cu%3E%3Cstrong%3EState Emergency Department Databases (SEDD)%3C/strong%3E%3C/u%3E

    SEDD Overview:

    https://www.hcup-us.ahrq.gov/db/state/sedddbdocumentation.jsp

    NY SEDD Description:

    https://www.hcup-us.ahrq.gov/db/state/sedddist/sedddist_filecompny.jsp

    SEDD Data Elements by State:

    https://www.hcup-us.ahrq.gov/db/state/sedddist/sedddistvarnote2016.jsp

    https://www.hcup-us.ahrq.gov/db/state/sedddist/sedddistvarnote2017.jsp

    Exports are approved on a case-by-case basis; allow 24 hours for approval. The use of exported data is only permitted for approved projects. An updated data use agreement must be submitted to HCUP by the dataset administrator before proceeding to work with the data for new projects. To initiate a new project or line of inquiry, email ortho_biostats@stanford.edu.

    Passing of data to other users outside of the system is not permitted without permission of the data administrator (Jayme Koltsov). All users must complete the HCUP tutorial and data use agreement prior to data access. Email ortho_biostats@stanford.edu to obtain data access.

  8. Sentiment Analysis Test Dataset Created from Two COVID-19 Surveys: National...

    • figshare.com
    xlsx
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan Antonio Lossio-Ventura; Rachel Weger; Angela Lee; Emily Guinee; Joyce Chung; Atlas, Lauren; Eleni Linos; Francisco Pereira (2024). Sentiment Analysis Test Dataset Created from Two COVID-19 Surveys: National Institutes of Health (NIH) and Stanford University [Dataset]. http://doi.org/10.6084/m9.figshare.24560584.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 9, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Juan Antonio Lossio-Ventura; Rachel Weger; Angela Lee; Emily Guinee; Joyce Chung; Atlas, Lauren; Eleni Linos; Francisco Pereira
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Two COVID-19 surveys were used to create the test dataset, both collected by teams from the National Institutes of Health (NIH) and Stanford University. The collected data were intended to assess the general topics experienced by participants during the pandemic lockdown. The test dataset comprises a total of 1,000 randomly chosen sentences, with 500 sentences selected from each survey. Each set was annotated by three separate and independent annotators. The annotators were instructed to assess the polarity of each sentence on a scale of -1 (negative), 0 (neutral), or 1 (positive). We then followed a three-step procedure to determine the final labels. First, if all three annotators agreed on a label (full agreement), that label was accepted. Second, if two out of the three agreed on a label (partial agreement), that label was also accepted. Third, if there was no agreement, the label was set as neutral (no agreement).

  9. CoreLogic Loan-Level Market Analytics

    • redivis.com
    application/jsonl +7
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford University Libraries (2024). CoreLogic Loan-Level Market Analytics [Dataset]. http://doi.org/10.57761/a96q-1j33
    Explore at:
    avro, sas, spss, stata, arrow, parquet, csv, application/jsonlAvailable download formats
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford University Libraries
    Description

    Abstract

    The CoreLogic Loan-Level Market Analytics (LLMA) for primary mortgages dataset contains detailed loan data, including origination, events, performance, forbearance and inferred modification data.

    Methodology

    CoreLogic sources the Loan-Level Market Analytics data directly from loan servicers. CoreLogic cleans and augments the contributed records with modeled data. The Data Dictionary indicates which fields are contributed and which are inferred.

    The Loan-Level Market Analytics data is aimed at providing lenders, servicers, investors, and advisory firms with the insights they need to make trustworthy assessments and accurate decisions. Stanford Libraries has purchased the Loan-Level Market Analytics data for researchers interested in housing, economics, finance and other topics related to prime and subprime first lien data.

    CoreLogic provided the data to Stanford Libraries as pipe-delimited text files, which we have uploaded to Data Farm (Redivis) for preview, extraction and analysis.

    For more information about how the data was prepared for Redivis, please see CoreLogic 2024 GitLab.

    Usage

    Per the End User License Agreement, the LLMA Data cannot be commingled (i.e. merged, mixed or combined) with Tax and Deed Data that Stanford University has licensed from CoreLogic, or other data which includes the same or similar data elements or that can otherwise be used to identify individual persons or loan servicers.

    The 2015 major release of CoreLogic Loan-Level Market Analytics (for primary mortgages) was intended to enhance the CoreLogic servicing consortium through data quality improvements and integrated analytics. See **CL_LLMA_ReleaseNotes.pdf **for more information about these changes.

    For more information about included variables, please see CL_LLMA_Data_Dictionary.pdf.

    **

    For more information about how the database was set up, please see LLMA_Download_Guide.pdf.

    Bulk Data Access

    Data access is required to view this section.

  10. o

    Replication Data for: Monetary Policy and the Redistribution Channel

    • openicpsr.org
    stata
    Updated Apr 6, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrien Auclert (2019). Replication Data for: Monetary Policy and the Redistribution Channel [Dataset]. http://doi.org/10.3886/E109242V3
    Explore at:
    stataAvailable download formats
    Dataset updated
    Apr 6, 2019
    Dataset provided by
    Stanford University
    Authors
    Adrien Auclert
    Description

    Top-level PSID Stata dataset used for the analysis in A. Auclert, "Monetary Policy in the Redistribution Channel", American Economic Review, June 2019

  11. ImageNet 1K TFRecords 256x256

    • kaggle.com
    Updated Sep 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Park (2022). ImageNet 1K TFRecords 256x256 [Dataset]. https://www.kaggle.com/datasets/parkjohnychae/imagenet1k-tfrecords-256x256
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 21, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    John Park
    Description

    "ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns), in which each node of the hierarchy is depicted by hundreds and thousands of images. The project has been instrumental in advancing computer vision and deep learning research. The data is available for free to researchers for non-commercial use." (https://www.image-net.org/index.php)

    I do not hold any copyright to this dataset. This data is just a re-distribution of the data Imagenet.org shared on Kaggle. Please note that some of the ImageNet1K images are under copyright.

    This version of the data is directly sourced from Kaggle, excluding the bounding box annotations. Therefore, only images and class labels are included.

    All images are resized to 256 x 256.

    Integer labels are assigned after ordering the class names alphabetically.

    Please note that anyone using this data abides by the original terms: ``` RESEARCHER_FULLNAME has requested permission to use the ImageNet database (the "Database") at Princeton University and Stanford University. In exchange for such permission, Researcher hereby agrees to the following terms and conditions:

    1. Researcher shall use the Database only for non-commercial research and educational purposes.
    2. Princeton University and Stanford University make no representations or warranties regarding the Database, including but not limited to warranties of non-infringement or fitness for a particular purpose.
    3. Researcher accepts full responsibility for his or her use of the Database and shall defend and indemnify the ImageNet team, Princeton University, and Stanford University, including their employees, Trustees, officers and agents, against any and all claims arising from Researcher's use of the Database, including but not limited to Researcher's use of any copies of copyrighted images that he or she may create from the Database.
    4. Researcher may provide research associates and colleagues with access to the Database provided that they first agree to be bound by these terms and conditions.
    5. Princeton University and Stanford University reserve the right to terminate Researcher's access to the Database at any time.
    6. If Researcher is employed by a for-profit, commercial entity, Researcher's employer shall also be bound by these terms and conditions, and Researcher hereby represents that he or she is fully authorized to enter into this agreement on behalf of such employer.
    7. The law of the State of New Jersey shall apply to all disputes under this agreement.
    
    The images are processed using [TPU VM](https://cloud.google.com/tpu/docs/users-guide-tpu-vm) via the support of Google's [TPU Research Cloud](https://sites.research.google/trc/about/).
    
  12. s

    HUD Section 202 Properties, 2023

    • searchworks.stanford.edu
    zip
    Updated Jul 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HUD Section 202 Properties, 2023 [Dataset]. https://searchworks.stanford.edu/view/yd619hz3033
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 19, 2025
    Description

    This map denotes the locations of HUD assisted Multi-Family properties that primarily serve elderly residents. In addition, each property illustrated through this service has at least one active Service Coordinator contract or grant, Section 236 loan, Section 8 202 contract, Section 8 Farmers Home Administration (FMHA) 515 contract, Section 8 New Construction contract, Section 202 Project Assistance Contracts (PAC) contract, and Section 202 Project Rental Assistance Contract (PRAC).Please note that the data provided through this map only includes location data and attributes for those addresses that can be geocoded to an interpolated point along a street segment, or to a ZIP+4 centroid location. While not all records are able to be geocoded and mapped, we are continuously working to improve the address data quality and enhance coverage. Please consider this issue when using any datasets provided by HUD.To learn more about the Section 202 Program visit: https://www.hud.gov/program_offices/housing/mfh/progdesc/eld202Data Dictionary: DD_Multifamily PropertiesDate of Coverage: 12/2019Data Updated: Quarterly

  13. f

    Occurrence of polymorphisms at drug resistance sites.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kahsay Huruy; Andargachew Mulu; Uwe Gerd Liebert; Maier Melanie (2023). Occurrence of polymorphisms at drug resistance sites. [Dataset]. http://doi.org/10.1371/journal.pone.0205119.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kahsay Huruy; Andargachew Mulu; Uwe Gerd Liebert; Maier Melanie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Occurrence of polymorphisms at drug resistance sites.

  14. f

    TDR frequency according to different algorithms.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kahsay Huruy; Andargachew Mulu; Uwe Gerd Liebert; Maier Melanie (2023). TDR frequency according to different algorithms. [Dataset]. http://doi.org/10.1371/journal.pone.0205119.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kahsay Huruy; Andargachew Mulu; Uwe Gerd Liebert; Maier Melanie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TDR frequency according to different algorithms.

  15. DAMP-VP1k: Digital Archive of Mobile Performances - Smule Vocal Performances...

    • zenodo.org
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inc. Smule; Inc. Smule (2020). DAMP-VP1k: Digital Archive of Mobile Performances - Smule Vocal Performances 100x10 [Dataset]. http://doi.org/10.5281/zenodo.2533419
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Inc. Smule; Inc. Smule
    Description

    Digital Archive of Mobile Performance DAMP-VP1k archive of Smule vocal performances, 100 singers, 10 Performance each.

    The dataset contains sung musical performances from the Smule app. Data files include audio.zip with all the compressed audio files, and metadata.csv describing some metadata about each performance, including unique identifiers for each recording, song, and singer, as well as binary gender labels, region labels, and social "love" counts from the Smule app.

    This archive is a subset of the DAMP-Multiple Songs archive hosted at https://ccrma.stanford.edu/damp/, which contains multiple performances from each of multiple singers, singing different songs, without much verification of the data. This subset has been reduced to a subset of 10 performances per singer, with cleaner recordings for each singer preselected.

    Users of this dataset must read and accept Smule's Research Data License Agreement (LICENSE.txt).

  16. Bidding and Tendering Data of China 中国各地区招投标数据

    • redivis.com
    application/jsonl +7
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford University Libraries (2024). Bidding and Tendering Data of China 中国各地区招投标数据 [Dataset]. http://doi.org/10.57761/t87y-6b63
    Explore at:
    csv, spss, avro, stata, sas, application/jsonl, parquet, arrowAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford University Libraries
    Area covered
    China
    Description

    Abstract

    The Bidding and Tendering Data of China was released on the Chinese Open Data Platform (CnOpenData). The dataset integrates bidding information from more than 100 bidding websites. It also includes a large amount of bidding and tendering information from unofficial platforms. Variables include unit, agency, winning bidder, process status, process status annotation, creation date, tender opening date, industry type, project number, project title, province, budget, etc.

    Methodology

    The raw data were wrangled for inclusion in Data Farm. For more information, please see CnOpenData GitLab.

    Bulk Data Access

    Data access is required to view this section.

  17. s

    Unrestricted Data and Code for Hwang, J. and B. Shrimali. 2022. "Shared and...

    • purl.stanford.edu
    Updated Nov 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jackelyn Hwang; Bina Shrimali (2022). Unrestricted Data and Code for Hwang, J. and B. Shrimali. 2022. "Shared and Crowded Housing in the Bay Area: Where Gentrification and the Housing Crisis Meet COVID-19" [Dataset]. http://doi.org/10.25740/cw226nt8831
    Explore at:
    Dataset updated
    Nov 3, 2022
    Authors
    Jackelyn Hwang; Bina Shrimali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    San Francisco Bay Area
    Description

    Replication material for Jackelyn Hwang & Bina Patel Shrimali (2022) Shared and Crowded Housing in the Bay Area: Where Gentrification and the Housing Crisis Meet COVID-19, Housing Policy Debate, DOI: 10.1080/10511482.2022.2099934

    Paper Abstract: Amid the growing affordable housing crisis and widespread gentrification over the last decade, people have been moving less than before and increasingly live in shared and often crowded households across the U.S. Crowded housing has various negative health implications, including stress, sleep disorders, and infectious diseases. Difference-in- difference analysis of a unique, large-scale longitudinal consumer credit database of over 450,000 San Francisco Bay Area residents from 2002 to 2020 shows gentrification affects the probability of residents shifting to crowded households across the socioeconomic spectrum but in different ways than expected. Gentrification is negatively associated with low- socioeconomic status (SES) residents’ probability of entering crowded households, and this is largely explained by increased shifts to crowded households in neighborhoods outside of major cities showing early signs of gentrification. Conversely, gentrification is associated with increases in the probability that middle-SES residents enter crowded households, primarily in Silicon Valley. Lastly, crowding is positively associated with COVID-19 case rates, beyond density and socioeconomic and racial composition in neighborhoods, although the role of gentrification remains unclear. Housing policies that mitigate crowding can serve as early interventions in displacement prevention and reducing health inequities.

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford WAVES initiative (2021). WAVES [Dataset]. http://doi.org/10.57761/5tdn-yy04
Organization logo

WAVES

Explore at:
avro, stata, csv, sas, arrow, spss, parquet, application/jsonlAvailable download formats
Dataset updated
Apr 18, 2021
Dataset provided by
Redivis Inc.
Authors
Stanford WAVES initiative
Description

Abstract

WAVES is a pediatric physiological waveform dataset containing ECG, respiratory, plethysmogram, arterial blood pressure, and a variety of other high-frequency waveforms extracted from bedside monitors.

Usage

For code examples and documentation, please refer to the WAVES utilities python package or its associated source code.

Waveform data is stored in compressed and base-64 encoded .csv files that cannot be properly loaded and decompressed using standard csv libraries. The utility codebase provides data loaders to interface with the raw data, and usage examples like basic plotting.

As an unrestricted preview of the dataset, the WAVES utilities code includes a very small sample dataset .csv file in the format that would be provided if you extract/filter download a waveform dataset .csv file from Redivis. The "Supporting files" section of the WAVES dataset on Redivis also includes a larger subset of ~25 samples running for roughly 8 hours each.

BY DOWNLOADING THE SAMPLE DATA FILE, YOU ARE AGREEING TO THE TERMS OF THE PROVIDED DATA USE AGREEMENT (DUA)

Documentation

Initial release of WAVES to validate and document user access

Search
Clear search
Close search
Google apps
Main menu