2 datasets found
  1. Medicare 20% [2006-2018] Enrollment/Summary (MBSF)

    • redivis.com
    application/jsonl +7
    Updated Dec 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2021). Medicare 20% [2006-2018] Enrollment/Summary (MBSF) [Dataset]. http://doi.org/10.57761/wnn9-b060
    Explore at:
    avro, spss, sas, application/jsonl, csv, arrow, parquet, stataAvailable download formats
    Dataset updated
    Dec 17, 2021
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 1999 - Dec 31, 2018
    Description

    Abstract

    Master Beneficiary Summary Files (MBSF)

    Usage

    This dataset page includes some of the tables from the Medicare Data in PHS's possession. Other Medicare tables are included on other dataset pages on the PHS Data Portal. Depending upon your research question and your DUA with CMS, you may only need tables from a subset of the Medicare dataset pages, or you may need tables from all of them.

    The location of each of the Medicare tables (i.e. a chart of which tables are included in each Medicare dataset page) is shown here.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    phsdatacore@stanford.edu for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Documentation

    Metadata access is required to view this section.

    Section 2

    Metadata access is required to view this section.

    Usage Notes

    Metadata access is required to view this section.

  2. Dockerfiles

    • kaggle.com
    Updated Jun 22, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Research Computing Center (2018). Dockerfiles [Dataset]. https://www.kaggle.com/datasets/stanfordcompute/dockerfiles
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 22, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Stanford Research Computing Center
    Description

    Context

    The Dockerfiles dataset is a set of approximately 130,000 Dockerfiles extracted in early summer 2018 across a sampling of search prefixes. This dataset is released under an MIT license

    $ find data -type f -name Dockerfile | wc -l
    129,519
    

    The files are hosted as public images on Docker Hub and thus freely available for download and parsing.

    Content

    The files are currently provided in their raw format, each named Dockerfile under an organization by the Docker Hub username. For example, here is the top level of folders under "data" in the repository:

    data
    ├── 0
    ├── 1
    ├── 2
    ├── 3
    ├── 4
    ├── 5
    ├── 6
    ├── 7
    ├── 8
    ├── 9
    ├── a
    ├── b
    ├── c
    ...
    
    ├── w
    ├── x
    ├── y
    └── z
    36 directories, 0 files
    

    and within each, we have folders that represent Docker Hub usernames:

    data/a
    ├── a13r
    ├── a13xx
    ├── a1exanderjung
    ...
    ├── azuresdk
    ├── azzanatsu
    └── azzra
    

    And then each Dockerhub username has subfolders with container names, and the subfolders contain the Dockerfiles (no pun intended).

    data/a/a13r
    ├── waecm-2018-group-16-bsp-1-backend
    │  └── Dockerfile
    ├── waecm-2018-group-16-bsp-1-frontend
    │  └── Dockerfile
    └── waecm-2018-group-16-bsp-1-revproxy
      └── Dockerfile
    

    Download

    Since this dataset (despite the huge number of files!) fits still in a Github repository, the files are provided as is under version control, and don't require any special downloading aside from cloning the repo, or downloading the archive.

    git clone https://www.github.com/vsoch/datasets
    wget https://github.com/vsoch/dockerfiles/archive/1.0.0.zip
    wget https://github.com/vsoch/dockerfiles/archive/1.0.0.tar.gz
    

    Acknowledgements

    Thanks for reading! If you have other questions, or want help for your project, please don't hesitate to reach out. If the dataset is useful to you, we have a Zenodo reference:

    DOI

    Inspiration

    Many of the same questions about signatures of software can be tested or generally relevant for this dataset. Additionally, we might ask the following:

    • How do containers relate (or inherit) from one another? For example, if we use the FROM statements to build a graph, what interesting things do we find?
    • What are signatures (of installation routines?) common across different containers?
    • Can we classify different operating systems, domains of science, or package manages?

    Resources

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford Center for Population Health Sciences (2021). Medicare 20% [2006-2018] Enrollment/Summary (MBSF) [Dataset]. http://doi.org/10.57761/wnn9-b060
Organization logo

Medicare 20% [2006-2018] Enrollment/Summary (MBSF)

Explore at:
avro, spss, sas, application/jsonl, csv, arrow, parquet, stataAvailable download formats
Dataset updated
Dec 17, 2021
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1999 - Dec 31, 2018
Description

Abstract

Master Beneficiary Summary Files (MBSF)

Usage

This dataset page includes some of the tables from the Medicare Data in PHS's possession. Other Medicare tables are included on other dataset pages on the PHS Data Portal. Depending upon your research question and your DUA with CMS, you may only need tables from a subset of the Medicare dataset pages, or you may need tables from all of them.

The location of each of the Medicare tables (i.e. a chart of which tables are included in each Medicare dataset page) is shown here.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

Metadata access is required to view this section.

Section 2

Metadata access is required to view this section.

Usage Notes

Metadata access is required to view this section.

Search
Clear search
Close search
Google apps
Main menu