2 datasets found
  1. TravisTorrent

    • figshare.com
    application/gzip
    Updated Mar 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moritz Beller; Georgios Gousios; Andy Zaidman (2022). TravisTorrent [Dataset]. http://doi.org/10.6084/m9.figshare.19314170.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 6, 2022
    Dataset provided by
    figshare
    Authors
    Moritz Beller; Georgios Gousios; Andy Zaidman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The TraivsTorrent treasure trove data sets from the publicationBeller, Moritz, Georgios Gousios, and Andy Zaidman. "Travistorrent: Synthesizing travis ci and github for full-stack research on continuous integration." 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 2017.

  2. Data from: PTMTorrent: A Dataset for Mining Open-source Pre-trained Model...

    • figshare.com
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenxin Jiang; Nicholas Synovic; Purvish Jajal; Taylor R. Schorlemmer; Arav Tewari; Bhavesh Pareek; George K. Thiruvathukal; James C. Davis (2023). PTMTorrent: A Dataset for Mining Open-source Pre-trained Model Packages [Dataset]. http://doi.org/10.6084/m9.figshare.22009880.v4
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Wenxin Jiang; Nicholas Synovic; Purvish Jajal; Taylor R. Schorlemmer; Arav Tewari; Bhavesh Pareek; George K. Thiruvathukal; James C. Davis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Due to the cost of developing and training deep learning models from scratch, machine learning engineers have begun to reuse pre-trained models (PTMs) and fine-tune them for downstream tasks. PTM registries known as “model hubs” support engineers in distributing and reusing deep learning models. PTM packages include pre-trained weights, documentation, model architectures, datasets, and metadata. Mining the information in PTM packages will enable the discovery of engineering phenomena and tools to support software engineers. However, accessing this information is difficult — there are many PTM registries, and both the registries and the individual packages may have rate limiting for accessing the data.

    We present an open-source dataset, PTMTorrent, to facilitate the evaluation and understanding of PTM packages. This paper describes the creation, structure, usage, and limitations of the dataset. The dataset includes a snapshot of 5 model hubs and a total of 15,913 PTM packages. These packages are represented in a uniform data schema for cross-hub mining. We describe prior uses of this data and suggest research opportunities for mining using our dataset.

    We provide links to the PTM Dataset and PTM Torrent Source Code.

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Moritz Beller; Georgios Gousios; Andy Zaidman (2022). TravisTorrent [Dataset]. http://doi.org/10.6084/m9.figshare.19314170.v1
Organization logo

TravisTorrent

Explore at:
229 scholarly articles cite this dataset (View in Google Scholar)
application/gzipAvailable download formats
Dataset updated
Mar 6, 2022
Dataset provided by
figshare
Authors
Moritz Beller; Georgios Gousios; Andy Zaidman
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The TraivsTorrent treasure trove data sets from the publicationBeller, Moritz, Georgios Gousios, and Andy Zaidman. "Travistorrent: Synthesizing travis ci and github for full-stack research on continuous integration." 2017 IEEE/ACM 14th International Conference on Mining Software Repositories (MSR). IEEE, 2017.

Search
Clear search
Close search
Google apps
Main menu