Saved datasets
Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. GitHub Programming Languages Data

    • kaggle.com
    zip
    Updated Jan 2, 2022
  2. P

    CodeContests Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated May 23, 2023
  3. Developers and programming languages

    • kaggle.com
    zip
    Updated Dec 3, 2017
  4. f

    Programming Languages

    • figshare.com
    zip
    Updated Jun 1, 2023
  5. P

    APPS Dataset

    • paperswithcode.com
    Updated Mar 30, 2023
  6. Most widely utilized programming languages among developers worldwide 2023

    • statista.com
    Updated Jul 19, 2023
  7. Codeforces Competitive Programming Dataset

    • kaggle.com
    zip
    Updated Jun 27, 2023
  8. h

    programming-languages-keywords

    • huggingface.co
    Updated Apr 27, 2023
  9. P

    Python Programming Puzzles (P3) Dataset

    • paperswithcode.com
    Updated Jun 9, 2021
  10. Programming languages used for software development worldwide 2022

    • statista.com
    Updated Feb 20, 2023
  11. z

    Data from: An Empirical Evaluation of Competitive Programming AI: A Case...

    • zenodo.org
    zip
    Updated Jul 12, 2022
  12. P

    Project CodeNet Dataset

    • paperswithcode.com
    Updated May 24, 2021
  13. Programming Homework Dataset for Plagiarism Detection

    • ieee-dataport.org
    Updated May 8, 2020
  14. h

    xlcost-text-to-code

    • huggingface.co
    Updated Apr 3, 2023
  15. z

    Replication Kit: "Skill Models for Programming Language Concepts"

    • zenodo.org
    zip
    Updated Jan 18, 2019
  16. Mostly Basic Python Problems

    • research.google
    Updated Sep 17, 2021
  17. h

    code_search_net

    • huggingface.co
    Updated Dec 12, 2020
  18. Most frequently required software languages in job offers in Poland 2022

    • statista.com
    Updated Oct 27, 2022
  19. Data from: A Programming Framework for Physics

    • figshare.com
    pdf
    Updated Jan 20, 2016
  20. m

    Data from: A Decision Model for Programming LanguageEcosystem Selection:...

    • data.mendeley.com
    • commons.datacite.org
    Updated Nov 1, 2020
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Isaac Wen (2022). GitHub Programming Languages Data [Dataset]. https://www.kaggle.com/datasets/isaacwen/github-programming-languages-data
Organization logo

GitHub Programming Languages Data

Statistics for Programming Languages used on GitHub

Explore at:
zip(41198 bytes)Available download formats
Dataset updated
Jan 2, 2022
Authors
Isaac Wen
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Context

A common question for those new and familiar to computer science and software engineering is what is the most best and/or most popular programming language. It is very difficult to give a definitive answer, as there are a seemingly indefinite number of metrics that can define the 'best' or 'most popular' programming language.

One such metric that can be used to define a 'popular' programming language is the number of projects and files that are made using that programming language. As GitHub is the most popular public collaboration and file-sharing platform, analyzing the languages that are used for repositories, PRs, and issues on GitHub and be a good indicator for the popularity of a language.

Content

This dataset contains statistics about the programming languages used for repositories, PRs, and issues on GitHub. The data is from 2011 to 2021.

Source

This data was queried and aggregated from BigQuery's public github_repos and githubarchive datasets.

Limitations

Only data for public GitHub repositories, and their corresponding PRs/issues, have their data available publicly. Thus, this dataset is only based on public repositories, which may not be fully representative of all repositories on GitHub.

Search
Clear search
Close search
Google apps
Main menu