Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A common question for those new and familiar to computer science and software engineering is what is the most best and/or most popular programming language. It is very difficult to give a definitive answer, as there are a seemingly indefinite number of metrics that can define the 'best' or 'most popular' programming language.
One such metric that can be used to define a 'popular' programming language is the number of projects and files that are made using that programming language. As GitHub is the most popular public collaboration and file-sharing platform, analyzing the languages that are used for repositories, PRs, and issues on GitHub and be a good indicator for the popularity of a language.
This dataset contains statistics about the programming languages used for repositories, PRs, and issues on GitHub. The data is from 2011 to 2021.
This data was queried and aggregated from BigQuery's public github_repos and githubarchive datasets.
Only data for public GitHub repositories, and their corresponding PRs/issues, have their data available publicly. Thus, this dataset is only based on public repositories, which may not be fully representative of all repositories on GitHub.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset was first revision was published in the paper "Matching Problem Statements to Editorials in Competitive Programming" - ICALT 2024 https://ieeexplore.ieee.org/abstract/document/10645920
The second revision which is 7 times bigger was published in the paper "Domain Adaptation for Automated Tag Prediction in Competitive Programming" - AIAI 2025
If you are interested in this dataset, cite one of the papers in your research.
The repository of papers can be found at 1. https://github.com/DinuGeorge0019/MatchingProblemStatementsToEditorialsInCP 2. https://github.com/DinuGeorge0019/MLCP
Competitive programming is a challenging task that demands proficiency in computer science concepts and strong problem-solving skills.
A significant limitation in the field of competitive programming, in the context of machine learning, is the lack of available datasets that include the problem statement, the editorial, and the source code for research purposes. This limitation hinders the development of new algorithms and techniques to improve the efficiency and accuracy of selecting or creating suitable editorials for given problems.
To address this problem, we have introduced a comprehensive series of over 7000 competitive programming problems that encompass editorial solutions, source code and other metadata.
Note: PSG named datasets from 01_TASK_DATASETS directory are provided from the paper https://arxiv.org/abs/2310.05791 with the public repository https://github.com/sronger/PSG_Predicting_Algorithm_Tags_and_Difficulty
Facebook
TwitterAs of 2025, JavaScript and HTML/CSS are the most commonly used programming languages among software developers around the world, with more than 66 percent of respondents stating that they used JavaScript and just around 61.9 percent using HTML/CSS. Python, SQL, and Bash/Shell rounded out the top five most widely used programming languages around the world. Programming languages At a very basic level, programming languages serve as sets of instructions that direct computers on how to behave and carry out tasks. Thanks to the increased prevalence of, and reliance on, computers and electronic devices in today’s society, these languages play a crucial role in the everyday lives of people around the world. An increasing number of people are interested in furthering their understanding of these tools through courses and bootcamps, while current developers are constantly seeking new languages and resources to learn to add to their skills. Furthermore, programming knowledge is becoming an important skill to possess within various industries throughout the business world. Job seekers with skills in Python, R, and SQL will find their knowledge to be among the most highly desirable data science skills and likely assist in their search for employment.
Facebook
TwitterJavaScript and Java were some of the most tested programming languages on the DevSkiller platform as of 2024. SQL and Python ranked second and fourth, with ** percent and ** percent of respondents testing this language in 2024, respectively. Nevertheless, the tech skill developers wanted to learn the most in 2024 was related to artificial intelligence, machine learning, and deep learning. At the same time, the fastest growing IT skills among DevSkiller customers were C/C++ and data science, while cybersecurity ranked third. Software skills When it came to the most used programming language among developers worldwide, JavaScript took the top spot, chosen by 62 percent of surveyed respondents. Most software developers learn how to code between 11 and 17 years old, with some of them writing their first line of code by the age of 5. Moreover, seven out of 10 developers learned how to program by accessing online resources such as videos and blogs. Software skills pay In 2024, the average annual software developer’s salary in the U.S. amounted to nearly ** thousand U.S. dollars, while in Germany, it totaled above ** thousand U.S. dollars. The programming languages associated with the highest salaries worldwide in 2024 were Clojure and Erlang.
Facebook
Twitterhttps://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Dataset Card for The Stack
Changelog
Release Description
v1.0 Initial release of the Stack. Included 30 programming languages and 18 permissive licenses. Note: Three included licenses (MPL/EPL/LGPL) are considered weak copyleft licenses. The resulting near-deduplicated dataset is 3TB in size.
v1.1 The three copyleft licenses ((MPL/EPL/LGPL) were excluded and the list of permissive licenses extended to 193 licenses in total. The list of programming languages… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/the-stack.
Facebook
TwitterThe most popular programming language used in the past 12 months by software developers worldwide is JavaScript as of 2024, according to ** percent of the software developers surveyed. This is followed by Python at ** percent of the respondents surveyed.
Facebook
TwitterDataset Card for "programming-languages-keywords"
Structured version of https://github.com/e3b0c442/keywords Generated using: r = requests.get("https://raw.githubusercontent.com/e3b0c442/keywords/main/README.md") keywords = r.text.split("### ")[1:] keywords = [i for i in keywords if not i.startswith("Sources")] keywords = {i.split(" ")[0]:[j for j in re.findall("[a-zA-Z]*", i.split(" ",1)[1]) if j] for i in keywords} keywords =… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/programming-languages-keywords.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
## Overview
Object Oriented Programming is a dataset for object detection tasks - it contains Pedestrian annotations for 6,990 images.
## Getting Started
You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
## License
This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
This dataset was created during the Programming Language Ecosystem project from TU Wien using the code inside the repository https://github.com/ValentinFutterer/UsageOfProgramminglanguages2011-2023?tab=readme-ov-file.
The centerpiece of this repository is the usage_of_programming_languages_2011-2023.csv. This csv file shows the popularity of programming languages over the last 12 years in yearly increments. The repository also contains graphs created with the dataset. To get an accurate estimate on the popularity of programming languages, this dataset was created using 3 vastly different sources.
The dataset was created using the github repository above. As input data, three public datasets where used.
Taken from https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/ by Peter Elmers. It is licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. It shows metadata information (no code) of all github repositories with more than 5 stars.
Taken from https://github.com/pypl/pypl.github.io/tree/master, put online by the user pcarbonn. It is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/. It shows from 2004 to 2023 for each month the share of programming related google searches per language.
Taken from https://insights.stackoverflow.com/survey. It is licensed under Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/. It shows from 2011 to 2023 the results of the yearly stackoverflow developer survey.
All these datasets were downloaded on the 12.12.2023. The datasets are all in the github repository above
The dataset contains a column for the year and then many columns for the different languages, denoting their usage in percent. Additionally, vertical barcharts and piecharts for each year plus a line graph for each language over the whole timespan as png's are provided.
The languages that are going to be considered for the project can be seen here:
- Python
- C
- C++
- Java
- C#
- JavaScript
- PHP
- SQL
- Assembly
- Scratch
- Fortran
- Go
- Kotlin
- Delphi
- Swift
- Rust
- Ruby
- R
- COBOL
- F#
- Perl
- TypeScript
- Haskell
- Scala
This project is licensed under the Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/ license.
TLDR: You are free to share, adapt, and create derivative works from this dataser as long as you attribute me, keep the database open (if you redistribute it), and continue to share-alike any adapted database under the ODbl.
Thanks go out to
- stackoverflow https://insights.stackoverflow.com/survey for providing the data from the yearly stackoverflow developer survey.
- the PYPL survey, https://github.com/pypl/pypl.github.io/tree/master for providing google search data.
- Peter Elmers, for crawling metadata on github repositories and providing the data https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/.
Facebook
TwitterThe Dataset comes from Programming Languages Database
languages.csvThe full data dictionary is available from PLDB.com.
| variable | class | description |
|---|---|---|
| pldb_id | character | A standardized, uniquified version of the language name, used as an ID on the PLDB site. |
| title | character | The official title of the language. |
| description | character | Description of the repo on GitHub. |
| type | character | Which category in PLDB's subjective ontology does this entity fit into. |
| appeared | double | What year was the language publicly released and/or announced? |
| creators | character | Name(s) of the original creators of the language delimited by " and " |
| website | character | URL of the official homepage for the language project. |
| domain_name | character | If the project website is on its own domain. |
| domain_name_registered | double | When was this domain first registered? |
| reference | character | A link to more info about this entity. |
| isbndb | double | Books about this language from ISBNdb. |
| book_count | double | Computed; the number of books found for this language at isbndb.com |
| semantic_scholar | integer | Papers about this language from Semantic Scholar. |
| language_rank | double | Computed; A rank for the language, taking into account various online rankings. The computation for this column is not currently clear. |
| github_repo | character | URL of the official GitHub repo for the project if it hosted there. |
| github_repo_stars | double | How many stars of the repo? |
| github_repo_forks | double | How many forks of the repo? |
| github_repo_updated | double | What year was the last commit made? |
| github_repo_subscribers | double | How many subscribers to the repo? |
| github_repo_created | double | When was the Github repo for this entity created? |
| github_repo_description | character | Description of the repo on GitHub. |
| github_repo_issues | double | How many isses on the repo? |
| github_repo_first_commit | double | What year the first commit made in this git repo? |
| github_language | character | GitHub has a set of supported languages as defined here |
| github_language_tm_scope | character | The TextMate scope that represents this programming language. |
| github_language_type | character | Either data, programming, markup, prose, or nil. |
| github_language_ace_mode | character | A String name of the Ace Mode used for highlighting whenever a file is edited. This must match one of the filenames in http://git.io/3XO_Cg. Use "text" if a mode does not exist. |
| github_language_file_extensions | character | An Array of associated extensions (the first one is considered the primary extension, the others should be listed alphabetically). |
| github_language_repos | double | How many repos for this language does GitHub report? |
| wikipedia | character | URL of the entity on Wikipedia, if and only if it has a page dedicated to it. |
| wikipedia_daily_page_views | double | How many page views per day does this Wikipedia page get? Useful as a signal for rankings. Available via WP api. |
| wikipedia_backlinks_count | double | How many pages on WP link to this page? |
| wikipedia_summary | character | What is the text summary of the language from the Wikipedia page? |
| wikipedia_page_id | double | Waht is the internal ID for this entity on WP? |
| wikipedia_appeared | double | When does Wikipedia claim this entity first appeared? |
| wikipedia_created | double | When was the Wikipedia page for this entity created? |
| wikipedia_revision_count | double | How many revisions does this page have? |
| wikipedia_related | character | What languages does Wikipedia have as related? |
| features_has_comments | logical | Does this language have a comment character? |
| features_has_semantic_indentation | logical | Does indentation have semantic meaning in this language? |
| features_has_line_comments | logical | Does this language support inline comments (as opposed to comments that must span an entire line)? |
| line_comment_token | character | ... |
Facebook
Twitterhttps://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy
programming languages statistics: The tech market which is also booming along with digital marketing is pretty good for a better income source. The tech market has many other things including programming languages. Programming languages are the basis for the formation of various websites, games, software, mobile applications, etc... There are nearly 9,000 programming languages around the world with each language with its own feature. In this most popular programming language statistics, we will have a look at statistical information and general knowledge about worldwide available various programming languages. Programming Languages Statistics (Editor’s Choice) There are 8,945 programming languages as stated by most popular Programming languages statistics. As of 2022, JavaScript is one of the most popular programming languages as around 47.86% of recruiters are demanding JavaScript language skills. A basic python developer earns between $70,000 to $1,00,00 a year. As per the most popular programming languages statistics Python has ranked number 1 in the United States of America, India, Germany, France, and the United Kingdom
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global market for programming language learning platforms is experiencing robust growth, driven by the increasing demand for skilled software developers and the proliferation of online learning resources. The market's expansion is fueled by several key factors, including the rising adoption of digital technologies across various industries, the increasing accessibility of internet connectivity, and the growing preference for flexible and convenient online learning options. Furthermore, the continuous evolution of programming languages and the need for professionals to upskill and reskill are significant contributors to market growth. While precise figures for market size and CAGR are unavailable, reasonable estimations based on industry reports suggest a market valued at approximately $10 billion in 2025, exhibiting a compound annual growth rate (CAGR) of around 15% from 2025 to 2033. This growth is further segmented across various learning platforms, catering to diverse learning styles and technical proficiencies. The competitive landscape is characterized by established players like Coursera, Udemy, and Udacity, alongside specialized platforms such as DataCamp and smaller, niche providers. The market's growth trajectory is expected to remain positive throughout the forecast period (2025-2033), though certain restraints may influence the rate of expansion. These restraints could include challenges associated with maintaining high-quality educational content, ensuring platform accessibility for diverse learners, managing competitive pricing strategies, and addressing concerns related to the effectiveness of online learning compared to traditional classroom settings. However, continuous innovation in educational technologies, the integration of immersive learning experiences (such as virtual reality and gamification), and the development of personalized learning pathways are poised to mitigate these constraints and sustain the market's upward momentum. The geographical distribution of the market is likely to be diverse, with North America and Europe currently holding significant market share, followed by Asia-Pacific and other regions experiencing rapidly increasing adoption rates.
Facebook
TwitterJavaScript was the most frequently used coding language in Russia, used by around ********** of the surveyed software companies in 2024. Furthermore, over ******** of the companies reported to use Python and Java.
Facebook
TwitterAccording to the survey, the size of the JavaScript programming language community is roughly **** percent of software developers as of 2023, making it the most popular programming language in the world. Python is also a popular community for programmers, with **** percent of developers.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains code snippets from 22 different programming languages, with 100 code files per language (total 2,200 samples). Each entry includes metadata such as file name, number of lines, number of characters, a code preview, and the source URLs from GitHub repositories.
Columns Descriptor:
id: Unique identifier
language: Programming language name
file_name: Name of the file
num_lines: Number of lines in the code
num_chars: Total number of characters
code_preview: Partial preview of the code
repo_url: GitHub repository link
source_url: Raw file link from GitHub
This dataset is useful for programming language detection and related code analysis tasks.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Programming Language Learning Platform market is poised for significant expansion, projected to reach an estimated $15,000 million by 2025 with a compound annual growth rate (CAGR) of 18% through 2033. This robust growth is primarily fueled by the escalating demand for skilled professionals across various sectors, driven by rapid digital transformation and the ubiquitous integration of technology in daily life. The platform’s ability to cater to both individual skill enhancement for career advancement (Worker segment) and academic pursuit (Student segment) positions it as a crucial educational tool. The increasing adoption of in-demand languages like Python, JavaScript, and C further propels market expansion. Key players are investing heavily in innovative learning methodologies, personalized learning paths, and interactive content to enhance user engagement and learning outcomes, contributing to the market's upward trajectory. The market's growth is further augmented by several critical trends, including the rise of microlearning, gamification in education, and the increasing preference for flexible, self-paced online courses. The widespread availability of affordable internet access and personal computing devices globally democratizes access to quality programming education. However, the market also faces certain restraints, such as the high cost of developing sophisticated learning platforms and the challenge of maintaining learner engagement over extended periods. Despite these hurdles, the sustained need for digital literacy and specialized coding skills across industries, coupled with the continuous evolution of programming languages and tools, ensures a bright future for the Programming Language Learning Platform market. The Asia Pacific region, with its burgeoning tech industry and large student population, is expected to be a significant growth engine, closely followed by North America and Europe. This comprehensive report provides an in-depth analysis of the global Programming Language Learning Platform market, encompassing a study period from 2019 to 2033, with a base year of 2025. The analysis focuses on the forecast period of 2025-2033 and includes a detailed examination of the historical period from 2019-2024. The estimated market size for 2025 is projected to be in the millions, with significant growth anticipated throughout the forecast period.
Facebook
Twitterhttps://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Programming youtube videos dataset. Total records extracted more than 300. Last extracted on 24 jan 2022.
Get in touch with crawlfeeds team for large datasets and customized youtube datasets.
Facebook
TwitterMulti-Round Programming Conversations
Based on previous evol-codealpaca-v1 dataset with added sampled questions from stackoverflow, crossvalidated and make it multiround! It should be more suited to train a code assistant which works side by side.
Tasks included in here:
Data science, statistic, programming questions
Code translation : translate a short function from Python, Golang, C++, Java, Javascript
Code fixing : Fix randomly corrupts characters with no tab… See the full description on the dataset page: https://huggingface.co/datasets/theblackcat102/multiround-programming-convo.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Programming Education market is poised for substantial expansion, projected to reach a valuation of approximately $120 million by 2025 and to grow at a Compound Annual Growth Rate (CAGR) of around 15% through 2033. This robust growth is fueled by an escalating demand for skilled developers across diverse industries, driven by digital transformation initiatives and the increasing integration of technology into everyday life. The "Paid Learning" segment, encompassing online courses, bootcamps, and specialized training programs, is expected to dominate the market, reflecting a growing preference for structured and comprehensive skill development. Furthermore, the "Vertical Community" segment, which includes platforms fostering peer-to-peer learning and specialized developer networks like GitHub and CSDN, will also witness significant traction. The rise of accessible, user-friendly platforms such as Coursera, Udacity, and Tynker, alongside the gamified learning experiences offered by platforms like Roblox, are democratizing programming education and making it more engaging for a broader audience. Key market drivers include the continuous evolution of programming languages and technologies, necessitating upskilling and reskilling of the workforce. The burgeoning tech industry, with its relentless innovation in areas like AI, machine learning, and data science, creates a persistent need for programmers. Emerging economies, particularly in Asia Pacific and Latin America, are increasingly investing in digital infrastructure and education, presenting significant growth opportunities. While the market benefits from strong demand, potential restraints such as the high cost of specialized training in some regions and the challenge of keeping curricula constantly updated with rapidly changing technology trends need to be addressed. Nonetheless, the overarching trend towards lifelong learning and the increasing recognition of programming skills as essential for career advancement will continue to propel the market forward. This report provides an in-depth analysis of the global Programming Education market, encompassing a study period from 2019 to 2033. With 2025 as the base and estimated year, the forecast period spans from 2025 to 2033, building upon the historical data from 2019-2024. The market is poised for substantial growth, with projected revenues reaching hundreds of millions of dollars by 2025 and continuing to ascend. This analysis delves into the intricate dynamics of this rapidly evolving sector, exploring its core characteristics, prevailing trends, regional dominance, product innovations, and the key players shaping its future.
Facebook
TwitterIn the fourth quarter 2024, the most popular programming languages in published job offers in Poland were ***********, and Java.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A common question for those new and familiar to computer science and software engineering is what is the most best and/or most popular programming language. It is very difficult to give a definitive answer, as there are a seemingly indefinite number of metrics that can define the 'best' or 'most popular' programming language.
One such metric that can be used to define a 'popular' programming language is the number of projects and files that are made using that programming language. As GitHub is the most popular public collaboration and file-sharing platform, analyzing the languages that are used for repositories, PRs, and issues on GitHub and be a good indicator for the popularity of a language.
This dataset contains statistics about the programming languages used for repositories, PRs, and issues on GitHub. The data is from 2011 to 2021.
This data was queried and aggregated from BigQuery's public github_repos and githubarchive datasets.
Only data for public GitHub repositories, and their corresponding PRs/issues, have their data available publicly. Thus, this dataset is only based on public repositories, which may not be fully representative of all repositories on GitHub.