100+ datasets found

h
the-stack
huggingface.co
opendatalab.com
Updated Oct 27, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BigCode (2022). the-stack [Dataset]. https://huggingface.co/datasets/bigcode/the-stack
Explore at:
Dataset updated
Oct 27, 2022
Dataset authored and provided by
BigCode
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for The Stack

Changelog

Release Description

v1.0 Initial release of the Stack. Included 30 programming languages and 18 permissive licenses. Note: Three included licenses (MPL/EPL/LGPL) are considered weak copyleft licenses. The resulting near-deduplicated dataset is 3TB in size.

v1.1 The three copyleft licenses ((MPL/EPL/LGPL) were excluded and the list of permissive licenses extended to 193 licenses in total. The list of programming languages… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/the-stack.
+5 Million Python & Bash Programming Submissions for 5 Courses & Grades for...
figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Azcona; Alan Smeaton (2023). +5 Million Python & Bash Programming Submissions for 5 Courses & Grades for Computer-Based Exams over 3 academic years. [Dataset]. http://doi.org/10.6084/m9.figshare.12610958.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12610958.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
David Azcona; Alan Smeaton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In Dublin City University, students learn how to code by taking a variety of programming modules. Students develop code algorithms for problems proposed by Faculty. Many of these courses or modules are delivered through a custom Virtual Learning Environment (VLE) built for the purpose of teaching and learning computer programming. This custom VLE enables students to access course information, material and slides for each module. In addition, our system integrates an automatic grading platform where students can verify their code submissions for programming exercises. Students typically develop solutions locally for laboratory sheets for the computer programming courses. Then, they submit their programs online to the automatic grading platform which runs a number of testcases specified by the lecturer on each exercise. This provides instant feedback to students based on the suite of testcases run and ultimately tells the student whether the program is considered correct or incorrect if any of the testcases fail. This information is invaluable to their learning and such a platform is needed to verify their programs work as expected. The computer programming grading system has been used for several years on a variety of programming courses at our University. This allowed researchers and Faculty to gather a fine-grained digital footprint of students learning programming at our University. Recently, research in Learning Analytics has focused on Predictive Modelling and identifying those students having difficulties with course material, also in programming courses, and offering remediation, personalized feedback and interventions to students using Machine Learning techniques. Prior work has reported that customized notifications sent to students regarding their performance and offering resources such as further learning material, code solutions from peers in their class and university support services helped students to increase their differential performance and engagement on these programming courses. However, there is a limit to this prior work where most of the models use little or no programming work as features for the learning algorithms or feedback sent to students. In this work we explore different mechanisms to represent students’ code to predict its correctness and to better analyze students’ progress using their interactions which can be exploited to provide effective feedback and support better recommendations. Every time a student submits a code solution for verification, the system stores the code submission, the student identifier, the IP used on the network for the upload, the results of the testcases run with inputs and outputs, the course the submission belongs to, the exercise and the task name the student is attempting by using the submission’s filename. In total, we collected more than half a million programming submissions (591,707) for 666 students from 5 Python programming courses over 3 academic years.
R
Object Oriented Programming Dataset
universe.roboflow.com
zip
Updated Jan 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
school (2024). Object Oriented Programming Dataset [Dataset]. https://universe.roboflow.com/school-92uwf/object-oriented-programming
Explore at:
zipAvailable download formats
Dataset updated
Jan 5, 2024
Dataset authored and provided by
school
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Pedestrian Bounding Boxes
Description
Object Oriented Programming

## Overview Object Oriented Programming is a dataset for object detection tasks - it contains Pedestrian annotations for 6,990 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
h
programming-languages-keywords
huggingface.co
Updated Nov 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BigCode (2023). programming-languages-keywords [Dataset]. https://huggingface.co/datasets/bigcode/programming-languages-keywords
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 13, 2023
Dataset authored and provided by
BigCode
Description
Dataset Card for "programming-languages-keywords"

Structured version of https://github.com/e3b0c442/keywords Generated using: r = requests.get("https://raw.githubusercontent.com/e3b0c442/keywords/main/README.md") keywords = r.text.split("### ")[1:] keywords = [i for i in keywords if not i.startswith("Sources")] keywords = {i.split(" ")[0]:[j for j in re.findall("[a-zA-Z]*", i.split(" ",1)[1]) if j] for i in keywords} keywords =… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/programming-languages-keywords.
t
Programming Language Ecosystem Project TU Wien
test.researchdata.tuwien.at
csv, text/markdown
Updated Jun 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Valentin Futterer; Valentin Futterer; Valentin Futterer; Valentin Futterer (2024). Programming Language Ecosystem Project TU Wien [Dataset]. http://doi.org/10.70124/gnbse-ts649
Explore at:
text/markdown, csvAvailable download formats
Unique identifier
https://doi.org/10.70124/gnbse-ts649
Dataset updated
Jun 25, 2024
Dataset provided by
TU Wien
Authors
Valentin Futterer; Valentin Futterer; Valentin Futterer; Valentin Futterer
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Time period covered
Dec 12, 2023
Area covered
Vienna
Description
About Dataset
This dataset was created during the Programming Language Ecosystem project from TU Wien using the code inside the repository https://github.com/ValentinFutterer/UsageOfProgramminglanguages2011-2023?tab=readme-ov-file.
The centerpiece of this repository is the usage_of_programming_languages_2011-2023.csv. This csv file shows the popularity of programming languages over the last 12 years in yearly increments. The repository also contains graphs created with the dataset. To get an accurate estimate on the popularity of programming languages, this dataset was created using 3 vastly different sources.

About Data collection methodology
The dataset was created using the github repository above. As input data, three public datasets where used.
github_metadata
Taken from https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/ by Peter Elmers. It is licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. It shows metadata information (no code) of all github repositories with more than 5 stars.
PYPL_survey_2004-2023
Taken from https://github.com/pypl/pypl.github.io/tree/master, put online by the user pcarbonn. It is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/. It shows from 2004 to 2023 for each month the share of programming related google searches per language.
stack_overflow_developer_survey
Taken from https://insights.stackoverflow.com/survey. It is licensed under Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/. It shows from 2011 to 2023 the results of the yearly stackoverflow developer survey.
All these datasets were downloaded on the 12.12.2023. The datasets are all in the github repository above

Description of the data
The dataset contains a column for the year and then many columns for the different languages, denoting their usage in percent. Additionally, vertical barcharts and piecharts for each year plus a line graph for each language over the whole timespan as png's are provided.

The languages that are going to be considered for the project can be seen here:
- Python
- C
- C++
- Java
- C#
- JavaScript
- PHP
- SQL
- Assembly
- Scratch
- Fortran
- Go
- Kotlin
- Delphi
- Swift
- Rust
- Ruby
- R
- COBOL
- F#
- Perl
- TypeScript
- Haskell
- Scala

License
This project is licensed under the Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/ license.
TLDR: You are free to share, adapt, and create derivative works from this dataser as long as you attribute me, keep the database open (if you redistribute it), and continue to share-alike any adapted database under the ODbl.

Acknowledgments
Thanks go out to
- stackoverflow https://insights.stackoverflow.com/survey for providing the data from the yearly stackoverflow developer survey.
- the PYPL survey, https://github.com/pypl/pypl.github.io/tree/master for providing google search data.
- Peter Elmers, for crawling metadata on github repositories and providing the data https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/.
E
Most Popular Programming Languages Statistics
enterpriseappstoday.com
Updated Jan 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
EnterpriseAppsToday (2023). Most Popular Programming Languages Statistics [Dataset]. https://www.enterpriseappstoday.com/stats/programming-languages-statistics.html
Explore at:
Dataset updated
Jan 5, 2023
Dataset authored and provided by
EnterpriseAppsToday
License
https://www.enterpriseappstoday.com/privacy-policyhttps://www.enterpriseappstoday.com/privacy-policy
Time period covered
2022 - 2032
Area covered
Global
Description
programming languages statistics: The tech market which is also booming along with digital marketing is pretty good for a better income source. The tech market has many other things including programming languages. Programming languages are the basis for the formation of various websites, games, software, mobile applications, etc... There are nearly 9,000 programming languages around the world with each language with its own feature. In this most popular programming language statistics, we will have a look at statistical information and general knowledge about worldwide available various programming languages. Programming Languages Statistics (Editorâ€™s Choice) There are 8,945 programming languages as stated by most popular Programming languages statistics. As of 2022, JavaScript is one of the most popular programming languages as around 47.86% of recruiters are demanding JavaScript language skills. A basic python developer earns between $70,000 to $1,00,00 a year. As per the most popular programming languages statistics Python has ranked number 1 in the United States of America, India, Germany, France, and the United Kingdom
P
Programming Language Learning Platform Report
datainsightsmarket.com
doc, pdf, ppt
Updated Oct 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Programming Language Learning Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/programming-language-learning-platform-1458395
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Oct 16, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Programming Language Learning Platform market is poised for significant expansion, projected to reach an estimated $15,000 million by 2025 with a compound annual growth rate (CAGR) of 18% through 2033. This robust growth is primarily fueled by the escalating demand for skilled professionals across various sectors, driven by rapid digital transformation and the ubiquitous integration of technology in daily life. The platform’s ability to cater to both individual skill enhancement for career advancement (Worker segment) and academic pursuit (Student segment) positions it as a crucial educational tool. The increasing adoption of in-demand languages like Python, JavaScript, and C further propels market expansion. Key players are investing heavily in innovative learning methodologies, personalized learning paths, and interactive content to enhance user engagement and learning outcomes, contributing to the market's upward trajectory. The market's growth is further augmented by several critical trends, including the rise of microlearning, gamification in education, and the increasing preference for flexible, self-paced online courses. The widespread availability of affordable internet access and personal computing devices globally democratizes access to quality programming education. However, the market also faces certain restraints, such as the high cost of developing sophisticated learning platforms and the challenge of maintaining learner engagement over extended periods. Despite these hurdles, the sustained need for digital literacy and specialized coding skills across industries, coupled with the continuous evolution of programming languages and tools, ensures a bright future for the Programming Language Learning Platform market. The Asia Pacific region, with its burgeoning tech industry and large student population, is expected to be a significant growth engine, closely followed by North America and Europe. This comprehensive report provides an in-depth analysis of the global Programming Language Learning Platform market, encompassing a study period from 2019 to 2033, with a base year of 2025. The analysis focuses on the forecast period of 2025-2033 and includes a detailed examination of the historical period from 2019-2024. The estimated market size for 2025 is projected to be in the millions, with significant growth anticipated throughout the forecast period.
Key programming languages of employees of Warsaw AI companies 2024
statista.com
Updated Jul 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Key programming languages of employees of Warsaw AI companies 2024 [Dataset]. https://www.statista.com/statistics/1557383/warsaw-ai-companies-employees-programming-skills/
Explore at:
Dataset updated
Jul 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2024
Area covered
Poland
Description
In 2024 in Poland, among employees of Warsaw-based AI companies, over ** percent of possessed knowledge of the Python programming language.
D
Programming Education Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Sep 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2024). Programming Education Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-programming-education-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Sep 23, 2024
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Programming Education Market Outlook

The global programming education market size is anticipated to expand significantly, exhibiting a compound annual growth rate (CAGR) of 11.8% from 2024 to 2032. The market was valued at approximately USD 12.5 billion in 2023 and is projected to reach USD 29.5 billion by 2032. This growth is driven by several factors, including the increasing demand for coding proficiency across multiple sectors, the rise of digital transformation initiatives, and the growing availability of online learning platforms.

One of the critical growth factors in the programming education market is the burgeoning demand for digital skills across various industries. As businesses increasingly adopt digital technologies, there is a growing necessity for employees proficient in programming and coding. This demand is not confined to the IT sector alone but extends to healthcare, finance, manufacturing, and even creative industries. Moreover, the rise of emerging technologies such as artificial intelligence, machine learning, and big data analytics is further amplifying the need for skilled programmers who can develop, deploy, and maintain sophisticated software solutions.

Another significant driver is the increasing availability and sophistication of online learning platforms. Over the past decade, online education has revolutionized the way people acquire new skills, making learning more accessible, flexible, and affordable. Platforms such as Coursera, Udacity, and edX offer a plethora of programming courses catering to different skill levels, from beginners to advanced programmers. These platforms often collaborate with prestigious universities and tech companies, providing high-quality content that meets industry standards. Additionally, the advent of interactive and gamified learning experiences has made programming education more engaging and effective.

The emphasis on STEM (Science, Technology, Engineering, and Mathematics) education in the K-12 segment is also propelling the programming education market. Governments and educational institutions worldwide are increasingly recognizing the importance of integrating coding and programming into the school curriculum. Initiatives such as Hour of Code and Code.org are making significant strides in this direction. By introducing programming at an early age, these initiatives aim to equip students with critical problem-solving skills and prepare them for future careers in technology-driven fields.

In terms of regional outlook, North America leads the programming education market, driven by strong demand from the corporate sector and the presence of several leading educational technology companies. Europe is also witnessing substantial growth, supported by government initiatives to promote digital literacy. The Asia Pacific region is emerging as a lucrative market, owing to the increasing penetration of internet and mobile devices, coupled with a young, tech-savvy population. Latin America and the Middle East & Africa, though smaller in comparison, are showing promising growth, driven by increasing investments in educational infrastructure and a growing awareness of the importance of digital skills.

Component Analysis

In the programming education market, components are broadly categorized into software and services. The software segment comprises various tools and platforms used for coding education, including integrated development environments (IDEs), coding simulators, and gamified learning platforms. The services segment includes instructor-led training, online tutorials, workshops, and bootcamps. Both segments are critical to the overall growth and adoption of programming education.

The software segment is experiencing robust growth due to the increasing availability of sophisticated coding tools and platforms. Integrated Development Environments (IDEs) like Visual Studio Code, PyCharm, and Eclipse offer comprehensive environments where learners can write, test, and debug their code. These tools often come with built-in tutorials, sample projects, and community support, making them ideal for self-paced learning. Additionally, gamified learning platforms like Codecademy and CodeCombat are gaining traction for their interactive and engaging approach to coding education. These platforms use game mechanics to teach programming concepts, making learning fun and effective.

On the services side, instructor-led training and coding bootcamps are becoming increasingly popular, especially among professionals looking to upskill or switch careers. Coding bootcamps like Gen
1M Chinese Coding Questions Dataset – Python/Java/C++
nexdata.ai
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nexdata (2025). 1M Chinese Coding Questions Dataset – Python/Java/C++ [Dataset]. https://www.nexdata.ai/datasets/llm/1738
Explore at:
Dataset updated
Mar 6, 2025
Dataset authored and provided by
Nexdata
Variables measured
Format, Content, Language, Data Size, Data Fields, Data Categories, Data processing
Description
This dataset contains 1 million Chinese programming questions with corresponding answers, detailed parses (explanations), and programming language labels. It includes a wide range of questions in C, C++, Python, Java, and JavaScript, making it ideal for training large language models (LLMs) on multilingual code understanding and generation. The questions cover fundamental to advanced topics, supporting AI applications such as code completion, bug fixing, and programming reasoning. This structured dataset enhances model performance in natural language programming tasks and helps reinforce code logic skills in AI systems. All data complies with international privacy regulations including GDPR, CCPA, and PIPL.
Leading programming languages worldwide 2022, by share of users
statista.com
Updated Jul 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Leading programming languages worldwide 2022, by share of users [Dataset]. https://www.statista.com/statistics/1343059/top-programming-languages-worldwide-by-share-of-users/
Explore at:
Dataset updated
Jul 10, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Dec 2021 - Feb 2022
Area covered
Worldwide
Description
According to a survey conducted between late 2021 and early 2022, JavaScript is the most used programming language worldwide, with ** percent of respondents reporting that they use the language. Python was the second most used language at **** percent.
O
Online Programming Learn Platform Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Online Programming Learn Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/online-programming-learn-platform-1987579
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Apr 17, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The online programming learning platform market is experiencing robust growth, driven by the increasing demand for skilled programmers across various industries and the accessibility of online learning resources. The market, estimated at $20 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching approximately $60 billion by 2033. This expansion is fueled by several key factors: the rising adoption of online learning models, particularly amongst younger demographics (teens and adults), the increasing affordability and accessibility of online courses compared to traditional education, and the diverse range of programming languages and specializations offered (Python, JavaScript, web development being particularly popular). The market is segmented by application (adult, child, teen learners) and course type (Python, JavaScript, Web Development, and other). Key players like Coursera, Udemy, and Codecademy are driving innovation through interactive learning environments, personalized learning paths, and project-based learning, enhancing the learning experience and attracting a broader learner base. However, market growth faces certain challenges. Competition among numerous platforms necessitates continuous innovation and adaptation to maintain a competitive edge. Ensuring the quality and relevance of the courses offered remains crucial, as does addressing the digital divide and ensuring equitable access to online learning opportunities across different regions and socioeconomic backgrounds. Furthermore, the evolution of technology and the emergence of new programming languages require platforms to consistently update their offerings to meet evolving industry demands. Geographical variations in market penetration exist; North America and Europe currently dominate market share, but significant growth potential lies in rapidly developing economies in Asia-Pacific and other regions. Therefore, strategic expansion into these regions, coupled with localized content and pricing strategies, will be vital for continued market growth.
Globally sought-after programming languages among software developers 2022
statista.com
Updated Jul 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Globally sought-after programming languages among software developers 2022 [Dataset]. https://www.statista.com/statistics/793631/worldwide-developer-survey-most-wanted-languages/
Explore at:
Dataset updated
Jul 11, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
May 11, 2022 - Jun 1, 2022
Area covered
Worldwide
Description
According to the survey, Rust was the most desired language in 2022, with over ** percent of respondents that are not developing with it, but expressed interest in developing with it. Python ranked second, followed by TypeScript.
P
Programming Education Report
datainsightsmarket.com
doc, pdf, ppt
Updated Jan 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Programming Education Report [Dataset]. https://www.datainsightsmarket.com/reports/programming-education-1928988
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Jan 25, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The programming education market is projected to witness significant growth in the coming years. In 2025, the market was valued at million, and it is estimated to reach million by 2033, exhibiting a CAGR of % during the forecast period. The growth of this market is primarily driven by the increasing demand for skilled programmers in various industries, the shift towards online learning, and the growing popularity of coding as a hobby. Key trends in the programming education market include the rise of online learning platforms such as Coursera, Udemy, and Pluralsight; the development of new programming languages and technologies; the increasing adoption of gamified learning approaches; and the growing focus on STEM education in schools and universities. The market is dominated by a few major players such as Coursera, Roblox, CSDN, Github, Udacity, Tynker, and Programming Hub. However, there are also a number of small and medium-sized companies offering programming education services. The market is expected to continue to grow in the coming years, as the demand for skilled programmers remains high.
Number of programming languages used software development organizations...
statista.com
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Number of programming languages used software development organizations global 2024 [Dataset]. https://www.statista.com/statistics/1616230/share-of-programming-languages-used-globally/
Explore at:
Dataset updated
Jun 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Nov 22, 2024 - Dec 9, 2024
Area covered
Worldwide
Description
In 2024 around ** percent of software development organizations used over ** programming languages. In the majority of companies with fewer than ***** employees and up to ***** employees, around *** languages were being utilized. More than 10,000 employees. Organizations with over 10,000 employees used a different number of programming languages with almost an equal share.
f
Programming Languages
datasetcatalog.nlm.nih.gov
Updated Apr 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gagniuc, Paul A. (2023). Programming Languages [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001046864
Explore at:
Dataset updated
Apr 8, 2023
Authors
Gagniuc, Paul A.
Description
These files accompany the book entitled: An Introduction to Programming Languages: Simultaneous Learning in Multiple Coding Environments. This work is an introductory textbook in several computer languages. It describes the most well-known and popular programming environments such as: C#, C++, Java, JavaScript, PERL, PHP, Python, Ruby, and Visual Basic (VB) or Visual Basic for Applications (VBA). Therefore, the main objective of this unique guide is to provide code examples reflected in these nine computer languages. Readers can easily understand the connection and universality between the syntax of different environments and be adept at translating code. This learning experience can be ideal for upper-undergraduate introductory courses, researchers, doctoral students, and sociologists or engineers charged with implementing data analysis. Graphical illustrations are used for technical details about the computation examples to aid in an in-depth understanding of their inner workings. Moreover, the book contains original material that has been class-tested by the author and numerous cases are examined. Readers will also benefit from the inclusion of: a) Historical and philosophical perspectives on the past, present and future of computer languages. b) A total of 448 additional files freely available online, from which a total of 44 files are poster presentations (i.e. PowerPoint and PDF files). c) A total of 404 code examples reflected in nine computer languages, namely: C#, C++, Java, JavaScript, PERL, PHP, Python, Ruby and VB. This work first begins with a general introduction to history and presents the natural inevitable pathway from mechanical automatons to present electronic computers. Following this historical introduction, an in-detail look is made on philosophical questions, implementations, entropy and life. More often than not, there is a genuine amazement of the younger generations regarding the advancement of computer technology. Historical events that led to the development of technologies have been distilled down to the essence. However, the essence of any story is made with massive loss of detailed information. The essence of essences even more so. Over time, the lack of detail leads to a collective amnesia that can prevent us from understanding the naturalness by which technology has evolved. Thus, new constructs are always built upon older constructs to fit the evolutionary chain of technological progress, which boils down to the same fundamental rules as biological evolution. In the first stage, this book discusses the natural path of programming constructs by starting from time immemorial and ending with examples up to the present times. In the end, naturally driven constructs of all kinds also drive our society today. In the second part, the emphasis is made on the technical side where a total of nine computer languages are used simultaneously for mirrored examples. Simultaneous learning of multiple computer languages can be regarded as an asset in the world of science and technology. Thus, the reader can get used to the majority of known programming or scripting languages. Moreover, a basic knowledge of software implementation in several computer languages, even in an introductory way, helps the versatility and adaptability of the reader to new situations that may arise in industry, education, or research. Thus, this work is meant to bring a more concrete understanding of the similarities and differences between computer languages. Paul A. Gagniuc. An Introduction to Programming Languages: Simultaneous Learning in Multiple Coding Environments. Synthesis Lectures on Computer Science. Springer International Publishing, 2023, pp. 1-280.
Codeforces Competitive Programming Dataset
kaggle.com
zip
Updated Jul 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dinu Ion George (2025). Codeforces Competitive Programming Dataset [Dataset]. https://www.kaggle.com/datasets/dinuiongeorge/codeforces-competitive-programming-dataset/versions/5
Explore at:
zip(538337548 bytes)Available download formats
Dataset updated
Jul 4, 2025
Authors
Dinu Ion George
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset was first revision was published in the paper "Matching Problem Statements to Editorials in Competitive Programming" - ICALT 2024 https://ieeexplore.ieee.org/abstract/document/10645920

The second revision which is 7 times bigger was published in the paper "Domain Adaptation for Automated Tag Prediction in Competitive Programming" - AIAI 2025

If you are interested in this dataset, cite one of the papers in your research.

The repository of papers can be found at 1. https://github.com/DinuGeorge0019/MatchingProblemStatementsToEditorialsInCP 2. https://github.com/DinuGeorge0019/MLCP

Competitive programming is a challenging task that demands proficiency in computer science concepts and strong problem-solving skills.

A significant limitation in the field of competitive programming, in the context of machine learning, is the lack of available datasets that include the problem statement, the editorial, and the source code for research purposes. This limitation hinders the development of new algorithms and techniques to improve the efficiency and accuracy of selecting or creating suitable editorials for given problems.

To address this problem, we have introduced a comprehensive series of over 7000 competitive programming problems that encompass editorial solutions, source code and other metadata.

Note: PSG named datasets from 01_TASK_DATASETS directory are provided from the paper https://arxiv.org/abs/2310.05791 with the public repository https://github.com/sronger/PSG_Predicting_Algorithm_Tags_and_Difficulty
O
Online Programming Courses Report
archivemarketresearch.com
doc, pdf, ppt
Updated Mar 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Online Programming Courses Report [Dataset]. https://www.archivemarketresearch.com/reports/online-programming-courses-59301
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Mar 15, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The online programming courses market is experiencing robust growth, driven by the increasing demand for skilled software developers and the accessibility of online learning platforms. The market size in 2025 is estimated at $25 billion, exhibiting a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033. This significant growth is fueled by several factors: the rising adoption of online learning methodologies, particularly among professionals seeking upskilling or reskilling opportunities; the proliferation of diverse programming languages, creating a continuous need for specialized training; and the increasing affordability and accessibility of high-quality online courses. The market's segmentation reveals a strong preference for Java, Python, and C/C++ languages, reflecting their widespread industry applications. Web development and software development remain the dominant application segments, highlighting the substantial industry demand. Leading players like Coursera, Udemy, and edX leverage their established brand reputation and course quality to maintain their market share. However, new entrants and specialized platforms focusing on specific niche languages or technologies are also emerging, creating a competitive yet dynamic market landscape. The projected CAGR of 15% suggests a substantial market expansion, with the market size anticipated to reach approximately $70 billion by 2033. This growth trajectory is expected to be influenced by ongoing technological advancements, the expanding digital economy, and government initiatives promoting digital literacy. The regional distribution of the market is expected to be geographically diverse, with North America and Europe holding significant market shares initially, followed by a steady increase in demand from Asia-Pacific regions like India and China, driven by a burgeoning tech industry and a growing young population eager to embrace digital skills. However, challenges such as inconsistent internet access in certain regions and concerns about the quality and authenticity of online courses might pose some restraints on the market's overall growth. Nevertheless, the overall forecast points to sustained and significant growth in the online programming courses market over the next decade.
h
programming-jokes-dataset
huggingface.co
Updated Aug 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Asfandyar Azhar (2024). programming-jokes-dataset [Dataset]. https://huggingface.co/datasets/asfandyarazhar/programming-jokes-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 3, 2024
Authors
Asfandyar Azhar
Description
Programming Jokes Dataset

Dataset Summary

This dataset contains programming-related jokes scraped from the website Punny Funny. The jokes are organized into different categories based on the structure of the original webpage. The dataset is intended for use in natural language processing tasks, such as fine-tuning language models to generate humor or analyze textual content in the programming domain. Number of Jokes: [220]

Usage

This dataset is suitable for… See the full description on the dataset page: https://huggingface.co/datasets/asfandyarazhar/programming-jokes-dataset.
O
Online Programming Learn Platform Report
datainsightsmarket.com
doc, pdf, ppt
Updated Oct 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Online Programming Learn Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/online-programming-learn-platform-1446246
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Oct 21, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global Online Programming Learn Platform market is projected for robust expansion, with an estimated market size of approximately $15,000 million in 2025, poised for significant growth at a Compound Annual Growth Rate (CAGR) of around 18% through 2033. This surge is primarily driven by the escalating demand for digital skills across all sectors, the increasing adoption of remote learning solutions, and the continuous evolution of technology necessitating constant upskilling. The platform offers a diverse range of applications, catering to adult learners seeking career advancement or pivots, as well as younger demographics like children and teens exploring early exposure to coding. Key types of learning encompass Python, JavaScript, and comprehensive Web Development, alongside "Other" specialized programming languages and frameworks. Leading companies such as Coursera, Pluralsight, Udemy, and Codecademy are actively shaping this landscape with their extensive course offerings, interactive learning modules, and flexible pricing models, fostering a competitive yet collaborative environment for learners worldwide. The market's trajectory is further bolstered by emerging trends like the integration of Artificial Intelligence (AI) in personalized learning paths, the growing emphasis on practical, project-based learning, and the proliferation of micro-credentialing and bootcamps designed for rapid skill acquisition. While the market enjoys strong growth, certain restraints exist, including the need for reliable internet access and a consistent learning environment, particularly for younger learners, and the challenge of ensuring the quality and relevance of rapidly evolving course content. Geographically, North America and Europe currently dominate the market share, driven by established tech industries and a strong culture of continuous learning. However, the Asia Pacific region, particularly China and India, is expected to witness the most rapid growth due to a burgeoning young population, increasing internet penetration, and a strong governmental push towards digital literacy and STEM education. This report provides an in-depth analysis of the global Online Programming Learn Platform market, encompassing a comprehensive study of its evolution from the historical period of 2019-2024, through the base year of 2025, and projecting its trajectory up to 2033. With a meticulous examination of market dynamics, industry trends, and strategic landscapes, this report offers invaluable insights for stakeholders seeking to capitalize on the burgeoning opportunities within this sector. The market size is projected to reach several million dollars by the end of the forecast period, driven by increasing demand for digital skills across various demographics and industries.

Facebook

Twitter

Click to copy link

Link copied

Cite

BigCode (2022). the-stack [Dataset]. https://huggingface.co/datasets/bigcode/the-stack

the-stack

The-Stack

bigcode/the-stack

Explore at:

70 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

Oct 27, 2022

Dataset authored and provided by

BigCode

License

https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

Description

Dataset Card for The Stack

  Changelog

Release Description

v1.0 Initial release of the Stack. Included 30 programming languages and 18 permissive licenses. Note: Three included licenses (MPL/EPL/LGPL) are considered weak copyleft licenses. The resulting near-deduplicated dataset is 3TB in size.

v1.1 The three copyleft licenses ((MPL/EPL/LGPL) were excluded and the list of permissive licenses extended to 193 licenses in total. The list of programming languages… See the full description on the dataset page: https://huggingface.co/datasets/bigcode/the-stack.

Clear search

Close search

Google apps

Main menu

the-stack

+5 Million Python & Bash Programming Submissions for 5 Courses & Grades for...

Object Oriented Programming Dataset

Object Oriented Programming

programming-languages-keywords

Programming Language Ecosystem Project TU Wien

About Dataset

About Data collection methodology

github_metadata

PYPL_survey_2004-2023

stack_overflow_developer_survey

Description of the data

License

Acknowledgments

Most Popular Programming Languages Statistics

Programming Language Learning Platform Report

Key programming languages of employees of Warsaw AI companies 2024

Programming Education Market Report | Global Forecast From 2025 To 2033

Programming Education Market Outlook

Component Analysis

1M Chinese Coding Questions Dataset – Python/Java/C++

Leading programming languages worldwide 2022, by share of users

Online Programming Learn Platform Report

Globally sought-after programming languages among software developers 2022

Programming Education Report

Number of programming languages used software development organizations...

Programming Languages

Codeforces Competitive Programming Dataset

Online Programming Courses Report

programming-jokes-dataset

Online Programming Learn Platform Report

the-stackSee More Versions

The-Stack

bigcode/the-stack

the-stack