17 datasets found
  1. Most Popular Programming Languages 2004-2024

    • kaggle.com
    zip
    Updated Sep 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Roshan Riaz (2024). Most Popular Programming Languages 2004-2024 [Dataset]. https://www.kaggle.com/datasets/muhammadroshaanriaz/most-popular-programming-languages-2004-2024/code
    Explore at:
    zip(3491 bytes)Available download formats
    Dataset updated
    Sep 15, 2024
    Authors
    Muhammad Roshan Riaz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This dataset contains the following columns:

    Month: The date (in year-month format) when the data was recorded. Python Worldwide(%): The percentage of global popularity for Python during that month. JavaScript Worldwide(%): The percentage of global popularity for JavaScript. Java Worldwide(%): The percentage of global popularity for Java. C# Worldwide(%): The percentage of global popularity for C#. PhP Worldwide(%): The percentage of global popularity for PhP. Flutter Worldwide(%): The percentage of global popularity for Flutter. React Worldwide(%): The percentage of global popularity for React. Swift Worldwide(%): The percentage of global popularity for Swift. TypeScript Worldwide(%): The percentage of global popularity for TypeScript. Matlab Worldwide(%): The percentage of global popularity for Matlab.

    Each row represents data for a particular month, starting from January 2004, tracking the popularity trends of these programming languages worldwide.

  2. Z

    Developer Expertise Dataset on JavaScript Libraries

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Montandon, João Eduardo; Silva, Luciana Lourdes; Valente, Marco Tulio (2020). Developer Expertise Dataset on JavaScript Libraries [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1484497
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    UFMG
    IFMG
    Authors
    Montandon, João Eduardo; Silva, Luciana Lourdes; Valente, Marco Tulio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains an anonymized list of surveyed developers who provided their expertise level on three popular JavaScript libraries:

    ReactJS, a library for building enriched web interfaces

    MongoDB, a driver for accessing MongoDB databased

    Socket.IO, a library for realtime communication

  3. Computer language popularity

    • kaggle.com
    Updated Oct 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LIUYUMING1 (2025). Computer language popularity [Dataset]. https://www.kaggle.com/datasets/liuyuming1/computer-language-popularity/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 6, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    LIUYUMING1
    Description

    As shown in the chart, Python ranks first with a usage rate of 28.7%, demonstrating its continued advantage in the fields of data science and artificial intelligence. JavaScript follows closely at 19.3%, reflecting its widespread use in front-end and full-stack development. Traditional languages such as Java and C# still maintain a stable market share, while emerging languages like Go and Rust show significant growth potential. Overall, the popularity of programming languages is closely related to technological trends. The leading positions of Python and JavaScript indicate a shift in development focus towards data-driven and web-oriented directions. In the future, with the further development of cloud computing and artificial intelligence, the usage of emerging languages such as Go and Rust is expected to continue increasing.

  4. t

    Programming Language Ecosystem Project TU Wien

    • test.researchdata.tuwien.at
    csv, text/markdown
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentin Futterer; Valentin Futterer; Valentin Futterer; Valentin Futterer (2024). Programming Language Ecosystem Project TU Wien [Dataset]. http://doi.org/10.70124/gnbse-ts649
    Explore at:
    text/markdown, csvAvailable download formats
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    TU Wien
    Authors
    Valentin Futterer; Valentin Futterer; Valentin Futterer; Valentin Futterer
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Time period covered
    Dec 12, 2023
    Area covered
    Vienna
    Description

    About Dataset

    This dataset was created during the Programming Language Ecosystem project from TU Wien using the code inside the repository https://github.com/ValentinFutterer/UsageOfProgramminglanguages2011-2023?tab=readme-ov-file.

    The centerpiece of this repository is the usage_of_programming_languages_2011-2023.csv. This csv file shows the popularity of programming languages over the last 12 years in yearly increments. The repository also contains graphs created with the dataset. To get an accurate estimate on the popularity of programming languages, this dataset was created using 3 vastly different sources.

    About Data collection methodology

    The dataset was created using the github repository above. As input data, three public datasets where used.

    github_metadata

    Taken from https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/ by Peter Elmers. It is licensed under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/. It shows metadata information (no code) of all github repositories with more than 5 stars.

    PYPL_survey_2004-2023

    Taken from https://github.com/pypl/pypl.github.io/tree/master, put online by the user pcarbonn. It is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/. It shows from 2004 to 2023 for each month the share of programming related google searches per language.

    stack_overflow_developer_survey

    Taken from https://insights.stackoverflow.com/survey. It is licensed under Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/. It shows from 2011 to 2023 the results of the yearly stackoverflow developer survey.

    All these datasets were downloaded on the 12.12.2023. The datasets are all in the github repository above

    Description of the data

    The dataset contains a column for the year and then many columns for the different languages, denoting their usage in percent. Additionally, vertical barcharts and piecharts for each year plus a line graph for each language over the whole timespan as png's are provided.

    The languages that are going to be considered for the project can be seen here:

    - Python

    - C

    - C++

    - Java

    - C#

    - JavaScript

    - PHP

    - SQL

    - Assembly

    - Scratch

    - Fortran

    - Go

    - Kotlin

    - Delphi

    - Swift

    - Rust

    - Ruby

    - R

    - COBOL

    - F#

    - Perl

    - TypeScript

    - Haskell

    - Scala

    License

    This project is licensed under the Open Data Commons Open Database License (ODbL) v1.0 https://opendatacommons.org/licenses/odbl/1-0/ license.

    TLDR: You are free to share, adapt, and create derivative works from this dataser as long as you attribute me, keep the database open (if you redistribute it), and continue to share-alike any adapted database under the ODbl.

    Acknowledgments

    Thanks go out to

    - stackoverflow https://insights.stackoverflow.com/survey for providing the data from the yearly stackoverflow developer survey.

    - the PYPL survey, https://github.com/pypl/pypl.github.io/tree/master for providing google search data.

    - Peter Elmers, for crawling metadata on github repositories and providing the data https://www.kaggle.com/datasets/pelmers/github-repository-metadata-with-5-stars/.

  5. Stack Overflow tags

    • kaggle.com
    zip
    Updated Jan 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abid Ali Awan (2021). Stack Overflow tags [Dataset]. https://www.kaggle.com/datasets/kingabzpro/stack-overflow-tags/code
    Explore at:
    zip(273306 bytes)Available download formats
    Dataset updated
    Jan 6, 2021
    Authors
    Abid Ali Awan
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    Context

    How can we tell what programming languages and technologies are used by the most people? How about what languages are growing and which are shrinking, so that we can tell which are most worth investing time in?

    One excellent source of data is Stack Overflow, a programming question and answer site with more than 16 million questions on programming topics. By measuring the number of questions about each technology, we can get an approximate sense of how many people are using it. We're going to use open data from the Stack Exchange Data Explorer to examine the relative popularity of languages like R, Python, Java and Javascript have changed over time.

    Content

    Each Stack Overflow question has a tag, which marks a question to describe its topic or technology. For instance, there's a tag for languages like R or Python, and for packages like ggplot2 or pandas.

    We'll be working with a dataset with one observation for each tag in each year. The dataset includes both the number of questions asked in that tag in that year, and the total number of questions asked in that year.

    Acknowledgements

    DataCamp

  6. Data from: E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects...

    • zenodo.org
    bin, txt
    Updated May 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck (2025). E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects [Dataset]. http://doi.org/10.5281/zenodo.14221860
    Explore at:
    txt, binAvailable download formats
    Dataset updated
    May 20, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ABSTRACT
    End-to-End (E2E) testing is a comprehensive approach to validating the functionality of a software application by testing its entire workflow from the user’s perspective, ensuring that all integrated components work together as expected. It is crucial for ensuring the quality and reliability of applications, especially in the web domain, which is often bound by Service Level Agreements (SLAs). This testing involves two key activities:
    Graphical User Interface (GUI) testing, which simulates user interactions through browsers, and performance testing, which evaluates system workload handling. Despite its importance, E2E testing is often neglected, and the lack of reliable datasets for Web GUI and performance testing has slowed research progress. This paper addresses these limitations by constructing E2EGit, a comprehensive dataset, cataloging non-trivial open-source web projects on GITHUB that adopt GUI or performance testing.
    The dataset construction process involved analyzing over 5k non-trivial web repositories based on popular programming languages (JAVA, JAVASCRIPT TYPESCRIPT PYTHON) to identify: 1) GUI tests based on popular browser automation frameworks (SELENIUM PLAYWRIGHT, CYPRESS, PUPPETEER), 2) performance tests written with the most popular open-source tools (JMETER, LOCUST). After analysis, we identified 472 repositories using web GUI testing, with over 43,000 tests, and 84 repositories using performance testing, with 410 tests.


    DATASET DESCRIPTION
    The dataset is provided as an SQLite database, whose structure is illustrated in Figure 3 (in the paper), which consists of five tables, each serving a specific purpose.
    The repository table contains information on 1.5 million repositories collected using the SEART tool on May 4. It includes 34 fields detailing repository characteristics. The
    non_trivial_repository table is a subset of the previous one, listing repositories that passed the two filtering stages described in the pipeline. For each repository, it specifies whether it is a web repository using JAVA, JAVASCRIPT, TYPESCRIPT, or PYTHON frameworks. A repository may use multiple frameworks, with corresponding fields (e.g., is web java) set to true, and the field web dependencies listing the detected web frameworks. For Web GUI testing, the dataset includes two additional tables; gui_testing_test _details, where each row represents a test file, providing the file path, the browser automation framework used, the test engine employed, and the number of tests implemented in the file. gui_testing_repo_details, aggregating data from the previous table at the repository level. Each of the 472 repositories has a row summarizing
    the number of test files using frameworks like SELENIUM or PLAYWRIGHT, test engines like JUNIT, and the total number of tests identified. For performance testing, the performance_testing_test_details table contains 410 rows, one for each test identified. Each row includes the file path, whether the test uses JMETER or LOCUST, and extracted details such as the number of thread groups, concurrent users, and requests. Notably, some fields may be absent—for instance, if external files (e.g., CSVs defining workloads) were unavailable, or in the case of Locust tests, where parameters like duration and concurrent users are specified via the command line.

    To cite this article refer to this citation:

    @inproceedings{di2025e2egit,
    title={E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects},
    author={Di Meglio, Sergio and Starace, Luigi Libero Lucio and Pontillo, Valeria and Opdebeeck, Ruben and De Roover, Coen and Di Martino, Sergio},
    booktitle={2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)},
    pages={10--15},
    year={2025},
    organization={IEEE/ACM}
    }

    This work has been partially supported by the Italian PNRR MUR project PE0000013-FAIR.

  7. Salaries of developers in Ukraine

    • kaggle.com
    zip
    Updated Nov 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mysha Rysh (2022). Salaries of developers in Ukraine [Dataset]. https://www.kaggle.com/datasets/mysha1rysh/salaries-of-developers-in-ukraine
    Explore at:
    zip(24303 bytes)Available download formats
    Dataset updated
    Nov 17, 2022
    Authors
    Mysha Rysh
    Area covered
    Ukraine
    Description

    This data was collected by the team https://dou.ua/ . This resource is very popular in Ukraine. It provides salary statistics, shows current vacancies and publishes useful articles related to the life of an IT specialist. This dataset was taken from the public repository https://github.com/devua/csv/tree/master/salaries . This dataset will include the following data for each of the developer: salary, position (f.e. Junior, Middle), experience, city, tech (f.e C#/.NET, JavaScript, Python). I think this dataset will be useful to our community. Thank you.

  8. Multi-language Open Source Code Identifier Dataset

    • kaggle.com
    zip
    Updated Jul 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bharat Mane (2025). Multi-language Open Source Code Identifier Dataset [Dataset]. https://www.kaggle.com/datasets/bharatmane/multi-language-open-source-code-identifier-dataset/data
    Explore at:
    zip(3690401 bytes)Available download formats
    Dataset updated
    Jul 8, 2025
    Authors
    Bharat Mane
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset was created to support research and tool development in the areas of code readability, identifier naming, program comprehension, and code mining. It contains 362,886 unique identifier names (including classes, functions/methods, and variables) extracted from 21 widely used and actively maintained open-source projects.

    Projects were carefully selected from four major programming language ecosystems: Java, Python, C#, and JavaScript/TypeScript. The repositories span popular libraries and frameworks in domains such as data science, web development, backend systems, dependency injection, and more. These projects are widely recognized as benchmarks in their respective communities, ensuring that the dataset represents industry best practices in naming and code style.

    Context & Motivation: Good identifier naming is fundamental for code readability and maintainability, yet cross-language empirical datasets are rare. This dataset enables comparative studies of naming conventions, training and benchmarking of AI models, and reproducible research on identifier readability. It is designed to be both a large-scale resource and a realistic reflection of naming in production-quality code.

    Sources: - commons-lang, guava, hibernate-orm, logging-log4j2, spring-framework - django, flask, numpy, pandas, requests - Autofac, Dapper, Hangfire, IdentityServer, NLog - react, vue, d3, lodash, express, angular, angular-cli, ngx-bootstrap, TypeScript, NestJS

    Each identifier is labelled with its project, language, type, and name. We encourage use for academic research, code intelligence, machine learning, and developer education.

  9. T

    mnist

    • tensorflow.org
    • universe.roboflow.com
    • +4more
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). mnist [Dataset]. https://www.tensorflow.org/datasets/catalog/mnist
    Explore at:
    Dataset updated
    Jun 1, 2024
    Description

    The MNIST database of handwritten digits.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('mnist', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

    https://storage.googleapis.com/tfds-data/visualization/fig/mnist-3.0.1.png" alt="Visualization" width="500px">

  10. h

    oop

    • huggingface.co
    Updated Jul 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CodeAI (2024). oop [Dataset]. https://huggingface.co/datasets/codeai-dteam/oop
    Explore at:
    Dataset updated
    Jul 8, 2024
    Dataset authored and provided by
    CodeAI
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    MultiOOP: A Multi-Language Object-Oriented Programming Benchmark for Large Language Models

      Dataset Summary
    

    MultiOOP is a multi-language object-oriented programming benchmark designed to establish fair and robust evaluations for intelligent code generation by large language models (LLMs). It addresses major imbalances in existing benchmarks by covering six popular programming languages: Python, PHP, C++, C#, Java, and JavaScript. The benchmark features 267 tasks per… See the full description on the dataset page: https://huggingface.co/datasets/codeai-dteam/oop.

  11. d

    Imagery and Map Services

    • catalog.data.gov
    • data.cityofnewyork.us
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofnewyork.us (2024). Imagery and Map Services [Dataset]. https://catalog.data.gov/dataset/imagery-and-map-services
    Explore at:
    Dataset updated
    Nov 1, 2024
    Dataset provided by
    data.cityofnewyork.us
    Description

    The Department of Information Technology and Telecommunications, GIS Unit, has created a series of Map Tile Services for use in public web mapping & desktop applications. The link below describes the Basemap, Labels, & Aerial Photographic map services, as well as, how to utilize them in popular JavaScript web mapping libraries and desktop GIS applications. A showcase application, NYC Then&Now (https://maps.nyc.gov/then&now/) is also included on this page.

  12. Job Descriptions 2025 – Tech & Non-Tech Roles

    • kaggle.com
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Raj Srivastava (2025). Job Descriptions 2025 – Tech & Non-Tech Roles [Dataset]. https://www.kaggle.com/datasets/adityarajsrv/job-descriptions-2025-tech-and-non-tech-roles
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 31, 2025
    Dataset provided by
    Kaggle
    Authors
    Aditya Raj Srivastava
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Synthetic Job Descriptions Dataset – Tech & Non-Tech Roles (2025)

    This dataset contains 1,100 synthetic job descriptions (JDs) spanning 55 diverse roles, designed to facilitate career guidance, resume building, ATS (Applicant Tracking System) simulation, and research in NLP/ML.

    All job descriptions are synthetically generated based on curated references from publicly available job postings, career guides, and professional role descriptions. They are not real job postings but represent realistic expectations, responsibilities, and skills for each role.

    Roles Covered

    Tech Roles (Core, Popular, and Niche)

    1. Software Engineer
    2. Software Developer
    3. Full Stack Developer
    4. Backend Developer
    5. Frontend Developer
    6. Data Scientist
    7. Machine Learning Engineer
    8. AI Engineer
    9. DevOps Engineer
    10. Cloud Engineer
    11. Data Analyst
    12. Business Intelligence Analyst
    13. QA Engineer
    14. Test Automation Engineer
    15. iOS Mobile App Developer
    16. Android Mobile App Developer
    17. Vibe Coder
    18. UI Designer
    19. UX Designer
    20. Product Designer
    21. Cybersecurity Analyst
    22. Python Developer
    23. Data Engineer
    24. Network Engineer
    25. Cloud Architect
    26. Systems Engineer
    27. Java Developer
    28. .NET Developer
    29. Web Developer
    30. Software Tester (SDET)
    31. Solutions Architect
    32. Big Data Specialist
    33. Fintech Engineer
    34. AI Prompt Engineer
    35. Blockchain Developer
    36. Robotics Engineer
    37. Javascript Developer
    38. AR/VR Developer
    39. IoT Engineer
    40. Ethical Hacker
    41. Site Reliability Engineer (SRE)
    42. Game Developer

    Non-Tech Roles (Business, Creative, Operations, and Niche)

    1. Product Manager
    2. Project Manager
    3. Marketing Specialist
    4. Digital Marketing Specialist
    5. SEO Specialist
    6. Content Writer
    7. Copywriter
    8. Business Analyst
    9. Operations Manager
    10. Sales Executive
    11. Technical Writer
    12. Market Research Analyst
    13. Graphic Designer

    Dataset Structure

    • Total Job Descriptions: 1,100 (20 per role)
    • Fields per JD:
    FieldDescription
    JobIDUnique identifier for each job description
    TitleJob role/title
    ExperienceLevelFresher / Junior / Experienced / Lead / Senior
    YearsOfExperienceNumeric range or years (e.g., 0-1, 3-5)
    SkillsList of required skills (JSON array or semicolon-separated in CSV)
    ResponsibilitiesKey responsibilities (JSON array or semicolon-separated in CSV)
    KeywordsRole-specific focus areas (JSON array or semicolon-separated in CSV)

    Key Features & Insights

    • Balanced Experience Distribution: ~50% entry-level / fresher, ~50% experienced / senior roles.
    • Top Skills Across Roles: Python, JavaScript, React, TensorFlow, ML/NLP, Docker, Collaboration, Problem-solving, Figma.
    • Keyword Trends: AI, ML, Data Analytics, Cloud, DevOps, Prompt Engineering, UX/UI.
    • Comprehensive Coverage: Tech roles cover core development, data, AI, cloud, security, and niche specialties, while non-tech roles cover business, creative, operations, marketing, and analytics.

    Potential Use Cases

    • Resume Builders: Generate role-specific resumes highlighting relevant skills and responsibilities.
    • ATS Simulation / Scoring: Test applicant tracking systems with realistic job descriptions.
    • Career Analytics: Analyze trends in skills, responsibilities, and popular roles.
    • Machine Learning & NLP Projects: Use for text classification, skill extraction, recommendation systems, and job matching.
    • Educational Purposes: Ideal for learning about job role requirements across tech and non-tech domains.

    File Formats

    • JSON: job_dataset.json – structured array of job objects.
    • CSV: job_dataset.csv – arrays flattened with semicolons for easy viewing in Excel or Pandas.

    Licensing

    • Free to use, share, and adapt for research, educational, or personal projects.
    • Please cite the dataset in publications or projects:

    Synthetic Job Descriptions Dataset (2025) – Curated & Generated by Aditya Raj Srivastava (https://www.kaggle.com/adityarajsrv)

  13. Most demanded tech skills worldwide 2023

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most demanded tech skills worldwide 2023 [Dataset]. https://www.statista.com/statistics/1296668/top-in-demand-tech-skills-worldwide/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2022
    Area covered
    Worldwide
    Description

    In 2023, the tech skill most in demand by recruiters was web development. This was closely followed by DevOps and database software skills. Interestingly, over ** percent of recruiters were actively seeking individuals with cybersecurity skills. Not far behind, AI/Machine learning/Deep learning ranked fourth, with approximately ** percent of respondents identifying it as their most sought-after tech skill. These preferences align with the skills that developers worldwide are keen to acquire, particularly web development and AI/Machine learning/Deep learning. AI at the forefront of IT skills Since the release of ChatGPT in late 2022, demand for AI and automation skills has increased across all sectors. In 2023, ChatGPT was the leading technology skill globally according to topic consumption on Udemy Business, experiencing a massive growth of over ***** percent in global topic consumption. In the same year, over ** percent of software developers reported using AI to help write code in the development workflow, while another ** percent said they currently use it for debugging code. Different languages for different needs JavaScript and Java, commonly used for back-end and front-end web development, were the most demanded programming languages worldwide in 2022, followed by SQL and Python. By industry, JavaScript and Java hold the fort in the IT services and aviation industries, while SQL was more popular in the healthcare sector as well as the marketing and advertising industries. Python, well suited for data science applications, was more commonly used in the manufacturing, education, and energy industries.

  14. A Personalized Activity-based Spatiotemporal Risk Mapping Approach to...

    • figshare.com
    tiff
    Updated Mar 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jing Li; Xuantong Wang; Hexuan Zheng; Tong Zhang (2021). A Personalized Activity-based Spatiotemporal Risk Mapping Approach to COVID-19 Pandemic [Dataset]. http://doi.org/10.6084/m9.figshare.13517105.v1
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Mar 18, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jing Li; Xuantong Wang; Hexuan Zheng; Tong Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets used for this manuscript were derived from multiple sources: Denver Public Health, Esri, Google, and SafeGraph. Any reuse or redistribution of the datasets are subjected to the restrictions of the data providers: Denver Public Health, Esri, Google, and SafeGraph and should consult relevant parties for permissions.1. COVID-19 case dataset were retrieved from Denver Public Health (Link: https://storymaps.arcgis.com/stories/50dbb5e7dfb6495292b71b7d8df56d0a )2. Point of Interests (POIs) data were retrieved from Esri and SafeGraph (Link: https://coronavirus-disasterresponse.hub.arcgis.com/datasets/6c8c635b1ea94001a52bf28179d1e32b/data?selectedAttribute=naics_code) and verified with Google Places Service (Link: https://developers.google.com/maps/documentation/javascript/reference/places-service)3. The activity risk information is accessible from Texas Medical Association (TMA) (Link: https://www.texmed.org/TexasMedicineDetail.aspx?id=54216 )The datasets for risk assessment and mapping are included in a geodatabase. Per SafeGraph data sharing guidelines, raw data cannot be shared publicly. To view the content of the geodatabase, users should have installed ArcGIS Pro 2.7. The geodatabase includes the following:1. POI. Major attributes are locations, name, and daily popularity.2. Denver neighborhood with weekly COVID-19 cases and computed regional risk levels.3. Simulated four travel logs with anchor points provided. Each is a separate point layer.

  15. h

    @--> Disqusin lataaminen ei onnistu. Jos olet valvoja, katso...

    • hri.fi
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). @--> Disqusin lataaminen ei onnistu. Jos olet valvoja, katso vianratkaisuopasta. Please enable JavaScript to view the comments powered by Disqus. [Dataset]. https://hri.fi/data/dataset/pysakointivirheet-helsingissa
    Explore at:
    Dataset updated
    Mar 4, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Helsingissä pysäköinninvalvonnan ja poliisin kirjaamat pysäköintivirheet tammikuusta 2014 alkaen. Aineistossa on virheen tekoaika kuukauden ja vuoden tarkkuudella, osoite, jossa virhemaksu tai huomautus on annettu, virhemaksun vaihe, virheen pääsyy sekä virheen kirjaaja, postinumero ja aluetiedot. Todellinen sijainti -ominaisuus kertoo, onko kyseessä todellinen pysäköintivirheen sijainti. Aineisto viedään tietokantaan noin puolen vuoden välein. Aineistojen esikatselu kartta.hel.fi -palvelussa: Vuosi 2023 Koordinaatisto(t): ETRS-GK25 (EPSG:3879) Rajapintapalvelujen osoitteet: WFS: https://kartta.hel.fi/ws/geoserver/avoindata/wfs?request=getCapabilities Julkaistut tasot: Pysakointivirheet Pysakointivirheet-tason ominaisuustiedot ja tietotyypit: id (int): kohteen yksilöllinen tunniste kuukausi (string): kuukausi sanallisesti vuosi (int): vuosiluku osoite (string): katunimi ja mahdollinen osoitenumero virhemaksun_vaihe (string): vaihe; huomautus tai pysäköintivirhemaksu virheen_paasyy_ja_paaluokka (string): Pysäköintivirheen syy virheen_kirjaaja (string): kuka on kirjannut virheen; pysäköinnintarkastus vai poliisi easting (int): e-koordinaatti northing (int): n-koordinaatti postinumero (string): postinumero postitoimipaikka (string): postitoimipaikka suurpiiri (string): suurpiirin nimi kunta (string): kuntanimi kunta_nro (string): kuntatunnus kaupunginosa (string): kaupunginosan nimi osa_alue (string): osa-alueen nimi todellinen_sijainti (string): ominaisuustieto arvioi onko kyseessä kohteen todellinen sijainti (Kyllä) vai likimääräinen sijainti (Ei) Vuosien 2014 - 2022 tiedostoissa on seuraavat tiedot Virheen tekokuukausi = Pysäköintivirheen tai huomautuksen kirjoituskuukausi Virheen tekovuosi = Pysäköintivirheen tai huomautuksen kirjoitusvuosi Osoite = osoite, jossa virhemaksu tai huomautus on annettu. Osoitetta ei ole välttämättä kaikissa näkyvissä, esim. käsin kirjoitetut poliisien antamat virhemaksut / huomautukset tai se voi olla puutteellinen Virhemaksun vaihe = Pysäköintivirhemaksu tai huomautus Virheen pääluokka / Pääsyy = tässä voi olla yhdestä kolmeen eri virheen luokkaa Virheen kirjaaja = Pysäköinnintarkastaja tai poliisi Kaupunginosa Lisäksi vuoden 2014-2017 paikkatietoaineistossa (SHP) on postinumeroaluetieto (y, x, postinumero, postitoimipaikka, alue, kunta, kunta_nro). Lähteenä on ollut vuoden 2015 pääkaupunkiseudun postinumeroaluejako. Aineistossa on joitakin Helsingin ulkopuolelle jääviä pisteitä. Taulukkomuotoisessa aineistossa (CSV) on joitain osoitteettomia pysäköintivirheitä; niitä ei ole paikkatietoaineistoissa (SHP). Vuosien 2018-2021 paikkatietoaineistot on geokoodattu QGIS Digitransit Geocoding -lisäosalla taulukkomuotoisesta aineistosta (ks. lisätiedot kutakin resurssia klikkaamalla) ja ne sisältävät virheitä. Vuodesta 2022 on saatavilla vain taulukkomuotoinen aineisto (CSV). Vuodesta 2023 eteenpäin uudet datat on saatavilla vain WFS-rajapinnan kautta.

  16. p

    Please enable JavaScript to view the page content.Your support ID is:...

    • dados.prefeitura.sp.gov.br
    Updated Jul 25, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Please enable JavaScript to view the page content.Your support ID is: 17090373647339254903. Este desafio é para testar se você é um visitante humano. Audio is not supported in your browser. Digite o código que aparece na imagem submit Suporte ID: 17090373647339254903. [Dataset]. http://dados.prefeitura.sp.gov.br/dataset/base-de-contribuicao-popular-programa-de-metas-2017-2020
    Explore at:
    Dataset updated
    Jul 25, 2017
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Please enable JavaScript to view the page content.Your support ID is: 17090373647339254903. Este desafio é para testar se você é um visitante humano. Audio is not supported in your browser. Digite o código que aparece na imagem submit Suporte ID: 17090373647339254903.

  17. US Congress Members' Tweets

    • kaggle.com
    zip
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US Congress Members' Tweets [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-congress-members-tweets
    Explore at:
    zip(5451625 bytes)Available download formats
    Dataset updated
    Dec 20, 2023
    Authors
    The Devastator
    Area covered
    United States
    Description

    US Congress Members' Tweets

    August 2017 Tweets from US Congress Members

    By Social Media Data [source]

    About this dataset

    In-depth Description of the Dataset

    This dataset is a comprehensive compilation of tweets from members of the United States Congress, specifically focusing on the month of August in 2017. It contains a wealth of information encapsulating over one thousand account activities from various political entities. These entities include Congress representatives' personal accounts, their office accounts, campaign accounts as well as any committee and party related handles that are associated with them.

    The creator behind this project, Alex Litel undertook an ambitious initiative to compile and present all daily tweets originating from both chambers (the House and Senate) using an automated process referred to as 'Tweets of Congress'. This system is programmed to systematically check Twitter at fixed intervals ensuring that every tweet within this time frame is accounted for.

    In order to make these vast amounts of data manageable and easily navigable for potential users or researchers, the complete collection has been curated and presented in raw data format using Javascript Object Notation (JSON). These datasets can be found hosted on Github repositories produced daily around midnight Eastern Standard Time (EST).

    Furthermore, each aspect involved in collating this dataset including its front-end portion forming the visual facade for users - along with certain mechanical aspects responsible for generating data within given repositories work harmoniously thanks to the Congressional Tweet Automator. For more insights on how each aspect functions together or individually - visit official Github repo's automation section.

    Congruent with facilitating convenience for its potential users further, another 'users-filtered.JSON' dataset has been included which contains metadata pertaining to every account utilized by this project during tweet collection.

    Despite offering such granulated detail about these digital interactions it's noteworthy that due to sheer size limitations there is a cutoff point where archives will stop collecting data/information making room for new incoming entries ensuring viable repository management.

    Aspirants who wish to explore computational social science projects may find high value here since they can use various statistical analysis strategies like content visualization, time-series analysis, and sentiment analysis to reveal and understand underlying patterns within the tweets. Additionally, it can also be used in fields like Natural Language Processing (NLP) for various linguistic studies.

    The 'Tweets of Congress' project appreciates contributions from John Otander's Pixyll theme which has been used extensively in building the front-end of the site. Furthermore much owed credit goes to the 'unitedstates/congress-legislators' project which greatly assisted in procuring data that aided creation amidst a wealth of others who have contributed.

    Finally, it is vital to mention that this dataset comes under MIT license permitting any person obtaining

    How to use the dataset

    Exploratory Data Analysis:

    Start with doing a basic exploratory data analysis (EDA) to find trends, patterns and outliers in the Tweet texts.

    • Analyze tweet lengths: Check if there is any noticeable trend between tweets from different members.

    • Examine tweet timings: Are most tweets sent during work hours or is there significant activity outside normal business hours?

    • Delve into the frequency of hashtags/mentions (): Identify the ratio or percentage of tweets that include other users’ handles or hashtags — this could suggest whether Congress members are conversing with constituents via Twitter versus broadcasting messages.

    • Sentiment analysis: Use NLP tools to perform sentiment analysis on Tweet text to gauge overall sentiments being expressed by congressmen over time.

    Social Network Analysis:

    Social Network Analysis (SNA) is a popular approach for identifying influential individuals in social networks like Twitter.

    • Graph theory techniques could be employed in identifying clusters and communities among Congress members based on who they mention in their tweets (indicating possible relationships between users).

    • Centrality measures can help identify influential Twitter handles that serve as important information hubs or bridges in communication paths.

    There’s also potential for studying Congressional relationships through frequency of communications amongst each other, which could demonstrate alliances.

    ...

  18. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Muhammad Roshan Riaz (2024). Most Popular Programming Languages 2004-2024 [Dataset]. https://www.kaggle.com/datasets/muhammadroshaanriaz/most-popular-programming-languages-2004-2024/code
Organization logo

Most Popular Programming Languages 2004-2024

Most Popular Programming Languages

Explore at:
zip(3491 bytes)Available download formats
Dataset updated
Sep 15, 2024
Authors
Muhammad Roshan Riaz
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

This dataset contains the following columns:

Month: The date (in year-month format) when the data was recorded. Python Worldwide(%): The percentage of global popularity for Python during that month. JavaScript Worldwide(%): The percentage of global popularity for JavaScript. Java Worldwide(%): The percentage of global popularity for Java. C# Worldwide(%): The percentage of global popularity for C#. PhP Worldwide(%): The percentage of global popularity for PhP. Flutter Worldwide(%): The percentage of global popularity for Flutter. React Worldwide(%): The percentage of global popularity for React. Swift Worldwide(%): The percentage of global popularity for Swift. TypeScript Worldwide(%): The percentage of global popularity for TypeScript. Matlab Worldwide(%): The percentage of global popularity for Matlab.

Each row represents data for a particular month, starting from January 2004, tracking the popularity trends of these programming languages worldwide.

Search
Clear search
Close search
Google apps
Main menu