100+ datasets found

Z
Data from: Investigating the Use of AI-Generated Exercises for Beginner and...
data.niaid.nih.gov
zenodo.org
Updated Oct 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sandro Speth (2023). Investigating the Use of AI-Generated Exercises for Beginner and Intermediate Programming Courses: A ChatGPT Case Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7763310
Explore at:
Dataset updated
Oct 6, 2023
Dataset provided by
Steffen Becker
Niklas Meißner
Sandro Speth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In recent years, artificial intelligence (AI) has been increasingly used in education and supports teachers in creating educational material and students in their learning progress. AI- driven learning support has recently been further strengthened by the release of ChatGPT, in which users can retrieve expla- nations for various concepts in a few minutes through chat. However, to what extent the use of AI models, such as ChatGPT, is suitable for the creation of didactically and content-wise good exercises for programming courses is not yet known. Therefore, in this paper, we investigate the use of AI-generated exercises for beginner and intermediate programming courses in higher education using ChatGPT. We created 12 exercise sheets with ChatGPT for a beginner to intermediate programming course focusing on the objects-first approach. We report our process, prompts, and experience using ChatGPT for this task and outline good practices we identified. The generated exercises are assessed and revised, primarily using ChatGPT, until they met the requirements of the programming course. We assessed the quality of these exercises by using them in our course as external teaching assignment at the University of Education Ludwigsburg and let the students evaluate them. Results indicate the quality of the generated exercises and the time-saving for creating them using ChatGPT. However, our experience showed that while it is fast to generate a good version of an exercise, almost every exercise requires minor manual changes to improve its quality.
Artificial Intelligence (AI) Training Dataset Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Artificial Intelligence (AI) Training Dataset Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/artificial-intelligence-training-dataset-market-global-industry-analysis
Explore at:
pptx, csv, pdfAvailable download formats
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Artificial Intelligence (AI) Training Dataset Market Outlook

According to our latest research, the global Artificial Intelligence (AI) Training Dataset market size reached USD 3.15 billion in 2024, reflecting robust industry momentum. The market is expanding at a notable CAGR of 20.8% and is forecasted to attain USD 20.92 billion by 2033. This impressive growth is primarily attributed to the surging demand for high-quality, annotated datasets to fuel machine learning and deep learning models across diverse industry verticals. The proliferation of AI-driven applications, coupled with rapid advancements in data labeling technologies, is further accelerating the adoption and expansion of the AI training dataset market globally.

One of the most significant growth factors propelling the AI training dataset market is the exponential rise in data-driven AI applications across industries such as healthcare, automotive, retail, and finance. As organizations increasingly rely on AI-powered solutions for automation, predictive analytics, and personalized customer experiences, the need for large, diverse, and accurately labeled datasets has become critical. Enhanced data annotation techniques, including manual, semi-automated, and fully automated methods, are enabling organizations to generate high-quality datasets at scale, which is essential for training sophisticated AI models. The integration of AI in edge devices, smart sensors, and IoT platforms is further amplifying the demand for specialized datasets tailored for unique use cases, thereby fueling market growth.

Another key driver is the ongoing innovation in machine learning and deep learning algorithms, which require vast and varied training data to achieve optimal performance. The increasing complexity of AI models, especially in areas such as computer vision, natural language processing, and autonomous systems, necessitates the availability of comprehensive datasets that accurately represent real-world scenarios. Companies are investing heavily in data collection, annotation, and curation services to ensure their AI solutions can generalize effectively and deliver reliable outcomes. Additionally, the rise of synthetic data generation and data augmentation techniques is helping address challenges related to data scarcity, privacy, and bias, further supporting the expansion of the AI training dataset market.

The market is also benefiting from the growing emphasis on ethical AI and regulatory compliance, particularly in data-sensitive sectors like healthcare, finance, and government. Organizations are prioritizing the use of high-quality, unbiased, and diverse datasets to mitigate algorithmic bias and ensure transparency in AI decision-making processes. This focus on responsible AI development is driving demand for curated datasets that adhere to strict quality and privacy standards. Moreover, the emergence of data marketplaces and collaborative data-sharing initiatives is making it easier for organizations to access and exchange valuable training data, fostering innovation and accelerating AI adoption across multiple domains.

From a regional perspective, North America currently dominates the AI training dataset market, accounting for the largest revenue share in 2024, driven by significant investments in AI research, a mature technology ecosystem, and the presence of leading AI companies and data annotation service providers. Europe and Asia Pacific are also witnessing rapid growth, with increasing government support for AI initiatives, expanding digital infrastructure, and a rising number of AI startups. While North America sets the pace in terms of technological innovation, Asia Pacific is expected to exhibit the highest CAGR during the forecast period, fueled by the digital transformation of emerging economies and the proliferation of AI applications across various industry sectors.

Data Type Analysis

The AI training dataset market is segmented by data type into Text, Image/Video, Audio, and Others, each playing a crucial role in powering different AI applications. Text da
AI Training Dataset Market Report | Global Forecast From 2025 To 2033
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). AI Training Dataset Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-ai-training-dataset-market
Explore at:
csv, pptx, pdfAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
AI Training Dataset Market Outlook

The global AI training dataset market size was valued at approximately USD 1.2 billion in 2023 and is projected to reach USD 6.5 billion by 2032, growing at a compound annual growth rate (CAGR) of 20.5% from 2024 to 2032. This substantial growth is driven by the increasing adoption of artificial intelligence across various industries, the necessity for large-scale and high-quality datasets to train AI models, and the ongoing advancements in AI and machine learning technologies.

One of the primary growth factors in the AI training dataset market is the exponential increase in data generation across multiple sectors. With the proliferation of internet usage, the expansion of IoT devices, and the digitalization of industries, there is an unprecedented volume of data being generated daily. This data is invaluable for training AI models, enabling them to learn and make more accurate predictions and decisions. Moreover, the need for diverse and comprehensive datasets to improve AI accuracy and reliability is further propelling market growth.

Another significant factor driving the market is the rising investment in AI and machine learning by both public and private sectors. Governments around the world are recognizing the potential of AI to transform economies and improve public services, leading to increased funding for AI research and development. Simultaneously, private enterprises are investing heavily in AI technologies to gain a competitive edge, enhance operational efficiency, and innovate new products and services. These investments necessitate high-quality training datasets, thereby boosting the market.

The proliferation of AI applications in various industries, such as healthcare, automotive, retail, and finance, is also a major contributor to the growth of the AI training dataset market. In healthcare, AI is being used for predictive analytics, personalized medicine, and diagnostic automation, all of which require extensive datasets for training. The automotive industry leverages AI for autonomous driving and vehicle safety systems, while the retail sector uses AI for personalized shopping experiences and inventory management. In finance, AI assists in fraud detection and risk management. The diverse applications across these sectors underline the critical need for robust AI training datasets.

As the demand for AI applications continues to grow, the role of Ai Data Resource Service becomes increasingly vital. These services provide the necessary infrastructure and tools to manage, curate, and distribute datasets efficiently. By leveraging Ai Data Resource Service, organizations can ensure that their AI models are trained on high-quality and relevant data, which is crucial for achieving accurate and reliable outcomes. The service acts as a bridge between raw data and AI applications, streamlining the process of data acquisition, annotation, and validation. This not only enhances the performance of AI systems but also accelerates the development cycle, enabling faster deployment of AI-driven solutions across various sectors.

Regionally, North America currently dominates the AI training dataset market due to the presence of major technology companies and extensive R&D activities in the region. However, Asia Pacific is expected to witness the highest growth rate during the forecast period, driven by rapid technological advancements, increasing investments in AI, and the growing adoption of AI technologies across various industries in countries like China, India, and Japan. Europe and Latin America are also anticipated to experience significant growth, supported by favorable government policies and the increasing use of AI in various sectors.

Data Type Analysis

The data type segment of the AI training dataset market encompasses text, image, audio, video, and others. Each data type plays a crucial role in training different types of AI models, and the demand for specific data types varies based on the application. Text data is extensively used in natural language processing (NLP) applications such as chatbots, sentiment analysis, and language translation. As the use of NLP is becoming more widespread, the demand for high-quality text datasets is continually rising. Companies are investing in curated text datasets that encompass diverse languages and dialects to improve the accuracy and efficiency of NLP models.

Image data is critical for computer vision application
The bAbI tasks data
kaggle.com
Updated May 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
roblex nana (2020). The bAbI tasks data [Dataset]. https://www.kaggle.com/datasets/roblexnana/the-babi-tasks-for-nlp-qa-system/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 16, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
roblex nana
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
Context

This dataset presents the first set of 20 tasks for testing text understanding and reasoning in the bAbI project. The tasks are described in detail in the paper: Jason Weston, Antoine Bordes, Sumit Chopra, Alexander M. Rush, Bart van Merriënboer, Armand Joulin and Tomas Mikolov. Towards AI Complete Question Answering: A Set of Prerequisite Toy Tasks, arXiv:1502.05698.

Please also see the following slides: Antoine Bordes Artificial Tasks for Artificial Intelligence, ICLR keynote, 2015.

The aim is that each task tests a unique aspect of text and reasoning, and hence test different capabilities of learning models. More tasks are planned in the future to capture more aspects.

Content

Training Set Size: For each task, there are 1000 questions for training, and 1000 for testing. However, we emphasize that the goal is to use as little data as possible to do well on the tasks (i.e. if you can use less than 1000 that’s even better) — and without resorting to engineering task-specific tricks that will not generalize to other tasks, as they may not be of much use subsequently. Note that the aim during evaluation is to use the same learner across all tasks to evaluate its skills and capabilities.

Supervision Signal: Further while the MemNN results in the paper use full supervision (including of the supporting facts) results with weak supervision would also be ultimately preferable as this kind of data is easier to collect. Hence results of that form are very welcome. E.g. this paper does include weakly supervised results.

For the reasons above there are currently several directories:

1) en/ — the tasks in English, readable by humans. 2) hn/ — the tasks in Hindi, readable by humans. 3) shuffled/ — the same tasks with shuffled letters so they are not readable by humans, and for existing parsers and taggers cannot be used in a straight-forward fashion to leverage extra resources– in this case the learner is more forced to rely on the given training data. This mimics a learner being first presented a language and having to learn from scratch. 4) en-10k/ shuffled-10k/ and hn-10k/ — the same tasks in the three formats, but with 10,000 training examples, rather than 1000 training examples. Note the results in the paper use 1000 training examples.

Versions: Some small updates since the original release have been made (see the README in the data download for more details). You can also get v1.0 and v1.1 here.

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

The aim is to encourage the machine learning community to work on, and develop more, of these tasks.

References

https://research.fb.com/downloads/babi/
f
Data_Sheet_1_Advanced large language models and visualization tools for data...
frontiersin.figshare.com
txt
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez (2024). Data_Sheet_1_Advanced large language models and visualization tools for data analytics learning.csv [Dataset]. http://doi.org/10.3389/feduc.2024.1418006.s001
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.3389/feduc.2024.1418006.s001
Dataset updated
Aug 8, 2024
Dataset provided by
Frontiers
Authors
Jorge Valverde-Rebaza; Aram González; Octavio Navarro-Hinojosa; Julieta Noguez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
IntroductionIn recent years, numerous AI tools have been employed to equip learners with diverse technical skills such as coding, data analysis, and other competencies related to computational sciences. However, the desired outcomes have not been consistently achieved. This study aims to analyze the perspectives of students and professionals from non-computational fields on the use of generative AI tools, augmented with visualization support, to tackle data analytics projects. The focus is on promoting the development of coding skills and fostering a deep understanding of the solutions generated. Consequently, our research seeks to introduce innovative approaches for incorporating visualization and generative AI tools into educational practices.MethodsThis article examines how learners perform and their perspectives when using traditional tools vs. LLM-based tools to acquire data analytics skills. To explore this, we conducted a case study with a cohort of 59 participants among students and professionals without computational thinking skills. These participants developed a data analytics project in the context of a Data Analytics short session. Our case study focused on examining the participants' performance using traditional programming tools, ChatGPT, and LIDA with GPT as an advanced generative AI tool.ResultsThe results shown the transformative potential of approaches based on integrating advanced generative AI tools like GPT with specialized frameworks such as LIDA. The higher levels of participant preference indicate the superiority of these approaches over traditional development methods. Additionally, our findings suggest that the learning curves for the different approaches vary significantly. Since learners encountered technical difficulties in developing the project and interpreting the results. Our findings suggest that the integration of LIDA with GPT can significantly enhance the learning of advanced skills, especially those related to data analytics. We aim to establish this study as a foundation for the methodical adoption of generative AI tools in educational settings, paving the way for more effective and comprehensive training in these critical areas.DiscussionIt is important to highlight that when using general-purpose generative AI tools such as ChatGPT, users must be aware of the data analytics process and take responsibility for filtering out potential errors or incompleteness in the requirements of a data analytics project. These deficiencies can be mitigated by using more advanced tools specialized in supporting data analytics tasks, such as LIDA with GPT. However, users still need advanced programming knowledge to properly configure this connection via API. There is a significant opportunity for generative AI tools to improve their performance, providing accurate, complete, and convincing results for data analytics projects, thereby increasing user confidence in adopting these technologies. We hope this work underscores the opportunities and needs for integrating advanced LLMs into educational practices, particularly in developing computational thinking skills.
d
The National Artificial Intelligence Research And Development Strategic Plan...
catalog.data.gov
datadiscoverystudio.org
+3more
Updated May 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NCO NITRD (2025). The National Artificial Intelligence Research And Development Strategic Plan [Dataset]. https://catalog.data.gov/dataset/the-national-artificial-intelligence-research-and-development-strategic-plan
Explore at:
Dataset updated
May 14, 2025
Dataset provided by
NCO NITRD
Description
Executive Summary: Artificial intelligence (AI) is a transformative technology that holds promise for tremendous societal and economic benefit. AI has the potential to revolutionize how we live, work, learn, discover, and communicate. AI research can further our national priorities, including increased economic prosperity, improved educational opportunities and quality of life, and enhanced national and homeland security. Because of these potential benefits, the U.S. government has invested in AI research for many years. Yet, as with any significant technology in which the Federal government has interest, there are not only tremendous opportunities but also a number of considerations that must be taken into account in guiding the overall direction of Federally-funded R&D in AI. On May 3, 2016,the Administration announced the formation of a new NSTC Subcommittee on Machine Learning and Artificial intelligence, to help coordinate Federal activity in AI.1 This Subcommittee, on June 15, 2016, directed the Subcommittee on Networking and Information Technology Research and Development (NITRD) to create a National Artificial Intelligence Research and Development Strategic Plan. A NITRD Task Force on Artificial Intelligence was then formed to define the Federal strategic priorities for AI R&D, with particular attention on areas that industry is unlikely to address. This National Artificial Intelligence R&D Strategic Plan establishes a set of objectives for Federallyfunded AI research, both research occurring within the government as well as Federally-funded research occurring outside of government, such as in academia. The ultimate goal of this research is to produce new AI knowledge and technologies that provide a range of positive benefits to society, while minimizing the negative impacts. To achieve this goal, this AI R&D Strategic Plan identifies the following priorities for Federally-funded AI research: Strategy 1: Make long-term investments in AI research. Prioritize investments in the next generation of AI that will drive discovery and insight and enable the United States to remain a world leader in AI. Strategy 2: Develop effective methods for human-AI collaboration. Rather than replace humans, most AI systems will collaborate with humans to achieve optimal performance. Research is needed to create effective interactions between humans and AI systems. Strategy 3: Understand and address the ethical, legal, and societal implications of AI. We expect AI technologies to behave according to the formal and informal norms to which we hold our fellow humans. Research is needed to understand the ethical, legal, and social implications of AI, and to develop methods for designing AI systems that align with ethical, legal, and societal goals. Strategy 4: Ensure the safety and security of AI systems. Before AI systems are in widespread use, assurance is needed that the systems will operate safely and securely, in a controlled, well-defined, and well-understood manner. Further progress in research is needed to address this challenge of creating AI systems that are reliable, dependable, and trustworthy. Strategy 5: Develop shared public datasets and environments for AI training and testing. The depth, quality, and accuracy of training datasets and resources significantly affect AI performance. Researchers need to develop high quality datasets and environments and enable responsible access to high-quality datasets as well as to testing and training resources. Strategy 6: Measure and evaluate AI technologies through standards and benchmarks. . Essential to advancements in AI are standards, benchmarks, testbeds, and community engagement that guide and evaluate progress in AI. Additional research is needed to develop a broad spectrum of evaluative techniques. Strategy 7: Better understand the national AI R&D workforce needs. Advances in AI will require a strong community of AI researchers. An improved understanding of current and future R&D workforce demands in AI is needed to help ensure that sufficient AI experts are available to address the strategic R&D areas outlined in this plan. The AI R&D Strategic Plan closes with two recommendations: Recommendation 1: Develop an AI R&D implementation framework to identify S&T opportunities and support effective coordination of AI R&D investments, consistent with Strategies 1-6 of this plan. Recommendation 2: Study the national landscape for creating and sustaining a healthy AI R&D workforce, consistent with Strategy 7 of this plan.
R
Final Data Set For Model Training Merged Dataset
universe.roboflow.com
zip
Updated Apr 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Equinox Lawn AI Tasks (2024). Final Data Set For Model Training Merged Dataset [Dataset]. https://universe.roboflow.com/equinox-lawn-ai-tasks/final-data-set-for-model-training-merged
Explore at:
zipAvailable download formats
Dataset updated
Apr 27, 2024
Dataset authored and provided by
Equinox Lawn AI Tasks
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Variables measured
Lawn Sidewalk Driveway House Bfi1 Lawn Polygons
Description
Final Data Set For Model Training Merged

## Overview Final Data Set For Model Training Merged is a dataset for instance segmentation tasks - it contains Lawn Sidewalk Driveway House Bfi1 Lawn annotations for 1,488 images. ## Getting Started You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model. ## License This dataset is available under the [Public Domain license](https://creativecommons.org/licenses/Public Domain).
F
Polish Open Ended Question Answer Text Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Polish Open Ended Question Answer Text Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/polish-open-ended-question-answer-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
What’s Included
The Polish Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the Polish language, advancing the field of artificial intelligence.
Dataset Content:
This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in Polish. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.
Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native Polish people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.
Question Diversity:
To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.
Answer Formats:
To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.
Data Format and Annotation Details:
This fully labeled Polish Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.
Quality and Accuracy:
The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.
Both the question and answers in Polish are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.
Continuous Updates and Customization:
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.
License:
The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy Polish Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.
h
Data-Centric-Visual-AI-Challenge-Train-Set
huggingface.co
Updated Sep 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Voxel51 (2024). Data-Centric-Visual-AI-Challenge-Train-Set [Dataset]. https://huggingface.co/datasets/Voxel51/Data-Centric-Visual-AI-Challenge-Train-Set
Explore at:
Dataset updated
Sep 24, 2024
Dataset authored and provided by
Voxel51
Description
Dataset Card for Data-Centric-Visual-AI-Train-Set

This is a FiftyOne dataset with 30,000 samples.

Installation

If you haven't already, install FiftyOne: pip install -U fiftyone

Usage

import fiftyone as fo import fiftyone.utils.huggingface as fouh

Load the dataset

Note: other available arguments include 'max_samples', etc

dataset = fouh.load_from_hub("Voxel51/Data-Centric-Visual-AI-Challenge-Train-Set")

Launch the App

session = fo.launch_app(dataset)… See the full description on the dataset page: https://huggingface.co/datasets/Voxel51/Data-Centric-Visual-AI-Challenge-Train-Set.
d
Global Fraud Detection Data | AI Training Data for Damaged Cars | 10K+...
datarade.ai
Updated Nov 3, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pixta AI (2022). Global Fraud Detection Data | AI Training Data for Damaged Cars | 10K+ Images | Classified-Segmented Datasets for Custom Requirements [Dataset]. https://datarade.ai/data-products/3-000-damaged-car-images-for-ai-ml-model-pixta-ai
Explore at:
.json, .xml, .csv, .txtAvailable download formats
Dataset updated
Nov 3, 2022
Dataset authored and provided by
Pixta AI
Area covered
Thailand, Italy, Hungary, Netherlands, Malaysia, Australia, Hong Kong, New Zealand, France, Norway
Description
Overview This dataset is a collection of 10,000+ images of damaged cars in multiple scenes that are ready to use for optimizing the accuracy of computer vision models. The dataset includes car images with 9 types of small damage (dent, scratch, gouge, crack, glass_shatter, lamp_broken, tire_flat, hail, rust) and balance classification

Annotated Imagery Data of damaged car images This dataset contains 10,000+ images of damaged cars. The dataset has annotated in Classification (9 Car Damage label) and Instant segmentation. Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy. Contact us for more custom datasets.

About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands.
A
Artificial Intelligence Programmer Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Artificial Intelligence Programmer Report [Dataset]. https://www.datainsightsmarket.com/reports/artificial-intelligence-programmer-1496204
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Apr 24, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Artificial Intelligence (AI) Programmer market is experiencing robust growth, driven by the increasing adoption of AI across diverse sectors. The market, estimated at $50 billion in 2025, is projected to grow at a Compound Annual Growth Rate (CAGR) of 20% from 2025 to 2033, reaching approximately $200 billion by 2033. This surge is fueled by several key factors, including the escalating demand for AI-powered solutions in information technology, financial services, and medical insurance. The rise of big data, coupled with advancements in machine learning and deep learning algorithms, is further accelerating the need for skilled AI programmers to develop and implement sophisticated AI systems. Large enterprises are leading the adoption, followed by a rapidly growing segment of Small and Medium-sized Enterprises (SMEs) recognizing the transformative potential of AI. The North American market currently holds the largest share, driven by significant investments in AI research and development and a mature technological infrastructure. However, regions like Asia-Pacific, particularly India and China, are emerging as significant growth drivers due to their expanding tech industries and increasing government support for AI initiatives.
Despite its rapid growth, the AI Programmer market faces challenges such as a shortage of skilled professionals and the high cost of AI development and deployment. The complexity of AI programming and the need for specialized expertise necessitate ongoing investments in education and training programs to bridge the skills gap. Further restraining factors include concerns regarding data privacy and security, as well as the ethical implications of deploying increasingly autonomous AI systems. Nevertheless, the long-term outlook for the AI Programmer market remains highly positive, with continued innovation and increasing demand expected to drive substantial growth throughout the forecast period. Companies like Cognition Labs are at the forefront of this technological revolution, actively contributing to its advancements and shaping the future of AI programming.
A
AI Basic Data Service Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). AI Basic Data Service Report [Dataset]. https://www.datainsightsmarket.com/reports/ai-basic-data-service-1390958
Explore at:
pdf, doc, pptAvailable download formats
Dataset updated
Apr 28, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The AI Basic Data Service market is experiencing robust growth, driven by the increasing adoption of artificial intelligence across diverse sectors. The market, valued at approximately $15 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 25% from 2025 to 2033, reaching an estimated market size of $75 billion by 2033. This expansion is fueled by several key factors: the burgeoning demand for high-quality data to train and improve AI models across applications like autonomous driving, smart security, and finance; the rise of data-centric businesses reliant on readily available, accurate datasets; and the ongoing development of innovative data collection, processing, and annotation services. The market's segmentation reveals significant opportunities within customized data services, catering to the specific needs of individual businesses, and data set products, offering pre-packaged solutions for broader applications. Key players, including Baidu, Alibaba, Tencent, and several specialized data providers, are actively shaping market dynamics through strategic partnerships, acquisitions, and technological advancements. Geographic distribution indicates strong growth across North America and Asia Pacific, fueled by significant investments in AI infrastructure and technological innovation within these regions. Market restraints include concerns surrounding data privacy and security, the high cost of data acquisition and processing, and the need for robust data governance frameworks to ensure data quality and ethical AI development. Nevertheless, the substantial investments in AI infrastructure, coupled with continuous improvements in data annotation and processing technologies, are poised to mitigate these challenges. The market's future trajectory will likely be shaped by advancements in synthetic data generation, the increasing adoption of cloud-based AI solutions, and the emergence of innovative business models that address data accessibility and affordability. The continued growth in applications of AI across various industries will further fuel the demand for basic data services, ensuring sustained market expansion in the coming decade.
Trojan Detection Software Challenge -...
catalog.data.gov
Updated Sep 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). Trojan Detection Software Challenge - nlp-sentiment-classification-mar2021-train [Dataset]. https://catalog.data.gov/dataset/trojan-detection-software-challenge-round-5-train-dataset-39fdb
Explore at:
Dataset updated
Sep 30, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Round 5 Train DatasetThe data being generated and disseminated is the train data used to construct trojan detection software solutions. This data, generated at NIST, consists of natural language processing (NLP) AIs trained to perform text sentiment classification on English text. A known percentage of these trained AI models have been poisoned with a known trigger which induces incorrect behavior. This data will be used to develop software solutions for detecting which trained AI models have been poisoned via embedded triggers. This dataset consists of 1656 adversarially trained, sentiment classification AI models using a small set of model architectures. The models were trained on text data drawn from movie and product reviews. Half (50%) of the models have been poisoned with an embedded trigger which causes misclassification of the images when the trigger is present. Errata: The following models were contaminated during dataset packaging. This caused nominally clean models to have a trigger. Please avoid using these models. Due to the similarity between the Round5 and Round6 datasets (both contain similarly trained sentiment classification AI models), the dataset authors suggest ignoring the Round5 data and only using the Round6 dataset. Corrupted Models: [id-00000007, id-00000014, id-00000030, id-00000036, id-00000047, id-00000074, id-00000080, id-00000088, id-00000089, id-00000097, id-00000103, id-00000105, id-00000122, id-00000123, id-00000124, id-00000127, id-00000148, id-00000151, id-00000154, id-00000162, id-00000165, id-00000181, id-00000184, id-00000185, id-00000193, id-00000197, id-00000198, id-00000207, id-00000230, id-00000236, id-00000239, id-00000240, id-00000244, id-00000251, id-00000256, id-00000258, id-00000265, id-00000272, id-00000284, id-00000321, id-00000336, id-00000364, id-00000389, id-00000391, id-00000396, id-00000423, id-00000425, id-00000446, id-00000449, id-00000463, id-00000468, id-00000479, id-00000499, id-00000516, id-00000524, id-00000532, id-00000537, id-00000563, id-00000575, id-00000577, id-00000583, id-00000592, id-00000629, id-00000635, id-00000643, id-00000644, id-00000685, id-00000710, id-00000720, id-00000724, id-00000730, id-00000735, id-00000780, id-00000784, id-00000794, id-00000798, id-00000802, id-00000808, id-00000818, id-00000828, id-00000841, id-00000864, id-00000867, id-00000923, id-00000970, id-00000971, id-00000973, id-00000989, id-00000990, id-00000996, id-00001000, id-00001036, id-00001040, id-00001041, id-00001044, id-00001048, id-00001053, id-00001059, id-00001063, id-00001116, id-00001131, id-00001139, id-00001146, id-00001159, id-00001163, id-00001166, id-00001171, id-00001183, id-00001188, id-00001201, id-00001211, id-00001233, id-00001251, id-00001262, id-00001291, id-00001300, id-00001302, id-00001305, id-00001312, id-00001314, id-00001327, id-00001341, id-00001344, id-00001346, id-00001364, id-00001365, id-00001373, id-00001389, id-00001390, id-00001391, id-00001392, id-00001399, id-00001414, id-00001418, id-00001425, id-00001449, id-00001470, id-00001486, id-00001516, id-00001517, id-00001518, id-00001532, id-00001533, id-00001537, id-00001542, id-00001549, id-00001579, id-00001580, id-00001581, id-00001586, id-00001591, id-00001599, id-00001600, id-00001604, id-00001610, id-00001618, id-00001643, id-00001650]
D
Notable AI Models
epoch.ai
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Epoch AI, Notable AI Models [Dataset]. https://epoch.ai/data/notable-ai-models
Explore at:
csvAvailable download formats
Dataset authored and provided by
Epoch AI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Global
Variables measured
https://epoch.ai/data/notable-ai-models-documentation#records
Measurement technique
https://epoch.ai/data/notable-ai-models-documentation#records
Description
Our most comprehensive database of AI models, containing over 800 models that are state of the art, highly cited, or otherwise historically notable. It tracks key factors driving machine learning progress and includes over 300 training compute estimates.
AI and machine learning: most used API types 2019
statista.com
Updated Mar 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). AI and machine learning: most used API types 2019 [Dataset]. https://www.statista.com/statistics/1069198/worldwide-ai-machine-learning-api/
Explore at:
Dataset updated
Mar 17, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2019
Area covered
Worldwide
Description
Language is the most relevant form of APIs used by global AI and machine learning developers as of 2019, as 55.9 percent of surveyed AI and machine learning developers said that their organizations relied on language APIs. The strong presence of conversation and data discovery APIs indicate the importance of voice-activated assistants in the development of mainstream AI and machine learning software.
d
Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial...
datarade.ai
Updated Nov 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pixta AI (2022). Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial landmark images | 30,000 Stock Images [Dataset]. https://datarade.ai/data-products/unique-face-ids-with-facial-landmark-106-key-points-pixta-ai
Explore at:
.json, .xml, .csv, .txtAvailable download formats
Dataset updated
Nov 25, 2022
Dataset authored and provided by
Pixta AI
Area covered
Canada, New Zealand, Korea (Republic of), Spain, Portugal, Belgium, Poland, Australia, Malaysia, Vietnam
Description
Overview This dataset is a collection of 30,000+ images of Face ID + 106 key points facial landmark that are ready to use for optimizing the accuracy of computer vision models. Images in the dataset includes People image with specific requirements as follow:

Age: above 20

Race: various

Angle: no more than 90 degree All of the contents is sourced from PIXTA's stock library of 100M+ Asian-featured images and videos.

Annotated Imagery Data of Face ID + 106 key points facial landmark This dataset contains 30,000+ images of Face ID + 106 key points facial landmark. The dataset has been annotated in - face bounding box, Attribute of race, gender, age, skin tone and 106 keypoints facial landmark. Each data set is supported by both AI and human review process to ensure labelling consistency and accuracy.

About PIXTA PIXTASTOCK is the largest Asian-featured stock platform providing data, contents, tools and services since 2005. PIXTA experiences 15 years of integrating advanced AI technology in managing, curating, processing over 100M visual materials and serving global leading brands for their creative and data demands.
Z
DMSP Particle Precipitation AI-ready Data
data.niaid.nih.gov
Updated Jul 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kristina Lynch (2021). DMSP Particle Precipitation AI-ready Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4281121
Explore at:
Dataset updated
Jul 13, 2021
Dataset provided by
Susan Skone
Jack Ziegler
Mathew Owens
Téo Bloch
Enrico Camporeale
Jesper Gjerloev
Ryan M. McGranaghan
Kristina Lynch
Binzheng Zhang
Spencer Hatch
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description:

The dataset ‘DMSP Particle Precipitation AI-ready Data’ accompanies the manuscript “Next generation particle precipitation: Mesoscale prediction through machine learning (a case study and framework for progress)” submitted to AGU Space Weather Journal and used to produce new machine learning models of particle precipitation from the magnetosphere to the ionosphere. Note that we have attempted to make these data ready to be used in artificial intelligence/machine learning explorations following a community definition of ‘AI-ready’ provided at https://github.com/rmcgranaghan/data_science_tools_and_resources/wiki/Curated-Reference%7CChallenge-Data-Sets

The purpose of publishing these data is two-fold:

To allow reuse of the data that led to the manuscript and extension, rather than reinvention, of the research produced there; and

To be an ‘AI-ready’ challenge data set to which the artificial intelligence/machine learning community can apply novel methods.

These data were compiled, curated, and explored by: Ryan McGranaghan, Enrico Camporeale, Kristina Lynch, Jack Ziegler, Téo Bloch, Mathew Owens, Jesper Gjerloev, Spencer Hatch, Binzheng Zhang, and Susan Skone

Pipeline for creation:

The steps to create the data were (Note that we do not provide intermediate datasets):

Access NASA-provided DMSP data at https://cdaweb.gsfc.nasa.gov/pub/data/dmsp/

Read CDF files for given satellite (e.g., F-16)

Collect the following variables at one-second cadence: SC_AACGM_LAT, SC_AACGM_LTIME, ELE_TOTAL_ENERGY_FLUX, ELE_TOTAL_ENERGY_FLUX_STD, ELE_AVG_ENERGY, ELE_AVG_ENERGY_STD, ID_SC

Sub-sample the variables to one-minute cadence and eliminate any rows for which ELE_TOTAL_ENERGY_FLUX is NaN

Combine all individual satellites into single yearly files

For each yearly file, use nasaomnireader to obtain solar wind and geomagnetic index data programmatically and timehist2 to calculate the time histories of each parameter. Collate with the DMSP observations and remove rows for which any solar wind or geomagnetic index data are missing.

For each row, calculate cyclical time variables (e.g., local time -> sin(LT) and cos(LT))

Merge all years

How to use:

The Github repository https://github.com/rmcgranaghan/precipNet is provided to detail the use of these data and to provide Jupyter notebooks to facilitate getting started. The code is implemented in Python 3 and is licensed under the GNU General Public License v3.0

Citation:

For anyone using these data, please cite each of the following papers:

McGranaghan, R. M., Ziegler, J., Bloch, T., Hatch, S., Camporeale, E., Lynch, K., et al. (2021). Toward a next generation particle precipitation model: Mesoscale prediction through machine learning (a case study and framework for progress). Space Weather, 19, e2020SW002684. https://doi.org/10.1029/2020SW002684

McGranaghan, R. (2019), Eight lessons I learned leading a scientific “design sprint”, Eos, 100, https://doi.org/10.1029/2019EO136427. Published on 11 November 2019.

For questions or comments please contact Ryan McGranaghan (ryan.mcgranaghan@gmail.com)
Share of IT professionals who use AI tools daily worldwide 2023, by...
statista.com
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Share of IT professionals who use AI tools daily worldwide 2023, by profession [Dataset]. https://www.statista.com/statistics/1440332/it-professionals-who-use-ai-tools-daily-worldwide/
Explore at:
Dataset updated
Jul 1, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jun 3, 2023 - Jun 22, 2023
Area covered
Worldwide
Description
In 2023, AI tools were used daily by IT professionals across various fields. In that year, over ** percent of machine learning engineers globally reported using these tools every day, while data scientists followed closely, with around ** percent stating daily usage. Back-end developers and full-stack developers reported slightly lower usage, with **** percent and **** percent respectively stating that they use AI tools daily.

Global Artificial Intelligence as a Service Market Research Report: By...

wiseguyreports.com

Updated Dec 3, 2024

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

wWiseguy Research Consultants Pvt Ltd (2024). Global Artificial Intelligence as a Service Market Research Report: By Application (Natural Language Processing, Image Recognition, Machine Learning, Speech Recognition, Data Analytics), By Deployment Model (Public Cloud, Private Cloud, Hybrid Cloud), By Service Type (Model Development, Application Programming Interface, Managed Services, Model Training, Data Storage), By Industry Vertical (Healthcare, Finance, Retail, Manufacturing, Telecommunications) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2032. [Dataset]. https://www.wiseguyreports.com/reports/artificial-intelligence-as-a-service-market

Explore at:

Dataset updated

Dec 3, 2024

Dataset authored and provided by

wWiseguy Research Consultants Pvt Ltd

License

https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

Area covered

Global

Description

BASE YEAR	2024
HISTORICAL DATA	2019 - 2024
REPORT COVERAGE	Revenue Forecast, Competitive Landscape, Growth Factors, and Trends
MARKET SIZE 2023	14.38(USD Billion)
MARKET SIZE 2024	18.2(USD Billion)
MARKET SIZE 2032	120.0(USD Billion)
SEGMENTS COVERED	Application, Deployment Model, Service Type, Industry Vertical, Regional
COUNTRIES COVERED	North America, Europe, APAC, South America, MEA
KEY MARKET DYNAMICS	Increasing demand for automation, Rising cloud adoption, Growing need for data analytics, Enhanced customer experiences, Competition among service providers
MARKET FORECAST UNITS	USD Billion
KEY COMPANIES PROFILED	Intel, Salesforce, Microsoft, IBM, Google, NVIDIA, Amazon Web Services, Oracle, Alibaba Cloud, DataRobot, H2O.ai, Zebra Medical Vision, SAP, Palo Alto Networks, C3.ai
MARKET FORECAST PERIOD	2025 - 2032
KEY MARKET OPPORTUNITIES	Rapid adoption of cloud solutions, Increased demand for automation, Growth in data analytics needs, Expansion of machine learning applications, Rising interest in AI-powered solutions
COMPOUND ANNUAL GROWTH RATE (CAGR)	26.59% (2025 - 2032)

Artificial Intelligence In Marketing Market Analysis North America, APAC,...
technavio.com
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2024). Artificial Intelligence In Marketing Market Analysis North America, APAC, Europe, Middle East and Africa, South America - US, China, UK, Japan, Germany - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/artificial-intelligence-in-marketing-market-industry-analysis
Explore at:
Dataset updated
Jul 15, 2024
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2021 - 2025
Area covered
Global, United States
Description
Snapshot img

Artificial Intelligence In Marketing Size 2024-2028

The artificial intelligence in marketing size is forecast to increase by USD 41.02 billion, at a CAGR of 30.9% between 2023 and 2028.

The Artificial Intelligence (AI) market in marketing is experiencing significant growth, driven by the increasing adoption of cloud-based applications and services. This shift towards cloud solutions enables businesses to leverage AI technologies more efficiently and cost-effectively, enhancing their marketing capabilities. Furthermore, the ongoing digitalization and expanding internet penetration are fueling the demand for AI solutions in marketing, as companies seek to engage with customers more effectively in the digital space. However, the market's growth is not without challenges. The lack of skilled professionals poses a significant obstacle to wider AI adoption in marketing. As AI applications become more complex, the need for specialized expertise in areas such as machine learning, data analytics, and programming grows. Companies must invest in upskilling their workforce or partner with external experts to overcome this challenge and fully capitalize on the opportunities presented by AI in marketing.

What will be the Size of the Artificial Intelligence In Marketing during the forecast period?

Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report.
Request Free Sample

Artificial intelligence (AI) continues to reshape marketing landscapes, with dynamic market activities unfolding across various sectors. Machine learning models optimize digital marketing strategies, enabling predictive analytics for marketing ROI and customer engagement. Brands build stronger connections through AI-powered personalization and sentiment analysis. Data privacy regulations necessitate transparency and accountability, influencing marketing technology stacks and Data Security measures. A/B testing and conversion rate optimization are enhanced through AI-driven insights, while marketing automation workflows streamline customer relationship management. Marketing analytics software and dashboards provide data-driven insights, enabling marketing budget allocation and multi-channel marketing strategies. Behavioral targeting and customer journey mapping are refined through AI, enhancing marketing attribution models and email marketing automation.

Virtual assistants and chatbots facilitate seamless customer experiences, while marketing automation platforms optimize search engine optimization, pay-per-click advertising, and social media advertising. Natural language processing and AI marketing consultants aid content marketing strategies, ensuring algorithmic bias and ethical AI considerations remain at the forefront. Marketing dynamics remain in a constant state of evolution, with AI-driven innovations continuing to transform the industry. Data Governance, marketing attribution models, and programmatic advertising are among the many areas where AI is making an impact. The ongoing integration of AI into marketing technologies and strategies ensures a continuously adaptive and effective marketing landscape.

How is this Artificial Intelligence Ining Industry segmented?

The artificial intelligence ining industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

Deployment On-premises Cloud Application Social Media Advertising Search Engine Marketing/ Search Advertising Virtual Assistant Content Curation Sales & Marketing Automation Analytics Platform Others Technology Machine Learning Natural Language Processing Computer Vision Others Geography North America US Canada Europe Germany UK APAC China Japan Australia India South America Brazil Argentina Middle East and Africa UAE Rest of World (ROW)

By Deployment Insights

The on-premises segment is estimated to witness significant growth during the forecast period.

Artificial Intelligence (AI) is revolutionizing marketing, with machine learning models at its core. Brands are building stronger connections with consumers through AI-driven personalization and predictive analytics. A/B testing and marketing analytics software enable data-driven insights, while conversion rate optimization and marketing automation workflows streamline campaigns. Data privacy regulations ensure transparency and accountability, shaping marketing strategies. Behavioral targeting and sentiment analysis provide deeper customer understanding, enhancing customer engagement. Predictive analytics and marketing ROI are key performance indicators, driving marketing budget allo

Facebook

Twitter

Click to copy link

Link copied

Cite

Sandro Speth (2023). Investigating the Use of AI-Generated Exercises for Beginner and Intermediate Programming Courses: A ChatGPT Case Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7763310

Data from: Investigating the Use of AI-Generated Exercises for Beginner and Intermediate Programming Courses: A ChatGPT Case Study

Explore at:

Dataset updated

Oct 6, 2023

Dataset provided by

Steffen Becker
Niklas Meißner
Sandro Speth

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In recent years, artificial intelligence (AI) has been increasingly used in education and supports teachers in creating educational material and students in their learning progress. AI- driven learning support has recently been further strengthened by the release of ChatGPT, in which users can retrieve expla- nations for various concepts in a few minutes through chat. However, to what extent the use of AI models, such as ChatGPT, is suitable for the creation of didactically and content-wise good exercises for programming courses is not yet known. Therefore, in this paper, we investigate the use of AI-generated exercises for beginner and intermediate programming courses in higher education using ChatGPT. We created 12 exercise sheets with ChatGPT for a beginner to intermediate programming course focusing on the objects-first approach. We report our process, prompts, and experience using ChatGPT for this task and outline good practices we identified. The generated exercises are assessed and revised, primarily using ChatGPT, until they met the requirements of the programming course. We assessed the quality of these exercises by using them in our course as external teaching assignment at the University of Education Ludwigsburg and let the students evaluate them. Results indicate the quality of the generated exercises and the time-saving for creating them using ChatGPT. However, our experience showed that while it is fast to generate a good version of an exercise, almost every exercise requires minor manual changes to improve its quality.

Clear search

Close search

Google apps

Main menu

Data from: Investigating the Use of AI-Generated Exercises for Beginner and...

Artificial Intelligence (AI) Training Dataset Market Research Report 2033

Artificial Intelligence (AI) Training Dataset Market Outlook

Data Type Analysis

AI Training Dataset Market Report | Global Forecast From 2025 To 2033

AI Training Dataset Market Outlook

Data Type Analysis

The bAbI tasks data

Context

Content

Acknowledgements

Inspiration

References

Data_Sheet_1_Advanced large language models and visualization tools for data...

The National Artificial Intelligence Research And Development Strategic Plan...

Final Data Set For Model Training Merged Dataset

Final Data Set For Model Training Merged

Polish Open Ended Question Answer Text Dataset

What’s Included

Data-Centric-Visual-AI-Challenge-Train-Set

Load the dataset

Note: other available arguments include 'max_samples', etc

Launch the App

Global Fraud Detection Data | AI Training Data for Damaged Cars | 10K+...

Artificial Intelligence Programmer Report

AI Basic Data Service Report

Trojan Detection Software Challenge -...

Notable AI Models

AI and machine learning: most used API types 2019

Annotated Imagery Data | AI Training Data| Face ID + 106 key points facial...

DMSP Particle Precipitation AI-ready Data

Share of IT professionals who use AI tools daily worldwide 2023, by...

Global Artificial Intelligence as a Service Market Research Report: By...

Artificial Intelligence In Marketing Market Analysis North America, APAC,...

Snapshot img

Data from: Investigating the Use of AI-Generated Exercises for Beginner and Intermediate Programming Courses: A ChatGPT Case StudySee More Versions

Data from: Investigating the Use of AI-Generated Exercises for Beginner and Intermediate Programming Courses: A ChatGPT Case Study