Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains two datasets used in the study exploring the impact of Generative AI, specifically ChatGPT, on the public sector workforce in the United States. The datasets provide detailed information on the core tasks of public sector occupations and their estimated performance metrics, including potential for automation and augmentation by ChatGPT. These estimations are generated by OpenAI’s GPT-4 model (GPT-4-1106-preview) through OpenAI API.
Facebook
TwitterThe rapid advancements in generative AI models present new opportunities in the education sector. However, it is imperative to acknowledge and address the potential risks and concerns that may arise with their use. We collected Twitter data to identify key concerns related to the use of ChatGPT in education. This dataset is used to support the study "ChatGPT in education: A discourse analysis of worries and concerns on social media."
In this study, we particularly explored two research questions. RQ1 (Concerns): What are the key concerns that Twitter users perceive with using ChatGPT in education? RQ2 (Accounts): Which accounts are implicated in the discussion of these concerns? In summary, our study underscores the importance of responsible and ethical use of AI in education and highlights the need for collaboration among stakeholders to regulate AI policy.
Facebook
TwitterAnalysis of 13,252 publicly shared ChatGPT conversations by WebFX to uncover usage statistics - prompt length, message count, question vs command distribution, use-case categories.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Discover how AI code interpreters are revolutionizing data visualization, reducing chart creation time from 20 to 5 minutes while simplifying complex statistical analysis.
Facebook
TwitterThe potential of using Chat GPT and AI to revolutionize the way we interact with computers, specifically in the field of medical diagnostics. Chat GPT can make conversations between doctors and patients more natural, while AI can analyze vast amounts of patient data to identify trends and estimate a patient’s health. Patients can use Chat GPT to better understand their medical conditions, and both Chat GPT and AI can be used to automate tasks such as scheduling appointments and processing test results. However, there are limitations to using AI, including data bias, complex results, and analysis errors. To reduce errors, it is important to validate findings using various techniques and ensure that data is accurate and up-to-date. Chat GPT also employs security measures to protect patient data privacy and confidentiality.
Facebook
Twitterhttps://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
Here you can find about 50K comments on Reddit website regarding ChatGPT . The comments are gathered from Reddit's Posts from 4 subreddits.
The data includes comment_id, comment_parent_id, comment_body and subreddit
The Date and other information related to comments will be added in the next version. This dataset is useful to get insight about the public take on ChatGPT and also for text analysis, text visualizations, Inline Question Answering, Text Summarization, NER and other tasks like clustering and so on.
Please note that this dataset is not cleaned or preprocessed so if you want to get your hands dirty with data, it's a good practice to level up your skills in data cleaning too :)
And please don't forget to UPVOTE it in case you find it useful and enjoy it.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveTo assess the competence of students and academic staff to use generative artificial intelligence (GenAI) as a tool in epidemiological data analyses in a randomised controlled trial (RCT).MethodsWe invited postgraduate students and academic staff at the Swiss Tropical and Public Health Institute to the RCT. Participants were randomized to analyse a simulated cross-sectional dataset using ChatGPT’s code interpreter (integrated analysis arm) vs. a statistical software (R/Stata) with ChatGPT as a support tool (distributed analysis arm). The primary outcome was the trial task score (out of 17, using an assessment rubric). Secondary outcome was the time to complete the task.ResultsWe invited 338 and randomized 31 participants equally to the two study arms and 30 participants submitted results. Overall, there was no statistically significant difference in mean task scores between the distributed analysis arm (8.5, ±4.6) and the integrated analysis arm (9.4, ±3.8), with a mean difference of 0.93 (p = 0.55). Mean task completion time was significantly shorter in the integrated analysis arm compared to the distributed analysis arm.ConclusionWhile ChatGPT offers advantages, its effective use requires a careful balance of GenAI capabilities and human expertise.
Facebook
Twitterhttps://tickertrends.io/termshttps://tickertrends.io/terms
Monthly dataset tracking topic frequency, keyword volume, and conversation patterns across ChatGPT discussions. Data is normalized on a 0 to 100 scale for easy comparison. Aggregates millions of AI interactions to reveal emerging trends, user interests, and discussion momentum across technology, finance, health, education, and business categories.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a CSV file related to ChatGPT including keywords(chatgpt, chat gpt) #hashtags and @mentions about ChatGPT. OpenAI's conversational AI model. The file includes information on 500,000 tweets. The dataset aims to help understand public opinion, trends, and potential applications of ChatGPT by analyzing tweet volume, sentiment, user engagement, and the influence of key AI events. The dataset offers valuable insights for companies, researchers, and policymakers, allowing them to make informed decisions and shape the future of AI-powered conversational technologies.
Check out my Comprehensive Analysis on this dataset: Medium article "Cracking the ChatGPT Code: A Deep Dive into 500,000 Tweets using Advanced NLP Techniques"
Learn about the collection process in Medium article "Effortlessly Scraping Massive Twitter Data"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the links to all the related experiments that I run related to my article titled Using "LLM for finding security vulnerabilities."
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the dataset and code used in the study titled “Academic Discourse on ChatGPT in Social Sciences: A Topic Modeling and Sentiment Analysis of Research Article Abstracts.” The study explores how social science scholars frame and evaluate ChatGPT by analyzing 1,227 SSCI-indexed abstracts using Latent Dirichlet Allocation (LDA) topic modeling and lexicon-based sentiment analysis. The data include the collected abstracts (with metadata), while the code files provide the full analytical pipeline in Python and R, covering preprocessing, topic modeling, sentiment scoring using the NRC Emotion Lexicon, and visualization scripts. This repository supports transparency, reproducibility, and reuse of the study’s computational methods and underlying materials.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Artificial Intelligence (AI) applications are expected to promote government service delivery and quality, more efficient handling of cases, and bias reduction in decision-making. One potential benefit of the AI tool ChatGPT is that it may support governments in the anonymization of data. However, it is not clear whether ChatGPT is appropriate to support data anonymization for public organizations. Hence, this study examines the possibilities, risks, and ethical implications for government organizations to employ ChatGPT in the anonymization of personal data. We use a case study approach, combining informal conversations, formal interviews, a literature review, document analysis and experiments to conduct a three-step study. First, we describe the technology behind ChatGPT and its operation. Second, experiments with three types of data (fake data, original literature and modified literature) show that ChatGPT exhibits strong performance in anonymizing these three types of texts. Third, an overview of significant risks and ethical issues related to ChatGPT and its use for anonymization within a specific government organization was generated, including themes such as privacy, responsibility, transparency, bias, human intervention, and sustainability. One significant risk in the current form of ChatGPT is a privacy risk, as inputs are stored and forwarded to OpenAI and potentially other parties. This is unacceptable if texts containing personal data are anonymized with ChatGPT. We discuss several potential solutions to address these risks and ethical issues. This study contributes to the scarce scientific literature on the potential value of employing ChatGPT for personal data anonymization in government. In addition, this study has practical value for civil servants who face the challenges of data anonymization in practice including resource-intensive and costly processes.
Facebook
TwitterChatGPT is widely used for writing tasks, yet its effects on medical students’ academic writing remain underexplored. This study aims to elucidate ChatGPT’s impact on academic writing efficiency and quality among medical students, while also evaluating students’ attitudes towards its use in academic writing. We collected systematic reviews from 130 third-year medical students and administered a questionnaire to assess ChatGPT usage and student attitudes. Three independent reviewers graded the papers using EASE guidelines, and statistical analysis compared articles generated with or without ChatGPT assistance across various parameters, with rigorous quality control ensuring survey reliability and validity. In this study, 33 students (25.8%) utilized ChatGPT for writing (ChatGPT group) and 95 (74.2%) did not (Control group). The ChatGPT group exhibited significantly higher daily technology use and prior experience with ChatGPT (p < 0.05). Writing time was significantly reduced in the ChatGPT group (p = 0.04), with 69.7% completing tasks within 2–3 days compared to 48.4% in the control group. They also achieved higher article quality scores (p < 0.0001) with improvements in completeness, credibility, and scientific content. Self-assessment indicated enhanced writing skills (p < 0.01), confidence (p < 0.001), satisfaction (p < 0.001) and a positive attitude toward its future use in the ChatGPT group. Integrating ChatGPT in medical academic writing, with proper guidance, improves efficiency and quality, illustrating artificial intelligence’s potential in shaping medical education methodologies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundClinical data is instrumental to medical research, machine learning (ML) model development, and advancing surgical care, but access is often constrained by privacy regulations and missing data. Synthetic data offers a promising solution to preserve privacy while enabling broader data access. Recent advances in large language models (LLMs) provide an opportunity to generate synthetic data with reduced reliance on domain expertise, computational resources, and pre-training.ObjectiveThis study aims to assess the feasibility of generating realistic tabular clinical data with OpenAI’s GPT-4o using zero-shot prompting, and evaluate the fidelity of LLM-generated data by comparing its statistical properties to the Vital Signs DataBase (VitalDB), a real-world open-source perioperative dataset.MethodsIn Phase 1, GPT-4o was prompted to generate a dataset with qualitative descriptions of 13 clinical parameters. The resultant data was assessed for general errors, plausibility of outputs, and cross-verification of related parameters. In Phase 2, GPT-4o was prompted to generate a dataset using descriptive statistics of the VitalDB dataset. Fidelity was assessed using two-sample t-tests, two-sample proportion tests, and 95% confidence interval (CI) overlap.ResultsIn Phase 1, GPT-4o generated a complete and structured dataset comprising 6,166 case files. The dataset was plausible in range and correctly calculated body mass index for all case files based on respective heights and weights. Statistical comparison between the LLM-generated datasets and VitalDB revealed that Phase 2 data achieved significant fidelity. Phase 2 data demonstrated statistical similarity in 12/13 (92.31%) parameters, whereby no statistically significant differences were observed in 6/6 (100.0%) categorical/binary and 6/7 (85.71%) continuous parameters. Overlap of 95% CIs were observed in 6/7 (85.71%) continuous parameters.ConclusionZero-shot prompting with GPT-4o can generate realistic tabular synthetic datasets, which can replicate key statistical properties of real-world perioperative data. This study highlights the potential of LLMs as a novel and accessible modality for synthetic data generation, which may address critical barriers in clinical data access and eliminate the need for technical expertise, extensive computational resources, and pre-training. Further research is warranted to enhance fidelity and investigate the use of LLMs to amplify and augment datasets, preserve multivariate relationships, and train robust ML models.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all available conversations from chatlogs.net between users and ChatGPT. Version 1 contains all conversations available up to the cutoff date of April 4, 2023. Version 1 contains all conversations available up to the cutoff date of April 20, 2023.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundChatGPT, developed by OpenAI, is an artificial intelligence software designed to generate text-based responses. The objective of this study is to evaluate the accuracy and consistency of ChatGPT’s responses to single-choice questions pertaining to carbon monoxide poisoning. This evaluation will contribute to our understanding of the reliability of ChatGPT-generated information in the medical field.MethodsThe questions utilized in this study were selected from the "Medical Exam Assistant (Yi Kao Bang)" application and encompassed a range of topics related to carbon monoxide poisoning. A total of 44 single-choice questions were included in the study following a screening process. Each question was entered into ChatGPT ten times in Chinese, followed by a translation into English, where it was also entered ten times. The responses generated by ChatGPT were subjected to statistical analysis with the objective of assessing their accuracy and consistency in both languages. In this assessment process, the "Medical Exam Assistant (Yi Kao Bang)" reference responses were employed as benchmarks. The data analysis was conducted using the Python.ResultsIn approximately 50% of the cases, the responses generated by ChatGPT exhibited a high degree of consistency, whereas in approximately one-third of the cases, the responses exhibited unacceptable blurring of the answers. Meanwhile, the accuracy of these responses was less favorable, with an accuracy rate of 61.1% in Chinese and 57% in English. This indicates that ChatGPT could be enhanced with respect to both consistency and accuracy in responding to queries pertaining to carbon monoxide poisoning.ConclusionsIt is currently evident that the consistency and accuracy of responses generated by ChatGPT regarding carbon monoxide poisoning is inadequate. Although it offers significant insights, it should not supersede the role of healthcare professionals in making clinical decisions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the raw data that is used in the publication: ChatGPT as an education and learning tool for engineering, technology and general studies: performance analysis of ChatGPT 3.0 on CSE, GATE and JEE examinations of India.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset corresponds to the study carried out to analyse 10 bibliographic references of 10 Spanish authors in the field of Information Sciences requested to the ChatGPT chatbot.
The file "Bibliographic_references_ analysis" contains the 10 references returned by ChatGPT for each of the 10 authors (a total of 100 references), together with the variables analysed to check their authenticity.
The "Keywords_analysis" file contains the normalisation carried out on the words considered to be key words extracted from the titles of the works, according to which a word cloud showing the frequency of occurrence could be drawn up.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ChatGPT has forever changed the way that many industries operate. Much of the focus of Artificial Intelligence (AI) has been on their ability to generate text. However, it is likely that their ability to generate computer codes and scripts will also have a major impact. We demonstrate the use of ChatGPT to generate Python scripts to perform hydrological analyses and highlight the opportunities, limitations and risks that AI poses in the hydrological sciences.
Here, we provide four worked examples of the use of ChatGPT to generate scripts to conduct hydrological analyses. We also provide a full list of the libraries available to the ChatGPT Advanced Data Analysis plugin (only available in the paid version). These files relate to a manuscript that is to be submitted to Hydrological Processes. The authors of the manuscript are Dylan J. Irvine, Landon J.S. Halloran and Philip Brunner.
If you find these examples useful and/or use them, we would appreciate if you could cite the associated publication in Hydrological Processes. Details to be made available upon final publication.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study explores the potential of ChatGPT, a large language model, in scientometrics by assessing its ability to predict citation counts, Mendeley readers, and social media engagement. In this study, 2222 abstracts from PLOS ONE articles published during the initial months of 2022 were analyzed using ChatGPT-4, which used a set of 60 criteria to assess each abstract. Using a principal component analysis, three components were identified: Quality and Reliability, Accessibility and Understandability, and Novelty and Engagement. The Accessibility and Understandability of the abstracts correlated with higher Mendeley readership, while Novelty and Engagement and Accessibility and Understandability were linked to citation counts (Dimensions, Scopus, Google Scholar) and social media attention. Quality and Reliability showed minimal correlation with citation and altmetrics outcomes. Finally, it was found that the predictive correlations of ChatGPT-based assessments surpassed traditional readability metrics. The findings highlight the potential of large language models in scientometrics and possibly pave the way for AI-assisted peer review.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This repository contains two datasets used in the study exploring the impact of Generative AI, specifically ChatGPT, on the public sector workforce in the United States. The datasets provide detailed information on the core tasks of public sector occupations and their estimated performance metrics, including potential for automation and augmentation by ChatGPT. These estimations are generated by OpenAI’s GPT-4 model (GPT-4-1106-preview) through OpenAI API.