Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Discover how AI code interpreters are revolutionizing data visualization, reducing chart creation time from 20 to 5 minutes while simplifying complex statistical analysis.
Facebook
Twitterhttps://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
Here you can find about 50K comments on Reddit website regarding ChatGPT . The comments are gathered from Reddit's Posts from 4 subreddits.
The data includes comment_id, comment_parent_id, comment_body and subreddit
The Date and other information related to comments will be added in the next version. This dataset is useful to get insight about the public take on ChatGPT and also for text analysis, text visualizations, Inline Question Answering, Text Summarization, NER and other tasks like clustering and so on.
Please note that this dataset is not cleaned or preprocessed so if you want to get your hands dirty with data, it's a good practice to level up your skills in data cleaning too :)
And please don't forget to UPVOTE it in case you find it useful and enjoy it.
Facebook
TwitterThe rapid advancements in generative AI models present new opportunities in the education sector. However, it is imperative to acknowledge and address the potential risks and concerns that may arise with their use. We collected Twitter data to identify key concerns related to the use of ChatGPT in education. This dataset is used to support the study "ChatGPT in education: A discourse analysis of worries and concerns on social media."
In this study, we particularly explored two research questions. RQ1 (Concerns): What are the key concerns that Twitter users perceive with using ChatGPT in education? RQ2 (Accounts): Which accounts are implicated in the discussion of these concerns? In summary, our study underscores the importance of responsible and ethical use of AI in education and highlights the need for collaboration among stakeholders to regulate AI policy.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains a total of 193,154 unique reviews about ChatGPT collected between 2023-07-25 and 2024-08-23, providing valuable insights into user feedback and ratings. The reviews are categorized into three groups, with 4% classified as "Good," 1% as "Nice," and the remaining 95% falling into the "Other" category. Ratings in the dataset range from 1 to 5, with 5 being the maximum and 1 the minimum, reflecting user satisfaction levels. The dataset is structured into 4 columns, likely including the review content, category, date, and rating. This data offers a comprehensive overview of user sentiment towards ChatGPT, making it ideal for analyzing satisfaction trends, identifying areas for improvement, and understanding user experiences with the AI service.
Facebook
TwitterAnalysis of 13,252 publicly shared ChatGPT conversations by WebFX to uncover usage statistics - prompt length, message count, question vs command distribution, use-case categories.
Facebook
TwitterThe potential of using Chat GPT and AI to revolutionize the way we interact with computers, specifically in the field of medical diagnostics. Chat GPT can make conversations between doctors and patients more natural, while AI can analyze vast amounts of patient data to identify trends and estimate a patient’s health. Patients can use Chat GPT to better understand their medical conditions, and both Chat GPT and AI can be used to automate tasks such as scheduling appointments and processing test results. However, there are limitations to using AI, including data bias, complex results, and analysis errors. To reduce errors, it is important to validate findings using various techniques and ensure that data is accurate and up-to-date. Chat GPT also employs security measures to protect patient data privacy and confidentiality.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the dataset and code used in the study titled “Academic Discourse on ChatGPT in Social Sciences: A Topic Modeling and Sentiment Analysis of Research Article Abstracts.” The study explores how social science scholars frame and evaluate ChatGPT by analyzing 1,227 SSCI-indexed abstracts using Latent Dirichlet Allocation (LDA) topic modeling and lexicon-based sentiment analysis. The data include the collected abstracts (with metadata), while the code files provide the full analytical pipeline in Python and R, covering preprocessing, topic modeling, sentiment scoring using the NRC Emotion Lexicon, and visualization scripts. This repository supports transparency, reproducibility, and reuse of the study’s computational methods and underlying materials.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Artificial Intelligence (AI) applications are expected to promote government service delivery and quality, more efficient handling of cases, and bias reduction in decision-making. One potential benefit of the AI tool ChatGPT is that it may support governments in the anonymization of data. However, it is not clear whether ChatGPT is appropriate to support data anonymization for public organizations. Hence, this study examines the possibilities, risks, and ethical implications for government organizations to employ ChatGPT in the anonymization of personal data. We use a case study approach, combining informal conversations, formal interviews, a literature review, document analysis and experiments to conduct a three-step study. First, we describe the technology behind ChatGPT and its operation. Second, experiments with three types of data (fake data, original literature and modified literature) show that ChatGPT exhibits strong performance in anonymizing these three types of texts. Third, an overview of significant risks and ethical issues related to ChatGPT and its use for anonymization within a specific government organization was generated, including themes such as privacy, responsibility, transparency, bias, human intervention, and sustainability. One significant risk in the current form of ChatGPT is a privacy risk, as inputs are stored and forwarded to OpenAI and potentially other parties. This is unacceptable if texts containing personal data are anonymized with ChatGPT. We discuss several potential solutions to address these risks and ethical issues. This study contributes to the scarce scientific literature on the potential value of employing ChatGPT for personal data anonymization in government. In addition, this study has practical value for civil servants who face the challenges of data anonymization in practice including resource-intensive and costly processes.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset consists of daily-updated user reviews and ratings for the ChatGPT Android App. The dataset includes several key attributes that capture various aspects of the reviews, providing insights into user experiences and feedback over time.
Facebook
TwitterAs large language models (LLMs) have become more deeply integrated into various sectors, understanding how they make moral judgements has become crucial, particularly in the realm of autonomous driving. This study used the moral machine framework to investigate the ethical decision-making tendencies of prominent LLMs, including GPT-3.5, GPT-4, PaLM 2 and Llama 2, to compare their responses with human preferences. While LLMs' and humans' preferences such as prioritizing humans over pets and favouring saving more lives are broadly aligned, PaLM 2 and Llama 2, especially, evidence distinct deviations. Additionally, despite the qualitative similarities between the LLM and human preferences, there are significant quantitative disparities, suggesting that LLMs might lean toward more uncompromising decisions, compared with the milder inclinations of humans. These insights elucidate the ethical frameworks of LLMs and their potential implications for autonomous driving., Using the MM methodology detailed in the supplementary information of https://www.nature.com/articles/s41586-018-0637-6, we implemented code for generating Moral Machine scenarios. After generating the MM scenarios, responses from GPT-3.5, GPT-4, PaLM 2, and Llama 2 were collected using the application programming interface (API) and relevant code. We applied the conjoint analysis framework to evaluate the relative importance of the nine preferences., , # Data and Code on the Moral Machine Experiment on Large Language Models
https://doi.org/10.5061/dryad.d7wm37q6v
pip install -r requirements.txt
NOTE: The script run_chatgpt.py requires an OpenAI API key. Please obtain your API key by following OpenAI's instructions. To run the script run_palm2.py, setup is required. Please refer to the Google Cloud instructions. Specifically, follow these sections in the given order: 1) Set up a project and a development environment and 2) Install the Vertex AI SDK for Python. Before running run_llama2.py, the Llama2 model files must be downloaded. Please follow [the instructi...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains the results of developing alternative text for images using chatbots based on large language models. The study was carried out in April-June 2024. Microsoft Copilot, Google Gemini, and YandexGPT chatbots were used to generate 108 text descriptions for 12 images. Descriptions were generated by chatbots using keywords specified by a person. The experts then rated the resulting descriptions on a Likert scale (from 1 to 5). The data set is presented in a Microsoft Excel table on the “Data” sheet with the following fields: record number; image number; chatbot; image type (photo, logo); request date; list of keywords; number of keywords; length of keywords; time of compilation of keywords; generated descriptions; required length of descriptions; actual length of descriptions; description generation time; usefulness; reliability; completeness; accuracy; literacy. The “Images” sheet contains links to the original images. Data set is presented in Russian.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ChatGPT has forever changed the way that many industries operate. Much of the focus of Artificial Intelligence (AI) has been on their ability to generate text. However, it is likely that their ability to generate computer codes and scripts will also have a major impact. We demonstrate the use of ChatGPT to generate Python scripts to perform hydrological analyses and highlight the opportunities, limitations and risks that AI poses in the hydrological sciences.
Here, we provide four worked examples of the use of ChatGPT to generate scripts to conduct hydrological analyses. We also provide a full list of the libraries available to the ChatGPT Advanced Data Analysis plugin (only available in the paid version). These files relate to a manuscript that is to be submitted to Hydrological Processes. The authors of the manuscript are Dylan J. Irvine, Landon J.S. Halloran and Philip Brunner.
If you find these examples useful and/or use them, we would appreciate if you could cite the associated publication in Hydrological Processes. Details to be made available upon final publication.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Objective: Our objective is to evaluate the efficacy of ChatGPT 4 in accurately and effectively delivering genetic information, building on previous findings with ChatGPT 3.5. We focus on assessing the utility, limitations, and ethical implications of using ChatGPT in medical settings. Materials and Methods: A structured questionnaire, including the Brief User Survey (BUS-15) and custom questions, was developed to assess ChatGPT 4's clinical value. An expert panel of genetic counselors and clinical geneticists independently evaluated ChatGPT 4's responses to these questions. We also involved comparative analysis with ChatGPT 3.5, utilizing descriptive statistics and using R for data analysis. Results: ChatGPT 4 demonstrated improvements over 3.5 in context recognition, relevance, and informativeness. However, performance variability and concerns about the naturalness of the output were noted. No significant difference in accuracy was found between ChatGPT 3.5 and 4.0. Notably, the efficacy of ChatGPT 4 varied significantly across different genetic conditions, with specific differences identified between responses related to BRCA1 and HFE. Discussion and Conclusion: This study highlights ChatGPT 4's potential in genomics, noting significant advancements over its predecessor. Despite these improvements, challenges remain, including the risk of outdated information and the necessity of ongoing refinement. The variability in performance across different genetic conditions underscores the need for expert oversight and continuous AI training. ChatGPT 4, while showing promise, emphasizes the importance of balancing technological innovation with ethical responsibility in healthcare information delivery. Methods Study Design This study was conducted to evaluate the performance of ChatGPT 4 (March 23rd, 2023) Model) in the context of genetic counseling and education. The evaluation involved a structured questionnaire, which included questions selected from the Brief User Survey (BUS-15) and additional custom questions designed to assess the clinical value of ChatGPT 4's responses. Questionnaire Development The questionnaire was built on Qualtrics, which comprised twelve questions: seven selected from the BUS-15 preceded by two additional questions that we designed. The initial questions focused on quality and answer relevancy: 1. The overall quality of the Chatbot’s response is: (5-point Likert: Very poor to Very Good) 2. The Chatbot delivered an answer that provided the relevant information you would include if asked the question. (5-point Likert: Strongly disagree to Strongly agree) The BUS-15 questions (7-point Likert: Strongly disagree to Strongly agree) focused on: 1. Recognition and facilitation of users’ goal and intent: Chatbot seems able to recognize the user’s intent and guide the user to its goals. 2. Relevance of information: The chatbot provides relevant and appropriate information/answer to people at each stage to make them closer to their goal. 3. Maxim of quantity: The chatbot responds in an informative way without adding too much information. 4. Resilience to failure: Chatbot seems able to find ways to respond appropriately even when it encounters situations or arguments it is not equipped to handle. 5. Understandability and politeness: The chatbot seems able to understand input and convey correct statements and answers without ambiguity and with acceptable manners. 6. Perceived conversational credibility: The chatbot responds in a credible and informative way without adding too much information. 7. Meet the neurodiverse needs: Chatbot seems able to meet needs and be used by users independently form their health conditions, well-being, age, etc. Expert Panel and Data Collection A panel of experts (two genetic counselors and two clinical geneticists) was provided with a link to the survey containing the questions. They independently evaluated the responses from ChatGPT 4 without discussing the questions or answers among themselves until after the survey submission. This approach ensured unbiased evaluation.
Facebook
Twitterhttps://www.marketreportanalytics.com/privacy-policyhttps://www.marketreportanalytics.com/privacy-policy
The text analytics market is experiencing robust growth, projected to reach $10.49 billion in 2025 and exhibiting a remarkable Compound Annual Growth Rate (CAGR) of 39.90% from 2019 to 2033. This expansion is fueled by several key drivers. The increasing volume of unstructured data generated across various industries, including healthcare, finance, and customer service, necessitates sophisticated tools for extracting actionable insights. Furthermore, advancements in natural language processing (NLP), machine learning (ML), and artificial intelligence (AI) are empowering text analytics solutions with enhanced capabilities, such as sentiment analysis, topic modeling, and entity recognition. The rising adoption of cloud-based solutions also contributes to market growth, offering scalability, cost-effectiveness, and ease of access. Major industry players like IBM, Microsoft, and SAP are actively investing in research and development, driving innovation and expanding the market's capabilities. Competitive pressures are fostering a continuous improvement in the accuracy and efficiency of text analytics tools, making them increasingly attractive to businesses of all sizes. The growing demand for real-time insights and improved customer experience further propels market expansion. While the market enjoys significant growth momentum, certain challenges persist. Data security and privacy concerns remain paramount, necessitating robust security measures within text analytics platforms. The complexity of implementing and integrating these solutions into existing IT infrastructures can also pose a barrier to adoption, particularly for smaller businesses lacking dedicated data science teams. Furthermore, the accuracy and reliability of text analytics outputs can be affected by the quality and consistency of the input data. Overcoming these challenges through improved data governance, user-friendly interfaces, and robust customer support will be crucial for continued market expansion. Despite these restraints, the overall market outlook remains positive, driven by the continuous evolution of technology and the growing reliance on data-driven decision-making across diverse sectors. Recent developments include: January 2023- Microsoft announced a new multibillion-dollar investment in ChatGPT maker Open AI. ChatGPT, automatically generates text based on written prompts in a more creative and advanced than the chatbots. Through this investment, the company will accelerate breakthroughs in AI, and both companies will commercialize advanced technologies., November 2022 - Tntra and Invenio have partnered to develop a platform that offers comprehensive data analysis on a firm. Throughout the process, Tntra offered complete engineering support and cooperation to Invenio. Tantra offers feeds, knowledge graphs, intelligent text extraction, and analytics, which enables Invenio to give information on seven parts of the business, such as false news identification, subject categorization, dynamic data extraction, article summaries, sentiment analysis, and keyword extraction.. Key drivers for this market are: Growing Demand for Social Media Analytics, Rising Practice of Predictive Analytics. Potential restraints include: Growing Demand for Social Media Analytics, Rising Practice of Predictive Analytics. Notable trends are: Retail and E-commerce to Hold a Significant Share in Text Analytics Market.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Given the importance of conversation practice in language learning, chatbots, especially ChatGPT, have attracted considerable attention for their ability to converse with learners using natural language. This review contributes to the literature by examining the currently unclear overall effect of using chatbots on language learning performance and comprehensively identifying important study characteristics that affect the overall effectiveness. We meta-analyzed 70 effect sizes from 28 studies, using robust variance estimation. The effects were assessed based on 18 study characteristics about learners, chatbots, learning objectives, context, communication/interaction, and methodological and pedagogical designs. Results indicated that using chatbots produced a positive overall effect on language learning performance (g = 0.486), compared to non-chatbot conditions. Moreover, four characteristics (i.e., educational level, language level, interface design, and interaction capability) affected the overall effectiveness. In an in-depth discussion on how the 18 characteristics are related to the effectiveness, future implications for practice and research are presented.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundChatGPT, developed by OpenAI, is an artificial intelligence software designed to generate text-based responses. The objective of this study is to evaluate the accuracy and consistency of ChatGPT’s responses to single-choice questions pertaining to carbon monoxide poisoning. This evaluation will contribute to our understanding of the reliability of ChatGPT-generated information in the medical field.MethodsThe questions utilized in this study were selected from the "Medical Exam Assistant (Yi Kao Bang)" application and encompassed a range of topics related to carbon monoxide poisoning. A total of 44 single-choice questions were included in the study following a screening process. Each question was entered into ChatGPT ten times in Chinese, followed by a translation into English, where it was also entered ten times. The responses generated by ChatGPT were subjected to statistical analysis with the objective of assessing their accuracy and consistency in both languages. In this assessment process, the "Medical Exam Assistant (Yi Kao Bang)" reference responses were employed as benchmarks. The data analysis was conducted using the Python.ResultsIn approximately 50% of the cases, the responses generated by ChatGPT exhibited a high degree of consistency, whereas in approximately one-third of the cases, the responses exhibited unacceptable blurring of the answers. Meanwhile, the accuracy of these responses was less favorable, with an accuracy rate of 61.1% in Chinese and 57% in English. This indicates that ChatGPT could be enhanced with respect to both consistency and accuracy in responding to queries pertaining to carbon monoxide poisoning.ConclusionsIt is currently evident that the consistency and accuracy of responses generated by ChatGPT regarding carbon monoxide poisoning is inadequate. Although it offers significant insights, it should not supersede the role of healthcare professionals in making clinical decisions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This study investigates the potential of ChatGPT 4 in the assessment of personality traits based on written texts. Using two publicly available datasets containing both written texts and self-assessments of the authors’ psychological traits based on the Big Five model, we aimed to evaluate the predictive performance of ChatGPT 4. For each sample text, we asked for numerical predictions on an eleven-point scale and compared them with the self-assessments. We also asked for ChatGPT 4 confidence scores on an eleven-point scale for each prediction. To keep the study within a manageable scope, a zero-prompt modality was chosen, although more sophisticated prompting strategies could potentially improve performance. The results show that ChatGPT 4 has moderate but significant abilities to automatically infer personality traits from written text. However, it also shows limitations in recognizing whether the input text is appropriate or representative enough to make accurate inferences, which could hinder practical applications. Furthermore, the results suggest that improved benchmarking methods could increase the efficiency and reliability of the evaluation process. These results pave the way for a more comprehensive evaluation of the capabilities of Large Language Models in assessing personality traits from written texts.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a CSV file related to ChatGPT including keywords(chatgpt, chat gpt) #hashtags and @mentions about ChatGPT. OpenAI's conversational AI model. The file includes information on 500,000 tweets. The dataset aims to help understand public opinion, trends, and potential applications of ChatGPT by analyzing tweet volume, sentiment, user engagement, and the influence of key AI events. The dataset offers valuable insights for companies, researchers, and policymakers, allowing them to make informed decisions and shape the future of AI-powered conversational technologies.
Check out my Comprehensive Analysis on this dataset: Medium article "Cracking the ChatGPT Code: A Deep Dive into 500,000 Tweets using Advanced NLP Techniques"
Learn about the collection process in Medium article "Effortlessly Scraping Massive Twitter Data"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please be advised that this project is intended solely for instructional purposes and should not be used for actual research. This dataset is intended to complement the instructional material and provide a hands-on learning experience for the workshop: Handling and Sharing Qualitative Data Responsibly and Effectively.
This hypothetical research project is designed to demonstrate key concepts related to human subject qualitative data management and thematic analysis coding. It includes interview transcripts generated with ChatGPT 4.0 Mini for a fictional graduate student in Communication named Sarah, whose main research question is: How do content creators/digital influencers view their role in shaping their followers' consumer behavior, and what ethical dilemmas do they face when promoting products?
Given the novelty of this research topic and the limited academic literature available, Sarah hopes that the insights gained from this small-scale qualitative exploratory study will help identify key variables for a larger survey study with a representative sample of content creators/digital influencers across the U.S.
Sarah has previous experience with quantitative methods but is very new to qualitative research and could use our help for better handling the data. Having already conducted six short structured interviews with subjects from top revenue niches (i.e., Home Decor and DYI, Travel & Adventure, Fashion & Style, Health & Wellness, Finance & Investment, Beauty & Skincare) and planning to conduct a dozen more, Sarah is eager to begin engaging with the data she has collected so far and deciding how to best organize and interpret it. We’ll be walking her through this process, providing the necessary guidance and support for effective and responsible data management.
Interviews were conducted over Zoom and audio recorded with participants' consent. The interview included four main questions, which were consistent across all interviews:
Q1. Please tell me a little about your work as a content creator/digital influencer how it started, and how you have established yourself in your current niche.
Q2. In what ways do you believe content creators/digital influencers shape consumer behavior? Could you share any examples?
Q3. What strategies would you say content creators/digital influencers typically use to increase sales of sponsored products and services? Which ones have you used? What worked and what did not work for you? Why?
Q4. In your view, what are the essential ethical responsibilities that content creators and digital influencers should uphold? Can you share any personal experiences that illustrate these responsibilities in action?
Each interview generated approximately 15 minutes of audio recording, which Sarah manually transcribed. Sarah decided to keep the transcription true to the recordings and seek assistance to mitigate any risk of identification.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Demographic Analysis of Shopping Behavior: Insights and Recommendations
Dataset Information: The Shopping Mall Customer Segmentation Dataset comprises 15,079 unique entries, featuring Customer ID, age, gender, annual income, and spending score. This dataset assists in understanding customer behavior for strategic marketing planning.
Cleaned Data Details: Data cleaned and standardized, 15,079 unique entries with attributes including - Customer ID, age, gender, annual income, and spending score. Can be used by marketing analysts to produce a better strategy for mall specific marketing.
Challenges Faced: 1. Data Cleaning: Overcoming inconsistencies and missing values required meticulous attention. 2. Statistical Analysis: Interpreting demographic data accurately demanded collaborative effort. 3. Visualization: Crafting informative visuals to convey insights effectively posed design challenges.
Research Topics: 1. Consumer Behavior Analysis: Exploring psychological factors driving purchasing decisions. 2. Market Segmentation Strategies: Investigating effective targeting based on demographic characteristics.
Suggestions for Project Expansion: 1. Incorporate External Data: Integrate social media analytics or geographic data to enrich customer insights. 2. Advanced Analytics Techniques: Explore advanced statistical methods and machine learning algorithms for deeper analysis. 3. Real-Time Monitoring: Develop tools for agile decision-making through continuous customer behavior tracking. This summary outlines the demographic analysis of shopping behavior, highlighting key insights, dataset characteristics, team contributions, challenges, research topics, and suggestions for project expansion. Leveraging these insights can enhance marketing strategies and drive business growth in the retail sector.
References OpenAI. (2022). ChatGPT [Computer software]. Retrieved from https://openai.com/chatgpt. Mustafa, Z. (2022). Shopping Mall Customer Segmentation Data [Data set]. Kaggle. Retrieved from https://www.kaggle.com/datasets/zubairmustafa/shopping-mall-customer-segmentation-data Donkeys. (n.d.). Kaggle Python API [Jupyter Notebook]. Kaggle. Retrieved from https://www.kaggle.com/code/donkeys/kaggle-python-api/notebook Pandas-Datareader. (n.d.). Retrieved from https://pypi.org/project/pandas-datareader/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Discover how AI code interpreters are revolutionizing data visualization, reducing chart creation time from 20 to 5 minutes while simplifying complex statistical analysis.