Facebook
TwitterThis dataset was collected by me, along with my friends during my college days. The dataset mostly contains data from my friends and family members. This dataset has the survey data for the type of fitness practices that people follow.
This dataset wouldn't be here without the help of my friends. So, thanks to them!
Facebook
TwitterThis dataset is from the 2013 California Dietary Practices Survey of Adults. This survey has been discontinued. Adults were asked a series of eight questions about their physical activity practices in the last month. These questions were borrowed from the Behavior Risk Factor Surveillance System. Data displayed in this table represent California adults who met the aerobic recommendation for physical activity, as defined by the 2008 U.S. Department of Health and Human Services Physical Activity Guidelines for Americans and Objectives 2.1 and 2.2 of Healthy People 2020.
The California Dietary Practices Surveys (CDPS) (now discontinued) was the most extensive dietary and physical activity assessment of adults 18 years and older in the state of California. CDPS was designed in 1989 and was administered biennially in odd years up through 2013. The CDPS was designed to monitor dietary trends, especially fruit and vegetable consumption, among California adults for evaluating their progress toward meeting the 2010 Dietary Guidelines for Americans and the Healthy People 2020 Objectives. For the data in this table, adults were asked a series of eight questions about their physical activity practices in the last month. Questions included: 1) During the past month, other than your regular job, did you participate in any physical activities or exercise such as running, calisthenics, golf, gardening or walking for exercise? 2) What type of physical activity or exercise did you spend the most time doing during the past month? 3) How many times per week or per month did you take part n this activity during the past month? 4) And when you took part in this activity, for how many minutes or hours did you usually keep at it? 5) During the past month, how many times per week or per month did you do physical activities or exercises to strengthen your muscles? Questions 2, 3, and 4 were repeated to collect a second activity. Data were collected using a list of participating CalFresh households and random digit dial, approximately 1,400-1,500 adults (ages 18 and over) were interviewed via phone survey between the months of June and October. Demographic data included gender, age, ethnicity, education level, income, physical activity level, overweight status, and food stamp eligibility status. Data were oversampled for low-income adults to provide greater sensitivity for analyzing trends among our target population.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains detailed employee engagement survey responses collected voluntarily from employees of Pierce County Government in Washington State. The survey measures employees’ agreement levels on various workplace statements to assess overall engagement and satisfaction.
The dataset was provided by Pierce County, WA. Licensed under Public Domain.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The Workout/Exercises Video Dataset contains diverse videos of people performing various exercises. Each folder corresponds to a specific workout name, enabling researchers and developers to train models for exercise recognition, motion tracking, and virtual training systems.
Facebook
Twitterhttps://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Note 08/07/13: Errata for regarding two variables incorrectly labelled with the same description in the Data Archive for the Health Survey for England - 2008 dataset deposited in the UK Data Archive Author: Health and Social Care Information Centre, Lifestyle Statistics Responsible Statistician: Paul Eastwood, Lifestyles Section Head Version: 1 Original date of publication: 17th December 2009 Date of errata: 11th June 2013 · Two physical activity variables (NSWA201 and WEPWA201) in the Health Survey for England - 2008 dataset deposited in the Data Archive had the same description of 'on weekdays in the last week have you done any cycling (not to school)?'. This is correct for NSWA201, but incorrect for WEPWA201 · The correct descriptions are: · NSWA201 - 'on weekdays in the last week have you done any cycling (not to school)?' · WEPWA201 - 'on weekends in the last week have you done any cycling (not to school)?' · This has been corrected and the amended dataset has been deposited in the UK Data Archive. NatCen Social Research and the Health and Social Care Information Centre apologise for any inconvenience this may have caused. Note 18/12/09: Please note that a slightly amended version of the Health Survey for England 2008 report, Volume 1, has been made available on this page on 18 December 2009. This was in order to correct the legend and title of figure 13G on page 321 of this volume. The NHS IC apologises for any inconvenience caused. The Health Survey for England is a series of annual surveys designed to measure health and health-related behaviours in adults and children living in private households in England. The survey was commissioned originally by the Department of Health and, from April 2005 by The NHS Information Centre for health and social care. The Health Survey for England has been designed and carried out since 1994 by the Joint Health Surveys Unit of the National Centre for Social Research (NatCen) and the Department of Epidemiology and Public Health at the University College London Medical School (UCL). The 2008 Health Survey for England focused on physical activity and fitness. Adults and children were asked to recall their physical activity over recent weeks, and objective measures of physical activity and fitness were also obtained. A secondary objective was to examine results on childhood obesity and other factors affecting health, including fruit and vegetable consumption, drinking and smoking.
Facebook
TwitterPresents data related to the number and percentage of people who successfully complete workforce development training with one of the partnering community benefit organizations (CBOs), also known as "Community Partners" in the Austin Metro Area Master Community Workforce Plan.
Facebook
TwitterLifeSnaps Dataset Documentation
Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.
The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.
Data Import: Reading CSV
For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.
Data Import: Setting up a MongoDB (Recommended)
To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.
To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.
For the Fitbit data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c fitbit
For the SEMA data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c sema
For surveys data, run the following:
mongorestore --host localhost:27017 -d rais_anonymized -c surveys
If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.
Data Availability
The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:
{ _id: id (or user_id): type: data: }
Each document consists of four fields: id (also found as user_id in sema and survey collections), type, and data. The _id field is the MongoDB-defined primary key and can be ignored. The id field refers to a user-specific ID used to uniquely identify each user across all collections. The type field refers to the specific data type within the collection, e.g., steps, heart rate, calories, etc. The data field contains the actual information about the document e.g., steps count for a specific timestamp for the steps type, in the form of an embedded object. The contents of the data object are type-dependent, meaning that the fields within the data object are different between different types of data. As mentioned previously, all times are stored in local time, and user IDs are common across different collections. For more information on the available data types, see the related publication.
Surveys Encoding
BREQ2
Why do you engage in exercise?
Code
Text
engage[SQ001]
I exercise because other people say I should
engage[SQ002]
I feel guilty when I don’t exercise
engage[SQ003]
I value the benefits of exercise
engage[SQ004]
I exercise because it’s fun
engage[SQ005]
I don’t see why I should have to exercise
engage[SQ006]
I take part in exercise because my friends/family/partner say I should
engage[SQ007]
I feel ashamed when I miss an exercise session
engage[SQ008]
It’s important to me to exercise regularly
engage[SQ009]
I can’t see why I should bother exercising
engage[SQ010]
I enjoy my exercise sessions
engage[SQ011]
I exercise because others will not be pleased with me if I don’t
engage[SQ012]
I don’t see the point in exercising
engage[SQ013]
I feel like a failure when I haven’t exercised in a while
engage[SQ014]
I think it is important to make the effort to exercise regularly
engage[SQ015]
I find exercise a pleasurable activity
engage[SQ016]
I feel under pressure from my friends/family to exercise
engage[SQ017]
I get restless if I don’t exercise regularly
engage[SQ018]
I get pleasure and satisfaction from participating in exercise
engage[SQ019]
I think exercising is a waste of time
PANAS
Indicate the extent you have felt this way over the past week
P1[SQ001]
Interested
P1[SQ002]
Distressed
P1[SQ003]
Excited
P1[SQ004]
Upset
P1[SQ005]
Strong
P1[SQ006]
Guilty
P1[SQ007]
Scared
P1[SQ008]
Hostile
P1[SQ009]
Enthusiastic
P1[SQ010]
Proud
P1[SQ011]
Irritable
P1[SQ012]
Alert
P1[SQ013]
Ashamed
P1[SQ014]
Inspired
P1[SQ015]
Nervous
P1[SQ016]
Determined
P1[SQ017]
Attentive
P1[SQ018]
Jittery
P1[SQ019]
Active
P1[SQ020]
Afraid
Personality
How Accurately Can You Describe Yourself?
Code
Text
ipip[SQ001]
Am the life of the party.
ipip[SQ002]
Feel little concern for others.
ipip[SQ003]
Am always prepared.
ipip[SQ004]
Get stressed out easily.
ipip[SQ005]
Have a rich vocabulary.
ipip[SQ006]
Don't talk a lot.
ipip[SQ007]
Am interested in people.
ipip[SQ008]
Leave my belongings around.
ipip[SQ009]
Am relaxed most of the time.
ipip[SQ010]
Have difficulty understanding abstract ideas.
ipip[SQ011]
Feel comfortable around people.
ipip[SQ012]
Insult people.
ipip[SQ013]
Pay attention to details.
ipip[SQ014]
Worry about things.
ipip[SQ015]
Have a vivid imagination.
ipip[SQ016]
Keep in the background.
ipip[SQ017]
Sympathize with others' feelings.
ipip[SQ018]
Make a mess of things.
ipip[SQ019]
Seldom feel blue.
ipip[SQ020]
Am not interested in abstract ideas.
ipip[SQ021]
Start conversations.
ipip[SQ022]
Am not interested in other people's problems.
ipip[SQ023]
Get chores done right away.
ipip[SQ024]
Am easily disturbed.
ipip[SQ025]
Have excellent ideas.
ipip[SQ026]
Have little to say.
ipip[SQ027]
Have a soft heart.
ipip[SQ028]
Often forget to put things back in their proper place.
ipip[SQ029]
Get upset easily.
ipip[SQ030]
Do not have a good imagination.
ipip[SQ031]
Talk to a lot of different people at parties.
ipip[SQ032]
Am not really interested in others.
ipip[SQ033]
Like order.
ipip[SQ034]
Change my mood a lot.
ipip[SQ035]
Am quick to understand things.
ipip[SQ036]
Don't like to draw attention to myself.
ipip[SQ037]
Take time out for others.
ipip[SQ038]
Shirk my duties.
ipip[SQ039]
Have frequent mood swings.
ipip[SQ040]
Use difficult words.
ipip[SQ041]
Don't mind being the centre of attention.
ipip[SQ042]
Feel others' emotions.
ipip[SQ043]
Follow a schedule.
ipip[SQ044]
Get irritated easily.
ipip[SQ045]
Spend time reflecting on things.
ipip[SQ046]
Am quiet around strangers.
ipip[SQ047]
Make people feel at ease.
ipip[SQ048]
Am exacting in my work.
ipip[SQ049]
Often feel blue.
ipip[SQ050]
Am full of ideas.
STAI
Indicate how you feel right now
Code
Text
STAI[SQ001]
I feel calm
STAI[SQ002]
I feel secure
STAI[SQ003]
I am tense
STAI[SQ004]
I feel strained
STAI[SQ005]
I feel at ease
STAI[SQ006]
I feel upset
STAI[SQ007]
I am presently worrying over possible misfortunes
STAI[SQ008]
I feel satisfied
STAI[SQ009]
I feel frightened
STAI[SQ010]
I feel comfortable
STAI[SQ011]
I feel self-confident
STAI[SQ012]
I feel nervous
STAI[SQ013]
I am jittery
STAI[SQ014]
I feel indecisive
STAI[SQ015]
I am relaxed
STAI[SQ016]
I feel content
STAI[SQ017]
I am worried
STAI[SQ018]
I feel confused
STAI[SQ019]
I feel steady
STAI[SQ020]
I feel pleasant
TTM
Do you engage in regular physical activity according to the definition above? How frequently did each event or experience occur in the past month?
Code
Text
processes[SQ002]
I read articles to learn more about physical
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
LifeSnaps Dataset Documentation Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction. The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication. Data Import: Reading CSV For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command. Data Import: Setting up a MongoDB (Recommended) To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database. To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here. For the Fitbit data, run the following: mongorestore --host localhost:27017 -d rais_anonymized -c fitbit
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
It sounds like you have a substantial amount of personal exercise and health data accumulated over 150 days. This data can provide valuable insights into your fitness journey and overall well-being. Here are some suggestions on how you can analyze and make the most of this information:
Exercise Types:
Identify the types of exercises you've been engaging in. Categorize them into cardiovascular, strength training, flexibility, and other categories. Note the frequency and duration of each type of exercise.
Intensity Levels: Assess the intensity of your workouts. This can be measured in terms of heart rate, perceived exertion, or weight lifted. Determine if there are patterns in intensity levels over time.
Progress and Setbacks: Look for trends in your progress. Are you consistently improving, or have you encountered any setbacks? Identify factors that contribute to your success or challenges.
Rest and Recovery: Analyze your rest days and recovery strategies. Ensure that you're allowing your body enough time to recover between intense workouts. Look for patterns in your energy levels and performance related to rest.
Nutrition and Hydration: Correlate your exercise data with your nutrition and hydration habits. Consider whether certain eating patterns impact your workouts positively or negatively.
Sleep Patterns: Examine your sleep data if available. Adequate sleep is crucial for recovery and overall health. Identify any correlations between your sleep patterns and exercise performance.
Mood and Stress Levels: Reflect on your mood and stress levels on different days. Exercise can have a significant impact on mental well-being. Consider whether there are connections between your exercise routine and your emotional state.
Injury Analysis: If you've experienced any injuries during this period, analyze the circumstances surrounding them. This can help in understanding potential risk factors.
Goal Alignment: Evaluate whether your exercise routine aligns with your initial goals. Are you progressing toward your desired outcomes?
Adjustment of Exercise Routine: Based on the analysis, consider adjustments to your exercise routine. This might involve modifying the types of exercises, intensity, or frequency.
Remember, the goal of analyzing this data is to make informed decisions about your fitness routine, identify areas of improvement, and celebrate your successes. If you have specific questions about the data or need guidance on certain aspects, feel free to provide more details for personalized advice.
Facebook
TwitterABSTRACT Introduction Studies have shown that physical exercise is beneficial to people’s overall physical and mental health, but few research reports on the effects of different physical exercises on people’s human health. Object The paper explores the difference in human health function between people who adhere to traditional health sports and those who rarely exercise and provide a scientific basis for applying and promoting traditional health sports in TCM “prevention of disease”. Methods The paper surveyed 526 people who regularly participate in physical exercises and rarely exercise. The exercise items are divided into Tai Chi/Tai Chi sword group, Health Qigong Baduanjin group, Health Qigong Wuqinxi group, and Health Qigong Yijin group. Warp group, walking/jogging group. Results There are differences in the mental indicators of the people in different exercise groups. The overall average percentage levels of and NK cells in each exercise group and the tiny exercise group are different, and the difference is statistically significant (P<0.05). Conclusions Persisting in physical exercise is beneficial to the balance of health and function of the population. Level of evidence II; Therapeutic studies - investigation of treatment results.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset gives a game-by-game attendance to every NCAA FBS game from 2001 to today. Big thanks to the SportsDataVerse whose cfbfastR package was used to get a majority of this data. NCAA Statistics was used to get current year attendance data.
Facebook
TwitterThis dataset was built for training and validating terrain classification models for Mars, which may be useful in future autonomous rover efforts. It consists of ~326K semantic segmentation full image labels on 35K images from Curiosity, Opportunity, and Spirit rovers, collected through crowdsourcing. Each image was labeled by 10 people to ensure greater quality and agreement of the crowdsourced labels. It also includes ~1.5K validation labels annotated by the rover planners and scientists from NASA’s MSL (Mars Science Laboratory) mission, which operates the Curiosity rover, and MER (Mars Exploration Rovers) mission, which operated the Spirit and Opportunity rovers.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Fitness and Training: The "poseAction" model can be used in fitness apps or gym equipment to analyze and correct postures during exercises. For instance, trainers can track and correct users doing single leg squats, lunges or back bridges, enhancing the effectiveness of the workout and reducing injury risks.
Virtual Physical Therapy: The model can help in developing applications for virtual physical therapy, providing feedback on patient activities, such as back bridges or lunges, to ensure that exercises are done correctly, thus accelerating recovery.
Remote Coaching: Sports coaches or personal trainers may use a platform equipped with "poseAction" to supervise athletes or clients' exercises remotely and provide real-time feedback.
Augmented Reality Gaming: In AR fitness games, "poseAction" could be used to recognize player movements and translate them into in-game actions, ensuring physical involvement of the player in the game.
Human Behavior Analysis: The model can aid in developing systems that study ergonomics, workplace safety, or human behaviors, helping understand how people perform certain physical activities, such as how well they maintain posture in a back bridge or a lunge.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This release provides estimates of young people (aged from 16 to 24) who are NEET (not in education, employment or training) broken down by age, sex and by labour market status (unemployed and economically inactive). Source agency: Office for National Statistics Designation: Official Statistics not designated as National Statistics Language: English Alternative title: NEET
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Quarterly estimates for young people (aged 16 to 24 years) who are not in education, employment or training (NEET) in the UK. These are official statistics in development.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Spanish Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the Spanish language, advancing the field of artificial intelligence.
This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in Spanish. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.
Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native Spanish people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.
To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.
To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.
This fully labeled Spanish Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.
The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.
Both the question and answers in Spanish are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.
The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy Spanish Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Filipino Chain of Thought prompt-response dataset, a meticulously curated collection containing 3000 comprehensive prompt and response pairs. This dataset is an invaluable resource for training Language Models (LMs) to generate well-reasoned answers and minimize inaccuracies. Its primary utility lies in enhancing LLMs' reasoning skills for solving arithmetic, common sense, symbolic reasoning, and complex problems.
This COT dataset comprises a diverse set of instructions and questions paired with corresponding answers and rationales in the Filipino language. These prompts and completions cover a broad range of topics and questions, including mathematical concepts, common sense reasoning, complex problem-solving, scientific inquiries, puzzles, and more.
Each prompt is meticulously accompanied by a response and rationale, providing essential information and insights to enhance the language model training process. These prompts, completions, and rationales were manually curated by native Filipino people, drawing references from various sources, including open-source datasets, news articles, websites, and other reliable references.
Our chain-of-thought prompt-completion dataset includes various prompt types, such as instructional prompts, continuations, and in-context learning (zero-shot, few-shot) prompts. Additionally, the dataset contains prompts and completions enriched with various forms of rich text, such as lists, tables, code snippets, JSON, and more, with proper markdown format.
To ensure a wide-ranging dataset, we have included prompts from a plethora of topics related to mathematics, common sense reasoning, and symbolic reasoning. These topics encompass arithmetic, percentages, ratios, geometry, analogies, spatial reasoning, temporal reasoning, logic puzzles, patterns, and sequences, among others.
These prompts vary in complexity, spanning easy, medium, and hard levels. Various question types are included, such as multiple-choice, direct queries, and true/false assessments.
To accommodate diverse learning experiences, our dataset incorporates different types of answers depending on the prompt and provides step-by-step rationales. The detailed rationale aids the language model in building reasoning process for complex questions.
These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
This fully labeled Filipino Chain of Thought Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt complexity, prompt category, domain, response, rationale, response type, and rich text presence.
Quality and Accuracy
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses and rationales are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Filipino version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom chain of thought prompt completion data tailored to specific needs, providing flexibility and customization options.
License
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Filipino Chain of Thought Prompt Completion Dataset to enhance the rationale and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Urdu Chain of Thought prompt-response dataset, a meticulously curated collection containing 3000 comprehensive prompt and response pairs. This dataset is an invaluable resource for training Language Models (LMs) to generate well-reasoned answers and minimize inaccuracies. Its primary utility lies in enhancing LLMs' reasoning skills for solving arithmetic, common sense, symbolic reasoning, and complex problems.
This COT dataset comprises a diverse set of instructions and questions paired with corresponding answers and rationales in the Urdu language. These prompts and completions cover a broad range of topics and questions, including mathematical concepts, common sense reasoning, complex problem-solving, scientific inquiries, puzzles, and more.
Each prompt is meticulously accompanied by a response and rationale, providing essential information and insights to enhance the language model training process. These prompts, completions, and rationales were manually curated by native Urdu people, drawing references from various sources, including open-source datasets, news articles, websites, and other reliable references.
Our chain-of-thought prompt-completion dataset includes various prompt types, such as instructional prompts, continuations, and in-context learning (zero-shot, few-shot) prompts. Additionally, the dataset contains prompts and completions enriched with various forms of rich text, such as lists, tables, code snippets, JSON, and more, with proper markdown format.
To ensure a wide-ranging dataset, we have included prompts from a plethora of topics related to mathematics, common sense reasoning, and symbolic reasoning. These topics encompass arithmetic, percentages, ratios, geometry, analogies, spatial reasoning, temporal reasoning, logic puzzles, patterns, and sequences, among others.
These prompts vary in complexity, spanning easy, medium, and hard levels. Various question types are included, such as multiple-choice, direct queries, and true/false assessments.
To accommodate diverse learning experiences, our dataset incorporates different types of answers depending on the prompt and provides step-by-step rationales. The detailed rationale aids the language model in building reasoning process for complex questions.
These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
This fully labeled Urdu Chain of Thought Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt complexity, prompt category, domain, response, rationale, response type, and rich text presence.
Quality and Accuracy
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses and rationales are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Urdu version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
Continuous Updates and Customization
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom chain of thought prompt completion data tailored to specific needs, providing flexibility and customization options.
License
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Urdu Chain of Thought Prompt Completion Dataset to enhance the rationale and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
Facebook
Twitterhttps://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Arabic Closed Ended Classification Prompt-Response Dataset, an extensive collection of 3000 meticulously curated prompt and response pairs. This dataset is a valuable resource for training Language Models (LMs) to classify input text accurately, a crucial aspect in advancing generative AI.
This closed-ended classification dataset comprises a diverse set of prompts and responses where the prompt contains input text to be classified and may also contain task instruction, context, constraints, and restrictions while completion contains the best classification category as response. Both these prompts and completions are available in Arabic language. As this is a closed-ended dataset, there will be options given to choose the right classification category as a part of the prompt.
These prompt and completion pairs cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more. Each prompt is accompanied by a response, providing valuable information and insights to enhance the language model training process. Both the prompt and response were manually curated by native Arabic people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This closed-ended classification prompt and completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains prompts and responses with different types of rich text, including tables, code, JSON, etc., with proper markdown.
To ensure diversity, this closed-ended classification dataset includes prompts with varying complexity levels, ranging from easy to medium and hard. Different types of prompts, such as multiple-choice, direct, and true/false, are included. Additionally, prompts are diverse in terms of length from short to medium and long, creating a comprehensive variety. The classification dataset also contains prompts with constraints and persona restrictions, which makes it even more useful for LLM training.
To accommodate diverse learning experiences, our dataset incorporates different types of responses depending on the prompt. These formats include single-word, short phrase, and single sentence type of response. These responses encompass text strings, numerical values, and date and time formats, enhancing the language model's ability to generate reliable, coherent, and contextually appropriate answers.
This fully labeled Arabic Closed Ended Classification Prompt Completion Dataset is available in JSON and CSV formats. It includes annotation details such as a unique ID, prompt, prompt type, prompt length, prompt complexity, domain, response, response type, and rich text presence.
Our dataset upholds the highest standards of quality and accuracy. Each prompt undergoes meticulous validation, and the corresponding responses are thoroughly verified. We prioritize inclusivity, ensuring that the dataset incorporates prompts and completions representing diverse perspectives and writing styles, maintaining an unbiased and discrimination-free stance.
The Arabic version is grammatically accurate without any spelling or grammatical errors. No copyrighted, toxic, or harmful content is used during the construction of this dataset.
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Ongoing efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to gather custom closed-ended classification prompt and completion data tailored to specific needs, providing flexibility and customization options.
The dataset, created by FutureBeeAI, is now available for commercial use. Researchers, data scientists, and developers can leverage this fully labeled and ready-to-deploy Arabic Closed Ended Classification Prompt-Completion Dataset to enhance the classification abilities and accurate response generation capabilities of their generative AI models and explore new approaches to NLP tasks.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset was created by myself. This dataset contains videos of people doing workouts. The name of the existing workout corresponds to the name of the folder listed.
Video format: .mp4 Some of the videos are muted
What is the videos resolution? The resolution of this video varies greatly, but I'm trying to find the best possible resolution so that you can lower the resolution according to what you will use later.
How about the duration of the videos? It also varies, but there is at least 1 rep on each video
What are the data sources? Mostly sourced from YouTube, but I also create some of it by myself with my friends
Need the extracted frame of each video? Try check my other dataset for the images of workout/exercise here
Facebook
TwitterThis dataset was collected by me, along with my friends during my college days. The dataset mostly contains data from my friends and family members. This dataset has the survey data for the type of fitness practices that people follow.
This dataset wouldn't be here without the help of my friends. So, thanks to them!