Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains basic demographic and performance information for a small group of individuals. Each entry includes an# *** ID, name, age, country, and score***. It was created as a simple example for practicing data analysis, visualization, and basic machine learning tasks such as sorting, filtering, and calculating statistics. The dataset is designed to be lightweight and easy to understand, making it suitable for beginners learning data handling and exploratory analysis techniques.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Purpose. This dataset contains anonymised raw responses (n = 55, 31 variables) from a cross-sectional survey investigating factors that influence the adoption of data-analytics tools (Excel/Sheets, Power BI/Tableau, Python notebooks, Google Analytics) among graduate students and early-career professionals in Uzbekistan.Instrument. Items operationalise seven UTAUT/TAM-based constructs: Performance Expectancy, Effort Expectancy, Behavioural Intention, Familiarity & Usage, Task–Technology Fit, Barriers to Adoption, plus Demographics (age, gender, study programme, prior stats courses, work experience). All Likert items use a five-point scale.Collection & cleaning. Data were collected via Google Forms between 02 Apr 2025 and 22 Apr 2025 through university e-mail lists, Telegram study channels, and LinkedIn posts. Five partial records (> 20 % missing) were removed; remaining open-text answers were lower-cased, spell-checked, and stemmed. The file is provided exactly as analysed in the accompanying thesis; no further processing (e.g., recoding) has been performed.File contents. survey_responses.xlsx – one worksheet (“Form Responses 1”) with 55 rows × 31 columns. Column A (“Timestamp”) shows submission time in UTC+5. Variable names follow the original question stems for transparency.Ethics & privacy. All participants gave informed e-consent; no personal identifiers (names, e-mails, IPs) are included. Ethical approval: Silk Road University REC # 2025-DX-012.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The present dataset is generated in the frame of the Horizon 2020 project "InnORBIT: Empowering innovation intermediaries to generate sustainable initiatives to accelerate the commercialisation of space innovation" (innorbit.eu). This dataset describes the InnORBIT project's dissemination and communication plan and also includes the data collected from dissemination and communication activities to measure the progress against the project's targets for outreach during the first 18 months of project implementation (January 1st, 2021 - June 30th, 2022). This dataset will be updated in a second and final version, after the end of InnORBIT's grant duration in July 2023. The final version will provide a full dataset accounting for the project's outreach activities. This first version of the dataset contains the following files and documents: [InnORBIT-DisseminationCommunicationPlan_v2_20220929.pdf]: Final version of the project's Dissemination, Awareness raising and Communication Plan (DACP), that describes the key target audiences, key messages and value offered by InnORBIT through in terms of knowledge, services and solutions boosting entrepreneurship in the space industry and the digital tools offered via the InnORBIT digital toolbox. The InnORBIT DACP also describes the channels, tools and activities employed to reach out effectively the project's target groups. The core Key Performance Indicators (KPIs) that indicate the performance level of the project's strategy and indicates areas for improvement are outlined. The updated version also outlines the achievements of the project's dissemination for the first 18 months of implementation (January 2021 - June 2022). [InnORBIT_DisseminationActivities_Data_20220929. xlsx]: A spreadsheet used to collect raw data about the project's dissemination activities, calculate the InnORBIT's KPIs for Dissemination and Communication to track progress against targets. The data span from January 1st, 2021 to June 30th, 2022. [InnORBIT-WebsiteAnalytics-AudienceOverview_20220929.pdf]: A Google Analytics report summarising InnORBIT website's audience demographics and overall page performance (visits, sessions, users). The data span from January 1st, 2021 to June 30th, 2022. [InnORBIT-WebsiteAnalytics-AudienceAcquisition_20220929.pdf]: A Google Analytics report summarising the main sources generating traffic for the InnORBIT website and the bahaviour of users coming from each source. The data span from January 1st, 2021 to June 30th, 2022. [InnORBIT-WebsiteAnalytics-AudienceBehaviour_20220929.pdf]: A Google Analytics report providing further insight on users' behaviour when using the InnORBIT website. The data span from January 1st, 2021 to June 30th, 2022.
Facebook
Twitterhttps://www.dataflix.com/data360/license/https://www.dataflix.com/data360/license/
The Dataflix COVID dataset is a centralized repository of up-to-date and curated data focused on key tracking metics and U.S. census data. The dataset is publicly-readable & accessible on Google BigQuery – ready for analysis, analytics and machine learning initiatives. The dataset is built on data sourced from trusted sources like CSSE at Johns Hopkins University and government agencies, covering a wide range of metrics including confirmed cases, new cases, % population, mortality rate and deaths, aggregated at various geographic levels including city, county, state and country. New data is published on daily basis. Our objective is to make structured COVID data available for organizations and individuals to help in the fight against COVID-19. Example, health authorities will be able to build reports & dashboards to efficiently deploy vital resources like hospital beds and ventilators as they track the spread of the disease. Or epidemiologists can use the dataset to complement their existing models & datasets, and generate better forecasts of hotspots and trends. Más información
Facebook
Twitterhttps://www.dataflix.com/data360/license/https://www.dataflix.com/data360/license/
The Dataflix COVID dataset is a centralized repository of up-to-date and curated data focused on key tracking metics and U.S. census data. The dataset is publicly-readable & accessible on Google BigQuery – ready for analysis, analytics and machine learning initiatives. The dataset is built on data sourced from trusted sources like CSSE at Johns Hopkins University and government agencies, covering a wide range of metrics including confirmed cases, new cases, % population, mortality rate and deaths, aggregated at various geographic levels including city, county, state and country. New data is published on daily basis. Our objective is to make structured COVID data available for organizations and individuals to help in the fight against COVID-19. Example, health authorities will be able to build reports & dashboards to efficiently deploy vital resources like hospital beds and ventilators as they track the spread of the disease. Or epidemiologists can use the dataset to complement their existing models & datasets, and generate better forecasts of hotspots and trends. Learn more
Facebook
TwitterContains Gallup data from countries that are home to more than 98% of the world's population through a state-of-the-art Web-based portal. Gallup Analytics puts Gallup's best global intelligence in users' hands to help them better understand the strengths and challenges of the world's countries and regions. Users can access Gallup's U.S. Daily tracking and World Poll data to compare residents' responses region by region and nation by nation to questions on topics such as economic conditions, government and business, health and wellbeing, infrastructure, and education.
The Gallup Analytics Database is accessed through the Cornell University Libraries here. In addition, a CUL subscription also allows access to the Gallup Respondent Level Data. For access please refer to the documentation below and then request the variables you need here.
Before requesting data from the World Poll, please see the Getting Started guide and the Worldwide Research Methodology and Codebook (You will need to request access). The Codebook will give you information about all available variables in the datasets. There are other guides available as well in the google folder. You can also access information about questions asked and variables using the Gallup World Poll Reference Tool. You will need to create your user account to access the tool. This will only give you access to information about the questions asked and variables. It will not give you access to the data.
For further documentation and information see this site from New York University Libraries. The Gallup documentation for the World Poll methodology is also available under the Data and Documentation tab.
In addition to the World Poll and Daily Tracking Poll, also available are the Gallup Covid-19 Survey, Gallup Poll Social Series Surveys, Race Relations Survey, Confidence in Institutions Survey, Honesty and Ethics in Professions Survey, and Religion Battery.
The process for getting access to respondent-level data from the Gallup U.S. Daily Tracking is similar to the World Poll Survey. There is no comparable discovery tool for U.S. Daily Tracking poll questions, however. Users need to consult the codebooks and available variables across years.
The COVID-19 web survey began on March 13, 2020 with daily random samples of U.S. adults, aged 18 and older who are members of the Gallup Panel. Before requesting data, please see the Gallup Panel COVID-19 Survey Methodology and Codebook.
The Gallup Poll Social Series (GPSS) dataset is a set of public opinion surveys designed to monitor U.S. adults’ views on numerous social, economic, and political topics. More information is available on the Gallup website: https://www.gallup.com/175307/gallup-poll-social-series-methodology.aspx As each month has a unique codebook, contact CCSS-ResearchSupport@cornell.edu to discuss your interests and start the data request process.
Starting in 1973, Gallup started measuring the confidence level in several US institutions like Congress, Presidency, Supreme Court, Police, etc. The included dataset includes data beginning in 1973 and data is collected once per year. Users should consult the list of available variables.
The Race Relations Poll includes topics that were previously represented in the GPSS Minority Relations Survey that ran through 2016. The Race Relations Survey was conducted November 2018. Users should consult the codebook for this poll before making their request.
The Honesty and Ethics in Professions Survey – Starting in 1976, Gallup started measuring US perceptions of the honesty and ethics of a list of professions. The included dataset was added to the collection in March 2023 and includes data ranging from 1976-2022. Documentation for this collection is located here and will require you to request access.
Religion Battery: Consolidated list of items focused on religion in the US from 1999-2022. Documentation for this collection is located here and will require you to request access.
Facebook
TwitterAs global communities responded to COVID-19, we heard from public health officials that the same type of aggregated, anonymized insights we use in products such as Google Maps would be helpful as they made critical decisions to combat COVID-19. These Community Mobility Reports aimed to provide insights into what changed in response to policies aimed at combating COVID-19. The reports charted movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential.
Facebook
TwitterThis dataset is the result of cleaning and aggregating data as part of the Process and Analyze phases of the Google Data Analytics Capstone Project Cyclistic Case Study 2013-2021 with demographics. The original data for the project has been made available by Motivate International Inc. under this license.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Comprehensive retail footfall and commercial property analysis for Dehradun's major shopping areas. This dataset provides actionable business intelligence for retail location planning, covering 8 prime retail nodes with detailed footfall patterns, rental costs, and customer demographics.
Target Market: Women's retail business planning in Dehradun, India's fastest-growing Tier-2 city Coverage: 8 major retail locations with 500+ daily data points Time Period: 2024-2025 with seasonal patterns
✅ Retail Location Selection - Compare footfall vs rent across 8 prime areas ✅ Footfall Optimization - Peak hours and seasonal planning ✅ Rental Budgeting - Detailed cost analysis by location type ✅ Target Demographics - Customer profile matching by area ✅ Competition Analysis - Market saturation and opportunity gaps ✅ Seasonal Planning - Monthly demand forecasting
First comprehensive retail footfall analysis for Dehradun combining traditional markets (Paltan Bazaar) with modern retail (Pacific Mall). Essential for entrepreneurs planning retail entry in India's emerging Tier-2 cities.
Geographic Scope: Dehradun city, Uttarakhand, India
Last Updated: June 2025
Data Type: Commercial footfall & property analysis
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
HR analytics, also referred to as people analytics, workforce analytics, or talent analytics, involves gathering together, analyzing, and reporting HR data. It is the collection and application of talent data to improve critical talent and business outcomes. It enables your organization to measure the impact of a range of HR metrics on overall business performance and make decisions based on data. They are primarily responsible for interpreting and analyzing vast datasets.
Download the data CSV files here ; https://drive.google.com/drive/folders/18mQalCEyZypeV8TJeP3SME_R6qsCS2Og
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.
So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.
Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.
Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.
2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC
This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.
The data is available from 22 Jan, 2020.
Here’s a polished version suitable for a professional Kaggle dataset description:
This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv, with supporting files for earlier records and individual-level line list data.
This is the primary dataset and contains aggregated COVID-19 statistics by location and date.
This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv.
This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.
Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.
✅ Use covid_19_data.csv for up-to-date aggregated global trends.
✅ Use the line list datasets for detailed, individual-level case analysis.
If you are interested in knowing country level data, please refer to the following Kaggle datasets:
India - https://www.kaggle.com/sudalairajkumar/covid19-in-india
South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset
Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy
Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil
USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa
Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland
Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases
Johns Hopkins University for making the data available for educational and academic research purposes
MoBS lab - https://www.mobs-lab.org/2019ncov.html
World Health Organization (WHO): https://www.who.int/
DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.
BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/
National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml
China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm
Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html
Macau Government: https://www.ssm.gov.mo/portal/
Taiwan CDC: https://sites.google....
Facebook
TwitterBy Inder Sethi [source]
This comprehensive District Information System for Education (DISE) dataset collects district-level educational statistics in India and provides the most up-to-date data on the nation's schools. The project tracks and compiles data on primary and upper primary school students, teachers, institutions, infrastructures and more from all districts in India. It has drastically reduced the time lag between data collection to analysis - from seven to eight years down to only a few months at both district and state levels. DISE is fully supported by the Ministry of Human Resource Development (MHRD) as well as UNICEF so precise regional insights are available regarding Indian education standards. With this institutionalized flow of raw data being collected, verified at Block Education Offices/Coordinators then computerized at a District level before eventually being aggregated into State level analysis – it’s easier than ever before to understand where educational improvements need to be made. From tracking key performance indicators amongst students across all ages right through to measuring access teacher resources - this DISE dataset serves as an invaluable resource towards unlocking potential within the Indian learning system!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
Guide: How to Use the Indian District Level School Data 2015-16
Familiarize yourself with the features of this data set. The dataset consists of five columns which provides an overview at district level educational statistics in India for the year 2015-16. Each row contains individual district-level data with corresponding educational information and statistics like Total Number of Schools, Number of Girls' Schools, Enrolment and more for each district in India during that year.
Understand what kind of analysis can be done using this dataset once imported into a statistical software program or spreadsheet program such as Microsoft Excel or Google Sheets. You can use this dataset to analyze many different aspects related to education in India at a district level; including total number of schools, number and percent girls enrolled, teacher qualifications and more across districts throughout all states in India during the year 2015-16 period covered by this data set.
Pull up a visual representation of your data within a statistical program like SPSS or perhaps one online such as Tableau Public, depending on your preference and needs for analysis purposes - either way it is necessary to have these setup beforehand before attempting to import any given subset into them; click upload file option within them (or any other appropriate action), select all files in your local machine directory where you saved our downloaded csv file “report card” from kaggle above – then just wait until it’s completely uploaded after selecting open/import/apply/etc…and if no errors about encoding appear then begin your desired data mining experience (visualization & analytical techniques).
Once inside your preferred visualization environment, try out different methods for analyzing individual rows which correspond directly onto specific districts located inside this geographic territory that are meant by our target sheet observations mentioned prior – refer back often if lost & take time understanding what any given county contributes when computer processing their respective responses accordingly without overlooking any particular variables taken into account unlike secondary “missing values” under consideration also..
Then define relationships between similar items according figures gathered - notice patterns found among these locations while focusing attention isolation instead – graphic qualities captured midst these demographics we choose visualize key representing intent anyways… therefor aim transform knowledge through effective strategy meant enable more meaningful representation ideas presented starting place develops further details follow courtesy
- Analyzing literacy rate and measure the educational advancement of different districts in India.
- Tracking the progress of various Governmental programs like Sarva Shiksha Abhiyan that focus on improving access to education for children across districts.
- Predicting trends in the quality of school resources, educational infrastructure and student performance to guide district-level decision making processes for improved education outcomes
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the dataset that I created as part of the Google Data Analytics Professional Certificate capstone project. The MyAnimeList website has a vast repository of ratings and rankings of viewership data that could be used for various methods. I extracted several datasets from the detail API from MyAnimeList (MAL) https://myanimelist.net/apiconfig/references/api/v2 and plan to potentially update data every two weeks.
Many possible uses for this data could be tracking what anime viewers are watching most within a particular time period, what's being scored (out of 10) well and what isn't.
My viz for this data will be part of a tableau dashboard located here. This dashboard allows fans to explore the dataset and locate top scored or popular titles by genre, time period, and demographic (although this field isn't always entered)
The extraction and cleaning process is outlined on github here.
I plan on updating this potentially every 2 weeks, this depends on my availability and the interest in this dataset.
Extracting and loading this data involved some transformations that should be noted:
alternative_title field in the anime_table. This uses the english version of the name unless it is null, if the value is null, it uses the default name. This was in an effort to make the title accessible to english speakers. The original title field can be used if desired.genres field. MyAnimeList includes demographic information (shounen, seinen etc.) in the genres field. I've extracted it so that it could be used as its own field. However, many of those fields are null making it somewhat difficult to use.start_date have been used. I will continue to use this method as long as it is viable.The primary keys in all of the tables (with the exclusion of the tm_ky table) are foreign keys to other tables. As a result, the tables have 2 or more primary keys.
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| demo_id | int |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| genres_id | int | PK |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| mean | dbl | |
| rank | int | |
| popularity | int | |
| num_scoring_users | int | |
| statistics.watching | int | |
| statistics.completed | int | |
| statistics.on_hold | int | |
| statistics.dropped | int | |
| statistics.plan_to_watch | int | |
| statistics.num_scoring_users | int |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| studio_id | int | PK |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| synonyms | chr |
| Field | Type | Primary Key |
|---|---|---|
| tm_ky | int | PK |
| mal_id | int | PK |
| title | chr | |
| main_picture.medium | chr | |
| main_picture.large | chr | |
| alternative_titles.en | chr | |
| alternative_titles.ja | chr | |
| start_date | chr | |
| end_date | chr | |
| synopsis | chr | |
| media_type | chr | |
| status | chr | |
| num_episodes | int | |
| start_season.year | int | |
| start_season.season | chr | |
| rating | chr | |
| nsfw | chr | |
| demo_de | chr ... |
Facebook
TwitterThis data set was selected to be used for my capstone project for the Google Data Analytics Certificate. It allowed me to showcase the skills I have developed throughout the courses leading up to the certificate.
The dataset contains demographic, geographic, and crime-related information about individuals arrested by the Tucson Police Department in 2020. The data is in long format, with all of the observations in one column and each column representing a variable.
I would like to thank everyone who contributed to the creation of the dataset as well as those who made it open to the public. I would also like to thank the Google Data Analytics Certificate team for guiding me through the well-designed courses leading up to certification.
My hope was to gain insight that would lead to data-driven recommendations, which would be used to develop strategies for the design of a crime prevention program.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
this graph was created in Loocker Studio, Tableau and PowerBi:
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Faa30bfda8161a2ccb5532fb461d5c5ca%2Fgraph1.png?generation=1717963934031440&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2F698cf099eee5fd39d7357707c23b9f83%2Fgraph2.jpg?generation=1717963939898552&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Ffbf5fac4f84f95d65738cb0e3df61df8%2Fgraph3.jpg?generation=1717963944992929&alt=media" alt="">
large-scale consumer survey across the UK population on the perceptions of vegetable oils, palm oil was deemed to be the least environmentally friendly.1
It wasn’t even close. 41% of people thought palm oil was ‘environmentally unfriendly,’ compared to 15% for soybean oil, 9% for rapeseed, 5% for sunflower, and 2% for olive oil. 43% also answered ‘Don’t know,’ meaning that almost no one thought it was environmentally friendly.
Retailers know that this is becoming an important driver of consumer choices. From shampoos to detergents and from chocolate to cookies, companies are trying to eliminate palm oil from their products. There are now long lists of companies that have done so [Google ‘palm oil free’ and you will find an endless supply]. Many online grocery stores now offer the option to apply a ‘palm-oil free’ filter when browsing their products.2
Why are consumers turning their back on palm oil? And is this reputation justified?
In this article, I address some key questions about palm oil production: how has it changed, where is it grown, and how has this affected deforestation and biodiversity? The story of palm oil is more complex than it is often portrayed.
Global demand for vegetable oils has increased rapidly over the last 50 years. As palm oil is the most productive oil crop, it has taken up a lot of this production. This has had a negative impact on the environment, particularly in Indonesia and Malaysia. But it’s not clear that the alternatives would have fared any better. In fact, because we can produce up to 20 times as much oil per hectare from palm versus the alternatives, it has probably spared a lot of environmental impacts from elsewhere.
Facebook
TwitterDescription of the Dataset 1. Dataset Overview
Name: Wellness Technology Market Analysis Dataset Purpose: This dataset is designed to analyze various factors influencing the success of wellness technology companies. It aims to identify strategic opportunities and challenges in the wellness tech industry by evaluating market trends, customer behavior, and competitive dynamics. 2. Key Attributes
Company ID: A unique identifier for each wellness technology company. Company Name: The name of the company. Product Categories: Types of wellness products offered (e.g., wearables, fitness apps, mental health platforms). Market Share: Percentage of market share held by the company in different regions. Revenue: Annual revenue generated by the company (numerical, in USD). Customer Satisfaction Score: Average customer satisfaction ratings (numerical, e.g., 1 to 10 scale). Investment Amount: Total investment received by the company (numerical, in USD). Product Features: Key features of each product (categorical, e.g., heart rate monitoring, sleep tracking). Competitive Position: Assessment of the company’s position relative to competitors (categorical, e.g., leader, challenger, niche). Innovation Index: An index score representing the level of innovation in the company’s product offerings (numerical). Marketing Spend: Annual expenditure on marketing and promotional activities (numerical, in USD). User Demographics: Age, gender, and location of the users (categorical and numerical). 3. Data Collection Method
Sources: The data was collected from a combination of primary and secondary sources:
Industry Reports: Data was sourced from market research reports and industry analysis published by organizations like Gartner, IDC, and Statista.
Company Financial Statements: Financial information and market share data were obtained from public financial reports and investor relations sections of company websites.
Customer Reviews and Ratings: Customer satisfaction scores and feedback were collected from review platforms such as Trustpilot, Google Reviews, and app store ratings.
Surveys and Interviews: Direct surveys and interviews with industry experts, company executives, and customers were conducted to gather qualitative insights into product features and competitive positioning.
Market Analysis Tools: Tools like Google Trends and social media analytics were used to assess market trends and consumer sentiment.
Collection Tools and Techniques:
Web Scraping: Automated scripts were used to extract data from online reviews and financial websites. APIs: Data was pulled from APIs provided by financial databases and market analysis tools. Surveys: Surveys were administered using platforms like SurveyMonkey to gather direct feedback from stakeholders. Data Quality Assurance:
Data Cleaning: Involves handling missing values, correcting data inconsistencies, and ensuring accurate data entry. Validation: Data was cross-verified with multiple sources to ensure reliability and accuracy. 4. Dataset Size and Format
Size: The dataset comprises data from [number of companies, e.g., 50] wellness technology companies and covers [number of records, e.g., 500] individual data points. Format: The data is stored in [format, e.g., Excel spreadsheets, SQL database] for ease of analysis and integration with analytical tools. 5. Privacy and Compliance
Data Privacy: All data collected is anonymized to ensure the privacy of individuals and companies. Compliance: The data collection process adheres to relevant data protection regulations such as GDPR and CCPA, ensuring proper consent and secure handling of data.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains basic demographic and performance information for a small group of individuals. Each entry includes an# *** ID, name, age, country, and score***. It was created as a simple example for practicing data analysis, visualization, and basic machine learning tasks such as sorting, filtering, and calculating statistics. The dataset is designed to be lightweight and easy to understand, making it suitable for beginners learning data handling and exploratory analysis techniques.