https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 6 columns and 10k rows about the demographics of the users of an app. UID - User ID, unique identifier for every app user. reg_date - Date that each user registered. device - Operating system of the user. Gender - Gender of the user Country - Country where the user downloaded the app. Age - Age of the user.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
We surveyed 10,208 people from more than 15 countries on their mobile app usage behavior. The countries include USA, China, Japan, Germany, France, Brazil, UK, Italy, Russia, India, Canada, Spain, Australia, Mexico, and South Korea. We asked respondents about: (1) their mobile app user behavior in terms of mobile app usage, including the app stores they use, what triggers them to look for apps, why they download apps, why they abandon apps, and the types of apps they download. (2) their demographics including gender, age, marital status, nationality, country of residence, first language, ethnicity, education level, occupation, and household income (3) their personality using the Big-Five personality traits This dataset contains the results of the survey.
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
This comprehensive iOS application reviews dataset contains thousands of authentic user reviews from the Apple App Store in English. The dataset provides valuable insights for app developers, marketers, and researchers studying mobile application performance and user sentiment.
Key Features:
Applications: Perfect for sentiment analysis, app store optimization, mobile app development research, user experience studies, and competitive analysis. This dataset enables businesses to understand user preferences, identify app improvement opportunities, and develop better mobile applications.
Data Quality: All reviews are genuine user feedback collected from the official Apple App Store, ensuring authenticity and reliability for research and business intelligence purposes. The dataset covers various app categories including fitness, shopping, education, entertainment, and productivity applications.
At MFour, our Behavioral Data stands out for its uniqueness and depth of insights. What makes our data genuinely exceptional is the combination of several key factors:
First-Party Opt-In Data: Our data is sourced directly from our opt-in panel of consumers who willingly participate in research and provide observed behaviors. This ensures the highest data quality and eliminates privacy concerns. CCPA compliant.
Unparalleled Data Coverage: With access to 3B+ billion events, we have an extensive pool of participants who allow us to observe their brick + mortar location visitation, app + web smartphone usage, or both. This large-scale coverage provides robust and reliable insights.
Our data is generally sourced through our Surveys On The Go (SOTG) mobile research app, where consumers are incentivized with cash rewards to participate in surveys and share their observed behaviors. This incentivized approach ensures a willing and engaged panel, leading to the highest-quality data.
The primary use cases and verticals of our Behavioral Data Product are diverse and varied. Some key applications include:
Data Acquisition and Modeling: Our data helps businesses acquire valuable insights into consumer behavior and enables modeling for various research objectives.
Shopper Data Analysis: By understanding purchase behavior and patterns, businesses can optimize their strategies, improve targeting, and enhance customer experiences.
Media Consumption Insights: Our data provides a deep understanding of viewer behavior and patterns across popular platforms like YouTube, Amazon Prime, Netflix, and Disney+, enabling effective media planning and content optimization.
App Performance Optimization: Analyzing app behavior allows businesses to monitor usage patterns, track key performance indicators (KPIs), and optimize app experiences to drive user engagement and retention.
Location-Based Targeting: With our detailed location data, businesses can map out consumer visits to physical venues and combine them with web and app behavior to create predictive ad targeting strategies.
Audience Creation for Ad Placement: Our data enables the creation of highly targeted audiences for ad campaigns, ensuring better reach and engagement with relevant consumer segments.
The Behavioral Data Product complements our comprehensive suite of data solutions in the broader context of our data offering. It provides granular and event-level insights into consumer behaviors, which can be combined with other data sets such as survey responses, demographics, or custom profiling questions to offer a holistic understanding of consumer preferences, motivations, and actions.
MFour's Behavioral Data empowers businesses with unparalleled consumer insights, allowing them to make data-driven decisions, uncover new opportunities, and stay ahead in today's dynamic market landscape.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the dataset used for paper: "A Recommender System of Buggy App Checkers for App Store Moderators", published on the International Conference on Mobile Software Engineering and Systems (MOBILESoft) in 2015.
Dataset Collection We built a dataset that consists of a random sample of Android app metadata and user reviews available on the Google Play Store on January and March 2014. Since the Google Play Store is continuously evolving (adding, removing and/or updating apps), we updated the dataset twice. The dataset D1 contains available apps in the Google Play Store in January 2014. Then, we created a new snapshot (D2) of the Google Play Store in March 2014.
The apps belong to the 27 different categories defined by Google (at the time of writing the paper), and the 4 predefined subcategories (free, paid, new_free, and new_paid). For each category-subcategory pair (e.g. tools-free, tools-paid, sports-new_free, etc.), we collected a maximum of 500 samples, resulting in a median number of 1.978 apps per category.
For each app, we retrieved the following metadata: name, package, creator, version code, version name, number of downloads, size, upload date, star rating, star counting, and the set of permission requests.
In addition, for each app, we collected up to a maximum of the latest 500 reviews posted by users in the Google Play Store. For each review, we retrieved its metadata: title, description, device, and version of the app. None of these fields were mandatory, thus several reviews lack some of these details. From all the reviews attached to an app, we only considered the reviews associated with the latest version of the app —i.e., we discarded unversioned and old-versioned reviews. Thus, resulting in a corpus of 1,402,717 reviews (2014 Jan.).
Dataset Stats Some stats about the datasets:
D1 (Jan. 2014) contains 38,781 apps requesting 7,826 different permissions, and 1,402,717 user reviews.
D2 (Mar. 2014) contains 46,644 apps and 9,319 different permission requests, and 1,361,319 user reviews.
Additional stats about the datasets are available here.
Dataset Description To store the dataset, we created a graph database with Neo4j. This dataset therefore consists of a graph describing the apps as nodes and edges. We chose a graph database because the graph visualization helps to identify connections among data (e.g., clusters of apps sharing similar sets of permission requests).
In particular, our dataset graph contains six types of nodes: - APP nodes containing metadata of each app, - PERMISSION nodes describing permission types, - CATEGORY nodes describing app categories, - SUBCATEGORY nodes describing app subcategories, - USER_REVIEW nodes storing user reviews. - TOPIC topics mined from user reviews (using LDA).
Furthermore, there are five types of relationships between APP nodes and each of the remaining nodes:
Dataset Files Info
Neo4j 2.0 Databases
googlePlayDB1-Jan2014_neo4j_2_0.rar
googlePlayDB2-Mar2014_neo4j_2_0.rar We provide two Neo4j databases containing the 2 snapshots of the Google Play Store (January and March 2014). These are the original databases created for the paper. The databases were created with Neo4j 2.0. In particular with the tool version 'Neo4j 2.0.0-M06 Community Edition' (latest version available at the time of implementing the paper in 2014).
Neo4j 3.5 Databases
googlePlayDB1-Jan2014_neo4j_3_5_28.rar
googlePlayDB2-Mar2014_neo4j_3_5_28.rar Currently, the version Neo4j 2.0 is deprecated and it is not available for download in the official Neo4j Download Center. We have migrated the original databases (Neo4j 2.0) to Neo4j 3.5.28. The databases can be opened with the tool version: 'Neo4j Community Edition 3.5.28'. The tool can be downloaded from the official Neo4j Donwload page.
In order to open the databases with more recent versions of Neo4j, the databases must be first migrated to the corresponding version. Instructions about the migration process can be found in the Neo4j Migration Guide.
First time the Neo4j database is connected, it could request credentials. The username and pasword are: neo4j/neo4j
This data set contains some basic statistics about user count and user growth as well as crash count for a real mobile app. The dataset contains a basic timeseries of 1 hour resolution for a period of one week.
The data set contains columns for total concurrent user count, new users acquired in that period of time, number of sessions and crash count.
This data set would not be available without the Real User Monitoring capabilities of Dynatrace and its flexibility to export and expose this data for scientific experiments.
The data set was intended to play around with seasonality, trend and prediction of timeseries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This research conducted an online survey to investigate the relationship between dating app use and hookup intention. It measured dating app use, perceived descriptive norms, injunctive norms, fear of negative evaluation, hookup intention, and demographic information including age, gender, sexual orientation, and relationship status.
Our consumer data is gathered and aggregated via surveys, digital services, and public data sources. We use powerful profiling algorithms to collect and ingest only fresh and reliable data points.
Our comprehensive data enrichment solution includes a variety of data sets that can help you address gaps in your customer data, gain a deeper understanding of your customers, and power superior client experiences.
Consumer Graph Schema & Reach: Our data reach represents the total number of counts available within various categories and comprises attributes such as country location, MAU, DAU & Monthly Location Pings:
Data Export Methodology: Since we collect data dynamically, we provide the most updated data and insights via a best-suited method on a suitable interval (daily/weekly/monthly).
Consumer Graph Use Cases:
360-Degree Customer View:Get a comprehensive image of customers by the means of internal and external data aggregation.
Data Enrichment:Leverage Online to offline consumer profiles to build holistic audience segments to improve campaign targeting using user data enrichment
Fraud Detection: Use multiple digital (web and mobile) identities to verify real users and detect anomalies or fraudulent activity.
Advertising & Marketing:Understand audience demographics, interests, lifestyle, hobbies, and behaviors to build targeted marketing campaigns.
Using Factori Consumer Data graph you can solve use cases like:
Acquisition Marketing Expand your reach to new users and customers using lookalike modeling with your first party audiences to extend to other potential consumers with similar traits and attributes.
Lookalike Modeling
Build lookalike audience segments using your first party audiences as a seed to extend your reach for running marketing campaigns to acquire new users or customers
And also, CRM Data Enrichment, Consumer Data Enrichment B2B Data Enrichment B2C Data Enrichment Customer Acquisition Audience Segmentation 360-Degree Customer View Consumer Profiling Consumer Behaviour Data
Here's the schema of Consumer Data:
person_id
first_name
last_name
age
gender
linkedin_url
twitter_url
facebook_url
city
state
address
zip
zip4
country
delivery_point_bar_code
carrier_route
walk_seuqence_code
fips_state_code
fips_country_code
country_name
latitude
longtiude
address_type
metropolitan_statistical_area
core_based+statistical_area
census_tract
census_block_group
census_block
primary_address
pre_address
streer
post_address
address_suffix
address_secondline
address_abrev
census_median_home_value
home_market_value
property_build+year
property_with_ac
property_with_pool
property_with_water
property_with_sewer
general_home_value
property_fuel_type
year
month
household_id
Census_median_household_income
household_size
marital_status
length+of_residence
number_of_kids
pre_school_kids
single_parents
working_women_in_house_hold
homeowner
children
adults
generations
net_worth
education_level
occupation
education_history
credit_lines
credit_card_user
newly_issued_credit_card_user
credit_range_new
credit_cards
loan_to_value
mortgage_loan2_amount
mortgage_loan_type
mortgage_loan2_type
mortgage_lender_code
mortgage_loan2_render_code
mortgage_lender
mortgage_loan2_lender
mortgage_loan2_ratetype
mortgage_rate
mortgage_loan2_rate
donor
investor
interest
buyer
hobby
personal_email
work_email
devices
phone
employee_title
employee_department
employee_job_function
skills
recent_job_change
company_id
company_name
company_description
technologies_used
office_address
office_city
office_country
office_state
office_zip5
office_zip4
office_carrier_route
office_latitude
office_longitude
office_cbsa_code
office_census_block_group
office_census_tract
office_county_code
company_phone
company_credit_score
company_csa_code
company_dpbc
company_franchiseflag
company_facebookurl
company_linkedinurl
company_twitterurl
company_website
company_fortune_rank
company_government_type
company_headquarters_branch
company_home_business
company_industry
company_num_pcs_used
company_num_employees
company_firm_individual
company_msa
company_msa_name
company_naics_code
company_naics_description
company_naics_code2
company_naics_description2
company_sic_code2
company_sic_code2_desc...
The following datasets are based on the children and youth (under age 21) beneficiary population and consist of aggregate Mental Health Service data derived from Medi-Cal claims, encounter, and eligibility systems. These datasets were developed in accordance with California Welfare and Institutions Code (WIC) § 14707.5 (added as part of Assembly Bill 470 on 10/7/17). Please contact BHData@dhcs.ca.gov for any questions or to request previous years’ versions of these datasets. Note: The Performance Dashboard AB 470 Report Application Excel tool development has been discontinued. Please see the Behavioral Health reporting data hub at https://behavioralhealth-data.dhcs.ca.gov/ for access to dashboards utilizing these datasets and other behavioral health data.
Table from the American Community Survey (ACS) 5-year series on age and gender related topics for City of Seattle Council Districts, Comprehensive Plan Growth Areas and Community Reporting Areas. Table includes B01001 Sex by Age, B01002 Median Age by Sex. Data is pulled from block group tables for the most recent ACS vintage and summarized to the neighborhoods based on block group assignment.Table created for and used in the Neighborhood Profiles application.Vintages: 2023ACS Table(s): B01001, B01002Data downloaded from: Census Bureau's Explore Census Data The United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estima
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A dataset covering Spotify usage and artist performance in 2025, including metrics like monthly active users, premium subscriber counts, demographic breakdowns, and playlist analytics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
There has been an increased emphasis on plant-based foods and diets. Although mobile technology has the potential to be a convenient and innovative tool to help consumers adhere to dietary guidelines, little is known about the content and quality of free, popular mobile health (mHealth) plant-based diet apps. The objective of the study was to assess the content and quality of free, popular mHealth apps supporting plant-based diets for Canadians. Free mHealth apps with high user ratings, a high number of user ratings, available on both Apple App and GooglePlay stores, and primarily marketed to help users follow plant-based diet were included. Using pre-defined search terms, Apple App and GooglePlay App stores were searched on December 22, 2020; the top 100 returns for each search term were screened for eligibility. Included apps were downloaded and assessed for quality by three dietitians/nutrition research assistants using the Mobile App Rating Scale (MARS) and the App Quality Evaluation (AQEL) scale. Of the 998 apps screened, 16 apps (mean user ratings±SEM: 4.6±0.1) met the eligibility criteria, comprising 10 recipe managers and meal planners, 2 food scanners, 2 community builders, 1 restaurant identifier, and 1 sustainability assessor. All included apps targeted the general population and focused on changing behaviors using education (15 apps), skills training (9 apps), and/or goal setting (4 apps). Although MARS (scale: 1–5) revealed overall adequate app quality scores (3.8±0.1), domain-specific assessments revealed high functionality (4.0±0.1) and aesthetic (4.0±0.2), but low credibility scores (2.4±0.1). The AQEL (scale: 0–10) revealed overall low score in support of knowledge acquisition (4.5±0.4) and adequate scores in other nutrition-focused domains (6.1–7.6). Despite a variety of free plant-based apps available with different focuses to help Canadians follow plant-based diets, our findings suggest a need for increased credibility and additional resources to complement the low support of knowledge acquisition among currently available plant-based apps. This research received no specific grant from any funding agency.
The following datasets are based on the adult (age 21 and over) beneficiary population and consist of aggregate MHS data derived from Medi-Cal claims, encounter, and eligibility systems. These datasets were developed in accordance with California Welfare and Institutions Code (WIC) § 14707.5 (added as part of Assembly Bill 470 on 10/7/17). Please contact BHData@dhcs.ca.gov for any questions or to request previous years’ versions of these datasets. Note: The Performance Dashboard AB 470 Report Application Excel tool development has been discontinued. Please see the Behavioral Health reporting data hub at https://behavioralhealth-data.dhcs.ca.gov/ for access to dashboards utilizing these datasets and other behavioral health data.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset contains information on application install interactions of users in the Myket android application market. The dataset was created for the purpose of evaluating interaction prediction models, requiring user and item identifiers along with timestamps of the interactions. Hence, the dataset can be used for interaction prediction and building a recommendation system. Furthermore, the data forms a dynamic network of interactions, and we can also perform network representation learning on the nodes in the network, which are users and applications.
Data Creation
The dataset was initially generated by the Myket data team, and later cleaned and subsampled by Erfan Loghmani a master student at Sharif University of Technology at the time. The data team focused on a two-week period and randomly sampled 1/3 of the users with interactions during that period. They then selected install and update interactions for three months before and after the two-week period, resulting in interactions spanning about 6 months and two weeks.
We further subsampled and cleaned the data to focus on application download interactions. We identified the top 8000 most installed applications and selected interactions related to them. We retained users with more than 32 interactions, resulting in 280,391 users. From this group, we randomly selected 10,000 users, and the data was filtered to include only interactions for these users. The detailed procedure can be found in here.
Data Structure
The dataset has two main files.
myket.csv
: This file contains the interaction information and follows the same format as the datasets used in the "JODIE: Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks" (ACM SIGKDD 2019) project. However, this data does not contain state labels and interaction features, resulting in associated columns being all zero.app_info_sample.csv
: This file comprises features associated with applications present in the sample. For each individual application, information such as the approximate number of installs, average rating, count of ratings, and category are included. These features provide insights into the applications present in the dataset.Dataset Details
For a detailed summary of the data's statistics, including information on users, applications, and interactions, please refer to the Python notebook available at summary-stats.ipynb. The notebook provides an overview of the dataset's characteristics and can be helpful for understanding the data's structure before using it for research or analysis.
Top 20 Most Installed Applications
Package Name | Count of Interactions |
---|---|
com.instagram.android | 15292 |
ir.resaneh1.iptv | 12143 |
com.tencent.ig | 7919 |
com.ForgeGames.SpecialForcesGroup2 | 7797 |
ir.nomogame.ClutchGame | 6193 |
com.dts.freefireth | 6041 |
com.whatsapp | 5876 |
com.supercell.clashofclans | 5817 |
com.mojang.minecraftpe | 5649 |
com.lenovo.anyshare.gps | 5076 |
ir.medu.shad | 4673 |
com.firsttouchgames.dls3 | 4641 |
com.activision.callofduty.shooter | 4357 |
com.tencent.iglite | 4126 |
com.aparat | 3598 |
com.kiloo.subwaysurf | 3135 |
com.supercell.clashroyale | 2793 |
co.palang.QuizOfKings | 2589 |
com.nazdika.app | 2436 |
com.digikala | 2413 |
Comparison with SNAP Datasets
The Myket dataset introduced in this repository exhibits distinct characteristics compared to the real-world datasets used by the project. The table below provides a comparative overview of the key dataset characteristics:
Dataset | #Users | #Items | #Interactions | Average Interactions per User | Average Unique Items per User |
---|---|---|---|---|---|
Myket | 10,000 | 7,988 | 694,121 | 69.4 | 54.6 |
LastFM | 980 | 1,000 | 1,293,103 | 1,319.5 | 158.2 |
10,000 | 984 | 672,447 | 67.2 | 7.9 | |
Wikipedia | 8,227 | 1,000 | 157,474 | 19.1 | 2.2 |
MOOC | 7,047 | 97 | 411,749 | 58.4 | 25.3 |
The Myket dataset stands out by having an ample number of both users and items, highlighting its relevance for real-world, large-scale applications. Unlike LastFM, Reddit, and Wikipedia datasets, where users exhibit repetitive item interactions, the Myket dataset contains a comparatively lower amount of repetitive interactions. This unique characteristic reflects the diverse nature of user behaviors in the Android application market environment.
Citation
If you use this dataset in your research, please cite the following preprint:
@misc{loghmani2023effect,
title={Effect of Choosing Loss Function when Using T-batching for Representation Learning on Dynamic Networks},
author={Erfan Loghmani and MohammadAmin Fazli},
year={2023},
eprint={2308.06862},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contains year, month and payment application-wise UPI Apps Transaction Statistics like Customer Initiated Transactions, B2C Transactions, B2B Transactions and On-us Transactions Note: 1) Unified Payments Interface(UPI) is an instant real-time payment system developed by National Payments Corporation of India. The interface facilitates inter-bank peer-to-peer and person-to-merchant transactions 2) From January 2021 onwards, ‚On-us Transactions‚ in UPI that are not processed and settled through the UPI Central System is shown under ‚ On-us Transactions column 3) Apps which has volume less than 10,000 is included under‚ Other Apps. 4) App volume in table is basis the Payer App logic, i.e the financial transaction is attributed to the PSP in UPI on the Payer's side. 5) BHIM Volume is inclusive of *99# volume. 6) For WhatsApp, Maximum registered user base of hundred (100) million in UPI
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Social Media has become a part of our day-to-day routine, keeping users from across the world well-connected through digital platforms. With each passing year, social media is evolving at a rapid speed. With each passing year, the number of social media users is increasing at an immersive speed. Reports also suggest the number of social media users will reach a milestone of 5.85 billion in 2027.
In 2024, 62.6% of the world’s population will access social media, which clearly indicates the dominance of social media platforms in today’s world. In this article, we will examine social media statistics for 2024, uncovering monthly active users, daily time spent by users, most downloaded social media apps, etc.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Social networks are tied to population dynamics; interactions are driven by population density and demographic structure, while social relationships can be key determinants of survival and reproductive success. However, difficulties integrating models used in demography and network analysis have limited research at this interface. We introduce the R package genNetDem for simulating integrated network-demographic datasets. It can be used to create longitudinal social networks and/or capture-recapture datasets with known properties. It incorporates the ability to generate populations and their social networks, generate grouping events using these networks, simulate social network effects on individual survival, and flexibly sample these longitudinal datasets of social associations. By generating co-capture data with known statistical relationships it provides functionality for methodological research. We demonstrate its use with case studies testing how imputation and sampling design influence the success of adding network traits to conventional Cormack-Jolly-Seber (CJS) models. We show that incorporating social network effects in CJS models generates qualitatively accurate results, but with downward-biased parameter estimates when network position influences survival. Biases are greater when fewer interactions are sampled or fewer individuals are observed in each interaction. While our results indicate the potential of incorporating social effects within demographic models, they show that imputing missing network measures alone is insufficient to accurately estimate social effects on survival, pointing to the importance of incorporating network imputation approaches. genNetDem provides a flexible tool to aid these methodological advancements and help researchers test other sampling considerations in social network studies.
The number of social media users in Saudi Arabia was forecast to continuously increase between 2024 and 2029 by in total six million users (+28.05 percent). After the ninth consecutive increasing year, the social media user base is estimated to reach 27.42 million users and therefore a new peak in 2029. Notably, the number of social media users of was continuously increasing over the past years.The shown figures regarding social media users have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of social media users in countries like Israel and Kuwait.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
For years, we have relied on population surveys to keep track of regional public health statistics, including the prevalence of non-communicable diseases. Because of the cost and limitations of such surveys, we often do not have the up-to-date data on health outcomes of a region. In this paper, we examined the feasibility of inferring regional health outcomes from socio-demographic data that are widely available and timely updated through national censuses and community surveys. Using data for 50 American states (excluding Washington DC) from 2007 to 2012, we constructed a machine-learning model to predict the prevalence of six non-communicable disease (NCD) outcomes (four NCDs and two major clinical risk factors), based on population socio-demographic characteristics from the American Community Survey. We found that regional prevalence estimates for non-communicable diseases can be reasonably predicted. The predictions were highly correlated with the observed data, in both the states included in the derivation model (median correlation 0.88) and those excluded from the development for use as a completely separated validation sample (median correlation 0.85), demonstrating that the model had sufficient external validity to make good predictions, based on demographics alone, for areas not included in the model development. This highlights both the utility of this sophisticated approach to model development, and the vital importance of simple socio-demographic characteristics as both indicators and determinants of chronic disease.
TagX Web Browsing Clickstream Data: Unveiling Digital Behavior Across North America and EU Unique Insights into Online User Behavior TagX Web Browsing clickstream Data offers an unparalleled window into the digital lives of 1 million users across North America and the European Union. This comprehensive dataset stands out in the market due to its breadth, depth, and stringent compliance with data protection regulations. What Makes Our Data Unique?
Extensive Geographic Coverage: Spanning two major markets, our data provides a holistic view of web browsing patterns in developed economies. Large User Base: With 300K active users, our dataset offers statistically significant insights across various demographics and user segments. GDPR and CCPA Compliance: We prioritize user privacy and data protection, ensuring that our data collection and processing methods adhere to the strictest regulatory standards. Real-time Updates: Our clickstream data is continuously refreshed, providing up-to-the-minute insights into evolving online trends and user behaviors. Granular Data Points: We capture a wide array of metrics, including time spent on websites, click patterns, search queries, and user journey flows.
Data Sourcing: Ethical and Transparent Our web browsing clickstream data is sourced through a network of partnered websites and applications. Users explicitly opt-in to data collection, ensuring transparency and consent. We employ advanced anonymization techniques to protect individual privacy while maintaining the integrity and value of the aggregated data. Key aspects of our data sourcing process include:
Voluntary user participation through clear opt-in mechanisms Regular audits of data collection methods to ensure ongoing compliance Collaboration with privacy experts to implement best practices in data anonymization Continuous monitoring of regulatory landscapes to adapt our processes as needed
Primary Use Cases and Verticals TagX Web Browsing clickstream Data serves a multitude of industries and use cases, including but not limited to:
Digital Marketing and Advertising:
Audience segmentation and targeting Campaign performance optimization Competitor analysis and benchmarking
E-commerce and Retail:
Customer journey mapping Product recommendation enhancements Cart abandonment analysis
Media and Entertainment:
Content consumption trends Audience engagement metrics Cross-platform user behavior analysis
Financial Services:
Risk assessment based on online behavior Fraud detection through anomaly identification Investment trend analysis
Technology and Software:
User experience optimization Feature adoption tracking Competitive intelligence
Market Research and Consulting:
Consumer behavior studies Industry trend analysis Digital transformation strategies
Integration with Broader Data Offering TagX Web Browsing clickstream Data is a cornerstone of our comprehensive digital intelligence suite. It seamlessly integrates with our other data products to provide a 360-degree view of online user behavior:
Social Media Engagement Data: Combine clickstream insights with social media interactions for a holistic understanding of digital footprints. Mobile App Usage Data: Cross-reference web browsing patterns with mobile app usage to map the complete digital journey. Purchase Intent Signals: Enrich clickstream data with purchase intent indicators to power predictive analytics and targeted marketing efforts. Demographic Overlays: Enhance web browsing data with demographic information for more precise audience segmentation and targeting.
By leveraging these complementary datasets, businesses can unlock deeper insights and drive more impactful strategies across their digital initiatives. Data Quality and Scale We pride ourselves on delivering high-quality, reliable data at scale:
Rigorous Data Cleaning: Advanced algorithms filter out bot traffic, VPNs, and other non-human interactions. Regular Quality Checks: Our data science team conducts ongoing audits to ensure data accuracy and consistency. Scalable Infrastructure: Our robust data processing pipeline can handle billions of daily events, ensuring comprehensive coverage. Historical Data Availability: Access up to 24 months of historical data for trend analysis and longitudinal studies. Customizable Data Feeds: Tailor the data delivery to your specific needs, from raw clickstream events to aggregated insights.
Empowering Data-Driven Decision Making In today's digital-first world, understanding online user behavior is crucial for businesses across all sectors. TagX Web Browsing clickstream Data empowers organizations to make informed decisions, optimize their digital strategies, and stay ahead of the competition. Whether you're a marketer looking to refine your targeting, a product manager seeking to enhance user experience, or a researcher exploring digital trends, our cli...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 6 columns and 10k rows about the demographics of the users of an app. UID - User ID, unique identifier for every app user. reg_date - Date that each user registered. device - Operating system of the user. Gender - Gender of the user Country - Country where the user downloaded the app. Age - Age of the user.