Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The rise of online dating apps has transformed how Gen Z in India explores relationships, social interactions, and casual dating. This analysis investigates dating app usage patterns, preferences, and challenges faced by individuals aged 18-25 across major Indian cities.
: ✅ Most popular dating apps ✅ Frequency & reasons for usage ✅ User satisfaction levels ✅ Challenges like safety concerns & time-wasting ✅ Preferences for features & communication methods
The study employs data visualization, statistical insights, and correlation analysis to understand the evolving landscape of online dating in India. 🚀\
User_ID: Unique identifier for each participant.
Age: Age of the user (18-25 range).
Gender: Gender identity (Male, Female, Non-binary, etc.).
Location: City of residence (e.g., Delhi, Mumbai).
Education: Education level (Undergraduate, Graduate, Postgraduate).
Occupation: Occupation type (e.g., Student, Freelancer, Intern).
Primary_App: The main dating app used by the user (e.g., Tinder, Bumble, Hinge).
Secondary_Apps: Other dating apps used, if any.
Usage_Frequency: How often they use dating apps (Daily, Weekly, Monthly).
Daily_Usage_Time: Time spent daily on dating apps (e.g., 1 hour, 2 hours).
Reason_for_Using: Purpose for using the apps (e.g., Casual Dating, Finding a Partner).
Satisfaction: Satisfaction level with the primary app (e.g., 1 to 5 scale).
Challenges: Challenges faced during usage (e.g., Safety Concerns, Lack of Matches).
Desired_Features: Features users want in dating apps (e.g., Video Calls, Compatibility Insights).
Preferred_Communication: Communication preferences (e.g., Text, Voice Notes, Video Calls).
Partner_Priorities: Attributes prioritized in a partner (e.g., Personality > Interests > Appearance).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a public dataset called OKCupid, collected by Kirkegaard and Bjerrekaer. The dataset is composed of 68,371 records and 2,626 variables. It is shared for educational purposes. Formatted in Arrow Parquet.Description from the authors:"A very large dataset (N=68,371, 2,620 variables) from the dating site OKCupid is presented and made publicly available for use by others. As an example of the analyses one can do with the dataset, a cognitive ability test is constructed from 14 suitable items. To validate the dataset and the test, the relationship of cognitive ability to religious beliefs and political interest/participation is examined. Cognitive ability is found to be negatively related to all measures of religious belief (latent correlations -.26 to -.35), and found to be positively related to all measures of political interest and participation (latent correlations .19 to .32). To further validate the dataset, we examined the relationship between Zodiac sign and every other variable. We found very scant evidence of any influence (the distribution of p-values from chi square tests was flat). Limitations of the dataset are discussed."
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Dataset provides a comprehensive view into the dynamics of online matchmaking interactions. It captures essential variables that influence the likelihood of successful matches across different genders. This dataset allows researchers and analysts to explore how factors such as VIP subscription status, income levels, parental status, age, and self-perceived attractiveness contribute to the outcomes of online dating endeavors.
The occurrence of zero matches for certain users within the dataset can be attributed to the presence of "ghost users." These are users who create an account but subsequently abandon the app without engaging further. Consequently, their profiles do not participate in any matching activities, leading to a recorded match count of zero. This phenomenon should be taken into account when analyzing user activity and match data, as it impacts the overall interpretation of user engagement and match success rates.
This dataset contains 1000 records, which is considered relatively low within this category of datasets. Additionally, the dataset may not accurately reflect reality as it was captured intermittently over different periods of time.
Furthermore, certain match categories are missing due to confidentiality constraints, and several other crucial variables are also absent for the same reason. Consequently, the machine learning models employed may not achieve high accuracy in predicting the number of matches.
It is important to acknowledge these limitations when interpreting the results derived from this dataset. Careful consideration of these factors is advised when drawing conclusions or making decisions based on the findings of any analyses conducted using this data.
Due to confidentiality constraints, only a small amount of data was collected. Additionally, only users with variables showing high correlation with the matching variable were included in the dataset.
As a result, the high performance of machine learning models on this dataset is primarily due to the data collection method (i.e., only high-correlation data was included).
Therefore, the findings you may derive from manipulating this dataset are not representative of the real dating world.
The source of this dataset is confidential, and it may be released in the future. For the present, this dataset can be utilized under the terms of the license visible on the dataset's card.
Users are advised to review and adhere to the terms specified in the dataset's license when using the data for any purpose.
This dataset provides insights into the dynamics of online dating interactions, allowing for predictive modeling and analysis of factors influencing matchmaking success.
This dataset, shared by Rabie El Kharoua, is original and has never been shared before. It is made available under the CC BY 4.0 license, allowing anyone to use the dataset in any form as long as proper citation is given to the author. A DOI is provided for proper referencing. Please note that duplication of this work within Kaggle is not permitted.
This dataset is synthetic and was generated for educational purposes, making it ideal for data science and machine learning projects. It is an original dataset, owned by Mr. Rabie El Kharoua, and has not been previously shared. You are free to use it under the license outlined on the data card. The dataset is offered without any guarantees. Details about the data provider will be shared soon.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Jeffrey Mvutu Mabilama [source]
When Dating apps like Tinder began to become more popular, users wanted to create the best profiles possible in order to maximize their chances of being noticed and gain more potential encounters. Unlike traditional dating platforms, these new ones required mutual attraction before allowing two people to chat, making it all the more important for users to create a great profile that would give them an advantage over others.
It was amidst this scene that we Humans began paying attention at how charismatic and inspiring people presented themselves online. The most charismatic individuals tended to be the ones with the most followers or friends on social networks. This made us question what makes a great user profile and how one could make a lasting first impression in order ensure finding true love or even just some new friendships? How do we recognize a truly charismatic person from their presentation on social media? Is there any way of quantifying charisma?
In 2015 I set out with researching all this using Lovoo's newest dating app version -V3 (the iOS version), gathering user profile data such as age demographics, interest types (friendship, chatting or dating), language preferences etc., as well as usually unavailable metrics like number of profile visits, kisses received etc. I was also able to collect pictures of those user profiles in order discern any correlations between appeal and reputation that may have existed at that time amongst Lovoo's population base.
My goal is forthis dataset will help you answer those questions related not just romantic success but also popularity/charisma censes/demographic studies and even detect influential figures both within & outside Lovoo's platform . A starter analysis is available accompanying this dataset which can be used as a reference point when working with the data here. Using this dataset you can your own investigations into:
* What type of person has attracted more visitors or potential matches than others? * Which criteria can be used when determining someone’s charm/likability among others ? * How does one optimize his/her dating app profile visibility so he/she won’t remain unseen among other users?Grab this amazing opportunity now! Kick-start your journey towards understanding the inner workings behind success in online relationships today!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
To get started with this dataset first you need to download it from Kaggle. Once downloaded you should take a look at the column names in order to get an idea of what information is available. This data includes fields such as gender, age name (and nickname), number of pictures uploaded/profile visits/kisses /fans/gifts received and flirt interests (chatting or making friends). It also contains language specifics like detected languages for each user as well as country & city of residence.
The most interesting section for your research is likely the number of details that have been filled in for each user – such as whether they are interested in chatting or making friends. Usually these information points allow us to infer more about a person’s character – from jokester to serious individualist (or anything else!). The same holds true for their language preferences which might reveal aspects regarding their cultures orientation or habits.
You may also want collected data which was left out here - imagery associated with users' profiles - so please contact JfreexDatasets_bot on Telegram if you would like access to this imagery that has not yet been uploaded here on Kaggle but is intregral part of understanding what makes a great user profile attractive on these platforms according Aesthetics Theory applied in an uthentic way when considering how each image adds sentimental appeal value by its perspective content focus - be it visually descriptive; emotive narrative; personality coupled with expression mood association.. etcetera... Or simple just download relevant images yourself using automated scripts ready made via webiste Grammak where Github Repo exists: https://github.com/grammak580542008/Lovoo-v3-Profiles-Data # 1 year ago...
Finally moving ahead — keep in mind that there are other ways data can be gathered possible besides just downloading it from Kaggle – such us Messenger Bots or Customer Relationship Management systems which help companies serve...
Facebook
Twitter"Dating apps have revolutionized the dating game for a new generation of singles. Making finding dates and future love theoretically easier than ever before, it's no surprise that 'online' has become the most common way for people to meet nowadays."
"The Statista Global Consumer Survey reveals that the country with the largest share of singles utilizing this modern matchmaking method is Sweden. Here, 25 percent said they used online dating services. Brazil was close to the top of the ranking with 22 percent. People in South Korea and Russia seem to get along fine without much help from online dating however, with just 7 and 4 percent of singles saying they were using such services."
Photo by Mika Baumeister on Unsplash
Users that have been fooled (financially) by other users that they met on the dating service site. Which (has no fault) since it's just another tool to people make what they are used to make before Internet times.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OkCupid (often abbreviated as OKC,[1] but officially OkC) is a U.S.-based, internationally operating online dating, friendship, and formerly also a social networking website and application. It features multiple-choice questions to match members. Registration is free. OKCupid is owned by Match Group, which also owns Tinder, Hinge, Plenty of Fish, and many other popular dating apps and sites.[Source: [Wikipedia](https://en.wikipedia.org/wiki/OkCupid]
This dataset belongs to the app OkCupid available on the Google Play Store. The Dataset mostly has user reviews and the various comments made by the users.
The content of the various columns is listed below. Please find the description for each column.
| Column Name | Column Description |
|---|---|
| userName | Name of a User |
| userImage | Profile Image that a user has |
| content | This represents the comments made by a user |
| score | Scores/Rating between 1 to 5 |
| thumbsUpCount | Number of Thumbs up received by a person |
| reviewCreatedVersion | Version number on which the review is created |
| at | Created At |
| replyContent | Reply to the comment by the Company |
| repliedAt | Date and time of the above reply |
| reviewId | unique identifier |
Banner image - OkCupid
Facebook
TwitterOnline dating services have increased in popularity around the world, but a lack of quality data hinders our understanding of their role in family formation. This paper studies the effect of online dating services on marital sorting, using a novel dataset with verified information on people and their spouses. Estimates based on matching techniques suggest that, relative to other spouse search methods, online dating promotes marriages that exhibit weaker sorting along occupation and geographical proximity but stronger sorting along education and other demographic traits. Sensitivity analysis, including the Rosenbaum Bounds approach, suggests that online dating's impact on marital sorting is robust to potential selection bias.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundUniversity campus clinics provide crucial sexual health services to students, including STI/HIV screening, testing, contraception, and counseling. These clinics are essential for engaging young adults who may lack access to primary care or have difficulty reaching off-campus services. Dating apps are widely used by young adults, yet there is a lack of studies on how they affect sexual practices. This study aimed to evaluate the use of dating apps, engagement in condomless sexual activity, and the prevalence of STIs among young adult college students in Northern Texas.MethodsA cross-sectional survey was conducted from August to December 2022 among undergraduate and graduate students aged 18–35 at a large university in Northern Texas. A total of 122 eligible participants completed the survey, which assessed demographics, sexual behaviors, dating app use, and STI/HIV testing practices. Descriptive statistics, bivariate analyses, and multivariate Poisson regression analyses with robust variance were performed to identify factors associated with dating app use and condomless sexual activity.ResultsTwo-thirds of participants reported using dating apps. Significant differences were found between app users and non-users regarding demographic factors and unprotected sexual behaviors. Dating app users were more likely to report multiple sexual partners, inconsistent condom use, and a higher likelihood of engaging in unprotected sex. Poisson regression analysis indicated that app use was associated with residing in large urban areas, frequent use of campus STI/HIV screening services, and having multiple sexual partners (p
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I was interested in learning the growing trend of what dating apps are used for in India over the years.
The data is from 2017-2022. I acquired the data using google_play_scraper from google playstore online. The data I received was more than just the column shown here but were unnecessary.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Net-Income-Applicable-To-Common-Shares Time Series for Bumble Inc. Bumble Inc. provides online dating and social networking applications in North America, Europe, internationally. It owns and operates websites and applications that offers subscription and in-app purchases of products. The company operates apps, including Bumble, a dating app built with women at the center, where women make the first move; Badoo, the web and mobile free-to-use dating app; Bumble BFF and Bumble Bizz Modes that have a format similar to the date mode requiring users to set up profiles and matching users through yes and no votes, similar to the dating platform; and Bumble for Friends, a friendship app where people in all stages of life can meet people nearby and create meaningful platonic connections, as well as Geneva app where users can create and join chat, forum, audio, video, and broadcast rooms. The company was founded in 2020 in and is headquartered in Austin, Texas.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please cite the following paper when using this dataset:
N. Thakur, “A Large-Scale Dataset of Twitter Chatter about Online Learning during the Current COVID-19 Omicron Wave,” Journal of Data, vol. 7, no. 8, p. 109, Aug. 2022, doi: 10.3390/data7080109
Abstract
The COVID-19 Omicron variant, reported to be the most immune evasive variant of COVID-19, is resulting in a surge of COVID-19 cases globally. This has caused schools, colleges, and universities in different parts of the world to transition to online learning. As a result, social media platforms such as Twitter are seeing an increase in conversations, centered around information seeking and sharing, related to online learning. Mining such conversations, such as Tweets, to develop a dataset can serve as a data resource for interdisciplinary research related to the analysis of interest, views, opinions, perspectives, attitudes, and feedback towards online learning during the current surge of COVID-19 cases caused by the Omicron variant. Therefore this work presents a large-scale public Twitter dataset of conversations about online learning since the first detected case of the COVID-19 Omicron variant in November 2021. The dataset is compliant with the privacy policy, developer agreement, and guidelines for content redistribution of Twitter and the FAIR principles (Findability, Accessibility, Interoperability, and Reusability) principles for scientific data management.
Data Description
The dataset comprises a total of 52,984 Tweet IDs (that correspond to the same number of Tweets) about online learning that were posted on Twitter from 9th November 2021 to 13th July 2022. The earliest date was selected as 9th November 2021, as the Omicron variant was detected for the first time in a sample that was collected on this date. 13th July 2022 was the most recent date as per the time of data collection and publication of this dataset.
The dataset consists of 9 .txt files. An overview of these dataset files along with the number of Tweet IDs and the date range of the associated tweets is as follows. Table 1 shows the list of all the synonyms or terms that were used for the dataset development.
Filename: TweetIDs_November_2021.txt (No. of Tweet IDs: 1283, Date Range of the associated Tweet IDs: November 1, 2021 to November 30, 2021)
Filename: TweetIDs_December_2021.txt (No. of Tweet IDs: 10545, Date Range of the associated Tweet IDs: December 1, 2021 to December 31, 2021)
Filename: TweetIDs_January_2022.txt (No. of Tweet IDs: 23078, Date Range of the associated Tweet IDs: January 1, 2022 to January 31, 2022)
Filename: TweetIDs_February_2022.txt (No. of Tweet IDs: 4751, Date Range of the associated Tweet IDs: February 1, 2022 to February 28, 2022)
Filename: TweetIDs_March_2022.txt (No. of Tweet IDs: 3434, Date Range of the associated Tweet IDs: March 1, 2022 to March 31, 2022)
Filename: TweetIDs_April_2022.txt (No. of Tweet IDs: 3355, Date Range of the associated Tweet IDs: April 1, 2022 to April 30, 2022)
Filename: TweetIDs_May_2022.txt (No. of Tweet IDs: 3120, Date Range of the associated Tweet IDs: May 1, 2022 to May 31, 2022)
Filename: TweetIDs_June_2022.txt (No. of Tweet IDs: 2361, Date Range of the associated Tweet IDs: June 1, 2022 to June 30, 2022)
Filename: TweetIDs_July_2022.txt (No. of Tweet IDs: 1057, Date Range of the associated Tweet IDs: July 1, 2022 to July 13, 2022)
The dataset contains only Tweet IDs in compliance with the terms and conditions mentioned in the privacy policy, developer agreement, and guidelines for content redistribution of Twitter. The Tweet IDs need to be hydrated to be used. For hydrating this dataset the Hydrator application (link to download and a step-by-step tutorial on how to use Hydrator) may be used.
Table 1. List of commonly used synonyms, terms, and phrases for online learning and COVID-19 that were used for the dataset development
Terminology
List of synonyms and terms
COVID-19
Omicron, COVID, COVID19, coronavirus, coronaviruspandemic, COVID-19, corona, coronaoutbreak, omicron variant, SARS CoV-2, corona virus
online learning
online education, online learning, remote education, remote learning, e-learning, elearning, distance learning, distance education, virtual learning, virtual education, online teaching, remote teaching, virtual teaching, online class, online classes, remote class, remote classes, distance class, distance classes, virtual class, virtual classes, online course, online courses, remote course, remote courses, distance course, distance courses, virtual course, virtual courses, online school, virtual school, remote school, online college, online university, virtual college, virtual university, remote college, remote university, online lecture, virtual lecture, remote lecture, online lectures, virtual lectures, remote lectures
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains information about the contents of 100 Terms of Service (ToS) of online platforms. The documents were analyzed and evaluated from the point of view of the European Union consumer law. The main results have been presented in the table titled "Terms of Service Analysis and Evaluation_RESULTS." This table is accompanied by the instruction followed by the annotators, titled "Variables Definitions," allowing for the interpretation of the assigned values. In addition, we provide the raw data (analyzed ToS, in the folder "Clear ToS") and the annotated documents (in the folder "Annotated ToS," further subdivided).
SAMPLE: The sample contains 100 contracts of digital platforms operating in sixteen market sectors: Cloud storage, Communication, Dating, Finance, Food, Gaming, Health, Music, Shopping, Social, Sports, Transportation, Travel, Video, Work, and Various. The selected companies' main headquarters span four legal surroundings: the US, the EU, Poland specifically, and Other jurisdictions. The chosen platforms are both privately held and publicly listed and offer both fee-based and free services. Although the sample cannot be treated as representative of all online platforms, it nevertheless accounts for the most popular consumer services in the analyzed sectors and contains a diverse and heterogeneous set.
CONTENT: Each ToS has been assigned the following information: 1. Metadata: 1.1. the name of the service; 1.2. the URL; 1.3. the effective date; 1.4. the language of ToS; 1.5. the sector; 1.6. the number of words in ToS; 1.7–1.8. the jurisdiction of the main headquarters; 1.9. if the company is public or private; 1.10. if the service is paid or free. 2. Evaluative Variables: remedy clauses (2.1– 2.5); dispute resolution clauses (2.6–2.10); unilateral alteration clauses (2.11–2.15); rights to police the behavior of users (2.16–2.17); regulatory requirements (2.18–2.20); and various (2.21–2.25). 3. Count Variables: the number of clauses seen as unclear (3.1) and the number of other documents referred to by the ToS (3.2). 4. Pull-out Text Variables: rights and obligations of the parties (4.1) and descriptions of the service (4.2)
ACKNOWLEDGEMENT: The research leading to these results has received funding from the Norwegian Financial Mechanism 2014-2021, project no. 2020/37/K/HS5/02769, titled “Private Law of Data: Concepts, Practices, Principles & Politics.”
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Reddit [source]
This dataset provides an in-depth exploration of the world of online dating, based on data mined from Reddit's Tinder subreddit. Through analysis of the six columns titled title, score, id, url, comms_num and created (which include information such as social norms and user behaviors related to online dating), this dataset can teach us valuable insights into how people are engaging with digital media and their attitudes towards it. Unveiling potential dangers such as safety risks and scams that can arise from online dating activities is also possible with this data. Its findings are paramount for anyone interested in understanding how relationships develop on a digital platform – both for researchers uncovering the sociotechnical aspects of online dating behavior and for companies seeking further insight into their user's perspectives. All in all, this dataset might just hold all the missing pieces to understanding our current relationship dynamic!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides a comprehensive overview of online dating trends and behaviors observed on Reddit's Tinder subreddit. This data can be used to analyze user opinions, investigate user experiences, and discover online dating trends. To utilize this dataset effectively, there are several steps an individual can take to gain insights from the data:
- Using the dataset to examine how online dating trends vary geographically and by demographics (gender, age, race etc.)
- Analyzing the language used in posts for insights into user attitudes towards online dating.
- Creating a machine learning model to predict a post's score based on its title, body and other features of the data set can help digital media companies better target their marketing efforts towards more successful posts on Tinder subreddits
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: Tinder.csv | Column name | Description | |:--------------|:--------------------------------------------------------| | title | The title of the post. (String) | | score | The number of upvotes the post has received. (Integer) | | url | The URL of the post. (String) | | comms_num | The number of comments the post has received. (Integer) | | created | The date and time the post was created. (DateTime) | | body | The body of the post. (String) | | timestamp | The timestamp of the post. (Integer) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Reddit.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a preview of a bigger dataset. My Telegram bot will answer your queries for more data and also allow you to contact me.
When Dating apps like Tinder were becoming viral, people wanted to have the best profile in order to get more matches and more potential encounters. Unlike other previous dating platforms, those new ones emphasized on the mutuality of attraction before allowing any two people to get in touch and chat. This made it all the more important to create the best profile in order to get the best first impression.
Parallel to that, we Humans have always been in awe before charismatic and inspiring people. The more charismatic people tend to be followed and listened to by more people. Through their metrics such as the number of friends/followers, social networks give some ways of "measuring" the potential charisma of some people.
In regard to all that, one can then think: - what makes a great user profile ? - how to make the best first impression in order to get more matches (and ultimately find love, or new friendships) ? - what makes a person charismatic ? - how do charismatic people present themselves ?
In order to try and understand those different social questions, I decided to create a dataset of user profile informations using the social network Lovoo when it came out. By using different methodologies, I was able to gather user profile data, as well as some usually unavailable metrics (such as the number of profile visits).
The dataset contains user profile infos of users of the website Lovoo.
The dataset was gathered during spring 2015 (april, may). At that time, Lovoo was expanding in european countries (among others), while Tinder was trending both in America and in Europe. At that time the iOS version of the Lovoo app was in version 3.
The dataset references pictures (field pictureId) of user profiles. These pictures are also available for a fraction of users but have not been uploaded and should be asked separately.
The idea when gathering the profile pictures was to determine whether some correlations could be identified between a profile picture and the reputation or success of a given profile. Since first impression matters, a sound hypothesis to make is that the profile picture might have a great influence on the number of profile visits, matches and so on. Do not forget that only a fraction of a user's profile is seen when browsing through a list of users.
https://s1.dmcdn.net/v/BnWkG1M7WuJDq2PKP/x480" alt="App preview of browsing profiles">
In order to gather the data, I developed a set of tools that would save the data while browsing through profiles and doing searches. Because of this approach (and the constraints that forced me to develop this approach) I could only gather user profiles that were recommended by Lovoo's algorithm for 2 profiles I created for this purpose occasion (male, open to friends & chats & dates). That is why there are only female users in the dataset. Another work could be done to fetch similar data for both genders or other age ranges.
Regarding the number of user profiles It turned out that the recommendation algorithm always seemed to output the same set of user profiles. This meant Lovoo's algorithm was probably heavily relying on settings like location (to recommend more people nearby than people in different places or countries) and maybe cookies. This diminished the number of different user profiles that would be presented and included in the dataset.
As mentioned in the introduction, there are a lot of questions we can answer using a dataset such as this one. Some questions are related to - popularity, charisma - census and demographic studies. - Statistics about the interest of people joining dating apps (making friends, finding someone to date, finding true love, ...). - Detecting influencers / potential influencers and studying them
Previously mentioned: - what makes a great user profile ? - how to make the best first impression in order to get more matches (and ultimately find love, or new friendships) ? - what makes a person charismatic ? - how do charismatic people present themselves ?
Other works: - A starter analysis is available on my data.world account, made using a SQL query. Another file has been created through that mean on the dataset page. - The kaggle version of the dataset might contain a starter kernel.
Facebook
TwitterfiNd HoT SiNgLeS iN yOuR aReA. Not really, this dataset is annonymous, but you can explore dating aspects though.
OkCupid is a mobile dating app. It sets itself apart from other dating apps by making use of a pre computed compatibility score, calculated by optional questions the users may choose to answer.
In this dataset, there are 60k records containing structured information such as age, sex, orientation as well as text data from open ended descriptions.
- Lover Recommendation with Unsupervised Learning
- Explore dating profiles and preferences
If you use this dataset in your research, please credit the authors.
Citation
@article{article, author = {Kim, Albert and Escobedo-Land, Adriana}, year = {2015}, month = {07}, pages = {}, title = {OkCupid Data for Introductory Statistics and Data Science Courses}, volume = {23}, journal = {Journal of Statistics Education}, doi = {10.1080/10691898.2015.11889737} }
Notes
License
License was not specified at the source
Splash banner
Photo by Giorgio Trovato on Unsplash
Splash icon
Logo by OkCupid available for download on their website.
More Datasets
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 2 rows and is filtered where the books is Poker face : how to win poker at the table and online. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bumble is an online dating application. Profiles of potential matches are displayed to users, who can "swipe left" to reject a candidate or "swipe right" to indicate interest. In heterosexual matches, only female users can make the first contact with matched male users, while in same-sex matches either person can send a message first. The app is a product of Bumble Inc.
Users can sign up using their phone number or Facebook profile, and have options of searching for romantic matches or, in "BFF mode", friends. Bumble Bizz facilitates business communications. Bumble was founded by Whitney Wolfe Herd shortly after she left Tinder, a dating app she says she co-founded, due to growing tensions with other company executives. Wolfe Herd has described Bumble as a "feminist dating app". As of January 2021, with a monthly user base of 42 million, Bumble is the second-most popular dating app in the U.S. after Tinder. According to a June 2016 survey, 46.2% of its users are female. According to Forbes, by 2017 the company was valued at more than $1 billion, and the company reports having over 55 million users in 150 countries as of 2019.[Source: Wikipedia]
This dataset belongs to the Bumble app available on the Google Play Store. The Dataset mostly has user reviews and the various comments made by the users.
The content of the various columns is listed below. Please find the description for each column.
| Column Name | Column Description |
|---|---|
| userName | Name of a User |
| userImage | Profile Image that a user has |
| content | This represents the comments made by a user |
| score | Scores/Rating between 1 to 5 |
| thumbsUpCount | Number of Thumbs up received by a person |
| reviewCreatedVersion | Version number on which the review is created |
| at | Created At |
| replyContent | Reply to the comment by the Company |
| repliedAt | Date and time of the above reply |
| reviewId | unique identifier |
Banner image - Bumble
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description:
The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.
Dataset Breakdown:
Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.
Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.
Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.
Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.
Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.
Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.
Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.
Context and Use Cases:
Researchers, data scientists, and developers can use this dataset to:
Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.
Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.
Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.
Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.
Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.
Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.
The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.
Future Considerations:
As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.
By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset: Online Shopping Dataset;
CustomerID
Description: Unique identifier for each customer. Data Type: Numeric;
Gender:
Description: Gender of the customer (e.g., Male, Female). Data Type: Categorical;
Location:
Description: Location or address information of the customer. Data Type: Text;
Tenure_Months:
Description: Number of months the customer has been associated with the platform. Data Type: Numeric;
Transaction_ID:
Description: Unique identifier for each transaction. Data Type: Numeric;
Transaction_Date:
Description: Date of the transaction. Data Type: Date;
Product_SKU:
Description: Stock Keeping Unit (SKU) identifier for the product. Data Type: Text;
Product_Description:
Description: Description of the product. Data Type: Text;
Product_Category:
Description: Category to which the product belongs. Data Type: Categorical;
Quantity:
Description: Quantity of the product purchased in the transaction. Data Type: Numeric;
Avg_Price:
Description: Average price of the product. Data Type: Numeric;
Delivery_Charges:
Description: Charges associated with the delivery of the product. Data Type: Numeric;
Coupon_Status:
Description: Status of the coupon associated with the transaction. Data Type: Categorical;
GST:
Description: Goods and Services Tax associated with the transaction. Data Type: Numeric;
Date:
Description: Date of the transaction (potentially redundant with Transaction_Date). Data Type: Date;
Offline_Spend:
Description: Amount spent offline by the customer. Data Type: Numeric;
Online_Spend:
Description: Amount spent online by the customer. Data Type: Numeric;
Month:
Description: Month of the transaction. Data Type: Categorical;
Coupon_Code:
Description: Code associated with a coupon, if applicable. Data Type: Text;
Discount_pct:
Description: Percentage of discount applied to the transaction. Data Type: Numeric;
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Plenty of Fish is a Canadian online dating service, popular primarily in Canada, the United Kingdom, the Republic of Ireland, Australia, New Zealand, Spain, Brazil,[2] and the United States.[3] It is available in nine languages. The company, which is based in Vancouver, British Columbia[4] generates revenue through advertising and premium memberships.[5] While it is free to use, Plenty of Fish offers premium services as part of their upgraded membership, such as allowing users to see who has "liked" a member through the service's MeetMe feature and whether a message has been read and/or deleted. [Source: Wikipedia]
This dataset belongs to the app Plenty of Fish available on the Google Play Store. The Dataset mostly has user reviews and the various comments made by the users.
The content of the various columns is listed below. Please find the description for each column.
| Column Name | Column Description |
|---|---|
| userName | Name of a User |
| userImage | Profile Image that a user has |
| content | This represents the comments made by a user |
| score | Scores/Rating between 1 to 5 |
| thumbsUpCount | Number of Thumbs up received by a person |
| reviewCreatedVersion | Version number on which the review is created |
| at | Created At |
| replyContent | Reply to the comment by the Company |
| repliedAt | Date and time of the above reply |
| reviewId | unique identifier |
Banner image - Pleanty of Fish Website
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The rise of online dating apps has transformed how Gen Z in India explores relationships, social interactions, and casual dating. This analysis investigates dating app usage patterns, preferences, and challenges faced by individuals aged 18-25 across major Indian cities.
: ✅ Most popular dating apps ✅ Frequency & reasons for usage ✅ User satisfaction levels ✅ Challenges like safety concerns & time-wasting ✅ Preferences for features & communication methods
The study employs data visualization, statistical insights, and correlation analysis to understand the evolving landscape of online dating in India. 🚀\
User_ID: Unique identifier for each participant.
Age: Age of the user (18-25 range).
Gender: Gender identity (Male, Female, Non-binary, etc.).
Location: City of residence (e.g., Delhi, Mumbai).
Education: Education level (Undergraduate, Graduate, Postgraduate).
Occupation: Occupation type (e.g., Student, Freelancer, Intern).
Primary_App: The main dating app used by the user (e.g., Tinder, Bumble, Hinge).
Secondary_Apps: Other dating apps used, if any.
Usage_Frequency: How often they use dating apps (Daily, Weekly, Monthly).
Daily_Usage_Time: Time spent daily on dating apps (e.g., 1 hour, 2 hours).
Reason_for_Using: Purpose for using the apps (e.g., Casual Dating, Finding a Partner).
Satisfaction: Satisfaction level with the primary app (e.g., 1 to 5 scale).
Challenges: Challenges faced during usage (e.g., Safety Concerns, Lack of Matches).
Desired_Features: Features users want in dating apps (e.g., Video Calls, Compatibility Insights).
Preferred_Communication: Communication preferences (e.g., Text, Voice Notes, Video Calls).
Partner_Priorities: Attributes prioritized in a partner (e.g., Personality > Interests > Appearance).