MyDigitalFootprint (MDF) is a novel large-scale dataset composed of smartphone embedded sensors data, physical proximity information, and Online Social Networks interactions aimed at supporting multimodal context-recognition and social relationships modelling in mobile environments. The dataset includes two months of measurements and information collected from the personal mobile devices of 31 volunteer users by following the in-the-wild data collection approach: the data has been collected in the users' natural environment, without limiting their usual behaviour. Existing public datasets generally consist of a limited set of context data, aimed at optimising specific application domains (human activity recognition is the most common example). On the contrary, the dataset contains a comprehensive set of information describing the user context in the mobile environment.
The complete analysis of the data contained in MDF has been presented in the following publication:
https://www.sciencedirect.com/science/article/abs/pii/S1574119220301383?via%3Dihub
The full anonymised dataset is contained in the folder MDF. Moreover, in order to demonstrate the efficacy of MDF, there are three proof of concept context-aware applications based on different machine learning tasks:
For the sake of reproducibility, the data used to evaluate the proof-of-concept applications are contained in the folders link-prediction, context-recognition, and cars, respectively.
https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy
Digital Footprint Statistics: A digital footprint is the trail of information people leave behind when using the internet. It includes everything from social media posts to online searches, websites visited, and emails sent. Some of this data is shared intentionally, like posting on Facebook, while other parts are collected automatically, like tracking cookies from websites.
A digital footprint can be active, meaning data is shared by choice, or passive, meaning it is collected without you realizing it. It's important to manage your digital footprint because it can affect your privacy, reputation, and even job opportunities in the future. Understanding it helps you stay safe online.
The digital era allows us to send messages, e-mails or share pictures at a touch of a button, but our online habits have a surprising impact on the environment. In 2021, the environmental issue with the Internet has become an important subject for most French people. Thus, the source has asked ***** respondents, what measures they take in the fight against digital footprint emmissions. The results show that between ** and ** percent of people regularly deleted their e-mails, and always close applications and software programs after use.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Your organization uses the Internet to carry out business activities, provide employees with remote work capabilities, and offer services to clients. As your employees and partners carry out activities on different online platforms and applications, consider the digital footprint they leave behind. Digital footprints contain sensitive information that is valuable to cyber threat actors. Through the use of tracking and monitoring techniques, threat actors can access and exfiltrate this sensitive information, jeopardizing its confidentiality and security.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Your organization uses the Internet to carry out business activities, provide employees with remote work capabilities, and offer services to clients. As your employees and partners carry out activities on different online platforms and applications, consider the digital footprint they leave behind. Digital footprints contain sensitive information that is valuable to cyber threat actors. Through the use of tracking and monitoring techniques, threat actors can access and exfiltrate this sensitive information, jeopardizing its confidentiality and security.
According to a study conducted in the United Kingdom in 2022, internet users post an average of ***** online photos in their lifetime, and ****** social media posts. Additionally, the average internet user leaves behind *** email addresses in their online footprint.
As of August 2019, around **** percent of Taiwanese respondents stated that they knew rather well about digital footproint. The average number of digital devices ownership of internet users increased drastically in Taiwan over the past two years.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In order to analyse University students' activity patterns, non-personal student user data (15,488 users) was enriched with the Teams user activity report for the 30 days since going remote over coronavirus pandemic.Data fields description:Campus location (0, 1, 2, 3)Form of education (On campus, Part-Time, Extramural)Degree pursued (Bachelor, Master, Diplom, PhD)Field of study CodeISCED Code (1, 2, 3, 4, 5)Year of study (1-6)Status (1 - Active, 0 - Sabbatical)Sex (M, F, N/A)Microsoft license type (Student, Alumni, Faculty)Assigned products (Office 365, Exchange, Flow, Stream)Microsoft Teams user activity report in the last 30 days (see Reference) ChannelMessages ReplyMessages PostMessages ChatMessages MeetingsOrganized MeetingsParticipated 1:1 Calls GroupCalls AudioTime (Minutes) VideoTime (Minutes) ScreenShareTime (Minutes)Time since last activity (d hh:mm:ss)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Much of the world’s data are stored, managed, and distributed by data centers. Data centers re-quire a tremendous amount of energy to operate, accounting for around 1.8% of electricity use in the United States. Large amounts of water are also required to operate data centers, both directly for liquid cooling and indirectly to produce electricity. For the first time, we calculate spatially-detailed carbon and water footprints of data centers operating within the United States, which is home to around one-quarter of all data center servers globally. Our bottom-up approach reveals one-fifth of data center servers direct water footprint comes from moderately to highly water stressed watersheds, while nearly half of servers are fully or partially powered by power plants located within water stressed regions. Approximately 0.5% of total US greenhouse gas emissions are attributed to data centers. We investigate tradeoffs and synergies between data center’s water and energy utilization by strategically locating data centers in areas of the country that will minimize one or more environmental footprints. Our study quantifies the environmental implications behind our data creation and storage and shows a path to decrease the environmental footprint of our increasing digital footprint..
As of February 2025, 5.56 billion individuals worldwide were internet users, which amounted to 67.9 percent of the global population. Of this total, 5.24 billion, or 63.9 percent of the world's population, were social media users. Global internet usage Connecting billions of people worldwide, the internet is a core pillar of the modern information society. Northern Europe ranked first among worldwide regions by the share of the population using the internet in 20254. In The Netherlands, Norway and Saudi Arabia, 99 percent of the population used the internet as of February 2025. North Korea was at the opposite end of the spectrum, with virtually no internet usage penetration among the general population, ranking last worldwide. Eastern Asia was home to the largest number of online users worldwide – over 1.34 billion at the latest count. Southern Asia ranked second, with around 1.2 billion internet users. China, India, and the United States rank ahead of other countries worldwide by the number of internet users. Worldwide internet user demographics As of 2024, the share of female internet users worldwide was 65 percent, five percent less than that of men. Gender disparity in internet usage was bigger in African countries, with around a ten percent difference. Worldwide regions, like the Commonwealth of Independent States and Europe, showed a smaller usage gap between these two genders. As of 2024, global internet usage was higher among individuals between 15 and 24 years old across all regions, with young people in Europe representing the most significant usage penetration, 98 percent. In comparison, the worldwide average for the age group 15–24 years was 79 percent. The income level of the countries was also an essential factor for internet access, as 93 percent of the population of the countries with high income reportedly used the internet, as opposed to only 27 percent of the low-income markets.
https://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
Each row contains an anonymized reddit user's MBTI personality type. Each column represents how much a user posts or comments in a particular subreddit. Specifically, the 'posts_ examplesubreddit' refers to how many of the users top 100 posts of all time are in 'r/examplesubreddit', and 'comments_examplesubreddit' refers to how many of the users most recent 100 comments are in 'r/examplesubreddit'.
This data was obtained using the PRAW (Reddit's API wrapper for python) to scrape a list of reddit users who comment on the r/mbti subreddit along with their self identified MBTI type (as illustrated in their flair). Then, for each user whose MBTI type we are aware of, we go through their top 100 posts and newest 100 comments to record the frequency of their interaction in various subreddits. Thus creating a user-footprint matrix.
The purpose of this data set is to see how well MBTI personality types (or even just specific traits i.e. extraversion vs. introversion) can be predicted on the basis of a user's subreddit interactions.
You will almost certainly need to perform some kind of dimensionality reduction in order to develop an effective classification model.
The MBTI type personality test is controversial and some consider it illegitimate. However, both extraversion/introversion and sensing/intuition correlate strongly with extraversion and openness as measured in the much more accepted big 5 model of personality. As such, it might be best to focus efforts on attempting to classify these traits based on the data provided.
With a growing worry about human impact on the environment, internet technologies have become a potential solution to reduce the footprint of physical technologies. However, according to the source, which assessed the impact of the fabrication and usage of digital and internet technologies on energy depletion, greenhouse gas emissions, water and other resources, internet users accounted for ** percent of the total energy consumption of digital technology in France in 2020. Furthermore, the users were also responsible for ** percent of water consumption of digital technology production and usage. Networks accounted for ** percent of the depletion of abiotic resources.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Result datasets for "Assessing accuracy improvement of integrating digital footprints into gridded populationmapping:spatiotemporal variations and data bias":
At some point in the future I am going to die. When this happens, I can donate my body to science but I’m currently unable to donate my data or even my metadata to research. I will present a scenario where an end of life service exists for people to donate their data. Over the next three months I will examine the relationship that members of the public have with the concept of digital legacy and their willingness to want to donate their data. I will briefly outline the concept of an end of life data donation service and explore the feasibility of such a service, including the readiness and willingness of data holding organisations to supply this data. This study draws on existing literature (Bellamy, 2013) and examines the broader implications of archiving your personal digital footprint; for example, can my digital self commit post-mortem crime?
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Polygon shapefile showing the footprint boundaries, source agency origins, and resolutions of compiled bathymetric digital elevation models (DEMs) used to construct a continuous, high-resolution DEM of the southern portion of San Francisco Bay.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The prevalence of mental distress among young adults, including those at university, has increased. In this context, learning analytics, students’ digital trace data, are increasingly being used to understand student mental health. In line with calls for more research on learning analytics from student perspectives, as part of a broader focus group study, 44 undergraduate students from three United Kingdom universities were invited to consider how they felt about having a digital footprint on their virtual learning environment (VLE). Two main themes were constructed using reflexive thematic analysis. First, students’ responses depended on the perceived threat to their privacy and identity. Some students were indifferent if no threat was perceived, but expressed unease if there was. Second, some students expressed personal preference for autonomy over use of their VLE data. Two uses identified were for non-judgmental personalized support, and using aggregated data to improve student learning. These themes suggest how the use of educational digital data can, under some circumstances, impact wellbeing negatively. The students’ perspectives garnered from the focus groups could have implications for policy and practice concerning privacy and surveillance, the possibility for misuse or misinterpretation of data, and informed consent. This small study supports the importance of partnering with students to develop and implement guidance for how VLE learning analytics data are used and interpreted by students and staff, including lecturers, to protect and enhance student mental wellbeing.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Digital footprints represent a valuable source of information for the analysis of tourist behavior. In this context, the dataset titled “Digital Footprints of Tourism in the Department of Sucre, Colombia: Dataset Based on Foursquare Reviews and Profiles” compiles user-generated content about their tourist experiences in the region. Data collection was carried out through the official Foursquare API in three phases: extraction of tourist sites in the 26 municipalities of the department, retrieval of user profiles, and collection of reviews. The final dataset includes information on 66 tourist sites, 355 unique users, and a total of 7,989 reviews. To ensure data quality, a manual validation was conducted with the support of 36 students from the Engineering program at the Universidad Tecnológica de Bolívar, who verified geographic accuracy, removed duplicates, and ensured data consistency. This dataset enables the analysis of tourist behavior, the implementation of natural language processing techniques, and the development of tools aimed at smart tourism. It thus constitutes a valuable resource for researchers, public policy makers, and stakeholders involved in the planning of data-driven sustainable tourism.
Introducing a comprehensive and meticulously curated dataset: "European Interest Groups' Social Media Engagement Dataset." This dataset offers a panoramic view of the digital footprint and social media presence of various interest groups within Europe. Encompassing a diverse range of platforms including Twitter, Facebook, Instagram, TikTok, and YouTube. This are the variables:
With a focus on transparency and relevance, this dataset presents a wealth of information that delves into the strategies, content, and reach of interest groups across these dynamic online platforms. Researchers, policymakers, and analysts can explore trends, patterns, and correlations between online activities and real-world influence, shedding light on the evolving landscape of digital interaction within the realm of European interest groups.
https://electroiq.com/privacy-policyhttps://electroiq.com/privacy-policy
Weebly Statistics: Weebly is a simple website builder designed for individuals and small businesses with limited technical skills. It has been especially helpful for users who want to create a website without needing extensive programming knowledge. Since its acquisition by Square, which later rebranded as Block, Inc. in 2018, Weebly has expanded its offerings by integrating e-commerce solutions and a variety of tools tailored for aspiring entrepreneurs. As of 2024,
Weebly has millions of users globally, establishing itself as a significant player in the digital space. Its ease of use and affordable pricing options have made it a popular choice for those looking to establish an online presence quickly. The platform continues to innovate with features that cater to both beginners and those looking to expand their digital footprint. Weebly’s emphasis on simplicity and versatility ensures it remains a go-to option for small businesses and individual users looking to build a professional website without complicated coding.
In 2023, India had over 1.2 billion internet users across the country. This figure was projected to grow to over 1.6 billion users by 2050, indicating a big market potential in internet services for the South Asian country. In fact, India was ranked as the second largest online market worldwide in 2022, second only to China. The number of internet users was estimated to increase in both urban as well as rural regions, indicating a dynamic growth in access to internet.
Mobile connectivity
Of the total internet users in the country, a majority of the people access the internet via their mobile phones. There were nearly the same amount of smartphone users as internet users across the country. Cheap availability of mobile data, a growing smartphone user base in the country along with the utility value of smartphones compared to desktops and tablets are some of the factors contributing to the mobile heavy internet access in India.
Growth is on the cards
Despite the large number of internet users in the country, the internet penetration levels took longer to catch up equally. At the same time, the number of women who have access to internet is much lower than men in the country, and the bias is even more evident in rural India. Similarly, internet usage is lower among older adults in the country due to internet literacy and technological know-how. By encouraging internet accessibility among marginalized groups including women, older people and rural inhabitants in the country, India’s digital footprint has significant headroom to grow.
MyDigitalFootprint (MDF) is a novel large-scale dataset composed of smartphone embedded sensors data, physical proximity information, and Online Social Networks interactions aimed at supporting multimodal context-recognition and social relationships modelling in mobile environments. The dataset includes two months of measurements and information collected from the personal mobile devices of 31 volunteer users by following the in-the-wild data collection approach: the data has been collected in the users' natural environment, without limiting their usual behaviour. Existing public datasets generally consist of a limited set of context data, aimed at optimising specific application domains (human activity recognition is the most common example). On the contrary, the dataset contains a comprehensive set of information describing the user context in the mobile environment.
The complete analysis of the data contained in MDF has been presented in the following publication:
https://www.sciencedirect.com/science/article/abs/pii/S1574119220301383?via%3Dihub
The full anonymised dataset is contained in the folder MDF. Moreover, in order to demonstrate the efficacy of MDF, there are three proof of concept context-aware applications based on different machine learning tasks:
For the sake of reproducibility, the data used to evaluate the proof-of-concept applications are contained in the folders link-prediction, context-recognition, and cars, respectively.