100+ datasets found

Time Spent with Relationships by Age - USA
kaggle.com
zip
Updated Nov 18, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Niccole Martinez (2022). Time Spent with Relationships by Age - USA [Dataset]. https://www.kaggle.com/datasets/niccolem/time-spent-with-relationships-by-age-usa
Explore at:
zip(2705 bytes)Available download formats
Dataset updated
Nov 18, 2022
Authors
Niccole Martinez
Area covered
United States
Description
From adolescence to old age: who do we spend our time with?

To understand how social connections evolve throughout our lives, we can look at survey data on how much time people spend with others and who that time is spent with.

This dataset shows the amount of time people in the US report spending in the company of others, based on their age. The data comes from time-use surveys, where people are asked to list all the activities they perform over a full day and the people who were present during each activity. Currently, there is only data with this granularity for the US – time-use surveys are common across many countries, but what is special about the US is that respondents of the American Time Use Survey are asked to list everyone present for each activity.

The numbers in this chart are based on averages for a cross-section of US society – people are only interviewed once, but the dataset represents a decade of surveys, tabulating the average amount of time survey respondents of different ages report spending with other people.

Source

https://ourworldindata.org/time-with-others-lifetime by Esteban Ortiz-Ospina December 11, 2020
American Time Use Survey: Daily Activities
kaggle.com
zip
Updated Dec 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). American Time Use Survey: Daily Activities [Dataset]. https://www.kaggle.com/datasets/thedevastator/american-time-use-survey-daily-activities
Explore at:
zip(17763 bytes)Available download formats
Dataset updated
Dec 12, 2023
Authors
The Devastator
Description
American Time Use Survey: Daily Activities

Americans' Daily Activities: Education, Employment, Gender, and Leisure Time

By Throwback Thursday [source]

About this dataset

The American Time Use Survey dataset provides comprehensive information on how individuals in America allocate their time throughout the day. It includes various aspects of daily activities such as education level, age, employment status, gender, number of children, weekly earnings and hours worked. The dataset also includes data on specific activities individuals engage in like sleeping, grooming, housework, food and drink preparation, caring for children, playing with children, job searching, shopping and eating and drinking. Additionally it captures time spent on leisure activities like socializing and relaxing as well as engaging in specific hobbies such as watching television or golfing. The dataset also records the amount of time spent volunteering or running for exercise purposes.

Each entry is organized based on categorical variables such as education level (ranging from lower levels to higher degrees), age (capturing different age brackets), employment status (including employed full-time or part-time), gender (male or female) and the number of children an individual has. Furthermore it provides information regarding an individual's weekly earnings and hours worked.

This extensive dataset aims to provide insights into how Americans prioritize their time across various aspects of their lives. Whether it be focusing on work-related tasks or indulging in recreational activities,it offers a comprehensive look at the allocation of time among different demographic groups within American society.

This dataset can be used for understanding trends in daily activity patterns across demographics groups over multiple years without directly referencing specific dates

How to use the dataset

How to use this dataset: American Time Use Survey - Daily Activities

Welcome to the American Time Use Survey dataset! This dataset provides valuable information on how Americans spend their time on a daily basis. Here's a guide on how to effectively utilize this dataset for your analysis:

Familiarize yourself with the columns:

Education Level: The level of education attained by the individual.

Age: The age of the individual.

Age Range: The age range the individual falls into.

Employment Status: The employment status of the individual.

Gender: The gender of the individual.

Children: The number of children that an individual has.

Weekly Earnings: The amount of money earned by an individual on a weekly basis.

Year: The year in which the data was collected.

Weekly Hours Worked: The number of hours worked by an individual on a weekly basis.

Identify variables related to daily activities: This dataset provides information about various daily activities undertaken by individuals. Some important variables related to daily activities include:

Sleeping

Grooming

Housework

Food & Drink Prep

Caring for Children

Playing with Children

Job Searching …and many more!

Analyze time spent on different activities: This dataset includes numerical values representing time spent in minutes for specific activities such as sleeping, grooming, housework, food and drink preparation, etc. You can use this data to analyze and compare how different groups of individuals allocate their time throughout the day.

Explore demographic factors: In addition to daily activities, this dataset also includes columns such as education level, age range, employment status, gender, and number of children. You can cross-reference these demographic factors with activity data to gain insights into how different population subgroups spend their time differently.

Identify trends and patterns: You can use this dataset to identify trends and patterns in how Americans allocate their time over the years. By analyzing data from different years, you may discover changes in certain activities and how they relate to demographic factors or societal shifts.

Visualize the data: Creating visualizations such as bar graphs, line plots, or pie charts can provide a clear representation of how time is allocated for different activities among various groups of individuals. Visualizations help in understanding the distribution of time spent on different activities and identifying any significant differences or similarities across demographics.

Remember that each column represents a specific variable, whi...
How Does Daily Yoga Impact Screen Time Habits
kaggle.com
zip
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). How Does Daily Yoga Impact Screen Time Habits [Dataset]. https://www.kaggle.com/datasets/thedevastator/how-does-daily-yoga-impact-screen-time-habits
Explore at:
zip(742 bytes)Available download formats
Dataset updated
Dec 14, 2022
Authors
The Devastator
Description
How Does Daily Yoga Impact Screen Time Habits

A Study of Daily Screen Time Behavior

By Taylor L Bailey [source]

About this dataset

This dataset contains data on daily minutes of screen time between April 17th and May 14th. With this dataset, you can gain insights into daily phone usage habits and determine the effect that regular yoga practice has on reducing phone use. By recording the amount of time spent using different types of apps -- such as social media, reading, productivity and entertainment -- you can understand how phone habits have changed over time. Moreover, this dataset captures my attempt to do at least 10 minutes of yoga every day for a period of 15 days from April 29th to May 13th. Did this experiment successfully reduce my screen time overall? Dive in deep and find out!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

How to use this dataset

This dataset contains information on daily minutes of screen time habits, categorized by type of usage, as well as the effect of yoga on those habits. This is useful for gaining insights into an individual's screen time habits and its variability with respect to doing yoga.

To start with, there are a few key columns to check out: Date (to keep track of the days in view), Week Day (to identify which day it is precisely), Social Networking/Reading and Reference/Other/Productivity/Health and Fitness (to determine how much time was spent in each category) and Yoga (whether or not any yoga was done that day).

You may find it helpful to analyze the daily data over a certain duration by creating separate datasets grouped by weeks or months. Additionally, tallying each person's total minutes per week or per month can show changes over long-term periods. As you will notice right away in viewing this dataset, consistency is important; if someone were tracking their smartphone use regularly but only measured twice during a month period or skipped days without setting aside any reference points prior, then this particular experiment would be somewhat difficult to draw conclusions from. It would be especially impactful if specific factors such as sleep hygiene were tracked along with practice evolution such us advanced yoga sequences tried out over time alongside different approaches at making screens off-limits during mealtime - all items that could bring interesting insight into our relationship with technology devices when looking at screentime fluctuations before and after our mediations become part of our daily routine

Research Ideas

Track the impact of daily yoga on overall and category-specific screen time.

Explore the relationship between day of the week and overall or category-specific screen time.

Investigate how long it takes to establish a healthy habit, such as decreased phone usage, by looking at changes in average daily screen time over the period of a month or two months before and after beginning yoga practice, adjusting for weekly period effect

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: Screen Time Data.csv | Column name | Description | |:--------------------------|:------------------------------------------------------------------------------------------| | Date | The date of the data entry. (Date) | | Week Day | The day of the week of the data entry. (String) | | Social Networking | The amount of time spent on social networking. (Integer) | | Reading and Reference | The amount of time spent on reading and reference activities. (Integer) | | Other ...

Average daily time spent on social media worldwide 2012-2024

statista.com
de.statista.com

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Average daily time spent on social media worldwide 2012-2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How much time do people spend on social media?

              As of 2024, the average daily social media usage of internet users worldwide amounted to 143 minutes per day, down from 151 minutes in the previous year. Currently, the country with the most time spent on social media per day is Brazil, with online users spending an average of three hours and 49 minutes on social media each day. In comparison, the daily time spent with social media in
              the U.S. was just two hours and 16 minutes. Global social media usageCurrently, the global social network penetration rate is 62.3 percent. Northern Europe had an 81.7 percent social media penetration rate, topping the ranking of global social media usage by region. Eastern and Middle Africa closed the ranking with 10.1 and 9.6 percent usage reach, respectively.
              People access social media for a variety of reasons. Users like to find funny or entertaining content and enjoy sharing photos and videos with friends, but mainly use social media to stay in touch with current events friends. Global impact of social mediaSocial media has a wide-reaching and significant impact on not only online activities but also offline behavior and life in general.
              During a global online user survey in February 2019, a significant share of respondents stated that social media had increased their access to information, ease of communication, and freedom of expression. On the flip side, respondents also felt that social media had worsened their personal privacy, increased a polarization in politics and heightened everyday distractions.

Daily Social Media Active Users
kaggle.com
zip
Updated May 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shaik Barood Mohammed Umar Adnaan Faiz (2025). Daily Social Media Active Users [Dataset]. https://www.kaggle.com/datasets/umeradnaan/daily-social-media-active-users
Explore at:
zip(126814 bytes)Available download formats
Dataset updated
May 5, 2025
Authors
Shaik Barood Mohammed Umar Adnaan Faiz
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Description:

The "Daily Social Media Active Users" dataset provides a comprehensive and dynamic look into the digital presence and activity of global users across major social media platforms. The data was generated to simulate real-world usage patterns for 13 popular platforms, including Facebook, YouTube, WhatsApp, Instagram, WeChat, TikTok, Telegram, Snapchat, X (formerly Twitter), Pinterest, Reddit, Threads, LinkedIn, and Quora. This dataset contains 10,000 rows and includes several key fields that offer insights into user demographics, engagement, and usage habits.

Dataset Breakdown:

Platform: The name of the social media platform where the user activity is tracked. It includes globally recognized platforms, such as Facebook, YouTube, and TikTok, that are known for their large, active user bases.

Owner: The company or entity that owns and operates the platform. Examples include Meta for Facebook, Instagram, and WhatsApp, Google for YouTube, and ByteDance for TikTok.

Primary Usage: This category identifies the primary function of each platform. Social media platforms differ in their primary usage, whether it's for social networking, messaging, multimedia sharing, professional networking, or more.

Country: The geographical region where the user is located. The dataset simulates global coverage, showcasing users from diverse locations and regions. It helps in understanding how user behavior varies across different countries.

Daily Time Spent (min): This field tracks how much time a user spends on a given platform on a daily basis, expressed in minutes. Time spent data is critical for understanding user engagement levels and the popularity of specific platforms.

Verified Account: Indicates whether the user has a verified account. This feature mimics real-world patterns where verified users (often public figures, businesses, or influencers) have enhanced status on social media platforms.

Date Joined: The date when the user registered or started using the platform. This data simulates user account history and can provide insights into user retention trends or platform growth over time.

Context and Use Cases:

This synthetic dataset is designed to offer a privacy-friendly alternative for analytics, research, and machine learning purposes. Given the complexities and privacy concerns around using real user data, especially in the context of social media, this dataset offers a clean and secure way to develop, test, and fine-tune applications, models, and algorithms without the risks of handling sensitive or personal information.

Researchers, data scientists, and developers can use this dataset to:

Model User Behavior: By analyzing patterns in daily time spent, verified status, and country of origin, users can model and predict social media engagement behavior.

Test Analytics Tools: Social media monitoring and analytics platforms can use this dataset to simulate user activity and optimize their tools for engagement tracking, reporting, and visualization.

Train Machine Learning Algorithms: The dataset can be used to train models for various tasks like user segmentation, recommendation systems, or churn prediction based on engagement metrics.

Create Dashboards: This dataset can serve as the foundation for creating user-friendly dashboards that visualize user trends, platform comparisons, and engagement patterns across the globe.

Conduct Market Research: Business intelligence teams can use the data to understand how various demographics use social media, offering valuable insights into the most engaged regions, platform preferences, and usage behaviors.

Sources of Inspiration: This dataset is inspired by public data from industry reports, such as those from Statista, DataReportal, and other market research platforms. These sources provide insights into the global user base and usage statistics of popular social media platforms. The synthetic nature of this dataset allows for the use of realistic engagement metrics without violating any privacy concerns, making it an ideal tool for educational, analytical, and research purposes.

The structure and design of the dataset are based on real-world usage patterns and aim to represent a variety of users from different backgrounds, countries, and activity levels. This diversity makes it an ideal candidate for testing data-driven solutions and exploring social media trends.

Future Considerations:

As the social media landscape continues to evolve, this dataset can be updated or extended to include new platforms, engagement metrics, or user behaviors. Future iterations may incorporate features like post frequency, follower counts, engagement rates (likes, comments, shares), or even sentiment analysis from user-generated content.

By leveraging this dataset, analysts and data scientists can create better, more effective strategies ...
AB Testing
kaggle.com
zip
Updated Nov 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Adarsh Anil Kumar (2024). AB Testing [Dataset]. https://www.kaggle.com/datasets/adarsh0806/ab-testing-practice
Explore at:
zip(37563 bytes)Available download formats
Dataset updated
Nov 8, 2024
Authors
Adarsh Anil Kumar
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
//Please refer the notebooks attached to the file for creating your own A/B testing dataset or to look up the solution of the A/B Testing analysis//

The AB Testing Dataset provided here is a self-generated synthetic dataset created using Random Sampling techniques provided by the Numpy Package. The dataset emulates information regarding visits made by users on an imaginary retail website around the United Kingdom. The users are found to be in two groups, A and B, each of which represents a control group and treatment group respectively. Imagine that the retail company needs to test out a new change on the website which is, "**Do people spend time on a website if the website background color is White or Black**". This question is asked to achieve the end goal of the analysis which is to improve user engagement, whether it is through a purchase, signing up, etc.

So, in this scenario, let the color 'White' be assigned to Group A which is the default setting for the background color on the website, representing the control group. Also, let the color 'Black' be equivalent to Group B which is the newer setting to be tested. And the main goal is to understand whether there is a significant improvement in website views if the newer setting is applied. This can be answered through the use of A/B Testing.

This dataset is placed to help with practicing A/B Testing as it is a very important topic for Data Analyst prospects. The column description is given as follows:

User ID: Serves as an identifier for each user.

Group: Contains both the control group (A) and treatment group (B).

Page Views: Number of pages the user viewed during their session.

Time Spent: The total amount of time, in seconds, that the user spent on the site during the session.

Conversion: Indicates whether a user has completed a desired action (Yes/No).

Device: Type of device used to access the website.

Location: The country in UK where the user is based in.

The dataset can also be used to derive segment-based insights through appropriate data visualization based on device type and location.
p
Cyprus Phone Number Data
listtodata.com
.csv, .xls, .txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
List to Data (2025). Cyprus Phone Number Data [Dataset]. https://listtodata.com/cyprus-number-data
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Jul 17, 2025
Authors
List to Data
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2025 - Dec 31, 2025
Area covered
Cyprus
Variables measured
phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
Description
Cyprus phone number database helps you save both time and money. When you use real data, you spend less time trying to find the right contacts. This number library helps you reach the right people. It makes your marketing efforts faster and more efficient. In Cyprus, many people use their phones every day, so you are sure to reach active users. The library is easy to operate, and you can start reaching out to potential customers right away. Moreover, there is no need to spend hours looking for contact numbers online. Cyprus mobile number data helps you connect with people across Cyprus quickly and easily. Whether you are doing cold calling or sending out SMS promotions, this database helps you reach people quickly. We even offer excellent customer support. If you ever have questions about using Cyprus mobile number data, our team is ready to help. We help you find the right numbers. If you need guidance using the database, we are here for you. Overall, this will help people reach more customers and grow their business faster. Indeed, simply visit our List to Data website, and you can get started right away.
d
Geodemographic Data | Asia/ MENA | Latest Estimates on Population, Consuming...
datarade.ai
.json, .csv
Updated Mar 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GapMaps (2025). Geodemographic Data | Asia/ MENA | Latest Estimates on Population, Consuming Class, Demographics, Retail Spend | GIS Data | Map Data [Dataset]. https://datarade.ai/data-products/gapmaps-premium-geodemographic-data-asia-mena-150m-x-150-gapmaps
Explore at:
.json, .csvAvailable download formats
Dataset updated
Mar 1, 2025
Dataset authored and provided by
GapMaps
Area covered
Singapore, India, Indonesia, Malaysia, Philippines, Saudi Arabia, Asia
Description
Sourcing accurate and up-to-date geodemographic data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

GapMaps uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent geodemographic datasets across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

Better understand your customers

Identify optimal locations to expand your retail footprint

Define sales territories for franchisees

Run targeted marketing campaigns.

Premium geodemographics data for Asia and MENA includes the latest estimates (updated annually) on:

Population (how many people live in your local catchment)

Demographics (who lives within your local catchment)

Worker population (how many people work within your local catchment)

Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)

Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

Primary Use Cases for GapMaps Geodemographic Data:

Retail (eg. Fast Food/ QSR, Cafe, Fitness, Supermarket/Grocery)

Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential

Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics

Site Selection: Identify optimal locations for future expansion and benchmark performance across existing locations.

Target Marketing: Develop effective marketing strategies to acquire more customers.

Integrate GapMaps demographic data with your existing GIS or BI platform to generate powerful visualizations.

Commercial Real-Estate (Brokers, Developers, Investors, Single & Multi-tenant O/O)

Tenant Recruitment

Target Marketing

Market Potential / Gap Analysis

Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

Customer Profiling

Target Marketing

Market Share Analysis

Number of global social network users 2017-2028

statista.com
de.statista.com

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Number of global social network users 2017-2028 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

How many people use social media?

              Social media usage is one of the most popular online activities. In 2024, over five billion people were using social media worldwide, a number projected to increase to over six billion in 2028.

              Who uses social media?
              Social networking is one of the most popular digital activities worldwide and it is no surprise that social networking penetration across all regions is constantly increasing. As of January 2023, the global social media usage rate stood at 59 percent. This figure is anticipated to grow as lesser developed digital markets catch up with other regions
              when it comes to infrastructure development and the availability of cheap mobile devices. In fact, most of social media’s global growth is driven by the increasing usage of mobile devices. Mobile-first market Eastern Asia topped the global ranking of mobile social networking penetration, followed by established digital powerhouses such as the Americas and Northern Europe.

              How much time do people spend on social media?
              Social media is an integral part of daily internet usage. On average, internet users spend 151 minutes per day on social media and messaging apps, an increase of 40 minutes since 2015. On average, internet users in Latin America had the highest average time spent per day on social media.

              What are the most popular social media platforms?
              Market leader Facebook was the first social network to surpass one billion registered accounts and currently boasts approximately 2.9 billion monthly active users, making it the most popular social network worldwide. In June 2023, the top social media apps in the Apple App Store included mobile messaging apps WhatsApp and Telegram Messenger, as well as the ever-popular app version of Facebook.

w
Data Use in Academia Dataset
datacatalog.worldbank.org
csv, utf-8
Updated Nov 27, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Semantic Scholar Open Research Corpus (S2ORC) (2023). Data Use in Academia Dataset [Dataset]. https://datacatalog.worldbank.org/search/dataset/0065200/data_use_in_academia_dataset
Explore at:
utf-8, csvAvailable download formats
Dataset updated
Nov 27, 2023
Dataset provided by
Semantic Scholar Open Research Corpus (S2ORC)
Brian William Stacy
License
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
Description
This dataset contains metadata (title, abstract, date of publication, field, etc) for around 1 million academic articles. Each record contains additional information on the country of study and whether the article makes use of data. Machine learning tools were used to classify the country of study and data use.

Our data source of academic articles is the Semantic Scholar Open Research Corpus (S2ORC) (Lo et al. 2020). The corpus contains more than 130 million English language academic papers across multiple disciplines. The papers included in the Semantic Scholar corpus are gathered directly from publishers, from open archives such as arXiv or PubMed, and crawled from the internet.

We placed some restrictions on the articles to make them usable and relevant for our purposes. First, only articles with an abstract and parsed PDF or latex file are included in the analysis. The full text of the abstract is necessary to classify the country of study and whether the article uses data. The parsed PDF and latex file are important for extracting important information like the date of publication and field of study. This restriction eliminated a large number of articles in the original corpus. Around 30 million articles remain after keeping only articles with a parsable (i.e., suitable for digital processing) PDF, and around 26% of those 30 million are eliminated when removing articles without an abstract. Second, only articles from the year 2000 to 2020 were considered. This restriction eliminated an additional 9% of the remaining articles. Finally, articles from the following fields of study were excluded, as we aim to focus on fields that are likely to use data produced by countries’ national statistical system: Biology, Chemistry, Engineering, Physics, Materials Science, Environmental Science, Geology, History, Philosophy, Math, Computer Science, and Art. Fields that are included are: Economics, Political Science, Business, Sociology, Medicine, and Psychology. This third restriction eliminated around 34% of the remaining articles. From an initial corpus of 136 million articles, this resulted in a final corpus of around 10 million articles.

Due to the intensive computer resources required, a set of 1,037,748 articles were randomly selected from the 10 million articles in our restricted corpus as a convenience sample.

The empirical approach employed in this project utilizes text mining with Natural Language Processing (NLP). The goal of NLP is to extract structured information from raw, unstructured text. In this project, NLP is used to extract the country of study and whether the paper makes use of data. We will discuss each of these in turn.

To determine the country or countries of study in each academic article, two approaches are employed based on information found in the title, abstract, or topic fields. The first approach uses regular expression searches based on the presence of ISO3166 country names. A defined set of country names is compiled, and the presence of these names is checked in the relevant fields. This approach is transparent, widely used in social science research, and easily extended to other languages. However, there is a potential for exclusion errors if a country’s name is spelled non-standardly.

The second approach is based on Named Entity Recognition (NER), which uses machine learning to identify objects from text, utilizing the spaCy Python library. The Named Entity Recognition algorithm splits text into named entities, and NER is used in this project to identify countries of study in the academic articles. SpaCy supports multiple languages and has been trained on multiple spellings of countries, overcoming some of the limitations of the regular expression approach. If a country is identified by either the regular expression search or NER, it is linked to the article. Note that one article can be linked to more than one country.

The second task is to classify whether the paper uses data. A supervised machine learning approach is employed, where 3500 publications were first randomly selected and manually labeled by human raters using the Mechanical Turk service (Paszke et al. 2019).[1] To make sure the human raters had a similar and appropriate definition of data in mind, they were given the following instructions before seeing their first paper:

Each of these documents is an academic article. The goal of this study is to measure whether a specific academic article is using data and from which country the data came.
There are two classification tasks in this exercise:
1. identifying whether an academic article is using data from any country
2. Identifying from which country that data came.
For task 1, we are looking specifically at the use of data. Data is any information that has been collected, observed, generated or created to produce research findings. As an example, a study that reports findings or analysis using a survey data, uses data. Some clues to indicate that a study does use data includes whether a survey or census is described, a statistical model estimated, or a table or means or summary statistics is reported.
After an article is classified as using data, please note the type of data used. The options are population or business census, survey data, administrative data, geospatial data, private sector data, and other data. If no data is used, then mark "Not applicable". In cases where multiple data types are used, please click multiple options.[2]
For task 2, we are looking at the country or countries that are studied in the article. In some cases, no country may be applicable. For instance, if the research is theoretical and has no specific country application. In some cases, the research article may involve multiple countries. In these cases, select all countries that are discussed in the paper.
We expect between 10 and 35 percent of all articles to use data.

The median amount of time that a worker spent on an article, measured as the time between when the article was accepted to be classified by the worker and when the classification was submitted was 25.4 minutes. If human raters were exclusively used rather than machine learning tools, then the corpus of 1,037,748 articles examined in this study would take around 50 years of human work time to review at a cost of $3,113,244, which assumes a cost of $3 per article as was paid to MTurk workers.

A model is next trained on the 3,500 labelled articles. We use a distilled version of the BERT (bidirectional Encoder Representations for transformers) model to encode raw text into a numeric format suitable for predictions (Devlin et al. (2018)). BERT is pre-trained on a large corpus comprising the Toronto Book Corpus and Wikipedia. The distilled version (DistilBERT) is a compressed model that is 60% the size of BERT and retains 97% of the language understanding capabilities and is 60% faster (Sanh, Debut, Chaumond, Wolf 2019). We use PyTorch to produce a model to classify articles based on the labeled data. Of the 3,500 articles that were hand coded by the MTurk workers, 900 are fed to the machine learning model. 900 articles were selected because of computational limitations in training the NLP model. A classification of “uses data” was assigned if the model predicted an article used data with at least 90% confidence.

The performance of the models classifying articles to countries and as using data or not can be compared to the classification by the human raters. We consider the human raters as giving us the ground truth. This may underestimate the model performance if the workers at times got the allocation wrong in a way that would not apply to the model. For instance, a human rater could mistake the Republic of Korea for the Democratic People’s Republic of Korea. If both humans and the model perform the same kind of errors, then the performance reported here will be overestimated.

The model was able to predict whether an article made use of data with 87% accuracy evaluated on the set of articles held out of the model training. The correlation between the number of articles written about each country using data estimated under the two approaches is given in the figure below. The number of articles represents an aggregate total of
d
Map Data | Asia & MENA | Premium Demographics & Point-of-Interest Data To...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GapMaps, Map Data | Asia & MENA | Premium Demographics & Point-of-Interest Data To Optimise Business Decisions | GIS Data | Demographic Data [Dataset]. https://datarade.ai/data-products/gapmaps-global-map-data-asia-mena-150m-x-150m-grids-cu-gapmaps
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
GapMaps
Area covered
India, Saudi Arabia, Philippines, Singapore, Malaysia, Indonesia
Description
Sourcing accurate and up-to-date map data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

GapMaps Map Data uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent demographics data across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

GapMaps Map Data also includes the latest Point-of-Interest (POI) Data for leading retail brands across a range of categories including Fast Food/ QSR, Health & Fitness, Supermarket/Grocery and Cafe sectors which is updated monthly.

With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

Better understand your customers

Identify optimal locations to expand your retail footprint

Define sales territories for franchisees

Run targeted marketing campaigns.

GapMaps Map Data for Asia and MENA can be utilized in any GIS platform and includes the latest estimates (updated annually) on:

Population (how many people live in your local catchment)

Demographics (who lives within your local catchment)

Worker population (how many people work within your local catchment)

Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)

Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

Primary Use Cases for GapMaps Map Data:

Retail Site Selection - identify optimal locations for future expansion and benchmark performance across existing locations.

Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential

Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics

Target Marketing: Develop effective marketing strategies to acquire more customers.

Integrate GapMaps demographic data with your existing GIS or BI platform to generate powerful visualizations.

Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

Customer Profiling

Target Marketing

Market Share Analysis
Coffee Shop Daily Revenue Prediction Dataset
kaggle.com
zip
Updated Feb 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Himel Sarder (2025). Coffee Shop Daily Revenue Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/himelsarder/coffee-shop-daily-revenue-prediction-dataset
Explore at:
zip(30259 bytes)Available download formats
Dataset updated
Feb 7, 2025
Authors
Himel Sarder
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset Overview

This dataset contains 2,000 rows of data from coffee shops, offering detailed insights into factors that influence daily revenue. It includes key operational and environmental variables that provide a comprehensive view of how business activities and external conditions affect sales performance. Designed for use in predictive analytics and business optimization, this dataset is a valuable resource for anyone looking to understand the relationship between customer behavior, operational decisions, and revenue generation in the food and beverage industry.

Columns & Variables

The dataset features a variety of columns that capture the operational details of coffee shops, including customer activity, store operations, and external factors such as marketing spend and location foot traffic.

Number of Customers Per Day

The total number of customers visiting the coffee shop on any given day.

Range: 50 - 500 customers.

Average Order Value ($)

The average dollar amount spent by each customer during their visit.

Range: $2.50 - $10.00.

Operating Hours Per Day

The total number of hours the coffee shop is open for business each day.

Range: 6 - 18 hours.

Number of Employees

The number of employees working on a given day. This can influence service speed, customer satisfaction, and ultimately, sales.

Range: 2 - 15 employees.

Marketing Spend Per Day ($)

The amount of money spent on marketing campaigns or promotions on any given day.

Range: $10 - $500 per day.

Location Foot Traffic (people/hour)

The number of people passing by the coffee shop per hour, a variable indicative of the shop's location and its potential to attract customers.

Range: 50 - 1000 people per hour.

Target Variable

Daily Revenue ($)

This is the dependent variable representing the total revenue generated by the coffee shop each day.

It is calculated as a combination of customer visits, average spending, and other operational factors like marketing spend and staff availability.

Range: $200 - $10,000 per day.

Data Distribution & Insights

The dataset spans a wide variety of operational scenarios, from small neighborhood coffee shops with limited traffic to larger, high-traffic locations with extensive marketing budgets. This variety allows for exploring different predictive modeling strategies. Key insights that can be derived from the data include:

The effect of marketing spend on daily revenue.

The correlation between customer count and daily sales.

The relationship between staffing levels and revenue generation.

The influence of foot traffic and operating hours on customer behavior.

Use Cases & Applications

The dataset offers a wide range of applications, especially in predictive analytics, business optimization, and forecasting:

Predictive Modeling: Use machine learning models such as regression, decision trees, or neural networks to predict daily revenue based on operational data.

Business Strategy Development: Analyze how changes in marketing spend, staff numbers, or operating hours can optimize revenue and improve efficiency.

Customer Insights: Identify patterns in customer behavior related to shop operations and external factors like foot traffic and marketing campaigns.

Resource Allocation: Determine optimal staffing levels and marketing budgets based on predicted sales, improving overall profitability.

Real-World Applications in the Food & Beverage Industry

For coffee shop owners, managers, and analysts in the food and beverage industry, this dataset provides an essential tool for refining daily operations and boosting profitability. Insights gained from this data can help:

Optimize Marketing Campaigns: Evaluate the effectiveness of daily or seasonal marketing campaigns on revenue.

Staff Scheduling: Predict busy days and ensure that the right number of employees are scheduled to maximize efficiency.

Revenue Forecasting: Provide accurate revenue projections that can assist with financial planning and decision-making.

Operational Efficiency: Discover the most profitable operating hours and adjust business hours accordingly.

This dataset is also ideal for aspiring data scientists and machine learning practitioners looking to apply their skills to real-world business problems in the food and beverage sector.

Conclusion

The Coffee Shop Revenue Prediction Dataset is a versatile and comprehensive resource for understanding the dynamics of daily sales performance in coffee shops. With a focus on key operational factors, it is perfect for building predictive models, ...
p
RCS Data Laos
listtodata.com
st.listtodata.com
.csv, .xls, .txt
Updated Jul 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
List to Data (2025). RCS Data Laos [Dataset]. https://listtodata.com/rcs-data-laos
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Jul 17, 2025
Authors
List to Data
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
Jan 1, 2025 - Dec 31, 2025
Area covered
Laos
Variables measured
phone numbers, Email Address, full name, Address, City, State, gender,age,income,ip address,
Description
RCS Data Laos helps businesses connect with customers quickly. This service allows companies to send messages to many people at once. It is fast, easy, and cost-effective. Additionally, we include source URLs, so you can see exactly where the information came from. You can use it at any time, thanks to our 24/7 support. Each user in this data set has opted in, meaning they have permitted their information to be shared. This makes the data legal, safe, and easy to use. Additionally, RCS Data Laos is a valuable tool for those looking to learn more about RCS users with ease. Also, our team checks to keep all info current and make sure we use reliable sources for everything we share. This means you can rely on the data for any project, whether for research, business, or customer connections. The opt-in process ensures safety. Additionally, customer support provides help. Laos RCS data is an amazing way to send messages to many people at once. Businesses use this data to reach customers quickly. It allows them to share important information like sales, updates, and promotions. People receive these messages on their phones. This helps them stay updated easily. Many businesses choose this data because it is fast and efficient. This saves time and effort. For example, a bakery can send a message about a special offer on cupcakes. Customers will see this offer right away and may come to buy some. Moreover, Laos RCS data is cost-effective. Businesses can send many messages for a low price. This means they can promote their products without spending too much money. It helps small businesses stand up to bigger ones. With this data, everyone has a chance to grow their audience. Another great thing about RCS data is that it has a high open rate.
d
Demographic Data | Asia & MENA | Make Informed Business Decisions with High...
datarade.ai
.json, .csv
Updated Jul 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GapMaps (2024). Demographic Data | Asia & MENA | Make Informed Business Decisions with High Quality and Granular Insights [Dataset]. https://datarade.ai/data-products/gapmaps-premium-demographics-data-asia-mena-accurate-and-gapmaps
Explore at:
.json, .csvAvailable download formats
Dataset updated
Jul 2, 2024
Dataset authored and provided by
GapMaps
Area covered
Malaysia, Singapore, Philippines, Indonesia, India, Saudi Arabia
Description
Sourcing accurate and up-to-date demographic data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

GapMaps uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent demographic datasets across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

Better understand your customers

Identify optimal locations to expand your retail footprint

Define sales territories for franchisees

Run targeted marketing campaigns.

Premium demographics data for Asia and MENA includes the latest estimates (updated annually) on:

Population (how many people live in your local catchment)

Demographics (who lives within your local catchment)

Worker population (how many people work within your local catchment)

Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)

Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

Primary Use Cases for GapMaps Demographic Data:

Retail (eg. Fast Food/ QSR, Cafe, Fitness, Supermarket/Grocery)

Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential

Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics

Site Selection: Identify optimal locations for future expansion and benchmark performance across existing locations.

Target Marketing: Develop effective marketing strategies to acquire more customers.

Integrate GapMaps demographic data with your existing GIS or BI platform to generate powerful visualizations.

Commercial Real-Estate (Brokers, Developers, Investors, Single & Multi-tenant O/O)

Tenant Recruitment

Target Marketing

Market Potential / Gap Analysis

Marketing / Advertising (Billboards/OOH, Marketing Agencies, Indoor Screens)

Customer Profiling

Target Marketing

Market Share Analysis
n
Data from: Macaques preferentially attend to intermediately surprising...
data.niaid.nih.gov
datadryad.org
zip
Updated Apr 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shengyi Wu; Tommy Blanchard; Emily Meschke; Richard Aslin; Ben Hayden; Celeste Kidd (2022). Macaques preferentially attend to intermediately surprising information [Dataset]. http://doi.org/10.6078/D15Q7Q
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6078/D15Q7Q
Dataset updated
Apr 26, 2022
Dataset provided by
Yale University
University of Minnesota
University of California, Berkeley
Klaviyo
Authors
Shengyi Wu; Tommy Blanchard; Emily Meschke; Richard Aslin; Ben Hayden; Celeste Kidd
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Normative learning theories dictate that we should preferentially attend to informative sources, but only up to the point that our limited learning systems can process their content. Humans, including infants, show this predicted strategic deployment of attention. Here we demonstrate that rhesus monkeys, much like humans, attend to events of moderate surprisingness over both more and less surprising events. They do this in the absence of any specific goal or contingent reward, indicating that the behavioral pattern is spontaneous. We suggest this U-shaped attentional preference represents an evolutionarily preserved strategy for guiding intelligent organisms toward material that is maximally useful for learning. Methods How the data were collected: In this project, we collected gaze data of 5 macaques when they watched sequential visual displays designed to elicit probabilistic expectations using the Eyelink Toolbox and were sampled at 1000 Hz by an infrared eye-monitoring camera system. Dataset:

"csv-combined.csv" is an aggregated dataset that includes one pop-up event per row for all original datasets for each trial. Here are descriptions of each column in the dataset:

subj: subject_ID = {"B":104, "C":102,"H":101,"J":103,"K":203} trialtime: start time of current trial in second trial: current trial number (each trial featured one of 80 possible visual-event sequences)(in order) seq current: sequence number (one of 80 sequences) seq_item: current item number in a seq (in order) active_item: pop-up item (active box) pre_active: prior pop-up item (actve box) {-1: "the first active object in the sequence/ no active object before the currently active object in the sequence"} next_active: next pop-up item (active box) {-1: "the last active object in the sequence/ no active object after the currently active object in the sequence"} firstappear: {0: "not first", 1: "first appear in the seq"} looks_blank: csv: total amount of time look at blank space for current event (ms); csv_timestamp: {1: "look blank at timestamp", 0: "not look blank at timestamp"} looks_offscreen: csv: total amount of time look offscreen for current event (ms); csv_timestamp: {1: "look offscreen at timestamp", 0: "not look offscreen at timestamp"} time till target: time spent to first start looking at the target object (ms) {-1: "never look at the target"} looks target: csv: time spent to look at the target object (ms);csv_timestamp: look at the target or not at current timestamp (1 or 0) look1,2,3: time spent look at each object (ms) location 123X, 123Y: location of each box (location of the three boxes for a given sequence were chosen randomly, but remained static throughout the sequence) item123id: pop-up item ID (remained static throughout a sequence) event time: total time spent for the whole event (pop-up and go back) (ms) eyeposX,Y: eye position at current timestamp

"csv-surprisal-prob.csv" is an output file from Monkilock_Data_Processing.ipynb. Surprisal values for each event were calculated and added to the "csv-combined.csv". Here are descriptions of each additional column:

rt: time till target {-1: "never look at the target"}. In data analysis, we included data that have rt > 0. already_there: {NA: "never look at the target object"}. In data analysis, we included events that are not the first event in a sequence, are not repeats of the previous event, and already_there is not NA. looks_away: {TRUE: "the subject was looking away from the currently active object at this time point", FALSE: "the subject was not looking away from the currently active object at this time point"} prob: the probability of the occurrence of object surprisal: unigram surprisal value bisurprisal: transitional surprisal value std_surprisal: standardized unigram surprisal value std_bisurprisal: standardized transitional surprisal value binned_surprisal_means: the means of unigram surprisal values binned to three groups of evenly spaced intervals according to surprisal values. binned_bisurprisal_means: the means of transitional surprisal values binned to three groups of evenly spaced intervals according to surprisal values.

"csv-surprisal-prob_updated.csv" is a ready-for-analysis dataset generated by Analysis_Code_final.Rmd after standardizing controlled variables, changing data types for categorical variables for analysts, etc. "AllSeq.csv" includes event information of all 80 sequences

Empty Values in Datasets:

There is no missing value in the original dataset "csv-combined.csv". Missing values (marked as NA in datasets) happen in columns "prev_active", "next_active", "already_there", "bisurprisal", "std_bisurprisal", "sq_std_bisurprisal" in "csv-surprisal-prob.csv" and "csv-surprisal-prob_updated.csv". NAs in columns "prev_active" and "next_active" mean that the first or the last active object in the sequence/no active object before or after the currently active object in the sequence. When we analyzed the variable "already_there", we eliminated data that their "prev_active" variable is NA. NAs in column "already there" mean that the subject never looks at the target object in the current event. When we analyzed the variable "already there", we eliminated data that their "already_there" variable is NA. Missing values happen in columns "bisurprisal", "std_bisurprisal", "sq_std_bisurprisal" when it is the first event in the sequence and the transitional probability of the event cannot be computed because there's no event happening before in this sequence. When we fitted models for transitional statistics, we eliminated data that their "bisurprisal", "std_bisurprisal", and "sq_std_bisurprisal" are NAs.

Codes:

In "Monkilock_Data_Processing.ipynb", we processed raw fixation data of 5 macaques and explored the relationship between their fixation patterns and the "surprisal" of events in each sequence. We computed the following variables which are necessary for further analysis, modeling, and visualizations in this notebook (see above for details): active_item, pre_active, next_active, firstappear ,looks_blank, looks_offscreen, time till target, looks target, look1,2,3, prob, surprisal, bisurprisal, std_surprisal, std_bisurprisal, binned_surprisal_means, binned_bisurprisal_means. "Analysis_Code_final.Rmd" is the main scripts that we further processed the data, built models, and created visualizations for data. We evaluated the statistical significance of variables using mixed effect linear and logistic regressions with random intercepts. The raw regression models include standardized linear and quadratic surprisal terms as predictors. The controlled regression models include covariate factors, such as whether an object is a repeat, the distance between the current and previous pop up object, trial number. A generalized additive model (GAM) was used to visualize the relationship between the surprisal estimate from the computational model and the behavioral data. "helper-lib.R" includes helper functions used in Analysis_Code_final.Rmd
d
GIS Data | Asia & MENA | 150m x 150m Grids| Accurate and Granular...
datarade.ai
.json, .csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GapMaps, GIS Data | Asia & MENA | 150m x 150m Grids| Accurate and Granular Demographics & Point of Interest (POI) Data | Map Data | Demographic Data [Dataset]. https://datarade.ai/data-products/gapmaps-global-gis-data-asia-mena-150m-x-150m-grids-cu-gapmaps
Explore at:
.json, .csvAvailable download formats
Dataset authored and provided by
GapMaps
Area covered
Philippines, Indonesia, Singapore, India, Saudi Arabia, Malaysia
Description
Sourcing accurate and up-to-date GIS data across Asia and MENA has historically been difficult for retail brands looking to expand their store networks in these regions. Either the data does not exist or it isn't readily accessible or updated regularly.

GapMaps uses known population data combined with billions of mobile device location points to provide highly accurate and globally consistent GIS data across Asia and MENA at 150m x 150m grid levels in major cities and 1km grids outside of major cities.

With this information, brands can get a detailed understanding of who lives in a catchment, where they work and their spending potential which allows you to:

Better understand your customers

Identify optimal locations to expand your retail footprint

Define sales territories for franchisees

Run targeted marketing campaigns.

GapMaps GIS data for Asia and MENA can be utilized in any GIS platform and includes the latest Demographic estimates (updated annually) including:

Population (how many people live in your local catchment)

Census Demographics (who lives within your local catchment)

Worker population (how many people work within your local catchment)

Consuming Class and Premium Consuming Class (who can can afford to buy goods & services beyond their basic needs and /or shop at premium retailers)

Retail Spending (Food & Beverage, Grocery, Apparel, Other). How much are consumers spending on retail goods and services by category.

GapMaps GIS Data also includes Point-Of-Interest (POI) Data updated monthly across a range of categories including Fast Food, Cafe, Health & Fitness and Supermarket/ Grocery

Primary Use Cases for GapMaps GIS Data:

Retail Site Selection - identify optimal locations for future expansion and benchmark performance across existing locations.

Customer Profiling: get a detailed understanding of the demographic profile of your customers, where they work and their spending potential

Analyse your trade areas at a granular 150m x 150m grid levels using all the key metrics

Target Marketing: Develop effective marketing strategies to acquire more customers.

Integrate GapMaps GIS data with your existing GIS or BI platform to generate powerful visualizations.
d
Satellite US Construction Materials Dataset Package (Cemex, Vulcan, Martin...
datarade.ai
.csv
Updated Jan 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Space Know (2023). Satellite US Construction Materials Dataset Package (Cemex, Vulcan, Martin Marietta) [Dataset]. https://datarade.ai/data-products/satellite-us-construction-materials-dataset-package-cemex-v-space-know
Explore at:
.csvAvailable download formats
Dataset updated
Jan 18, 2023
Dataset authored and provided by
Space Know
Area covered
United States of America
Description
This dataset package is focused on U.S construction materials and three construction companies: Cemex, Martin Marietta & Vulcan.

In this package, SpaceKnow tracks manufacturing and processing facilities for construction material products all over the US. By tracking these facilities, we are able to give you near-real-time data on spending on these materials, which helps to predict residential and commercial real estate construction and spending in the US.

The dataset includes 40 indices focused on asphalt, cement, concrete, and building materials in general. You can look forward to receiving country-level and regional data (activity in the North, East, West, and South of the country) and the aforementioned company data.

SpaceKnow uses satellite (SAR) data to capture activity and building material manufacturing and processing facilities in the US.

Data is updated daily, has an average lag of 4-6 days, and history back to 2017.

The insights provide you with level and change data for refineries, storage, manufacturing, logistics, and employee parking-based locations.

SpaceKnow offers 3 delivery options: CSV, API, and Insights Dashboard

Available Indices Companies: Cemex (CX): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Martin Marietta (MLM): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates Vulcan (VMC): Construction Materials (covers all manufacturing facilities of the company in the US), Concrete, Cement (refinery and storage) indices, and aggregates

USA Indices:

Aggregates USA Asphalt USA Cement USA Cement Refinery USA Cement Storage USA Concrete USA Construction Materials USA Construction Mining USA Construction Parking Lots USA Construction Materials Transfer Hub US Cement - Midwest, Northeast, South, West Cement Refinery - Midwest, Northeast, South, West Cement Storage - Midwest, Northeast, South, West

Why get SpaceKnow's U.S Construction Materials Package?

Monitor Construction Market Trends: Near-real-time insights into the construction industry allow clients to understand and anticipate market trends better.

Track Companies Performance: Monitor the operational activities, such as the volume of sales

Assess Risk: Use satellite activity data to assess the risks associated with investing in the construction industry.

Index Methodology Summary Continuous Feed Index (CFI) is a daily aggregation of the area of metallic objects in square meters. There are two types of CFI indices; CFI-R index gives the data in levels. It shows how many square meters are covered by metallic objects (for example employee cars at a facility). CFI-S index gives the change in data. It shows how many square meters have changed within the locations between two consecutive satellite images.

How to interpret the data SpaceKnow indices can be compared with the related economic indicators or KPIs. If the economic indicator is in monthly terms, perform a 30-day rolling sum and pick the last day of the month to compare with the economic indicator. Each data point will reflect approximately the sum of the month. If the economic indicator is in quarterly terms, perform a 90-day rolling sum and pick the last day of the 90-day to compare with the economic indicator. Each data point will reflect approximately the sum of the quarter.

Where the data comes from SpaceKnow brings you the data edge by applying machine learning and AI algorithms to synthetic aperture radar and optical satellite imagery. The company’s infrastructure searches and downloads new imagery every day, and the computations of the data take place within less than 24 hours.

In contrast to traditional economic data, which are released in monthly and quarterly terms, SpaceKnow data is high-frequency and available daily. It is possible to observe the latest movements in the construction industry with just a 4-6 day lag, on average.

The construction materials data help you to estimate the performance of the construction sector and the business activity of the selected companies.

The foundation of delivering high-quality data is based on the success of defining each location to observe and extract the data. All locations are thoroughly researched and validated by an in-house team of annotators and data analysts.

See below how our Construction Materials index performs against the US Non-residential construction spending benchmark

Each individual location is precisely defined to avoid noise in the data, which may arise from traffic or changing vegetation due to seasonal reasons.

SpaceKnow uses radar imagery and its own unique algorithms, so the indices do not lose their significance in bad weather conditions such as rain or heavy clouds.

→ Reach out to get free trial

...
Pricing Trends for communications services - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Feb 5, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2020). Pricing Trends for communications services - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/pricing-trends-for-communications-services
Explore at:
Dataset updated
Feb 5, 2020
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
This report looks at pricing trends for residential phone, broadband and TV services in the UK. It examines the prices of standalone and bundled services and what consumers spend on them. It also looks at how consumers engage with the market, illustrating how the prices paid by engaged consumers (those who shop around and are aware of their contractual status) who are currently within a minimum contract period, differ from prices paid by consumers outside a minimum contractual period. The report also includes the results of our consumer engagement research, which looked at the reasons why consumers may not engage with the market.
T
United States Consumer Spending
tradingeconomics.com
tr.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Aug 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United States Consumer Spending [Dataset]. https://tradingeconomics.com/united-states/consumer-spending
Explore at:
xml, json, excel, csvAvailable download formats
Dataset updated
Aug 15, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 31, 1947 - Jun 30, 2025
Area covered
United States
Description
Consumer Spending in the United States increased to 16445.70 USD Billion in the second quarter of 2025 from 16345.80 USD Billion in the first quarter of 2025. This dataset provides the latest reported value for - United States Consumer Spending - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
O
Open Budget Revenue - Current Year Totals
data.cstx.gov
csv, xlsx, xml
Updated Dec 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Open Budget Revenue - Current Year Totals [Dataset]. https://data.cstx.gov/Finance/Open-Budget-Revenue-Current-Year-Totals/geui-qkfe
Explore at:
xlsx, csv, xmlAvailable download formats
Dataset updated
Dec 1, 2025
Description
This dataset provides our current fiscal year's revenue budget and a transparent look at how we allocate public funds. There is also a total at the bottom of the dataset. Datasets will update every Friday by 11 p.m. (CST).

Facebook

Twitter

Click to copy link

Link copied

Cite

Niccole Martinez (2022). Time Spent with Relationships by Age - USA [Dataset]. https://www.kaggle.com/datasets/niccolem/time-spent-with-relationships-by-age-usa

Time Spent with Relationships by Age - USA

Explore at:

zip(2705 bytes)Available download formats

Dataset updated

Nov 18, 2022

Authors

Niccole Martinez

Area covered

United States

Description

From adolescence to old age: who do we spend our time with?

To understand how social connections evolve throughout our lives, we can look at survey data on how much time people spend with others and who that time is spent with.

This dataset shows the amount of time people in the US report spending in the company of others, based on their age. The data comes from time-use surveys, where people are asked to list all the activities they perform over a full day and the people who were present during each activity. Currently, there is only data with this granularity for the US – time-use surveys are common across many countries, but what is special about the US is that respondents of the American Time Use Survey are asked to list everyone present for each activity.

The numbers in this chart are based on averages for a cross-section of US society – people are only interviewed once, but the dataset represents a decade of surveys, tabulating the average amount of time survey respondents of different ages report spending with other people.

Source

https://ourworldindata.org/time-with-others-lifetime by Esteban Ortiz-Ospina December 11, 2020

Clear search

Close search

Google apps

Main menu

Time Spent with Relationships by Age - USA

From adolescence to old age: who do we spend our time with?

Source

American Time Use Survey: Daily Activities

American Time Use Survey: Daily Activities

Americans' Daily Activities: Education, Employment, Gender, and Leisure Time

About this dataset

How to use the dataset

How Does Daily Yoga Impact Screen Time Habits

How Does Daily Yoga Impact Screen Time Habits

A Study of Daily Screen Time Behavior

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

How to use this dataset

Research Ideas

Acknowledgements

License

Columns

Average daily time spent on social media worldwide 2012-2024

Daily Social Media Active Users

AB Testing

Cyprus Phone Number Data

Geodemographic Data | Asia/ MENA | Latest Estimates on Population, Consuming...

Number of global social network users 2017-2028

Data Use in Academia Dataset

Map Data | Asia & MENA | Premium Demographics & Point-of-Interest Data To...

Coffee Shop Daily Revenue Prediction Dataset

Dataset Overview

Columns & Variables

Target Variable

Data Distribution & Insights

Use Cases & Applications

Real-World Applications in the Food & Beverage Industry

Conclusion

RCS Data Laos

Demographic Data | Asia & MENA | Make Informed Business Decisions with High...

Data from: Macaques preferentially attend to intermediately surprising...

GIS Data | Asia & MENA | 150m x 150m Grids| Accurate and Granular...

Satellite US Construction Materials Dataset Package (Cemex, Vulcan, Martin...

Pricing Trends for communications services - Dataset - data.gov.uk

United States Consumer Spending

Open Budget Revenue - Current Year Totals

Time Spent with Relationships by Age - USA

From adolescence to old age: who do we spend our time with?

Source