Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Use Case 1: Gender-Based Retail Analytics By analyzing customer demographics in retail stores, the "man vrouw dataset 1" can help retailers understand the gender distribution of their shoppers, empowering them to make informed decisions on store layout, marketing strategies, and product placements.
Use Case 2: Crowd Monitoring and Event Management This model can help enhance safety and optimize visitor experience at crowded events, such as concerts or festivals, by identifying the gender distribution of attendees, enabling promoters to customize services, restrooms allocation, and security measures accordingly.
Use Case 3: Digital Advertising and Marketing Using the "man vrouw dataset 1" model, businesses can better target their digital advertisements by understanding the key demographic visiting specific websites or engaging with specific content, allowing for tailored ad campaigns designed to target male or female audiences.
Use Case 4: Smart Surveillance and Security Systems The model can be used in surveillance and security systems to help identify and track people by their HU classes (man or vrouw) in premises like airports or corporate buildings, allowing security teams to analyze patterns and prevent potential threats.
Use Case 5: Social Media Image Analysis The "man vrouw dataset 1" model can be used to analyze the gender composition of social media images, providing insights into trends, preferences, and behaviors of different gender groups on social platforms. This information can then be used for targeted marketing or social research purposes.
We collect, validate, model, and segment raw data signals from over 900+ sources globally to deliver thousands of mobile audience segments. We then combine that data with other public and private data sources to derive interests, intent, and behavioral attributes. Our proprietary algorithms then clean, enrich, unify and aggregate these data sets for use in our products. We have categorized our audience data into consumable categories such as interest, demographics, behavior, geography, etc. Audience Data Categories:Below mentioned data categories include consumer behavioral data and consumer profiles (available for the US and Australia) divided into various data categories. Brand Shoppers:Methodology: This category has been created based on the high intent of users in terms of their visits to Brand outlets in the real world. To create segments containing users with a high-affinity index, we use a precise determination of the number of occurrences at a given time. Place Category Visitors:Methodology: This category has been created based on the high intent of users visiting specific places of interest in the real world. To create segments containing users with a high-affinity index, we use a precise determination of the number of occurrences at a given time. Demographics:This category has been created based on deterministic data that we receive from apps based on the declared gender and age data. Marital Status, Education, Party affiliation, and State residency are available in the US. Geo-Behavioural:This category has been created based on the high intent of users in terms of the frequency of their visits to specific granular places of interest in the real world. To create segments containing users with a high-affinity index, we use a precise determination of the number of occurrences at a given time. Interests:This segment is created based on users' interest in a specific subject while browsing the internet when the visited website category is clearly focused on a specific subject such as cars, cooking, traveling, etc. We use a deterministic model to assign a proper profile and time that information is valid. The recency of data can range from 14 to 30 days, depending on the topic. Intent:Factori receives data from many partners to deliver high-quality pieces of information about users’ shopping intent. We collect data from sources connected to the eCommerce sector and we also receive data connected to online transactions from affiliate networks to deliver the most accurate segments with purchase intentions, such as laptops, mobile phones, or cars. The recency of data can range from 7 to 14 days depending on the product category. Events:This category was created based on the high interest of users in terms of content related to specific global events - sports, culture, and gaming. Among the event segments, we also distinguish categories related to the interest in certain lifestyle choices and behaviors. To create segments containing users with a high-affinity index, we use a precise determination of the number of occurrences at a given time. App Usage:Mobile category is a branch of the taxonomy that is dedicated only to the data that is based on mobile advertising IDs. It is based on the categorization of the mobile apps that the user has installed on the device. Auto Ownership:Consumer Profiles - Available for US and AustraliaThis audience has been created based on users declaring that they own a certain brand of automobile and other automotive attributes via a survey or registration. These audiences are currently available in the USA. Motorcycle Ownership:Consumer Profiles - Available for US and AustraliaThis audience has been created based on users declaring that they own a certain brand of motorcycle and other motorcycle-based attributes via a survey or registration. These audiences are currently available for the USA. Household:Consumer Profiles - Available for the US and AustraliaThis audience has been created based on users' declaring their marital status, parental status, and the overall number of children via a survey or registration. These audiences are currently available in the USA. Financial:Consumer Profiles - Available for the US and Australia this audience has been created based on their behavior in different financial services like property ownership, mortgage, investing behavior, and wealth and declaring their estimated net worth via a survey or registration. Purchase/ Spending Behavior:Consumer Profiles - Available for the US and AustraliaThis audience has been created based on their behavior in different spending behaviors in different business verticals available in the USA. Clusters:Consumer Profiles - Available for the US and AustraliaClusters are groups of consumers who exhibit similar demographic, lifestyle, and media consumption characteristics, empowering marketers to understand the unique attributes that comprise their most profitable consumer segments. Armed with this rich data, data scientists can drive analytics and modeling to power their brand’s unique marketing initiatives. B2B Audiences;Consumer Profiles - Available for US and AustraliaThis audience has been created based on users declaring their employee credentials, designations, and companies they work in, further specifying business verticals, revenue breakdowns, and headquarters locations. Customizable Audiences Data Segment:Brands can choose the appropriate pre-made audience segments or ask our data experts about creating a custom segment that is precisely tailored to your brief in order to reach their target customers and boost the campaign's effectiveness. Location Query Granularity:Minimum area: HEX 8Maximum area: QuadKey 17/City
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This anonymized data set consists of one month's (October 2018) web tracking data of 2,148 German users. For each user, the data contains the anonymized URL of the webpage the user visited, the domain of the webpage, category of the domain, which provides 41 distinct categories. In total, these 2,148 users made 9,151,243 URL visits, spanning 49,918 unique domains. For each user in our data set, we have self-reported information (collected via a survey) about their gender and age.
We acknowledge the support of Respondi AG, which provided the web tracking and survey data free of charge for research purposes, with special thanks to François Erner and Luc Kalaora at Respondi for their insights and help with data extraction.
The data set is analyzed in the following paper:
The code used to analyze the data is also available at https://github.com/gesiscss/web_tracking.
If you use data or code from this repository, please cite the paper above and the Zenodo link.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Uplift modeling is an important yet novel area of research in machine learning which aims to explain and to estimate the causal impact of a treatment at the individual level. In the digital advertising industry, the treatment is exposure to different ads and uplift modeling is used to direct marketing efforts towards users for whom it is the most efficient . The data is a collection collection of 13 million samples from a randomized control trial, scaling up previously available datasets by a healthy 590x factor.
###
###
The dataset was created by The Criteo AI Lab .The dataset consists of 13M rows, each one representing a user with 12 features, a treatment indicator and 2 binary labels (visits and conversions). Positive labels mean the user visited/converted on the advertiser website during the test period (2 weeks). The global treatment ratio is 84.6%. It is usual that advertisers keep only a small control population as it costs them in potential revenue.
Following is a detailed description of the features:
###
Uplift modeling is an important yet novel area of research in machine learning which aims to explain and to estimate the causal impact of a treatment at the individual level. In the digital advertising industry, the treatment is exposure to different ads and uplift modeling is used to direct marketing efforts towards users for whom it is the most efficient . The data is a collection collection of 13 million samples from a randomized control trial, scaling up previously available datasets by a healthy 590x factor.
###
###
The dataset was created by The Criteo AI Lab .The dataset consists of 13M rows, each one representing a user with 12 features, a treatment indicator and 2 binary labels (visits and conversions). Positive labels mean the user visited/converted on the advertiser website during the test period (2 weeks). The global treatment ratio is 84.6%. It is usual that advertisers keep only a small control population as it costs them in potential revenue.
Following is a detailed description of the features:
###
The data provided for paper: "A Large Scale Benchmark for Uplift Modeling"
https://s3.us-east-2.amazonaws.com/criteo-uplift-dataset/large-scale-benchmark.pdf
For privacy reasons the data has been sub-sampled non-uniformly so that the original incrementality level cannot be deduced from the dataset while preserving a realistic, challenging benchmark. Feature names have been anonymized and their values randomly projected so as to keep predictive power while making it practically impossible to recover the original features or user context.
We can foresee related usages such as but not limited to:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data, collected in 2024, provides a comprehensive snapshot of travel patterns and preferences within the mosque site area in Solo. This data, gathered across five distinct locations within the mosque complex, delves into the motivations and choices of individuals visiting the site.
The dataset encompasses a range of factors influencing travel decisions. It meticulously records travel characteristics, such as the primary purpose of the trip, the distance traveled to reach the mosque, and the duration of the journey. Additionally, it captures the parking fee incurred by visitors, offering insights into the economic considerations associated with travel to the mosque.
Beyond travel details, the dataset also profiles the respondents themselves. It captures demographic information, including gender, age, and occupation, providing a nuanced understanding of the diverse population visiting the mosque. Furthermore, it delves into economic indicators, such as monthly income and vehicle ownership, revealing the socioeconomic factors that influence travel choices.
This rich dataset serves as a valuable resource for understanding travel behavior within the mosque site area. By analyzing the collected data, researchers can gain valuable insights into the factors influencing travel choices, identify potential areas for improvement in accessibility and convenience, and develop strategies to enhance the overall experience for visitors.
The dataset collection includes information on accidental deaths (excluding road traffic deaths) among the population aged 1-7 per 100,000 persons of the same age in Finland. The dataset table, titled 'Accidental Deaths Excluding Road Traffic Deaths Among Population Aged 1-7 per 100,000 Persons of Same Age in Finland', is sourced from the website 'Sotkanet' in Finland.
Dataset for the textbook Computational Methods and GIS Applications in Social Science (3rd Edition), 2023 Fahui Wang, Lingbo Liu Main Book Citation: Wang, F., & Liu, L. (2023). Computational Methods and GIS Applications in Social Science (3rd ed.). CRC Press. https://doi.org/10.1201/9781003292302 KNIME Lab Manual Citation: Liu, L., & Wang, F. (2023). Computational Methods and GIS Applications in Social Science - Lab Manual. CRC Press. https://doi.org/10.1201/9781003304357 KNIME Hub Dataset and Workflow for Computational Methods and GIS Applications in Social Science-Lab Manual Update Log If Python package not found in Package Management, use ArcGIS Pro's Python Command Prompt to install them, e.g., conda install -c conda-forge python-igraph leidenalg NetworkCommDetPro in CMGIS-V3-Tools was updated on July 10,2024 Add spatial adjacency table into Florida on June 29,2024 The dataset and tool for ABM Crime Simulation were updated on August 3, 2023, The toolkits in CMGIS-V3-Tools was updated on August 3rd,2023. Report Issues on GitHub https://github.com/UrbanGISer/Computational-Methods-and-GIS-Applications-in-Social-Science Following the website of Fahui Wang : http://faculty.lsu.edu/fahui Contents Chapter 1. Getting Started with ArcGIS: Data Management and Basic Spatial Analysis Tools Case Study 1: Mapping and Analyzing Population Density Pattern in Baton Rouge, Louisiana Chapter 2. Measuring Distance and Travel Time and Analyzing Distance Decay Behavior Case Study 2A: Estimating Drive Time and Transit Time in Baton Rouge, Louisiana Case Study 2B: Analyzing Distance Decay Behavior for Hospitalization in Florida Chapter 3. Spatial Smoothing and Spatial Interpolation Case Study 3A: Mapping Place Names in Guangxi, China Case Study 3B: Area-Based Interpolations of Population in Baton Rouge, Louisiana Case Study 3C: Detecting Spatiotemporal Crime Hotspots in Baton Rouge, Louisiana Chapter 4. Delineating Functional Regions and Applications in Health Geography Case Study 4A: Defining Service Areas of Acute Hospitals in Baton Rouge, Louisiana Case Study 4B: Automated Delineation of Hospital Service Areas in Florida Chapter 5. GIS-Based Measures of Spatial Accessibility and Application in Examining Healthcare Disparity Case Study 5: Measuring Accessibility of Primary Care Physicians in Baton Rouge Chapter 6. Function Fittings by Regressions and Application in Analyzing Urban Density Patterns Case Study 6: Analyzing Population Density Patterns in Chicago Urban Area >Chapter 7. Principal Components, Factor and Cluster Analyses and Application in Social Area Analysis Case Study 7: Social Area Analysis in Beijing Chapter 8. Spatial Statistics and Applications in Cultural and Crime Geography Case Study 8A: Spatial Distribution and Clusters of Place Names in Yunnan, China Case Study 8B: Detecting Colocation Between Crime Incidents and Facilities Case Study 8C: Spatial Cluster and Regression Analyses of Homicide Patterns in Chicago Chapter 9. Regionalization Methods and Application in Analysis of Cancer Data Case Study 9: Constructing Geographical Areas for Mapping Cancer Rates in Louisiana Chapter 10. System of Linear Equations and Application of Garin-Lowry in Simulating Urban Population and Employment Patterns Case Study 10: Simulating Population and Service Employment Distributions in a Hypothetical City Chapter 11. Linear and Quadratic Programming and Applications in Examining Wasteful Commuting and Allocating Healthcare Providers Case Study 11A: Measuring Wasteful Commuting in Columbus, Ohio Case Study 11B: Location-Allocation Analysis of Hospitals in Rural China Chapter 12. Monte Carlo Method and Applications in Urban Population and Traffic Simulations Case Study 12A. Examining Zonal Effect on Urban Population Density Functions in Chicago by Monte Carlo Simulation Case Study 12B: Monte Carlo-Based Traffic Simulation in Baton Rouge, Louisiana Chapter 13. Agent-Based Model and Application in Crime Simulation Case Study 13: Agent-Based Crime Simulation in Baton Rouge, Louisiana Chapter 14. Spatiotemporal Big Data Analytics and Application in Urban Studies Case Study 14A: Exploring Taxi Trajectory in ArcGIS Case Study 14B: Identifying High Traffic Corridors and Destinations in Shanghai Dataset File Structure 1 BatonRouge Census.gdb BR.gdb 2A BatonRouge BR_Road.gdb Hosp_Address.csv TransitNetworkTemplate.xml BR_GTFS Google API Pro.tbx 2B Florida FL_HSA.gdb R_ArcGIS_Tools.tbx (RegressionR) 3A China_GX GX.gdb 3B BatonRouge BR.gdb 3C BatonRouge BRcrime R_ArcGIS_Tools.tbx (STKDE) 4A BatonRouge BRRoad.gdb 4B Florida FL_HSA.gdb HSA Delineation Pro.tbx Huff Model Pro.tbx FLplgnAdjAppend.csv 5 BRMSA BRMSA.gdb Accessibility Pro.tbx 6 Chicago ChiUrArea.gdb R_ArcGIS_Tools.tbx (RegressionR) 7 Beijing BJSA.gdb bjattr.csv R_ArcGIS_Tools.tbx (PCAandFA, BasicClustering) 8A Yunnan YN.gdb R_ArcGIS_Tools.tbx (SaTScanR) 8B Jiangsu JS.gdb 8C Chicago ChiCity.gdb cityattr.csv ...
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Environmental volunteering can benefit participants and nature through improving physical and mental wellbeing while encouraging environmental stewardship. To enhance achievement of these outcomes, conservation organisations need to reach different groups of people to increase participation in environmental volunteering. This paper explores what engages communities searching online for environmental volunteering.
We conducted a literature review of 1032 papers to determine key factors fostering participation by existing volunteers in environmental projects. We found the most important factor was to tailor projects to the motivations of participants. Also important were: promoting projects to people with relevant interests; meeting the perceived benefits of volunteers and removing barriers to participation.
We then assessed the composition and factors fostering participation of the NatureVolunteers’s online community (n = 2216) of potential environmental volunteers and compared findings with those from the literature review. We asked whether projects advertised by conservation organisations meet motivations and interests of this online community.
Using Facebook insights and Google Analytics we found that the online community were on average younger than extant communities observed in studies of environmental volunteering. Their motivations were also different as they were more interested in physical activity and using skills and less in social factors. They also exhibited preference for projects which are outdoor based, and which offer close contact with wildlife. Finally, we found that the online community showed a stronger preference for habitat improvement projects over those involving species-survey based citizen science.
Our results demonstrate mis-matches between what our online community are looking for and what is advertised by conservation organisations. The online community are looking for projects which are more solitary, more physically active and more accessible by organised transport. We discuss how our results may be used by conservation organisations to better engage with more people searching for environmental volunteering opportunities online.
We conclude that there is a pool of young people attracted to environmental volunteering projects whose interests are different to those of current volunteers. If conservation organisations can develop projects that meet these interests, they can engage larger and more diverse communities in nature volunteering.
Methods The data set consists of separate sheets for each set of results presented in the paper. Each sheet contains the full data, summary descriptive statistics analysis and graphs presented in the paper. The method for collection and processing of the dataset in each sheet is as follows:
The data set for results presented in Figure 1 in the paper - Sheet: "Literature"
We conducted a review of literature on improving participation within nature conservation projects. This enabled us to determine what the most important factors were for participating in environmental projects, the composition of the populations sampled and the methods by which data were collected. The search terms used were (Environment* OR nature OR conservation) AND (Volunteer* OR “citizen science”) AND (Recruit* OR participat* OR retain* OR interest*). We reviewed all articles identified in the Web of Science database and the first 50 articles sorted for relevance in Google Scholar on the 22nd October 2019. Articles were first reviewed by title, secondly by abstract and thirdly by full text. They were retained or excluded according to criteria agreed by the authors of this paper. These criteria were as follows - that the paper topic was volunteering in the environment, including citizen science, community-based projects and conservation abroad, and included the study of factors which could improve participation in projects. Papers were excluded for topics irrelevant to this study, the most frequent being the outcomes of volunteering for participants (such as behavioural change and knowledge gain), improving citizen science data and the usefulness of citizen science data. The remaining final set of selected papers was then read to extract information on the factors influencing participation, the population sampled and the data collection methods. In total 1032 papers were reviewed of which 31 comprised the final selected set read in full. Four factors were identified in these papers which improve volunteer recruitment and retention. These were: tailoring projects to the motivations of participants, promoting projects to people with relevant hobbies and interests, meeting the perceived benefits of volunteers and removing barriers to participation.
The data set for results presented in Figure 2 and Figure 3 in the paper - Sheet "Demographics"
To determine if the motivations and interests expressed by volunteers in literature were representative of wider society, NatureVolunteers was exhibited at three UK public engagement events during May and June 2019; Hullabaloo Festival (Isle of Wight), The Great Wildlife Exploration (Bournemouth) and Festival of Nature (Bristol). This allowed us to engage with people who may not have ordinarily considered volunteering and encourage people to use the website. A combination of surveys and semi-structured interviews were used to collect information from the public regarding demographics and volunteering. In line with our ethics approval, no personal data were collected that could identify individuals and all participants gave informed consent for their anonymous information to be used for research purposes. The semi-structured interviews consisted of conducting the survey in a conversation with the respondent, rather than the respondent filling in the questionnaire privately and responses were recorded immediately by the interviewer. Hullabaloo Festival was a free discovery and exploration event where NatureVolunteers had a small display and surveys available. The Great Wildlife Exploration was a Bioblitz designed to highlight the importance of urban greenspaces where we had a stall with wildlife crafts promoting NatureVolunteers. The Festival of Nature was the UK’s largest nature-based festival in 2019 where we again had wildlife crafts available promoting NatureVolunteers. The surveys conducted at these events sampled a population of people who already expressed an interest in nature and the environment by attending the events and visiting the NatureVolunteers stand. In total 100 completed surveys were received from the events NatureVolunteers exhibited at; 21 from Hullabaloo Festival, 25 from the Great Wildlife Exploration and 54 from the Festival of Nature. At Hullabaloo Festival information on gender was not recorded for all responses and was consequently entered as “unrecorded”.
OVERALL DESCRIPTION OF METHOD DATA COLLECTION FOR ALL OTHER RESULTS (Figures 4-7 and Tables 1-2)
The remaining data were all collected from the NatureVolunteers website. The NatureVolunteers website https://www.naturevolunteers.uk/ was set up in 2018 with funding support from the Higher Education Innovation Fund to expand the range of people accessing nature volunteering opportunities in the UK. It is designed to particularly appeal to people who are new to nature volunteering including young adults wishing to expand their horizons, families looking for ways connect with nature to enhance well-being and older people wishing to share their time and life experiences to help nature. In addition, it was designed to be helpful to professionals working in the countryside & wildlife conservation sectors who wish to enhance their skills through volunteering. As part of the website’s development we created and used an online project database, www.naturevolunteers.uk (hereafter referred to as NatureVolunteers), to assess the needs and interests of our online community. Our research work was granted ethical approval by the Bournemouth University Ethics Committee. The website collects entirely anonymous data on our online community of website users that enables us to evaluate what sort of projects and project attributes most appeal to our online community. Visitors using the website to find projects are informed as part of the guidance on using the search function that this fully anonymous information is collected by the website to enhance and share research understanding of how conservation organisations can tailor their future projects to better match the interests of potential volunteers. Our online community was built up over the 2018-2019 through open advertising of the website nationally through the social media channels of our partner conservation organisations, through a range of public engagement in science events and nature-based festivals across southern England and through our extended network of friends and families, their own social media networks and the NatureVolunteers website’s own social network on Facebook and Twitter. There were 2216 searches for projects on NatureVolunteers from January 1st to October 25th, 2019.
The data set for results presented in Figure 2 and Figure 3 in the paper - Sheet "Demographics"
On the website, users searching for projects were firstly asked to specify their expectations of projects. These expectations encompass the benefits of volunteering by asking whether the project includes social interaction, whether particular skills are required or can be developed, and whether physical activity is involved. The barriers to participation are incorporated by asking whether the project is suitable for families, and whether organised transport is provided. Users were asked to rate the importance of the five project expectations on a Likert scale of 1 to 5 (Not at all = 1, Not really = 2, Neutral = 3, It
The population share with mobile internet access in North America was forecast to increase between 2024 and 2029 by in total 2.9 percentage points. This overall increase does not happen continuously, notably not in 2028 and 2029. The mobile internet penetration is estimated to amount to 84.21 percent in 2029. Notably, the population share with mobile internet access of was continuously increasing over the past years.The penetration rate refers to the share of the total population having access to the internet via a mobile broadband connection.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the population share with mobile internet access in countries like Caribbean and Europe.
Home to NTIA data and analysis on computer and Internet use in the United States. Since November 1994, NTIA has periodically sponsored data collections on Internet use and the devices Americans use to go online as a supplement to the Census Bureau’s annual Current Population Survey (CPS); analyzed the data; and reported the findings. In recent years, NTIA has also linked to the raw datasets on the Census Bureau website. To facilitate the public’s access to the CPS Internet use data, NTIA is now making these data available here, and has developed an important tool to help site visitors find information quickly. Our Data Explorer tool enables users to select from dozens of metrics tracked over time, as well as a number of demographic characteristics, and charts the requested data. NTIA invites your feedback at data@ntia.doc.gov as we continually improve Data Central.
https://www.icpsr.umich.edu/web/ICPSR/studies/32721/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/32721/terms
The Study of Women's Health Across the Nation (SWAN), is a multi-site longitudinal, epidemiologic study designed to examine the health of women during their middle years. The study examines the physical, biological, psychological and social changes during this transitional period. The goal of SWAN's research is to help scientists, health care providers and women learn how mid-life experiences affect health and quality of life during aging. Data were collected about doctor visits, medical conditions, medications, treatments, medical procedures, relationships, smoking, and menopause related information such as age at pre-, peri- and post-menopause, self-attitudes, feelings, and common physical problems associated with menopause. The study began in 1994. Between 2005 and 2007, 2,255 of the 3,302 women that joined SWAN were seen for their ninth follow-up visit. The research centers are located in the following communities: Ypsilanti and Inkster, MI (University of Michigan); Boston, MA (Massachusetts General Hospital); Chicago, IL (Rush Presbyterian-St. Luke's Medical Center); Alameda and Contra Costa County, CA (University of California-Davis and Kaiser Permanente); Los Angeles, CA (University of California-Los Angeles); Hackensack, NJ (Hackensack University Medical Center); and Pittsburgh, PA (University of Pittsburgh). SWAN participants represent five racial/ethnic groups and a variety of backgrounds and cultures. Though the New Jersey site was still part of the study, data was not collected from this site for the ninth visit. Demographic and background information includes age, language of interview, marital status, household composition, and employment.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
This project investigates air pollution in California communities disproportionately affected by their proximity to transportation corridors, industrial facilities, and logistics centers, focusing on truck-related activities, including idling, parking search, and parking demand, using comprehensive datasets and robust models employing techniques such as Random Forest, Convolutional Neural Network, Bayesian Ridge Regression, and Spatial Error Model. Key findings reveal factors affecting idling times, parking search times, and parking demand, with heavy-duty trucks having the highest idle times and parking search challenges concentrated around transportation arteries and freight yards. The Spatial Error Model highlights relationships between truck activities, socio-economic variables, and air pollution in AB 617 communities. Based on these findings, preliminary policy recommendations include targeted anti-idling campaigns, improved truck parking facilities, cleaner fuels and technologies, enhanced routing efficiency, stricter emission standards, and strengthened land-use planning. Methods The data submitted in this dataset originates from various sources, with each source providing unique insights into the study of truck idling and parking in AB 617 Disadvantaged Communities. The dataset submitted here is the result of careful processing and manipulation of the original datasets to create a comprehensive view of truck idling and parking behaviors. 1. Geotab Ignition Platform Data Though not directly included in this submission due to licensing restrictions, data from the Geotab Ignition platform was instrumental in the creation of this dataset. It includes raw idling data, raw data for searching for parking, and raw truck parking location data. We used these datasets to extract key metrics related to truck idling and parking behaviors. The Geotab data was processed and aggregated to obtain daily idling times and parking search times. This was done by using the geohash provided to group data by location and then computing the daily metrics. Please note that due to licensing restrictions, the raw Geotab data is not included in this submission. For those interested in using the Geotab data, please refer to the Geotab website to access the data directly. 2. CalEnviroScreen 4.0, Census data, and OpenStreetMap (OSM) These datasets provided contextual information, such as demographics and infrastructure, which were used to enrich the idling and parking data derived from the Geotab datasets. For example, demographic data from the Census and CalEnviroScreen 4.0 was used to identify disadvantaged communities, while data from OpenStreetMap was used to map idling and parking behavior to specific locations. 3. Kern County Traffic Count Data System (TCDS) Data The TCDS data was used to provide a count of truck traffic at various locations. This data was integrated with the processed Geotab data to provide a more complete picture of truck activity in the study areas. 4. Final Dataset (The Dataset Used for Modeling) The final dataset was created by merging the processed Geotab data with the relevant data from the other sources. This process involved spatially joining the Geotab and TCDS data based on location and then appending the relevant demographic and infrastructure data from CalEnviroScreen 4.0, Census, and OSM. The result is a comprehensive dataset that provides a detailed view of truck idling and parking behavior in AB 617 Disadvantaged Communities.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
This dataset presents the ratio of tourist visitors to the population of the region for Tourism Regions around Australia for the years 2006/07 to 2014/15. The Tourism Regions covered in the data are from the 2014 release of the Tourism Regions from the Australian Bureau of Statistics.
Tourism Research Australia's (TRA) Tourism Region Profiles provide comprehensive supply and demand tourism data for all of Australia's 2014 tourism regions. The data includes:
Total tourism expenditure
Overnight visitors
Visitor/population ratio
Accommodation (rooms, occupancy and RevPAR)
Aviation (seats available and seat utilisation)
Tourism businesses
Tourism investment (projects and value) For more information please visit the Website of the TRA.
Please note:
AURIN has spatially enabled the original data.
Where data values were, "np", not published or "-", not available, in the original data, they have been set to null.
People aged 15 years and over are used for both visitors and population.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
A log of dataset alerts open, monitored or resolved on the open data portal. Alerts can include issues as well as deprecation or discontinuation notices.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Reporting of Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. Although these data will continue to be publicly available, this dataset will no longer be updated.
This archived public use dataset has 11 data elements reflecting United States COVID-19 community levels for all available counties.
The COVID-19 community levels were developed using a combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days. The COVID-19 community level was determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge.
Using these data, the COVID-19 community level was classified as low, medium, or high.
COVID-19 Community Levels were used to help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals.
For the most accurate and up-to-date data for any county or state, visit the relevant health department website. COVID Data Tracker may display data that differ from state and local websites. This can be due to differences in how data were collected, how metrics were calculated, or the timing of web updates.
Archived Data Notes:
This dataset was renamed from "United States COVID-19 Community Levels by County as Originally Posted" to "United States COVID-19 Community Levels by County" on March 31, 2022.
March 31, 2022: Column name for county population was changed to “county_population”. No change was made to the data points previous released.
March 31, 2022: New column, “health_service_area_population”, was added to the dataset to denote the total population in the designated Health Service Area based on 2019 Census estimate.
March 31, 2022: FIPS codes for territories American Samoa, Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands were re-formatted to 5-digit numeric for records released on 3/3/2022 to be consistent with other records in the dataset.
March 31, 2022: Changes were made to the text fields in variables “county”, “state”, and “health_service_area” so the formats are consistent across releases.
March 31, 2022: The “%” sign was removed from the text field in column “covid_inpatient_bed_utilization”. No change was made to the data. As indicated in the column description, values in this column represent the percentage of staffed inpatient beds occupied by COVID-19 patients (7-day average).
March 31, 2022: Data values for columns, “county_population”, “health_service_area_number”, and “health_service_area” were backfilled for records released on 2/24/2022. These columns were added since the week of 3/3/2022, thus the values were previously missing for records released the week prior.
April 7, 2022: Updates made to data released on 3/24/2022 for Guam, Commonwealth of the Northern Mariana Islands, and United States Virgin Islands to correct a data mapping error.
April 21, 2022: COVID-19 Community Level (CCL) data released for counties in Nebraska for the week of April 21, 2022 have 3 counties identified in the high category and 37 in the medium category. CDC has been working with state officials to verify the data submitted, as other data systems are not providing alerts for substantial increases in disease transmission or severity in the state.
May 26, 2022: COVID-19 Community Level (CCL) data released for McCracken County, KY for the week of May 5, 2022 have been updated to correct a data processing error. McCracken County, KY should have appeared in the low community level category during the week of May 5, 2022. This correction is reflected in this update.
May 26, 2022: COVID-19 Community Level (CCL) data released for several Florida counties for the week of May 19th, 2022, have been corrected for a data processing error. Of note, Broward, Miami-Dade, Palm Beach Counties should have appeared in the high CCL category, and Osceola County should have appeared in the medium CCL category. These corrections are reflected in this update.
May 26, 2022: COVID-19 Community Level (CCL) data released for Orange County, New York for the week of May 26, 2022 displayed an erroneous case rate of zero and a CCL category of low due to a data source error. This county should have appeared in the medium CCL category.
June 2, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a data processing error. Tolland County, CT should have appeared in the medium community level category during the week of May 26, 2022. This correction is reflected in this update.
June 9, 2022: COVID-19 Community Level (CCL) data released for Tolland County, CT for the week of May 26, 2022 have been updated to correct a misspelling. The medium community level category for Tolland County, CT on the week of May 26, 2022 was misspelled as “meduim” in the data set. This correction is reflected in this update.
June 9, 2022: COVID-19 Community Level (CCL) data released for Mississippi counties for the week of June 9, 2022 should be interpreted with caution due to a reporting cadence change over the Memorial Day holiday that resulted in artificially inflated case rates in the state.
July 7, 2022: COVID-19 Community Level (CCL) data released for Rock County, Minnesota for the week of July 7, 2022 displayed an artificially low case rate and CCL category due to a data source error. This county should have appeared in the high CCL category.
July 14, 2022: COVID-19 Community Level (CCL) data released for Massachusetts counties for the week of July 14, 2022 should be interpreted with caution due to a reporting cadence change that resulted in lower than expected case rates and CCL categories in the state.
July 28, 2022: COVID-19 Community Level (CCL) data released for all Montana counties for the week of July 21, 2022 had case rates of 0 due to a reporting issue. The case rates have been corrected in this update.
July 28, 2022: COVID-19 Community Level (CCL) data released for Alaska for all weeks prior to July 21, 2022 included non-resident cases. The case rates for the time series have been corrected in this update.
July 28, 2022: A laboratory in Nevada reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate will be inflated in Clark County, NV for the week of July 28, 2022.
August 4, 2022: COVID-19 Community Level (CCL) data was updated on August 2, 2022 in error during performance testing. Data for the week of July 28, 2022 was changed during this update due to additional case and hospital data as a result of late reporting between July 28, 2022 and August 2, 2022. Since the purpose of this data set is to provide point-in-time views of COVID-19 Community Levels on Thursdays, any changes made to the data set during the August 2, 2022 update have been reverted in this update.
August 4, 2022: COVID-19 Community Level (CCL) data for the week of July 28, 2022 for 8 counties in Utah (Beaver County, Daggett County, Duchesne County, Garfield County, Iron County, Kane County, Uintah County, and Washington County) case data was missing due to data collection issues. CDC and its partners have resolved the issue and the correction is reflected in this update.
August 4, 2022: Due to a reporting cadence change, case rates for all Alabama counties will be lower than expected. As a result, the CCL levels published on August 4, 2022 should be interpreted with caution.
August 11, 2022: COVID-19 Community Level (CCL) data for the week of August 4, 2022 for South Carolina have been updated to correct a data collection error that resulted in incorrect case data. CDC and its partners have resolved the issue and the correction is reflected in this update.
August 18, 2022: COVID-19 Community Level (CCL) data for the week of August 11, 2022 for Connecticut have been updated to correct a data ingestion error that inflated the CT case rates. CDC, in collaboration with CT, has resolved the issue and the correction is reflected in this update.
August 25, 2022: A laboratory in Tennessee reported a backlog of historic COVID-19 cases. As a result, the 7-day case count and rate may be inflated in many counties and the CCLs published on August 25, 2022 should be interpreted with caution.
August 25, 2022: Due to a data source error, the 7-day case rate for St. Louis County, Missouri, is reported as zero in the COVID-19 Community Level data released on August 25, 2022. Therefore, the COVID-19 Community Level for this county should be interpreted with caution.
September 1, 2022: Due to a reporting issue, case rates for all Nebraska counties will include 6 days of data instead of 7 days in the COVID-19 Community Level (CCL) data released on September 1, 2022. Therefore, the CCLs for all Nebraska counties should be interpreted with caution.
September 8, 2022: Due to a data processing error, the case rate for Philadelphia County, Pennsylvania,
Welcome to Apiscrapy, your ultimate destination for comprehensive location-based intelligence. As an AI-driven web scraping and automation platform, Apiscrapy excels in converting raw web data into polished, ready-to-use data APIs. With a unique capability to collect Google Address Data, Google Address API, Google Location API, Google Map, and Google Location Data with 100% accuracy, we redefine possibilities in location intelligence.
Key Features:
Unparalleled Data Variety: Apiscrapy offers a diverse range of address-related datasets, including Google Address Data and Google Location Data. Whether you seek B2B address data or detailed insights for various industries, we cover it all.
Integration with Google Address API: Seamlessly integrate our datasets with the powerful Google Address API. This collaboration ensures not just accessibility but a robust combination that amplifies the precision of your location-based insights.
Business Location Precision: Experience a new level of precision in business decision-making with our address data. Apiscrapy delivers accurate and up-to-date business locations, enhancing your strategic planning and expansion efforts.
Tailored B2B Marketing: Customize your B2B marketing strategies with precision using our detailed B2B address data. Target specific geographic areas, refine your approach, and maximize the impact of your marketing efforts.
Use Cases:
Location-Based Services: Companies use Google Address Data to provide location-based services such as navigation, local search, and location-aware advertisements.
Logistics and Transportation: Logistics companies utilize Google Address Data for route optimization, fleet management, and delivery tracking.
E-commerce: Online retailers integrate address autocomplete features powered by Google Address Data to simplify the checkout process and ensure accurate delivery addresses.
Real Estate: Real estate agents and property websites leverage Google Address Data to provide accurate property listings, neighborhood information, and proximity to amenities.
Urban Planning and Development: City planners and developers utilize Google Address Data to analyze population density, traffic patterns, and infrastructure needs for urban planning and development projects.
Market Analysis: Businesses use Google Address Data for market analysis, including identifying target demographics, analyzing competitor locations, and selecting optimal locations for new stores or offices.
Geographic Information Systems (GIS): GIS professionals use Google Address Data as a foundational layer for mapping and spatial analysis in fields such as environmental science, public health, and natural resource management.
Government Services: Government agencies utilize Google Address Data for census enumeration, voter registration, tax assessment, and planning public infrastructure projects.
Tourism and Hospitality: Travel agencies, hotels, and tourism websites incorporate Google Address Data to provide location-based recommendations, itinerary planning, and booking services for travelers.
Discover the difference with Apiscrapy – where accuracy meets diversity in address-related datasets, including Google Address Data, Google Address API, Google Location API, and more. Redefine your approach to location intelligence and make data-driven decisions with confidence. Revolutionize your business strategies today!
The global number of smartphone users in was forecast to continuously increase between 2024 and 2029 by in total 1.8 billion users (+42.62 percent). After the ninth consecutive increasing year, the smartphone user base is estimated to reach 6.1 billion users and therefore a new peak in 2029. Notably, the number of smartphone users of was continuously increasing over the past years.Smartphone users here are limited to internet users of any age using a smartphone. The shown figures have been derived from survey data that has been processed to estimate missing demographics.The shown data are an excerpt of Statista's Key Market Indicators (KMI). The KMI are a collection of primary and secondary indicators on the macro-economic, demographic and technological environment in up to 150 countries and regions worldwide. All indicators are sourced from international and national statistical offices, trade associations and the trade press and they are processed to generate comparable data sets (see supplementary notes under details for more information).Find more key insights for the number of smartphone users in countries like Australia & Oceania and Asia.
Abstract copyright UK Data Service and data collection copyright owner. This is a mixed methods data collection. The main purpose of the research was to understand what users and non-users of public libraries want from the service. Although detailed data exist on the level and frequency of public library use across the population, there was less information on how people regard the quality of service, and on their reasons for using or not using public libraries. If the library service is to retain existing users and encourage new visitors, it is vital to understand the expectations and experiences of the public, irrespective of whether people currently use public libraries or not. The study aimed to provide an up-to-date picture of what the public wants and values in library services, and help leaders to make decisions about the future development of the service. The data were collected in two different phases: Qualitative: This consisted of a series of focus groups across England with a diverse range of people who had differing attitudes to public libraries. These groups examined all aspects of people's relationships with local public libraries and crucially allowed the researchers to explore many issues not covered satisfactorily by existing data. Quantitative: This took the form of a telephone survey with a representative sample of the adult population of England. This phase's principal focus was to corroborate the findings from the qualitative phase and identify further reasons behind usage or not of libraries and the factors that would encourage people to visit libraries more often. Further information is available on the Museums, Libraries and Archives Council's (MLA) Research Evidence Resources website.
Attribution 2.5 (CC BY 2.5)https://creativecommons.org/licenses/by/2.5/
License information was derived automatically
This dataset contains estimates of the resident population and estimates of the components of population change as at 30 June for the years 2001-2019. The data is aggregated to 2016 Australian Statistical Geography Standard (ASGS) Statistical Area Level 2 (SA2). Estimated resident population (ERP) is the official estimate of the Australian population, which links people to a place of usual residence within Australia. Usual residence within Australia refers to that address at which the person has lived or intends to live for six months or more in a given reference year. For the 30 June reference date, this refers to the calendar year around it. Estimated resident population is based on Census counts by place of usual residence (excluding short-term overseas visitors in Australia), with an allowance for Census net undercount, to which are added the estimated number of Australian residents temporarily overseas at the time of the Census. This data is sourced from the Australian Bureau of Statistics (Catalogue Number: 3218.0). For more information please visit the Explanatory Notes.
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
This is a monthly report on publicly funded community services for children, young people and adults using data from the Community Services Data Set (CSDS) reported in England for November 2017. The CSDS is a patient-level dataset providing information relating to publicly funded community services for children, young people and adults. These services can include district nursing services, school nursing services, health visiting services and occupational therapy services, among others. The data collected includes personal and demographic information, diagnoses including long-term conditions and disabilities and care events plus screening activities. It has been developed to help achieve better outcomes for children, young people and adults. It provides data that will be used to commission services in a way that improves health, reduces inequalities, and supports service improvement and clinical quality. Prior to October 2017, the predecessor Children and Young People's Health Services (CYPHS) Data Set collected data for children and young people aged 0-18. The CSDS superseded the CYPHS data set to allow adult community data to be submitted, expanding the scope of the existing data set by removing the 0-18 age restriction. The structure and content of the CSDS remains the same as the previous CYPHS data set. Further information about the CYPHS and related statistical reports is available from https://digital.nhs.uk/data-and-information/data-collections-and-data-sets/data-sets/children-and-young-people-s-health-services-data-set References to children and young people covers records submitted for 0-18 year olds and references to adults covers records submitted for those aged over 18. Where analysis for both groups have been combined, this is referred to as all patients. These statistics are classified as experimental and should be used with caution. Experimental statistics are new official statistics undergoing evaluation. They are published in order to involve users and stakeholders in their development and as a means to build in quality at an early stage. More information about experimental statistics can be found on the UK Statistics Authority website. We hope this information is helpful and would be grateful if you could spare a couple of minutes to complete a short customer satisfaction survey. Please use this form to provide us with any feedback or suggestions for improving the report. Update 6 April 2018: Please note since the removal of the age restriction to include adult data in CSDS, some of our Data Quality measures may not take into account items intended for children only. We are currently reviewing these measures and will look to reflect this in future reports.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Here are a few use cases for this project:
Use Case 1: Gender-Based Retail Analytics By analyzing customer demographics in retail stores, the "man vrouw dataset 1" can help retailers understand the gender distribution of their shoppers, empowering them to make informed decisions on store layout, marketing strategies, and product placements.
Use Case 2: Crowd Monitoring and Event Management This model can help enhance safety and optimize visitor experience at crowded events, such as concerts or festivals, by identifying the gender distribution of attendees, enabling promoters to customize services, restrooms allocation, and security measures accordingly.
Use Case 3: Digital Advertising and Marketing Using the "man vrouw dataset 1" model, businesses can better target their digital advertisements by understanding the key demographic visiting specific websites or engaging with specific content, allowing for tailored ad campaigns designed to target male or female audiences.
Use Case 4: Smart Surveillance and Security Systems The model can be used in surveillance and security systems to help identify and track people by their HU classes (man or vrouw) in premises like airports or corporate buildings, allowing security teams to analyze patterns and prevent potential threats.
Use Case 5: Social Media Image Analysis The "man vrouw dataset 1" model can be used to analyze the gender composition of social media images, providing insights into trends, preferences, and behaviors of different gender groups on social platforms. This information can then be used for targeted marketing or social research purposes.