Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains information about web requests to a single website. It's a time series dataset, which means it tracks data over time, making it great for machine learning analysis.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Explore our detailed website traffic dataset featuring key metrics like page views, session duration, bounce rate, traffic source, and conversion rates.
Facebook
TwitterDaily utilization metrics for data.lacity.org and geohub.lacity.org. Updated monthly
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The data set provided (traffic.csv) contains web traffic data ("events") from a few different pages ("links") over 7 days including various categorical dimensions about the geographic origin of that traffic as well as a page's content: isrc.
Facebook
TwitterUrban SDK is a GIS data management platform and global provider of mobility, urban characteristics, and alt datasets. Urban SDK Traffic data provides traffic volume, average speed, average travel time and congestion for logistics, transportation planning, traffic monitoring, routing and urban planning. Traffic data is generated from cars, trucks and mobile devices for major road networks in US and Canada.
"With the old data I used, it took me 3-4 weeks to create a presentation. I will be able to do 3-4x the work with your Urban SDK traffic data."
Traffic Volume, Speed and Congestion Data Type Profile:
Industry Solutions include:
Use cases:
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The dataset represents synthetic traffic data for a certain location over a one-year period. It includes information about the traffic volume, weather conditions, and special events that may affect traffic.
Features:
Timestamp: The date and time of the observation.Weather: The weather condition at the time of the observation (e.g., Clear, Cloudy, Rain, Snow).
Events: A binary variable indicating whether there was a special event affecting traffic at the time of the observation (True or False).
Traffic Volume: The volume of traffic at the location at the time of the observation.
The dataset is intended for use in analyzing traffic patterns and trends, as well as for developing and testing models related to traffic prediction and management.
Facebook
TwitterTraffic Analysis Zones (TAZ) for the COG/TPB Modeled Region from Metropolitan Washington Council of Governments. The TAZ dataset is used to join several types of zone-based transportation modeling data. For more information, visit https://plandc.dc.gov/page/traffic-analysis-zone.
Facebook
TwitterUnlock the Potential of Your Web Traffic with Advanced Data Resolution
In the digital age, understanding and leveraging web traffic data is crucial for businesses aiming to thrive online. Our pioneering solution transforms anonymous website visits into valuable B2B and B2C contact data, offering unprecedented insights into your digital audience. By integrating our unique tag into your website, you unlock the capability to convert 25-50% of your anonymous traffic into actionable contact rows, directly deposited into an S3 bucket for your convenience. This process, known as "Web Traffic Data Resolution," is at the forefront of digital marketing and sales strategies, providing a competitive edge in understanding and engaging with your online visitors.
Comprehensive Web Traffic Data Resolution Our product stands out by offering a robust solution for "Web Traffic Data Resolution," a process that demystifies the identities behind your website traffic. By deploying a simple tag on your site, our technology goes to work, analyzing visitor behavior and leveraging proprietary data matching techniques to reveal the individuals and businesses behind the clicks. This innovative approach not only enhances your data collection but does so with respect for privacy and compliance standards, ensuring that your business gains insights ethically and responsibly.
Deep Dive into Web Traffic Data At the core of our solution is the sophisticated analysis of "Web Traffic Data." Our system meticulously collects and processes every interaction on your site, from page views to time spent on each section. This data, once anonymous and perhaps seen as abstract numbers, is transformed into a detailed ledger of potential leads and customer insights. By understanding who visits your site, their interests, and their contact information, your business is equipped to tailor marketing efforts, personalize customer experiences, and streamline sales processes like never before.
Benefits of Our Web Traffic Data Resolution Service Enhanced Lead Generation: By converting anonymous visitors into identifiable contact data, our service significantly expands your pool of potential leads. This direct enhancement of your lead generation efforts can dramatically increase conversion rates and ROI on marketing campaigns.
Targeted Marketing Campaigns: Armed with detailed B2B and B2C contact data, your marketing team can create highly targeted and personalized campaigns. This precision in marketing not only improves engagement rates but also ensures that your messaging resonates with the intended audience.
Improved Customer Insights: Gaining a deeper understanding of your web traffic enables your business to refine customer personas and tailor offerings to meet market demands. These insights are invaluable for product development, customer service improvement, and strategic planning.
Competitive Advantage: In a digital landscape where understanding your audience can make or break your business, our Web Traffic Data Resolution service provides a significant competitive edge. By accessing detailed contact data that others in your industry may overlook, you position your business as a leader in customer engagement and data-driven strategies.
Seamless Integration and Accessibility: Our solution is designed for ease of use, requiring only the placement of a tag on your website to start gathering data. The contact rows generated are easily accessible in an S3 bucket, ensuring that you can integrate this data with your existing CRM systems and marketing tools without hassle.
How It Works: A Closer Look at the Process Our Web Traffic Data Resolution process is streamlined and user-friendly, designed to integrate seamlessly with your existing website infrastructure:
Tag Deployment: Implement our unique tag on your website with simple instructions. This tag is lightweight and does not impact your site's loading speed or user experience.
Data Collection and Analysis: As visitors navigate your site, our system collects web traffic data in real-time, analyzing behavior patterns, engagement metrics, and more.
Resolution and Transformation: Using advanced data matching algorithms, we resolve the collected web traffic data into identifiable B2B and B2C contact information.
Data Delivery: The resolved contact data is then securely transferred to an S3 bucket, where it is organized and ready for your access. This process occurs daily, ensuring you have the most up-to-date information at your fingertips.
Integration and Action: With the resolved data now in your possession, your business can take immediate action. From refining marketing strategies to enhancing customer experiences, the possibilities are endless.
Security and Privacy: Our Commitment Understanding the sensitivity of web traffic data and contact information, our solution is built with security and privacy at its core. We adhere to strict data protection regulat...
Facebook
TwitterUnlock insights with Echo's Activity data, offering views of locations based on visitor behavior. Enhance site selection, urban planning, and real estate with metrics like unique visitors and visits. Our high-quality, global data reveals movement patterns, updated daily and normalized monthly.
Facebook
TwitterThis dataset is a structured collection of traffic data extracted from video footage, designed to support machine learning and data analysis projects. It includes attributes such as vehicle counts, average speed, time taken to cross frames, and vehicle types. The dataset is well-suited for traffic prediction, clustering, and classification tasks.
Key Features: Frame-wise traffic data, including counts of cars, trucks, bikes, and buses. Calculated features such as average speed, crossing time, and total vehicles. Supports tasks like PCA, regression, clustering, and classification. Extracted using YOLOv8 for object detection and tracking. Applications: Predict traffic density for smart traffic management systems. Analyze traffic patterns and vehicle distributions. Implement clustering and PCA to identify meaningful patterns in traffic data. Train machine learning models for real-time traffic monitoring. This dataset provides a foundational resource for researchers and developers working on traffic-related machine learning and computer vision projects.
Facebook
TwitterWIGeoGIS offers you access to high-quality traffic data from TomTom. Available for road segments in over 80 countries. Historical traffic data, origin-destination analysis, real-time route monitoring, and junction analysis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Code:
Packet_Features_Generator.py & Features.py
To run this code:
pkt_features.py [-h] -i TXTFILE [-x X] [-y Y] [-z Z] [-ml] [-s S] -j
-h, --help show this help message and exit -i TXTFILE input text file -x X Add first X number of total packets as features. -y Y Add first Y number of negative packets as features. -z Z Add first Z number of positive packets as features. -ml Output to text file all websites in the format of websiteNumber1,feature1,feature2,... -s S Generate samples using size s. -j
Purpose:
Turns a text file containing lists of incomeing and outgoing network packet sizes into separate website objects with associative features.
Uses Features.py to calcualte the features.
startMachineLearning.sh & machineLearning.py
To run this code:
bash startMachineLearning.sh
This code then runs machineLearning.py in a tmux session with the nessisary file paths and flags
Options (to be edited within this file):
--evaluate-only to test 5 fold cross validation accuracy
--test-scaling-normalization to test 6 different combinations of scalers and normalizers
Note: once the best combination is determined, it should be added to the data_preprocessing function in machineLearning.py for future use
--grid-search to test the best grid search hyperparameters - note: the possible hyperparameters must be added to train_model under 'if not evaluateOnly:' - once best hyperparameters are determined, add them to train_model under 'if evaluateOnly:'
Purpose:
Using the .ml file generated by Packet_Features_Generator.py & Features.py, this program trains a RandomForest Classifier on the provided data and provides results using cross validation. These results include the best scaling and normailzation options for each data set as well as the best grid search hyperparameters based on the provided ranges.
Data
Encrypted network traffic was collected on an isolated computer visiting different Wikipedia and New York Times articles, different Google search queres (collected in the form of their autocomplete results and their results page), and different actions taken on a Virtual Reality head set.
Data for this experiment was stored and analyzed in the form of a txt file for each experiment which contains:
First number is a classification number to denote what website, query, or vr action is taking place.
The remaining numbers in each line denote:
The size of a packet,
and the direction it is traveling.
negative numbers denote incoming packets
positive numbers denote outgoing packets
Figure 4 Data
This data uses specific lines from the Virtual Reality.txt file.
The action 'LongText Search' refers to a user searching for "Saint Basils Cathedral" with text in the Wander app.
The action 'ShortText Search' refers to a user searching for "Mexico" with text in the Wander app.
The .xlsx and .csv file are identical
Each file includes (from right to left):
The origional packet data,
each line of data organized from smallest to largest packet size in order to calculate the mean and standard deviation of each packet capture,
and the final Cumulative Distrubution Function (CDF) caluclation that generated the Figure 4 Graph.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This dataset originates from DataCamp. Many users have reposted copies of the CSV on Kaggle, but most of those uploads omit the original instructions, business context, and problem framing. In this upload, I’ve included that missing context in the About Dataset so the reader of my notebook or any other notebook can fully understand how the data was intended to be used and the intended problem framing.
Note: I have also uploaded a visualization of the workflow I personally took to tackle this problem, but it is not part of the dataset itself.
Additionally, I created a PowerPoint presentation based on my work in the notebook, which you can download from here:
PPTX Presentation
From: Head of Data Science
Received: Today
Subject: New project from the product team
Hey!
I have a new project for you from the product team. Should be an interesting challenge. You can see the background and request in the email below.
I would like you to perform the analysis and write a short report for me. I want to be able to review your code as well as read your thought process for each step. I also want you to prepare and deliver the presentation for the product team - you are ready for the challenge!
They want us to predict which recipes will be popular 80% of the time and minimize the chance of showing unpopular recipes. I don't think that is realistic in the time we have, but do your best and present whatever you find.
You can find more details about what I expect you to do here. And information on the data here.
I will be on vacation for the next couple of weeks, but I know you can do this without my support. If you need to make any decisions, include them in your work and I will review them when I am back.
Good Luck!
From: Product Manager - Recipe Discovery
To: Head of Data Science
Received: Yesterday
Subject: Can you help us predict popular recipes?
Hi,
We haven't met before but I am responsible for choosing which recipes to display on the homepage each day. I have heard about what the data science team is capable of and I was wondering if you can help me choose which recipes we should display on the home page?
At the moment, I choose my favorite recipe from a selection and display that on the home page. We have noticed that traffic to the rest of the website goes up by as much as 40% if I pick a popular recipe. But I don't know how to decide if a recipe will be popular. More traffic means more subscriptions so this is really important to the company.
Can your team: - Predict which recipes will lead to high traffic? - Correctly predict high traffic recipes 80% of the time?
We need to make a decision on this soon, so I need you to present your results to me by the end of the month. Whatever your results, what do you recommend we do next?
Look forward to seeing your presentation.
Tasty Bytes was founded in 2020 in the midst of the Covid Pandemic. The world wanted inspiration so we decided to provide it. We started life as a search engine for recipes, helping people to find ways to use up the limited supplies they had at home.
Now, over two years on, we are a fully fledged business. For a monthly subscription we will put together a full meal plan to ensure you and your family are getting a healthy, balanced diet whatever your budget. Subscribe to our premium plan and we will also deliver the ingredients to your door.
This is an example of how a recipe may appear on the website, we haven't included all of the steps but you should get an idea of what visitors to the site see.
Tomato Soup
Servings: 4
Time to make: 2 hours
Category: Lunch/Snack
Cost per serving: $
Nutritional Information (per serving) - Calories 123 - Carbohydrate 13g - Sugar 1g - Protein 4g
Ingredients: - Tomatoes - Onion - Carrot - Vegetable Stock
Method: 1. Cut the tomatoes into quarters….
The product manager has tried to make this easier for us and provided data for each recipe, as well as whether there was high traffic when the recipe was featured on the home page.
As you will see, they haven't given us all of the information they have about each recipe.
You can find the data here.
I will let you decide how to process it, just make sure you include all your decisions in your report.
Don't forget to double check the data really does match what they say - it might not.
| Column Name | Details |
|---|---|
| recipe | Numeric, unique identifier of recipe |
| calories | Numeric, number of calories |
| carbohydrate | Numeric, amount of carbohydrates in grams |
| sugar | Numeric, amount of sugar in grams |
| protein | Numeric, amount of prote... |
Facebook
TwitterAt Echo, our dedication to data curation is unmatched; we focus on providing our clients with an in-depth picture of a physical location based on activity in and around a point of interest over time. Our dataset empowers you to explore the “what” by allowing you to dig deeper into customer movement behaviors, eliminate gaps in your trade area and discover untapped potential. Leverage Echo's Activity datasets to identify new growth opportunities and gain a competitive advantage.
This sample of our Area Activity data provides you insights into the estimated total unique visitors and visits in an area. This helps you understand frequentation dynamics over time, identify emerging trends in people movements and measure the impact of external factors on how people move across a city.
Additional Information: - Understand the actual movement patterns of consumers without using PII data, gaining a 360-degree consumer view. Complement your online behavior knowledge with actual offline actions, and better attribute intent based on real-world behaviors. - Echo collects, cleans and updates its footfall on a daily basis. Normalization of the data occurs on a monthly basis. - We provide data aggregation on a weekly, monthly and quarterly basis. - Information about our country offering and data schema can be found here:
1) Data Schema: https://docs.echo-analytics.com/activity/data-schema
2) Country Availability: https://docs.echo-analytics.com/activity/country-coverage
3) Methodology: https://docs.echo-analytics.com/activity/methodology
Echo's commitment to customer service is evident in our exceptional data quality and dedicated team, providing 360° support throughout your location intelligence journey. We handle the complex tasks to deliver analysis-ready datasets to you.
Business Needs: 1. Site Selection: Leverage footfall data to identify the best location to open a new store. By analyzing areas with high footfall you can select sites that are likely to attract more customers. 2. Urban Planning Development: City planners can use footfall data to optimize the layout and infrastructure of urban areas, guide the development of commercial areas by indicating where pedestrian traffic is heaviest, and aid in traffic management and safety measures. 3. Real Estate Investment: Leverage footfall data to identify lucrative investment opportunities and optimize property management by analyzing pedestrian traffic patterns.
Facebook
TwitterThe census count of vehicles on city streets is normally reported in the form of Average Daily Traffic (ADT) counts. These counts provide a good estimate for the actual number of vehicles on an average weekday at select street segments. Specific block segments are selected for a count because they are deemed as representative of a larger segment on the same roadway. ADT counts are used by transportation engineers, economists, real estate agents, planners, and others professionals for planning and operational analysis. The frequency for each count varies depending on City staff’s needs for analysis in any given area. This report covers the counts taken in our City during the past 12 years approximately.
Facebook
TwitterLeverage the most reliable and compliant mobile device location/foot traffic dataset on the market. Veraset Movement (Mobile Device GPS / Foot Traffic Data) offers unparalleled insights into footfall traffic patterns across North America.
Covering the United States, Canada and Mexico, Veraset's Mobile Location Data draws on raw GPS data from tier-1 apps, SDKs, and aggregators of mobile devices to provide customers with accurate, up-to-the-minute information on human movement. Ideal for ad tech, planning, retail analysis, and transportation logistics, Veraset's Movement data helps in shaping strategy and making data-driven decisions.
Veraset’s North American Movement Panel: - United States: 768M Devices, 70B+ Pings - Canada: 55M+ Devices, 9B+ Pings - Mexico: 125M+ Devices, 14B+ Pings - MAU/Devices and Monthly Pings
Uses for Veraset's Mobile Location Data: - Advertising - Ad Placement, Attribution, and Segmentation - Audience Creation/Building - Dynamic Ad Targeting - Infrastructure Plans - Route Optimization - Public Transit Optimization - Credit Card Loyalty - Competitive Analysis - Risk assessment, Underwriting, and Policy Personalization - Enrichment of Existing Datasets - Trade Area Analysis - Predictive Analytics and Trend Forecasting
Facebook
TwitterA dataset explaining organic traffic, its importance for SEO, and methods to track it in Google Analytics 4.
Facebook
TwitterWeb traffic statistics for the several City-Parish websites, brla.gov, city.brla.gov, Red Stick Ready, GIS, Open Data etc. Information provided by Google Analytics.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Traffic-related data collected by the Boston Transportation Department, as well as other City departments and State agencies. Various types of counts: Turning Movement Counts, Automated Traffic Recordings, Pedestrian Counts, Delay Studies, and Gap Studies.
~_Turning Movement Counts (TMC)_ present the number of motor vehicles, pedestrians, and cyclists passing through the particular intersection. Specific movements and crossings are recorded for all street approaches involved with the intersection. This data is used in traffic signal retiming programs and for signal requests. Counts are typically conducted for 2-, 4-, 11-, and 12-Hr periods.
~_Automated Traffic Recordings (ATR)_ record the volume of motor vehicles traveling along a particular road, measures of travel speeds, and approximations of the class of the vehicles (motorcycle, 2-axle, large box truck, bus, etc). This type of count is conducted only along a street link/corridor, to gather data between two intersections or points of interest. This data is used in travel studies, as well as to review concerns about street use, speeding, and capacity. Counts are typically conducted for 12- & 24-Hr periods.
~_Pedestrian Counts (PED)_ record the volume of individual persons crossing a given street, whether at an existing intersection or a mid-block crossing. This data is used to review concerns about crossing safety, as well as for access analysis for points of interest. Counts are typically conducted for 2-, 4-, 11-, and 12-Hr periods.
~_Delay Studies (DEL)_ measure the delay experienced by motor vehicles due to the effects of congestion. Counts are typically conducted for a 1-Hr period at a given intersection or point of intersecting vehicular traffic.
~_Gap Studies (GAP)_ record the number of gaps which are typically present between groups of vehicles traveling through an intersection or past a point on a street. This data is used to assess opportunities for pedestrians to cross the street and for analyses on vehicular “platooning”. Counts are typically conducted for a specific 1-Hr period at a single point of crossing.
Facebook
Twitter
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The dataset contains information about web requests to a single website. It's a time series dataset, which means it tracks data over time, making it great for machine learning analysis.