32 datasets found

D
Most popular websites in the Netherlands 2015
ssh.datastations.nl
csv, tsv, zip
Updated May 9, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
M. Kleppe; H. Bijleveld; M. Kleppe; H. Bijleveld (2017). Most popular websites in the Netherlands 2015 [Dataset]. http://doi.org/10.17026/DANS-X6H-6QQT
Explore at:
zip(15855), csv(138294), tsv(176359)Available download formats
Unique identifier
https://doi.org/10.17026/DANS-X6H-6QQT
Dataset updated
May 9, 2017
Dataset provided by
DANS Data Station Social Sciences and Humanities
Authors
M. Kleppe; H. Bijleveld; M. Kleppe; H. Bijleveld
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Netherlands
Dataset funded by
NWO
Description
This dataset contains a list of 3654 Dutch websites that we considered the most popular websites in 2015. This list served as whitelist for the Newstracker Research project in which we monitored the online web behaviour of a group of respondents.The research project 'The Newstracker' was a subproject of the NWO-funded project 'The New News Consumer: A User-Based Innovation Project to Meet Paradigmatic Change in News Use and Media Habits'.For the Newstracker project we aimed to understand the web behaviour of a group of respondents. We created custom-built software to monitor their web browsing behaviour on their laptops and desktops (please find the code in open access at https://github.com/NITechLabs/NewsTracker). For reasons of scale and privacy we created a whitelist with websites that were the most popular websites in 2015. We manually compiled this list by using data of DDMM, Alexa and own research. The dataset consists of 5 columns:- the URL- the type of website: We created a list of types of websites and each website has been manually labeled with 1 category- Nieuws-regio: When the category was 'News', we subdivided these websites in the regional focus: International, National or Local- Nieuws-onderwerp: Furthermore, each website under the category News was further subdivided in type of news website. For this we created an own list of news categories and manually coded each website- Bron: For each website we noted which source we used to find this website.The full description of the research design of the Newstracker including the set-up of this whitelist is included in the following article: Kleppe, M., Otte, M. (in print), 'Analysing & understanding news consumption patterns by tracking online user behaviour with a multimodal research design', Digital Scholarship in the Humanities, doi 10.1093/llc/fqx030.
h
1k_Website_Screenshots_and_Metadata
huggingface.co
Updated Apr 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Silatus (2023). 1k_Website_Screenshots_and_Metadata [Dataset]. https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 13, 2023
Dataset authored and provided by
Silatus
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Card for 1000 Website Screenshots with Metadata

Dataset Summary

Silatus is sharing, for free, a segment of a dataset that we are using to train a generative AI model for text-to-mockup conversions. This dataset was collected in December 2022 and early January 2023, so it contains very recent data from 1,000 of the world's most popular websites. You can get our larger 10,000 website dataset for free at: https://silatus.com/datasets This dataset includes: High-res… See the full description on the dataset page: https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata.
Dataset Search WebApp
figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Angelo Batista Neves Júnior; Luiz André Portes Paes Leme (2023). Dataset Search WebApp [Dataset]. http://doi.org/10.6084/m9.figshare.5217958.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5217958.v2
Dataset updated
May 31, 2023
Dataset provided by
figshare
Authors
Angelo Batista Neves Júnior; Luiz André Portes Paes Leme
License
https://www.gnu.org/copyleft/gpl.htmlhttps://www.gnu.org/copyleft/gpl.html
Description
Despite the fact that extensive list of open datasets are available in catalogues, most of the data publishers still connects their datasets to other popular datasets, such as DBpedia5, Freebase 6 and Geonames7. Although the linkage with popular datasets would allow us to explore external resources, it would fail to cover highly specialized information. Catalogues of linked data describe the content of datasets in terms of the update periodicity, authors, SPARQL endpoints, linksets with other datasets, amongst others, as recommended by W3C VoID Vocabulary. However, catalogues by themselves do not provide any explicit information to help the URI linkage process.Searching techniques can rank available datasets SI according to the probability that it will be possible to define links between URIs of SI and a given dataset T to be published, so that most of the links, if not all, could be found by inspecting the most relevant datasets in the ranking. dataset-search is a tool for searching datasets for linkage.
Data from: E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects...
zenodo.org
bin, txt
Updated May 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck (2025). E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects [Dataset]. http://doi.org/10.5281/zenodo.14221860
Explore at:
txt, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14221860
Dataset updated
May 20, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT
End-to-End (E2E) testing is a comprehensive approach to validating the functionality of a software application by testing its entire workflow from the user’s perspective, ensuring that all integrated components work together as expected. It is crucial for ensuring the quality and reliability of applications, especially in the web domain, which is often bound by Service Level Agreements (SLAs). This testing involves two key activities:
Graphical User Interface (GUI) testing, which simulates user interactions through browsers, and performance testing, which evaluates system workload handling. Despite its importance, E2E testing is often neglected, and the lack of reliable datasets for Web GUI and performance testing has slowed research progress. This paper addresses these limitations by constructing E2EGit, a comprehensive dataset, cataloging non-trivial open-source web projects on GITHUB that adopt GUI or performance testing.
The dataset construction process involved analyzing over 5k non-trivial web repositories based on popular programming languages (JAVA, JAVASCRIPT TYPESCRIPT PYTHON) to identify: 1) GUI tests based on popular browser automation frameworks (SELENIUM PLAYWRIGHT, CYPRESS, PUPPETEER), 2) performance tests written with the most popular open-source tools (JMETER, LOCUST). After analysis, we identified 472 repositories using web GUI testing, with over 43,000 tests, and 84 repositories using performance testing, with 410 tests.

DATASET DESCRIPTION
The dataset is provided as an SQLite database, whose structure is illustrated in Figure 3 (in the paper), which consists of five tables, each serving a specific purpose.
The repository table contains information on 1.5 million repositories collected using the SEART tool on May 4. It includes 34 fields detailing repository characteristics. The
non_trivial_repository table is a subset of the previous one, listing repositories that passed the two filtering stages described in the pipeline. For each repository, it specifies whether it is a web repository using JAVA, JAVASCRIPT, TYPESCRIPT, or PYTHON frameworks. A repository may use multiple frameworks, with corresponding fields (e.g., is web java) set to true, and the field web dependencies listing the detected web frameworks. For Web GUI testing, the dataset includes two additional tables; gui_testing_test _details, where each row represents a test file, providing the file path, the browser automation framework used, the test engine employed, and the number of tests implemented in the file. gui_testing_repo_details, aggregating data from the previous table at the repository level. Each of the 472 repositories has a row summarizing
the number of test files using frameworks like SELENIUM or PLAYWRIGHT, test engines like JUNIT, and the total number of tests identified. For performance testing, the performance_testing_test_details table contains 410 rows, one for each test identified. Each row includes the file path, whether the test uses JMETER or LOCUST, and extracted details such as the number of thread groups, concurrent users, and requests. Notably, some fields may be absent—for instance, if external files (e.g., CSVs defining workloads) were unavailable, or in the case of Locust tests, where parameters like duration and concurrent users are specified via the command line.

To cite this article refer to this citation:

@inproceedings{di2025e2egit,
title={E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects},
author={Di Meglio, Sergio and Starace, Luigi Libero Lucio and Pontillo, Valeria and Opdebeeck, Ruben and De Roover, Coen and Di Martino, Sergio},
booktitle={2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)},
pages={10--15},
year={2025},
organization={IEEE/ACM}
}

This work has been partially supported by the Italian PNRR MUR project PE0000013-FAIR.
A
‘Imdb Most popular Films and series’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Imdb Most popular Films and series’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-imdb-most-popular-films-and-series-704e/717040ed/?iid=008-281&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Imdb Most popular Films and series’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mazenramadan/imdb-most-popular-films-and-series on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

IMDB is a popular website for rating films and series I always go there if I want to watch something new and many many users trust it's rankings The data is about most popular 7k Films and series on IMDB with rates, The Data is Ideal for Exploratory Data Analysis and You can also use Regression to predict Rate

Content

These are our columns: Name: Name of the film/series Data: Creation date Rate: IMDB's Rate Votes: Number of voters Genre: Genres , Actions , Drama, Romance, etc... Duration: Duration of the episode , film Type: whether it's film or series Certificate: TV-Y: Designed to be appropriate for all children TV-Y7: Suitable for ages 7 and up G: Suitable for General Audiences TV-G: Suitable for General Audiences PG: Parental Guidance suggested TV-PG: Parental Guidance suggested PG-13: Parents strongly cautioned. May be Inappropriate for ages 12 and under. TV-14: Parents strongly cautioned. May not be suitable for ages 14 and under. R: Restricted. May be inappropriate for ages 17 and under. TV-MA: For Mature Audiences. May not be suitable for ages 17 and under. NC-17: Inappropriate for ages 17 and under Episodes: Number of Episodes only for series Nudity, violence.. :How much does it have of these

Acknowledgements

I got these data using Web Scraping you can see the code to get the data here

Inspiration

The Data is ideal is ideal for EDA you can See how various features affects the rate like

How date affects the rate ?

Is there any genre appears more on specific date ?

How genre affects rate ?

Does series get higher rates ?

Does Nudity, Fighting , Violence affects rate ?

You can Also predict the Rate with Regression

--- Original source retains full ownership of the source dataset ---
TED dataset
zenodo.org
data.niaid.nih.gov
application/gzip, txt
Updated Oct 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikolaos Pappas; Nikolaos Pappas; Andrei Popescu-Belis; Andrei Popescu-Belis (2020). TED dataset [Dataset]. http://doi.org/10.34777/wqv1-jd60
Explore at:
application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.34777/wqv1-jd60
Dataset updated
Oct 6, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nikolaos Pappas; Nikolaos Pappas; Andrei Popescu-Belis; Andrei Popescu-Belis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A dataset for recommendations collected from ted.com which contains metadata fields for TED talks and user profiles with rating and commenting transactions.

The TED dataset contains all the audio-video recordings of the TED talks downloaded from the official TED website, http://www.ted.com, on April 27th 2012 (first version) and on September 10th 2012 (second version). No processing has been done on any of the metadata fields. The metadata was obtained by crawling the HTML source of the list of talks and users, as well as talk and user webpages using scripts written by Nikolaos Pappas at the Idiap Research Institute, Martigny, Switzerland. The dataset is shared under the Creative Commons license (the same as the content of the TED talks) which is stored in the COPYRIGHT file. The dataset is shared for research purposes which are explained in detail in the following papers. The dataset can be used to benchmark systems that perform two tasks, namely personalized recommendations and generic recommendations. Please check the CBMI 2013 paper for a detailed description of each task.

Nikolaos Pappas, Andrei Popescu-Belis, "Combining Content with User Preferences for TED Lecture Recommendation", 11th International Workshop on Content Based Multimedia Indexing, Veszprém, Hungary, IEEE, 2013
PDF document, Bibtex citation

Nikolaos Pappas, Andrei Popescu-Belis, Sentiment Analysis of User Comments for One-Class Collaborative Filtering over TED Talks, 36th ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, ACM, 2013
PDF document, Bibtex citation

If you use the TED dataset for your research please cite one of the above papers (specifically the 1st paper for the April 2012 version and the 2nd paper for the September 2012 version of the dataset).

TED website

The TED website is a popular online repository of audiovisual recordings of public lectures given by prominent speakers, under a Creative Commons non-commercial license (see www.ted.com). The site provides extended metadata and user-contributed material. The speakers are scientists, writers, journalists, artists, and businesspeople from all over the world who are generally given a maximum of 18 minutes to present their ideas. The talks are given in English and are usually transcribed and then translated into several other languages by volunteer users. The quality of the talks has made TED one of the most popular online lecture repositories, as each talk was viewed on average almost 500,000 times.

Metadata

The dataset contains two main entry types: talks and users. The talks have the following data fields: identifier, title, description, speaker name, TED event at which they were given, transcript, publication date, filming date, number of views. Each talk has a variable number of user comments, organized in threads. In addition, three fields were assigned by TED editorial staff: related tags, related themes, and related talks. Each talk generally has three related talks and 95% of them have a high- quality transcript available. The dataset includes 1,149 talks from 960 speakers and 69,023 registered users that have made about 100,000 favorites and 200,000 comments.
d
Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B...
datarade.ai
.csv
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Consumer Edge (2025). Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B Shopper Insights | 59 Countries, 3-Day Lag, Daily Delivery [Dataset]. https://datarade.ai/data-products/click-global-data-web-traffic-data-transaction-data-con-consumer-edge
Explore at:
.csvAvailable download formats
Dataset updated
Mar 13, 2025
Dataset authored and provided by
Consumer Edge
Area covered
Bermuda, Marshall Islands, Congo, Bosnia and Herzegovina, Finland, South Africa, Sri Lanka, Nauru, El Salvador, Montserrat
Description
Click Web Traffic Combined with Transaction Data: A New Dimension of Shopper Insights

Consumer Edge is a leader in alternative consumer data for public and private investors and corporate clients. Click enhances the unparalleled accuracy of CE Transact by allowing investors to delve deeper and browse further into global online web traffic for CE Transact companies and more. Leverage the unique fusion of web traffic and transaction datasets to understand the addressable market and understand spending behavior on consumer and B2B websites. See the impact of changes in marketing spend, search engine algorithms, and social media awareness on visits to a merchant’s website, and discover the extent to which product mix and pricing drive or hinder visits and dwell time. Plus, Click uncovers a more global view of traffic trends in geographies not covered by Transact. Doubleclick into better forecasting, with Click.

Consumer Edge’s Click is available in machine-readable file delivery and enables: • Comprehensive Global Coverage: Insights across 620+ brands and 59 countries, including key markets in the US, Europe, Asia, and Latin America. • Integrated Data Ecosystem: Click seamlessly maps web traffic data to CE entities and stock tickers, enabling a unified view across various business intelligence tools. • Near Real-Time Insights: Daily data delivery with a 5-day lag ensures timely, actionable insights for agile decision-making. • Enhanced Forecasting Capabilities: Combining web traffic indicators with transaction data helps identify patterns and predict revenue performance.

Use Case: Analyze Year Over Year Growth Rate by Region

Problem A public investor wants to understand how a company’s year-over-year growth differs by region.

Solution The firm leveraged Consumer Edge Click data to: • Gain visibility into key metrics like views, bounce rate, visits, and addressable spend • Analyze year-over-year growth rates for a time period • Breakout data by geographic region to see growth trends

Metrics Include: • Spend • Items • Volume • Transactions • Price Per Volume

Inquire about a Click subscription to perform more complex, near real-time analyses on public tickers and private brands as well as for industries beyond CPG like: • Monitor web traffic as a leading indicator of stock performance and consumer demand • Analyze customer interest and sentiment at the brand and sub-brand levels

Consumer Edge offers a variety of datasets covering the US, Europe (UK, Austria, France, Germany, Italy, Spain), and across the globe, with subscription options serving a wide range of business needs.

Consumer Edge is the Leader in Data-Driven Insights Focused on the Global Consumer
h
UI-Elements-Detection-Dataset
huggingface.co
Updated Nov 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yash Jain (2024). UI-Elements-Detection-Dataset [Dataset]. https://huggingface.co/datasets/YashJain/UI-Elements-Detection-Dataset
Explore at:
Dataset updated
Nov 26, 2024
Authors
Yash Jain
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Web UI Elements Dataset

Overview

A comprehensive dataset of web user interface elements collected from the world's most visited websites. This dataset is specifically curated for training AI models to detect and classify UI components, enabling automated UI testing, accessibility analysis, and interface design studies.

Key Features

300+ popular websites sampled 15 essential UI element classes High-resolution screenshots (1920x1080) Rich accessibility metadata… See the full description on the dataset page: https://huggingface.co/datasets/YashJain/UI-Elements-Detection-Dataset.
P
Dashlane Login | How to Login Dashlane Account? Dataset
paperswithcode.com
Updated Jun 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Dashlane Login | How to Login Dashlane Account? Dataset [Dataset]. https://paperswithcode.com/dataset/dashlane-login-how-to-login-dashlane-account
Explore at:
Dataset updated
Jun 17, 2025
Description
(Toll Free) Number +1-341-900-3252

It's hard to keep track of passwords (Toll Free) Number +1-341-900-3252 in our digital world. It might be hard to remember logins, keep your accounts safe (Toll Free) Number +1-341-900-3252 , and manage many accounts at once. This is where Dashlane comes in. It makes managing passwords easy because it has a simple UI and strong security. This tutorial is for you if you want to know how to log in to your Dashlane account and what makes (Toll Free) Number +1-341-900-3252 it different from other password managers.

(Toll Free) Number +1-341-900-3252

Why should you use Dashlane to manage your passwords?

It's crucial to know why millions of people around the world choose Dashlane before we get into the login process. Here's how it helps people in their daily lives:

(Toll Free) Number +1-341-900-3252

Better security Dashlane uses strong encryption to keep your credentials safe. You don't have to worry about hackers getting to your data using AES-256 encryption, which is the best in the business.

(Toll Free) Number +1-341-900-3252

Easy to use on all devices Dashlane makes it easy to store passwords on various devices. You can easily get to your login information on any device, whether it's a smartphone, laptop, or tablet.

Easier to log in After you set it up, Dashlane's autofill function lets you log in to apps and websites without having to type in your login and password. Not only does it go faster, but it also gets rid of mistakes.

Keeping an eye on the dark web Dashlane does more than merely keep track of passwords. It also checks the dark web for leaks of personal information. You will be notified right away if your information has been leaked.

Use a VPN to keep your privacy safe Dashlane has a virtual private network (VPN) in addition to passwords to keep your private browsing safe on public Wi-Fi.

How to Access Your Dashlane Account

Whether you're new to Dashlane or use it every day, it's easy to log in. To safely log into your account and start managing your passwords, do the following:

(Toll Free) Number +1-341-900-3252

Step 1: Get the Dashlane app Downloading Dashlane is the first thing you need to do if you're new. It works on all of the most popular platforms, including Windows, macOS, iOS, and Android. You can also use Dashlane as a browser extension on popular browsers including Chrome, Firefox, and Edge.

Step 2: Launch Dashlane Once you've installed the Dashlane app or browser extension, open it.

Step 3: Type in Your Email Address Type in the email address that is linked to your Dashlane login account. This will take you to the login page.

Step 4: Give your master password You'll need to make a Master Password the first time you log in. This is a single, strong (Toll Free) Number +1-341-900-3252 password that opens your vault. To get back in, just type in your master password. Tip: Your master password should be hard to guess but easy to remember. Think about using numerals, letters, and special characters in both upper and lower case.

Step 5: Verify (If Necessary) If you have two-factor authentication (2FA) set up on your account, you will also need to check this step. Dashlane can ask you to enter a code that was delivered to your email or made by an authentication app.

(Toll Free) Number +1-341-900-3252

Step 6: Open Your Vault Once you sign in, you'll see your password vault. This is where you can manage your stored logins, credit card information, and confidential notes.

Important Security Features of Dashlane

Dashlane puts your safety first with these cutting-edge (Toll Free) Number +1-341-900-3252 features:

Encryption with AES-256 Your private information is stored with military-grade encryption, which keeps it safe from hackers.

Architecture with No Knowledge Dashlane uses a zero-knowledge security model, which means that the corporation can't see or get to your passwords.

Ways to log in with biometrics You may make things easier without giving up security by turning on biometric authentication, such as Face ID or fingerprint scanning, on devices that allow it.

Information about the health of your password Dashlane doesn't just keep your passwords safe; it also looks at them. It indicates passwords that are weak or have been used before, which helps you make your accounts stronger.

(Toll Free) Number +1-341-900-3252

Access in an emergency You can let a trustworthy person in if there is an emergency. This makes it easier to keep (Toll Free) Number +1-341-900-3252 track of critical accounts.

How to Get the Most Out of Dashlane

To get the most out of your Dashlane login account, follow these tips:

Turn on Autofill: Autofill can help you save time when you log in, especially to sites you visit often.

Change your passwords often: Change your passwords every now and then (Toll Free) Number +1-341-900-3252 to make them more secure. You can make strong, unique passwords in seconds using Dashlane's Password Generator.

Turn on two-factor authentication: Always use two-factor authentication (2FA) to add an extra layer of security to your account. This way, even if your password is stolen, your account will still be safe.

Use the Password Health Tool: Check your password health score often and change any credentials that are marked.

What Makes Dashlane Unique

Dashlane is different from other password managers since it is easy to use and has sophisticated capabilities like monitoring the dark web and built-in VPN services. Dashlane keeps your information safe without slowing you down when you check in for work, shop online, or manage your personal accounts.

Last Thoughts

(Toll Free) Number +1-341-900-3252

It's easier than ever to keep your internet safety in check. (Toll Free) Number +1-341-900-3252 Dashlane makes it easy to get to your accounts, keeps your private data safe, and protects you from breaches before they happen. Now that you know how to log in to Dashlane, why not give it a shot and take charge of your passwords? Dashlane is the greatest way to keep your online life safe because (Toll Free) Number +1-341-900-3252 your safety deserves the best.
c
Recipes dataset from allrecipes
crawlfeeds.com
csv, zip
Updated Jul 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Recipes dataset from allrecipes [Dataset]. https://crawlfeeds.com/datasets/recipes-dataset-from-allrecipes
Explore at:
zip, csvAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Description
Unleash the culinary potential with our comprehensive Recipes dataset from Allrecipes. This dataset provides detailed information on a vast collection of recipes sourced from Allrecipes, one of the world's most popular recipe websites. Ideal for chefs, food enthusiasts, developers, and data scientists, this dataset offers an extensive range of culinary possibilities.

The dataset includes key details such as recipe titles, ingredients, preparation instructions, cooking times, user ratings, and dietary categories. With recipes spanning various cuisines, dietary preferences, and meal types, this dataset is a valuable resource for creating recipe apps, conducting nutritional analysis, or exploring new culinary trends.

Looking for more data to fuel your food-related projects? Check out our Food & Beverage Data for diverse datasets designed to inspire and empower innovation in the food and beverage industry.

Enhance your food-related projects with structured, high-quality data from Allrecipes. Whether developing a recipe recommendation engine, building a food blog, or researching cooking trends, this dataset is your go-to resource for delicious inspiration and data-driven culinary insights.
Global Starlink Web Cache Latency & Traceroute Measurement Dataset
zenodo.org
Updated Feb 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qi Zhang; Qi Zhang; Zeqi Lai; Zeqi Lai; Qian Wu; Qian Wu; Jihao Li; Jihao Li; HEWU LI; HEWU LI (2025). Global Starlink Web Cache Latency & Traceroute Measurement Dataset [Dataset]. http://doi.org/10.5281/zenodo.14800115
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14800115
Dataset updated
Feb 6, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Qi Zhang; Qi Zhang; Zeqi Lai; Zeqi Lai; Qian Wu; Qian Wu; Jihao Li; Jihao Li; HEWU LI; HEWU LI
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains global web cache latency measurements collected via RIPE Atlas probes equipped with Starlink terminals across five continents, spanning over 24 hours and resulting in ~2 Million measurements. The measurements aim to evaluate the user-perceived latency of accessing popular websites through low-earth orbit (LEO) satellite networks.

This dataset is a product of Spache, a research project on web caching from space. Please refer to its WWW'25 paper for more details and analysis results.

Dataset File Content

The dataset includes the following files:

Metadata

Target website list: A list of the top 50 most popular websites according to Alexa ranking.

RIPE Atlas Measurement IDs: For each website, the corresponding RIPE Atlas Measurement IDs for both Ping and Traceroute measurements are provided.

Note: microsoftonline.com (originally ranked 41st) is not included in the list due to its unresolvable domain name.

Measurement results - Raw Data

Ping and Traceroute results: Raw measurement results for each target website, including detailed information on each measurement.

Note: For details on the measurement result formats, please refer to the RIPE Atlas documentation.

Measurement results - Preprocessed Latency

Ping RTT latency: Preprocessed data containing the minimum RTT (Round Trip Time, in milliseconds) for each Ping measurement to all target websites.

Probe information: Corresponding Probe IDs, along with their respective countries and continents at the time of measurement.

This dataset is intended to support research on web caching, particularly in the context of satellite Internet. Please cite both this dataset and the associated paper if you find this data useful.
E
Cookie Synchronization HTTP Requests
find.data.gov.scot
dtechtive.com
csv, json, pdf, txt
Updated Apr 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh, School of Informatics, Labratory for the Foundations of Computer Science (LFCS), Security Group (2022). Cookie Synchronization HTTP Requests [Dataset]. http://doi.org/10.7488/ds/3441
Explore at:
csv(13.12 MB), json(345.9 MB), txt(0.0166 MB), pdf(0.8557 MB), json(27.86 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/3441
Dataset updated
Apr 14, 2022
Dataset provided by
University of Edinburgh, School of Informatics, Labratory for the Foundations of Computer Science (LFCS), Security Group
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset was created as part of a Masters Thesis aiming to automatically detect cookie synchronization. The dataset contains information about all the HTTP requests made when using Selenium to visit the top 2000 most popular websites were visited.
A
‘Spotify Past Decades Songs Attributes’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Spotify Past Decades Songs Attributes’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-spotify-past-decades-songs-attributes-57a7/4e9b7dfe/?iid=011-638&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Spotify Past Decades Songs Attributes’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/cnic92/spotify-past-decades-songs-50s10s on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Context

Why do we like some songs more than others? Is there something about a song that pleases out subconscious, making us listening to it on repeat? To understand this I collected various attributes from a selection of songs available in the Spotify's playlist "All out ..s" starting from the 50s up to the newly ended 10s. Can you find the secret sauce to make a song popular?

Content

This data repo contains 7 datasets (.csv files), each representing a Spotify's "All out ..s" type of playlist. Those playlists collect the most popular/iconic songs from the decade. For each song, a set of attributes have been reported in order to perform some data analysis. The attributes have been scraped from this amazing website. In particular, according to the website the attributes are:

top genre: genre of the song

year: year of the song (due to re-releases, the year might not correspond to the release year of the original song)

bpm(beats per minute): beats per minute

nrgy(energy): energy of a song, the higher the value the more energetic the song is

dnce(danceability): the higher the value, the easier it is to dance to this song.

dB(loudness): the higher the value, the louder the song.

live(liveness): the higher the value, the more likely the song is a live recording.

val(valence): the higher the value, the more positive mood for the song.

dur(duration): the duration of the song.

acous(acousticness): the higher the value the more acoustic the song is.

spch(speechiness): the higher the value the more spoken word the song contains.

pop(popularity): the higher the value the more popular the song is.

Acknowledgements

I got inspired by the top-notch work by Leonardo Henrique in this dataset. Thanks to him I discovered this website, from which all the data collected here have been scraped.

--- Original source retains full ownership of the source dataset ---
Machine Learning Dataset
brightdata.com
.json, .csv, .xlsx
Updated Jun 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bright Data (2024). Machine Learning Dataset [Dataset]. https://brightdata.com/products/datasets/machine-learning
Explore at:
.json, .csv, .xlsxAvailable download formats
Dataset updated
Jun 19, 2024
Dataset authored and provided by
Bright Datahttps://brightdata.com/
License
https://brightdata.com/licensehttps://brightdata.com/license
Area covered
Worldwide
Description
Utilize our machine learning datasets to develop and validate your models. Our datasets are designed to support a variety of machine learning applications, from image recognition to natural language processing and recommendation systems. You can access a comprehensive dataset or tailor a subset to fit your specific requirements, using data from a combination of various sources and websites, including custom ones. Popular use cases include model training and validation, where the dataset can be used to ensure robust performance across different applications. Additionally, the dataset helps in algorithm benchmarking by providing extensive data to test and compare various machine learning algorithms, identifying the most effective ones for tasks such as fraud detection, sentiment analysis, and predictive maintenance. Furthermore, it supports feature engineering by allowing you to uncover significant data attributes, enhancing the predictive accuracy of your machine learning models for applications like customer segmentation, personalized marketing, and financial forecasting.

Instagram: most popular posts as of 2024

statista.com

Facebook

Twitter

Click to copy link

Link copied

Cite

Stacy Jo Dixon, Instagram: most popular posts as of 2024 [Dataset]. https://www.statista.com/topics/1164/social-networks/

Explore at:

Dataset provided by

Statistahttp://statista.com/

Authors

Stacy Jo Dixon

Description

Instagram’s most popular post

              As of April 2024, the most popular post on Instagram was Lionel Messi and his teammates after winning the 2022 FIFA World Cup with Argentina, posted by the account @leomessi. Messi's post, which racked up over 61 million likes within a day, knocked off the reigning post, which was 'Photo of an Egg'. Originally posted in January 2021, 'Photo of an Egg' surpassed the world’s most popular Instagram post at that time, which was a photo by Kylie Jenner’s daughter totaling 18 million likes.
              After several cryptic posts published by the account, World Record Egg revealed itself to be a part of a mental health campaign aimed at the pressures of social media use.

              Instagram’s most popular accounts

              As of April 2024, the official Instagram account @instagram had the most followers of any account on the platform, with 672 million followers. Portuguese footballer Cristiano Ronaldo (@cristiano) was the most followed individual with 628 million followers, while Selena Gomez (@selenagomez) was the most followed woman on the platform with 429 million. Additionally, Inter Miami CF striker Lionel Messi (@leomessi) had a total of 502 million. Celebrities such as The Rock, Kylie Jenner, and Ariana Grande all had over 380 million followers each.

              Instagram influencers

              In the United States, the leading content category of Instagram influencers was lifestyle, with 15.25 percent of influencers creating lifestyle content in 2021. Music ranked in second place with 10.96 percent, followed by family with 8.24 percent. Having a large audience can be very lucrative: Instagram influencers in the United States, Canada and the United Kingdom with over 90,000 followers made around 1,221 US dollars per post.

              Instagram around the globe

              Instagram’s worldwide popularity continues to grow, and India is the leading country in terms of number of users, with over 362.9 million users as of January 2024. The United States had 169.65 million Instagram users and Brazil had 134.6 million users. The social media platform was also very popular in Indonesia and Turkey, with 100.9 and 57.1, respectively. As of January 2024, Instagram was the fourth most popular social network in the world, behind Facebook, YouTube and WhatsApp.

f
Data_Sheet_1_Genetic Privacy and Data Protection: A Review of Chinese...
frontiersin.figshare.com
figshare.com
pdf
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li Du; Meng Wang (2023). Data_Sheet_1_Genetic Privacy and Data Protection: A Review of Chinese Direct-to-Consumer Genetic Test Services.PDF [Dataset]. http://doi.org/10.3389/fgene.2020.00416.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2020.00416.s001
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers
Authors
Li Du; Meng Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundThe existing literature has not examined how Chinese direct-to-consumer (DTC) genetic testing providers navigate the issues of informed consent, privacy, and data protection associated with testing services. This research aims to explore these questions by examining the relevant documents and messages published on websites of the Chinese DTC genetic test providers.MethodsUsing Baidu.com, the most popular Chinese search engine, we compiled the websites of providers who offer genetic testing services and analyzed available documents related to informed consent, the terms of services, and the privacy policy. The analyses were guided by the following inquiries as they applied to each DTC provider: the methods available for purchasing testing products; the methods providers used to obtain informed consent; privacy issues and measures for protecting consumers’ health information; the policy for third-party data sharing; consumers right to their data; and the liabilities in the event of a data breach.Results68.7% of providers offer multiple channels for purchasing genetic testing products, and that social media has become a popular platform to promote testing services. Informed consent forms are not available on 94% of providers’ websites and a privacy policy is only offered by 45.8% of DTC genetic testing providers. Thirty-nine providers stated that they used measures to protect consumers’ information, of which, 29 providers have distinguished consumers’ general personal information from their genetic information. In 33.7% of the cases examined, providers stated that with consumers’ explicit permission, they could reuse and share the clients’ information for non-commercial purposes. Twenty-three providers granted consumer rights to their health information, with the most frequently mentioned right being the consumers’ right to decide how their data can be used by providers. Lastly, 21.7% of providers clearly stated their liabilities in the event of a data breach, placing more emphasis on the providers’ exemption from any liability.ConclusionsCurrently, the Chinese DTC genetic testing business is running in a regulatory vacuum, governed by self-regulation. The government should develop a comprehensive legal framework to regulate DTC genetic testing offerings. Regulatory improvements should be made based on periodical reviews of the supervisory strategy to meet the rapid development of the DTC genetic testing industry.
d
App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M...
datarade.ai
.csv
Updated Nov 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
mfour (2023). App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M consumers, 3B+ events verified, US consumers | CCPA Compliant [Dataset]. https://datarade.ai/data-categories/app-data/datasets
Explore at:
.csvAvailable download formats
Dataset updated
Nov 14, 2023
Dataset authored and provided by
mfour
Area covered
United States of America
Description
At MFour, our Behavioral Data stands out for its uniqueness and depth of insights. What makes our data genuinely exceptional is the combination of several key factors:

First-Party Opt-In Data: Our data is sourced directly from our opt-in panel of consumers who willingly participate in research and provide observed behaviors. This ensures the highest data quality and eliminates privacy concerns. CCPA compliant.

Unparalleled Data Coverage: With access to 3B+ billion events, we have an extensive pool of participants who allow us to observe their brick + mortar location visitation, app + web smartphone usage, or both. This large-scale coverage provides robust and reliable insights.

Our data is generally sourced through our Surveys On The Go (SOTG) mobile research app, where consumers are incentivized with cash rewards to participate in surveys and share their observed behaviors. This incentivized approach ensures a willing and engaged panel, leading to the highest-quality data.

The primary use cases and verticals of our Behavioral Data Product are diverse and varied. Some key applications include:

Data Acquisition and Modeling: Our data helps businesses acquire valuable insights into consumer behavior and enables modeling for various research objectives.

Shopper Data Analysis: By understanding purchase behavior and patterns, businesses can optimize their strategies, improve targeting, and enhance customer experiences.

Media Consumption Insights: Our data provides a deep understanding of viewer behavior and patterns across popular platforms like YouTube, Amazon Prime, Netflix, and Disney+, enabling effective media planning and content optimization.

App Performance Optimization: Analyzing app behavior allows businesses to monitor usage patterns, track key performance indicators (KPIs), and optimize app experiences to drive user engagement and retention.

Location-Based Targeting: With our detailed location data, businesses can map out consumer visits to physical venues and combine them with web and app behavior to create predictive ad targeting strategies.

Audience Creation for Ad Placement: Our data enables the creation of highly targeted audiences for ad campaigns, ensuring better reach and engagement with relevant consumer segments.

The Behavioral Data Product complements our comprehensive suite of data solutions in the broader context of our data offering. It provides granular and event-level insights into consumer behaviors, which can be combined with other data sets such as survey responses, demographics, or custom profiling questions to offer a holistic understanding of consumer preferences, motivations, and actions.

MFour's Behavioral Data empowers businesses with unparalleled consumer insights, allowing them to make data-driven decisions, uncover new opportunities, and stay ahead in today's dynamic market landscape.
p
OPRA Dataset
paperswithcode.com
Updated Mar 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kuan Fang; Te-Lin Wu; Daniel Yang; Silvio Savarese; Joseph J. Lim (2023). OPRA Dataset [Dataset]. https://paperswithcode.com/dataset/opra
Explore at:
Dataset updated
Mar 26, 2023
Authors
Kuan Fang; Te-Lin Wu; Daniel Yang; Silvio Savarese; Joseph J. Lim
Description
The OPRA Dataset was introduced in Demo2Vec: Reasoning Object Affordances From Online Videos (CVPR'18) for reasoning object affordances from online demonstration videos. It contains 11,505 demonstration clips and 2,512 object images scraped from 6 popular YouTube product review channels along with the corresponding affordance annotations. More details can be found on our https://sites.google.com/view/demo2vec/.
Ice Cream Dataset
kaggle.com
Updated Oct 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tyson Pond (2020). Ice Cream Dataset [Dataset]. https://www.kaggle.com/tysonpo/ice-cream-dataset/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 4, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Tyson Pond
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
🍦 Overview

This dataset contains details (including ingredients), images, and reviews of 241 ice cream flavors across 4 brands (Ben & Jerry's, Häagen-Dazs, Breyers, and Talenti). There are a total of 21,674 reviews, with each review containing star ratings and text. The data was collected directly from the brand websites: (i) https://www.benjerry.com/flavors/ice-cream-pints, (ii) https://www.haagendazs.us/products, (iii) https://www.breyers.com/us/en/products.html, (iv) https://www.talentigelato.com/product-category/talenti-gelato-flavors. Below we describe the three components to this dataset.

products.csv -- Descriptive information about each flavor such as: the flavor name, description, average rating, and ingredients list.

the column key matches the column key in reviews.csv and matches the file name in the images/ directory.

reviews.csv -- Reviews for each flavor. The review information includes: review author, review date, stars (out of 5), review text, upvotes/downvotes (helpful yes/no), etc.

images/ -- The product images.

There are five main directories. Four are for the individual brands: bj/=Ben & Jerry's, hd/=Häagen-Dazs, breyers/=Breyers, talenti/=Talenti. A fifth directory, combined/, contains the merged data. However, since the data reported between the websites is slightly different, there are several NA values in the merged data.

🍦 Uses

There are several uses for this dataset. You could: (i) determine which flavors are most popular, (ii) investigate why popular flavors are popular (e.g. by extracting info from reviews & ingredients), (iii) suggest a new recipe by examining ingredients list, (iv) apply sentiment analysis, (v) compare brands, etc.

🍦 Considerations

The collection of reviews on the brand websites may not be representative of overall opinion, i.e. there may be review censoring or presence of fake reviews meant to help/harm the image of the brand. We intentionally chose brands that host some negative reviews on their website (see e.g. Enlightened or Rebel for only 4-5 star reviews).

Ben & Jerry's, Breyers, and Talenti are all owned by Unilever. Häagen-Dazs is owned by Froneri. Talenti is distinguished from the other brands as they produce gelato.

The images collection probably isn't large enough to use for any computer vision tasks, but may be useful for EDA presentation. Some products may be missing a corresponding image. Ben & Jerry's and Häagen-Dazs images show actual ice cream scoops. Breyers and Talenti only show containers.

Some reviews (mostly specific to Häagen-Dazs) include "...[This review was collected as part of a promotion]" in the text. Also -- not intended -- Talenti's review text seems to also include company feedback, i.e.: "We appreciate your feedback! ... please feel free to contact us directly at consumer.services@unilever.com".

🍦 Recent and future updates

In version 1 the data consisted only of Ben & Jerry's and Häagen-Dazs. In the latest version I added Breyers and Talenti data.

I do not have other planned updates, but I may try to add data from other brands (such as Dreyer's). Please let me know if you have any suggestions/questions 😊
d
Data from: Human preferences for dogs and cats in China: the current...
search.dataone.org
datadryad.org
Updated Dec 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhang Xu; He Yuansi; Yang Shuai; Wang Daiping (2024). Human preferences for dogs and cats in China: the current situation and influencing factors of watching online videos and pet ownership [Dataset]. http://doi.org/10.5061/dryad.qfttdz0rr
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.qfttdz0rr
Dataset updated
Dec 18, 2024
Dataset provided by
Dryad Digital Repository
Authors
Zhang Xu; He Yuansi; Yang Shuai; Wang Daiping
Description
Dogs and cats have become the most important and successful pets through long-term domestication. People keep them for various reasons, such as their functional roles or for physical or psychological support. However, why humans are so attached to dogs and cats remains unclear. A comprehensive understanding of the current state of human preferences for dogs and cats and the potential influential factors behind it is required. Here, we investigate this question using two independent online datasets and anonymous questionnaires in China. We find that current human preferences for dog and cat videos are relatively higher than for most other interests, with video plays ranking among the top three out of fifteen interests. We also find genetic variations, gender, age, and economic development levels notably influence human preferences for dogs and cats. Specifically, dog and cat ownership are significantly associated with parentsâ€™ pet ownership of dogs and cats (Spearmanâ€™s rank correlation c..., , , # Human preferences for dogs and cats in China: the current situation and influencing factors of watching online videos and pet ownership

https://doi.org/10.5061/dryad.qfttdz0rr

This dataset contains three CSV data files, each corresponding to one of the three parts described in the study.

Description of the data and file structure

**â€œ1, bilibili.csvâ€ **: contains data extracted from the Bilibili website. Each row in the dataset represents yearly data for each popular channel. Missing data are indicated with NA.

ID:Â The serial number for each video, ranging from 1 to 167368.

year: The year the video was published on the website, from 2009 to 2021.

Videourl:Â The URL of the video.

plays:Â The total number of plays for the video.

likes: The total number of likes for the video.

sort: The ranking of the video in terms of play count among all popular videos in its channel for that year.

channelID: The I...

Facebook

Twitter

Click to copy link

Link copied

Cite

M. Kleppe; H. Bijleveld; M. Kleppe; H. Bijleveld (2017). Most popular websites in the Netherlands 2015 [Dataset]. http://doi.org/10.17026/DANS-X6H-6QQT

Most popular websites in the Netherlands 2015

Explore at:

zip(15855), csv(138294), tsv(176359)Available download formats

Unique identifier

https://doi.org/10.17026/DANS-X6H-6QQT

Dataset updated

May 9, 2017

Dataset provided by

DANS Data Station Social Sciences and Humanities

Authors

M. Kleppe; H. Bijleveld; M. Kleppe; H. Bijleveld

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Area covered

Netherlands

Dataset funded by

NWO

Description

This dataset contains a list of 3654 Dutch websites that we considered the most popular websites in 2015. This list served as whitelist for the Newstracker Research project in which we monitored the online web behaviour of a group of respondents.The research project 'The Newstracker' was a subproject of the NWO-funded project 'The New News Consumer: A User-Based Innovation Project to Meet Paradigmatic Change in News Use and Media Habits'.For the Newstracker project we aimed to understand the web behaviour of a group of respondents. We created custom-built software to monitor their web browsing behaviour on their laptops and desktops (please find the code in open access at https://github.com/NITechLabs/NewsTracker). For reasons of scale and privacy we created a whitelist with websites that were the most popular websites in 2015. We manually compiled this list by using data of DDMM, Alexa and own research. The dataset consists of 5 columns:- the URL- the type of website: We created a list of types of websites and each website has been manually labeled with 1 category- Nieuws-regio: When the category was 'News', we subdivided these websites in the regional focus: International, National or Local- Nieuws-onderwerp: Furthermore, each website under the category News was further subdivided in type of news website. For this we created an own list of news categories and manually coded each website- Bron: For each website we noted which source we used to find this website.The full description of the research design of the Newstracker including the set-up of this whitelist is included in the following article: Kleppe, M., Otte, M. (in print), 'Analysing & understanding news consumption patterns by tracking online user behaviour with a multimodal research design', Digital Scholarship in the Humanities, doi 10.1093/llc/fqx030.

Clear search

Close search

Google apps

Main menu

Most popular websites in the Netherlands 2015

1k_Website_Screenshots_and_Metadata

Dataset Search WebApp

Data from: E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects...

‘Imdb Most popular Films and series’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

TED dataset

Click Global Data | Web Traffic Data + Transaction Data | Consumer and B2B...

UI-Elements-Detection-Dataset

Dashlane Login | How to Login Dashlane Account? Dataset

Recipes dataset from allrecipes

Global Starlink Web Cache Latency & Traceroute Measurement Dataset

Dataset File Content

Metadata

Measurement results - Raw Data

Measurement results - Preprocessed Latency

Cookie Synchronization HTTP Requests

‘Spotify Past Decades Songs Attributes’ analyzed by Analyst-2

Context

Content

Acknowledgements

Machine Learning Dataset

Instagram: most popular posts as of 2024

Data_Sheet_1_Genetic Privacy and Data Protection: A Review of Chinese...

App + Web Consumer Data | MFour's 1st Party - App + Web Usage Data | 2M...

OPRA Dataset

Ice Cream Dataset

🍦 Overview

🍦 Uses

🍦 Considerations

🍦 Recent and future updates

Data from: Human preferences for dogs and cats in China: the current...

Description of the data and file structure

Most popular websites in the Netherlands 2015