31 datasets found

BrowserART
huggingface.co
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Scale AI (2024). BrowserART [Dataset]. https://huggingface.co/datasets/ScaleAI/BrowserART
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 11, 2024
Dataset authored and provided by
Scale AIhttps://scale.com/
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents

Paper PDF

Homepage

Github

This project contains the behavior dataset in BrowserART, a red teaming test suit tailored particularly for browser agents.

Abstract

For safety reasons, large language models (LLMs) are trained to refuse harmful user instructions, such as assisting dangerous activities. We study an open question in this work: Can the desired safety refusal, typically enforced in chat… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/BrowserART.
📣 Ad Click Prediction Dataset
kaggle.com
Updated Sep 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ciobanu Marius (2024). 📣 Ad Click Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/marius2303/ad-click-prediction-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ciobanu Marius
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
About

This dataset provides insights into user behavior and online advertising, specifically focusing on predicting whether a user will click on an online advertisement. It contains user demographic information, browsing habits, and details related to the display of the advertisement. This dataset is ideal for building binary classification models to predict user interactions with online ads.

Features

id: Unique identifier for each user.

full_name: User's name formatted as "UserX" for anonymity.

age: Age of the user (ranging from 18 to 64 years).

gender: The gender of the user (categorized as Male, Female, or Non-Binary).

device_type: The type of device used by the user when viewing the ad (Mobile, Desktop, Tablet).

ad_position: The position of the ad on the webpage (Top, Side, Bottom).

browsing_history: The user's browsing activity prior to seeing the ad (Shopping, News, Entertainment, Education, Social Media).

time_of_day: The time when the user viewed the ad (Morning, Afternoon, Evening, Night).

click: The target label indicating whether the user clicked on the ad (1 for a click, 0 for no click).

Goal

The objective of this dataset is to predict whether a user will click on an online ad based on their demographics, browsing behavior, the context of the ad's display, and the time of day. You will need to clean the data, understand it and then apply machine learning models to predict and evaluate data. It is a really challenging request for this kind of data. This data can be used to improve ad targeting strategies, optimize ad placement, and better understand user interaction with online advertisements.
G
Internet security and privacy related practices, frequency of deleting...
open.canada.ca
www150.statcan.gc.ca
+4more
csv, html, xml
Updated Jan 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2023). Internet security and privacy related practices, frequency of deleting browser history by age group, level of education and household income [Dataset]. https://open.canada.ca/data/en/dataset/73f7c222-d7fc-4d64-a784-593690f5fd05
Explore at:
html, xml, csvAvailable download formats
Dataset updated
Jan 17, 2023
Dataset provided by
Statistics Canada
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Canadian Internet use survey, internet security and privacy related practices, frequency of deleting browser history by age group, level of education and household income quartile for Canada from 2010 and 2012.
e
Register of green-roof activities: database application - Dataset - B2FIND
b2find.eudat.eu
Updated Aug 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Register of green-roof activities: database application - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/b562ad4e-924f-5281-a809-f0a2435b2a76
Explore at:
Dataset updated
Aug 5, 2024
Description
This repository contains a web application (written in Flask, Python) that allows green roofs (and other locations) to register held activities. All entires are automatically added to an SQLite database. In its presented form, six green roofs of Barcelona are included, but this can be customised as wished. An empty database file is provided along with the SQLite schema to generate the tables. The Python code is also included in order to allow the application to be offered through a server. The application registers internal and external activities, alongside different fields of data that allow the characterisation of indicators related to the social performance of the roof. The menu is offered in Catalan, but can be customised for other languages. The list of external activity types can be expanded and edited. Moreover, each activity entry can be edited after its creation. The application is written in Flask (Python) and needs to be run on a Linux/BSD server. Once on a server, the application runs in any browser. The resulting database is a sigle SQLite file that can be edited and processed with any SQLite application, like DB Browser for SQLite.
d
Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant
datarade.ai
.csv, .xls
Updated Jun 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swash (2023). Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant [Dataset]. https://datarade.ai/data-products/swash-blockchain-bitcoin-and-web3-enthusiasts-swash
Explore at:
.csv, .xlsAvailable download formats
Dataset updated
Jun 27, 2023
Dataset authored and provided by
Swash
Area covered
Saint Vincent and the Grenadines, Liechtenstein, Russian Federation, Monaco, Belarus, Jamaica, Latvia, Jordan, Uzbekistan, India
Description
Unlock the Power of Behavioural Data with GDPR-Compliant Clickstream Insights.

Swash clickstream data offers a comprehensive and GDPR-compliant dataset sourced from users worldwide, encompassing both desktop and mobile browsing behaviour. Here's an in-depth look at what sets us apart and how our data can benefit your organisation.

User-Centric Approach: Unlike traditional data collection methods, we take a user-centric approach by rewarding users for the data they willingly provide. This unique methodology ensures transparent data collection practices, encourages user participation, and establishes trust between data providers and consumers.

Wide Coverage and Varied Categories: Our clickstream data covers diverse categories, including search, shopping, and URL visits. Whether you are interested in understanding user preferences in e-commerce, analysing search behaviour across different industries, or tracking website visits, our data provides a rich and multi-dimensional view of user activities.

GDPR Compliance and Privacy: We prioritise data privacy and strictly adhere to GDPR guidelines. Our data collection methods are fully compliant, ensuring the protection of user identities and personal information. You can confidently leverage our clickstream data without compromising privacy or facing regulatory challenges.

Market Intelligence and Consumer Behaviuor: Gain deep insights into market intelligence and consumer behaviour using our clickstream data. Understand trends, preferences, and user behaviour patterns by analysing the comprehensive user-level, time-stamped raw or processed data feed. Uncover valuable information about user journeys, search funnels, and paths to purchase to enhance your marketing strategies and drive business growth.

High-Frequency Updates and Consistency: We provide high-frequency updates and consistent user participation, offering both historical data and ongoing daily delivery. This ensures you have access to up-to-date insights and a continuous data feed for comprehensive analysis. Our reliable and consistent data empowers you to make accurate and timely decisions.

Custom Reporting and Analysis: We understand that every organisation has unique requirements. That's why we offer customisable reporting options, allowing you to tailor the analysis and reporting of clickstream data to your specific needs. Whether you need detailed metrics, visualisations, or in-depth analytics, we provide the flexibility to meet your reporting requirements.

Data Quality and Credibility: We take data quality seriously. Our data sourcing practices are designed to ensure responsible and reliable data collection. We implement rigorous data cleaning, validation, and verification processes, guaranteeing the accuracy and reliability of our clickstream data. You can confidently rely on our data to drive your decision-making processes.
Octo Browser: Your Ultimate Web Browsing Solution
kaggle.com
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
tahir tabassum (2025). Octo Browser: Your Ultimate Web Browsing Solution [Dataset]. https://www.kaggle.com/datasets/tahirtabassum/octo-browser-your-ultimate-web-browsing-solution/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 15, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
tahir tabassum
Description
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F26037753%2Fbbb0cf1bf56d21eefd8affbdd3ba1230%2FzZeZUKTdB2N2MNkxptRRIagxm1Yewj1oTZ61NwpI.png?generation=1744691255224374&alt=media" alt="">

Octo Browser: Your Ultimate Web Browsing Solution
In today's digital world, having a good web browser is key. The Octo Browser is here to help. It offers a top-notch browsing experience unlike any other.

This browser has cool features and is easy to use. It's perfect for anyone, whether you're just browsing or need it for work. It's made to make your online time better.
The Octo Browser uses the latest tech. It loads pages quickly, keeps you safe, and is easy to get around. It's the best choice for anyone looking for a great browser.

Key Takeaways
- Advanced features for a seamless browsing experience
- Robust security to protect your online activities
- Fast page loading and intuitive navigation
- User-centric design for enhanced usability
- Ideal for both casual users and professionals

Introducing Octo Browser
Octo Browser is changing how we browse the web. It's a top-notch web browser that makes browsing fast. It's perfect for those who want quick and reliable results.
Octo Browser has cool features that make browsing better. It's easy to use and works great.

Key Features at a Glance
Octo Browser has some key features:
- High-speed page loading
- Advanced security protocols
- Intuitive interface design

These features make browsing smooth and safe. Experts say it's a game-changer:
"Octo Browser's blend of speed and security sets a new standard in the world of web browsers."

How Octo Browser Stands Out
Octo Browser is different because it focuses on speed and security. It offers a better browsing experience than others. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F26037753%2F0826c351a3a6febc05c9d0ee82f37595%2F51a667a1-7803-4b62-8b0e-f9dedc9394fc_1440x900.png?generation=1744691337624920&alt=media" alt=""> Blazing-Fast Performance
Octo Browser brings you the future of web browsing with its lightning-fast speed. It uses a top-notch rendering engine and smart resource management.

Optimized Rendering Engine
Octo Browser has an optimized rendering engine that makes pages load much faster. This means you can quickly move through your favorite websites.

Efficient Resource Management
The browser's efficient resource management makes sure your system runs smoothly. It prevents slowdowns and crashes. Key features include:
- Intelligent memory allocation
- Background process optimization
- Prioritization of active tabs

Speed Comparison with Leading Browsers
Octo Browser is the fastest among leading browsers. Here's why:
- Loads pages up to 30% faster than the average browser
- Maintains speed even with multiple tabs open
- Outperforms competitors in both JavaScript and page rendering tests

Uncompromising Security and Privacy
In today's digital world, security and privacy are key. Octo Browser is built with these in mind. It's a secure browser that protects your data from cyber threats.
Octo Browser is all about keeping your online activities safe. It has strong features to do just that.

Built-in Privacy Protection
Octo Browser has privacy features to keep your browsing private. It stops tracking and profiling, so your habits stay hidden.
It uses advanced anti-tracking tech. This blocks third-party cookies and other tracking tools.

Advanced Data Encryption
Data encryption is vital for online safety. Octo Browser uses advanced encryption protocols to secure your data.
This means your data is safe from unauthorized access. It's protected when you send or store it.

Automatic Security Updates
Octo Browser also has automatic security updates. This keeps your browser current with the latest security fixes.
This way, you're always safe from new threats. You don't have to manually update the browser.

Seamless User Experience
Octo Browser is designed with the user in mind. It offers a seamless user experience. This means users can easily explore their favorite websites.

Intuitive Interface Design
The Octo Browser has an intuitive interface design. It's easy to use and navigate. The layout is clean and simple, focusing on your browsing experience.

Extensive Customization Options
Octo Browser gives you extensive customization options. You can personalize your browsing experience. Choose from various themes, customize toolbar layouts, and more.
- Choose from multiple theme options
- Customize toolbar layouts
- Personalize your browsing experience

Cross-Device Synchronization
Octo Browser's cross-device synchronization lets you access your data on different devices. This means you ...
Data from: E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects...
zenodo.org
bin, pdf, txt
Updated May 20, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck (2025). E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects [Dataset]. http://doi.org/10.5281/zenodo.14988988
Explore at:
txt, bin, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14988988
Dataset updated
May 20, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sergio Di Meglio; Sergio Di Meglio; Valeria Pontillo; Valeria Pontillo; Coen De roover; Coen De roover; Luigi Libero Lucio Starace; Luigi Libero Lucio Starace; Sergio Di Martino; Sergio Di Martino; Ruben Opdebeeck; Ruben Opdebeeck
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ABSTRACT
End-to-end (E2E) testing is a software validation approach that simulates realistic user scenarios throughout the entire workflow of an application. In the context of web
applications, E2E testing involves two activities: Graphic User Interface (GUI) testing, which simulates user interactions with the web app’s GUI through web browsers, and performance testing, which evaluates system workload handling. Despite its recognized importance in delivering high-quality web applications, the availability of large-scale datasets featuring real-world E2E web tests remains limited, hindering research in the field.
To address this gap, we present E2EGit, a comprehensive dataset of non-trivial open-source web projects collected on GitHub that adopt E2E testing. By analyzing over 5,000 web repositories across popular programming languages (JAVA, JAVASCRIPT, TYPESCRIPT, and PYTHON), we identified 472 repositories implementing 43,670 automated Web GUI tests with popular browser automation frameworks (SELENIUM, PLAYWRIGHT, CYPRESS, PUPPETEER), and 84 repositories that featured 271 automated performance tests implemented leveraging the most popular open-source tools (JMETER, LOCUST). Among these, 13 repositories implemented both types of testing for a total of 786 Web GUI tests and 61 performance tests.

DATASET DESCRIPTION
The dataset is provided as an SQLite database, whose structure is illustrated in Figure 3 (in the paper), which consists of five tables, each serving a specific purpose.
The repository table contains information on 1.5 million repositories collected using the SEART tool on May 4. It includes 34 fields detailing repository characteristics. The
non_trivial_repository table is a subset of the previous one, listing repositories that passed the two filtering stages described in the pipeline. For each repository, it specifies whether it is a web repository using JAVA, JAVASCRIPT, TYPESCRIPT, or PYTHON frameworks. A repository may use multiple frameworks, with corresponding fields (e.g., is web java) set to true, and the field web dependencies listing the detected web frameworks. For Web GUI testing, the dataset includes two additional tables; gui_testing_test _details, where each row represents a test file, providing the file path, the browser automation framework used, the test engine employed, and the number of tests implemented in the file. gui_testing_repo_details, aggregating data from the previous table at the repository level. Each of the 472 repositories has a row summarizing
the number of test files using frameworks like SELENIUM or PLAYWRIGHT, test engines like JUNIT, and the total number of tests identified. For performance testing, the performance_testing_test_details table contains 410 rows, one for each test identified. Each row includes the file path, whether the test uses JMETER or LOCUST, and extracted details such as the number of thread groups, concurrent users, and requests. Notably, some fields may be absent—for instance, if external files (e.g., CSVs defining workloads) were unavailable, or in the case of Locust tests, where parameters like duration and concurrent users are specified via the command line.

To cite this article refer to this citation:

@inproceedings{di2025e2egit,
title={E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects},
author={Di Meglio, Sergio and Starace, Luigi Libero Lucio and Pontillo, Valeria and Opdebeeck, Ruben and De Roover, Coen and Di Martino, Sergio},
booktitle={2025 IEEE/ACM 22nd International Conference on Mining Software Repositories (MSR)},
pages={10--15},
year={2025},
organization={IEEE/ACM}
}

This work has been partially supported by the Italian PNRR MUR project PE0000013-FAIR.
d
SSA
catalog.data.gov
data.cityofchicago.org
+2more
Updated Jun 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofchicago.org (2024). SSA [Dataset]. https://catalog.data.gov/dataset/ssa-e2981
Explore at:
Dataset updated
Jun 8, 2024
Dataset provided by
data.cityofchicago.org
Description
Special Service Areas (SSA) boundaries in Chicago. The Special Service Area program is a mechanism used to fund expanded services and programs through a localized property tax levy within contiguous industrial, commercial and residential areas. The enhanced services and programs are in addition to services and programs currently provided through the city. SSA-funded projects could include, but are not limited to, security services, area marketing and advertising assistance, promotional activities such as parades and festivals, or any variety of small scale capital improvements that could be supported through a modest property tax levy. The data can be viewed on the Chicago Data Portal with a web browser. However, to view or use the files outside of a web browser, you will need to use compression software and special GIS software, such as ESRI ArcGIS (shapefile) or Google Earth (KML or KMZ).
Performance counter for biometrics authentication
figshare.com
txt
Updated Oct 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cesar Andrade; Eduardo Souto; Hendrio Bragança (2023). Performance counter for biometrics authentication [Dataset]. http://doi.org/10.6084/m9.figshare.24461230.v3
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24461230.v3
Dataset updated
Oct 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Cesar Andrade; Eduardo Souto; Hendrio Bragança
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In the quest for advancing the field of continuous user authentication, we have meticulously crafted two comprehensive datasets: COUNT-OS-I and COUNT-OS-II, each harboring unique characteristics while sharing a common ground in their utility and design principles. These datasets encompass performance counters extracted from the Windows operating system, offering an intricate tapestry of data vital for evaluating and refining authentication models in real-world scenarios.Both datasets have been generated in real-world settings within public organizations in Brazil, ensuring their applicability and relevance to practical scenarios. Volunteers from diverse professional backgrounds participated in the data collection, contributing to the richness and variability of the data. Furthermore, both datasets were collected at a sample rate of every 5 seconds, providing a dense and detailed view of user interactions and system performance. The commitment to preserving user confidentiality is unwavering across both datasets, with pseudonymization applied meticulously to safeguard individual identities while maintaining data integrity and statistical robustness.The COUNT-OS-I dataset was specifically generated in a real-world scenario to evaluate our work on continuous user authentication. This dataset consist of performance counters extracted from the Windows operating system of 26 computers, representing 26 individual users. The data were collected on the computers of the Information Technology Department of a public organization in Brazil.The participants in this study were volunteers, with aged between 20 and 45 years old, consisting of both males and females. The majority of the participants were systems analysts and software developers who performed their routine work activities. There were no specific restrictions imposed on the tasks that the participants were required to perform during the data collection process.The participants used a variety of software applications as part of their regular work activities. This included web browsers such as Firefox, Chrome, and Edge, developer tools like Eclipse and SQL Developer, office programs such as Microsoft Office Word, Excel, and PowerPoint, as well as chat applications like WhatsApp. It's important to note that the list of applications mentioned is not exhaustive, and participants were not limited to using only these applications.For the COUNT-OS-I dataset, the data collected is based on computers with different characteristics and configurations in terms of hardware, operating system versions, and installed software. This diversity ensures a representative sample of real-world scenarios and allows for a comprehensive evaluation of the authentication model.During the data collection process, each sample was recorded at a frequency of every 5 seconds, capturing system data over a period of approximately 26 hours, on average, for each user. This duration provides sufficient data to analyze user behavior and system performance over an extended period. Each sample in the COUNT-OS-I dataset corresponds to a feature vector comprising 159 attributesThe COUNT-OS-II dataset was utilized to evaluate our work in a real-world setting. This dataset comprises performance counters extracted from the Windows operating system installed on 37 computers. These computers possess identical hardware configurations (CPU, memory, network, disk), operating systems, and software installations. The data collection was conducted within various departments of a public organization in Brazil.The participants in this study (37 users) were voluntary administration assistants who performed various administrative tasks as part of their routine work activities. No restrictions were imposed on the specific tasks they were assigned. The participants commonly utilized programs such as the Chrome browser and office applications like Office Word, Excel, and PowerPoint, in addition to the WhatsApp chat application.The data were collected over six days (approximately 48 hours), with sample collected at a 5-second interval. Each sample corresponds to a feature vector composed of 218 attributes. In this dataset, we also apply pseudonymization to hide users' sensitive information.
d
Impervious Cover 2023
catalog.data.gov
datahub.austintexas.gov
+2more
Updated May 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.austintexas.gov (2025). Impervious Cover 2023 [Dataset]. https://catalog.data.gov/dataset/impervious-cover-2023
Explore at:
Dataset updated
May 25, 2025
Dataset provided by
data.austintexas.gov
Description
This dataset entails the delineation of impervious surfaces and artificial land cover types extracted from aerial imagery captured in 2023. Utilization within the City of Austin The dataset plays a pivotal role in several municipal functions, encompassing the computation of the Drainage Charge managed by the Watershed Protection Department, wildfire assessments, emergency operations planning, transportation asset monitoring, urban forest management, and more. Data Updates New aerial imagery and impervious cover data are acquired by the city every two years, resulting in distinct datasets for each capture. As of its initial capture in early 2023, there have been no subsequent updates to this dataset. Downloading Instructions Some users have reported issues downloading the data. Due to the large size of the dataset, downloading can take longer than expected. We recommend following these instructions to download the data. Click on 'Export'. Choose the desired export format from the dropdown menu. Initiate the download by clicking 'Download'. It's essential to allow your browser ample time to process the download. Even if no immediate action appears to occur, please refrain from closing the browser window. Depending on your internet speed, the download process can take between 10 to 30 minutes due to the substantial size of the data. Once the download commences in your browser, allow it to finish entirely. Should you encounter any issues during the download process, kindly contact the dataset owner for assistance.
Persons participating in cultural or sport activities in the last 12 months...
data.europa.eu
service.tib.eu
+1more
csv, html, tsv, xml
Updated May 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eurostat (2025). Persons participating in cultural or sport activities in the last 12 months by sex, age, educational attainment, activity type and frequency [Dataset]. https://data.europa.eu/data/datasets/sobu0hvbbun7o8l3vxrxfq?locale=en
Explore at:
csv(2103876), tsv(1438769), xml(1906860), xml(10665), htmlAvailable download formats
Dataset updated
May 30, 2025
Dataset authored and provided by
Eurostathttps://ec.europa.eu/eurostat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Persons participating in cultural or sport activities in the last 12 months by sex, age, educational attainment, activity type and frequency
Z
Example dataset input for IgIDivA
data.niaid.nih.gov
zenodo.org
Updated Jun 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zaragoza-Infante Laura (2022). Example dataset input for IgIDivA [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6616045
Explore at:
Dataset updated
Jun 6, 2022
Dataset authored and provided by
Zaragoza-Infante Laura
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Example dataset input for the Immunoglobulin Intraclonal Diversification Analysis (IgIDivA) tool. (Publication of IgIDivA under revision)

The data was retrieved from ENA (https://www.ebi.ac.uk/ena/browser/view/PRJEB36589?show=reads) under the accession number PRJEB36589, and subsequently processed with IMGT/HighV-QUEST (https://www.imgt.org/HighV-QUEST/home.action) and tripr (https://bioconductor.org/packages/release/bioc/html/tripr.html).
e
Flash Eurobarometer 443 (e-Privacy) - Dataset - B2FIND
b2find.eudat.eu
Updated Apr 4, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Flash Eurobarometer 443 (e-Privacy) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/f7f6f0b7-8505-5aba-b25d-6ea802bd8ff9
Explore at:
Dataset updated
Apr 4, 2017
Description
Themen: persönliche Maßnahmen zur Gewährleistung des Datenschutzes in der Online-Kommunikation: Nutzung von Software zur Verhinderung der Anzeige von Online-Werbung (Anti-Adware), Nutzung von Software zur Verhinderung der Überwachung der Online-Aktivitäten (Anti-Spyware), Meiden bestimmter Webseiten zur Verhinderung von Überwachung, Änderung der Datenschutzeinstellungen des Internet-Browsers; Wichtigkeit ausgewählter Aspekte in Bezug auf E-Privacy: Zugriff auf persönliche Daten nur nach vorheriger Zustimmung, Nutzung von Tools zur Überwachung nur nach vorheriger Zustimmung, garantierte Vertraulichkeit von E-Mails und Instant Messaging im Internet; Wissenstest zu Gesetzen zu E-Privacy: Zugriff auf persönliche Daten auf Computer, Smartphone oder Tablet nur nach vorheriger Zustimmung, Speicherverbot für externe Informationen (z.B. Cookies) auf Computer, Smartphone oder Tablet, Gewährleistung der Vertraulichkeit von Instant messaging- und online erfolgender Sprachkommunikation; Zustimmung zu den folgenden Aussagen: Provider sollten regelmäßige Updates zum Schutz persönlicher Daten zur Verfügung stellen, Browser-Standardeinstellungen sollten die Weitergabe von Informationen verhindern, Empfang zu vieler unaufgeforderter kommerzieller Anrufe, Möglichkeit zur Verschlüsselung von Nachrichten und Anrufen durch den Benutzer; Akzeptanz ausgewählter Maßnahmen im Hinblick auf die Überwachung von Online-Aktivitäten: Erlauben von Überwachung im Gegenzug für uneingeschränkten Zugang zu einer bestimmten Webseite, Austausch von Nutzerdaten zwischen Unternehmen zur Verbesserung personalisierter Angebote, Entrichtung eines Geldbetrags zur Gewährleistung der Anonymität; präferierter Zeitpunkt beim Webseitenbesuch für die Einholung der Erlaubnis zum Zugriff auf oder des Speicherns von persönlichen Daten; präferierte Regelung in Bezug auf kommerzielle Anrufe: generelle Erlaubnis, Erlaubnis unter der Voraussetzung des Anzeigens der Telefonnummer, spezielle Vorwahl für Werbeanrufe. Demographie: Häufigkeit der Nutzung der folgenden Kommunikationsmittel für ausgewählte Zwecke: Festnetzanschluss, Mobiltelefon für das Tätigen von Anrufen oder Senden von Textnachrichten, Internet für das Tätigen von Telefon- oder Videoanrufen, Internet für Instant Messaging, E-Mail, soziale Netzwerke im Internet, Internet zum Surfen; Alter; Geschlecht; Staatsangehörigkeit; Alter bei Beendigung der Ausbildung; Beruf; berufliche Stellung; Region; Urbanisierungsgrad; Besitz eines Mobiltelefons; Festnetztelefon im Haushalt; Haushaltszusammensetzung und Haushaltsgröße. Zusätzlich verkodet wurde: Befragten-ID; Land; Interviewmodus (Mobiltelefon oder Festnetz); Nationengruppe; Gewichtungsfaktor. Topics: measures taken to guarantee online privacy: use of software that protects from seeing online adverts (anti-adware), use of software that prevents online activities from being monitored (anti-spyware), avoidance of certain websites to prevent from being monitored, change of privacy settings on internet browser; importance of selected aspects regarding online privacy: access to personal information only with permission, use of monitoring tools only with permission, guaranteed confidentiality of e-mails and online instant messaging; knowledge test on laws on online privacy: personal information on computer, smartphone, or tablet are only allowed to be accessed with personal permission, interdiction to store external information (e.g. cookies) on personal computer, smartphone, or tablet, confidentiality of instant messaging and online voice conversation; approval of the following statements: providers should give regular software updates to protect personal information, default settings of browser should stop information from being shared, reception of too many commercial calls, encryption of messages and calls should be possible for the user; acceptance of selected measures with regard to monitoring online activities: being monitored in exchange for unrestricted access to a certain website, sharing of personal user information between companies to provide users with new services, pay not to be monitored; desired point of time when visiting a website to be asked for permission to access or to store user information; preferred approach concerning commercial calls: general allowance, allowance only under the condition of displaying phone number, phone numbers should have a special prefix. Demography: frequency of using the following means of communication for selected purposes: fixed phone line, mobile phone to make calls or send text messages, internet to make phone or video calls, internet for instant messaging, e-mail, online social networks, internet to browse online; age; sex; nationality; age at end of education; occupation; professional position; region; type of community; own a mobile phone and fixed (landline) phone; household composition and household size. Additionally coded was: respondent ID; country; type of phone line; nation group; weighting factor.
WebRTC-QoE: A Dataset of Quality of Experience in Audio-Video Communications...
zenodo.org
Updated Jul 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gulnaziye Bingol; Gulnaziye Bingol; LUIGI SERRELI; LUIGI SERRELI; SIMONE PORCU; SIMONE PORCU; Alessandro Floris; Alessandro Floris; Luigi Atzori; Luigi Atzori (2025). WebRTC-QoE: A Dataset of Quality of Experience in Audio-Video Communications [Dataset]. http://doi.org/10.21227/mb47-hf44
Explore at:
Unique identifier
https://doi.org/10.21227/mb47-hf44
Dataset updated
Jul 8, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Gulnaziye Bingol; Gulnaziye Bingol; LUIGI SERRELI; LUIGI SERRELI; SIMONE PORCU; SIMONE PORCU; Alessandro Floris; Alessandro Floris; Luigi Atzori; Luigi Atzori
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2021
Description
In the realm of real-time communications, WebRTC-based multimedia applications are increasingly prevalent as these can be smoothly integrated within Web browsing sessions. The browsing experience is then significantly improved concerning scenarios where browser add-ons and/or plug-ins are used; still, the end user's Quality of Experience (QoE) in WebRTC sessions may be affected by network impairments, such as delays and losses. Due to the variability in user perceptions under different communications scenarios, comprehending and enhancing the resulting service quality is a complex endeavour. To address this, we present a dataset that provides a comprehensive perspective on the conversational quality of a two-party WebRTC-based audiovisual telemeeting service. This dataset was gathered through subjective evaluations involving 20 subjects across 15 different test conditions (TCs). A specialized system was developed to induce controlled network disruptions such as delay, jitter, and packet loss rate, which adversely affected the communication between the parties. This methodology offered insight into user perceptions under various network impairments. The dataset encompasses a blend of objective and subjective data, including ACR (Absolute Category Rating) subjective scores, webrtc-internals parameters, facial expressions features, and speech features. Consequently, it serves as a substantial contribution to the improvement of WebRTC-based video call systems, offering practical and real-world data that can drive the development of more robust and efficient multimedia communication systems, thereby enhancing the user’s experience.

In the following, we discuss the details of the provided datasets.

Subjective_results_dataset.csv: This dataset encompasses subjective evaluation results from 20 subjects (users) who assessed the quality of WebRTC-based video calls under 15 distinct test conditions (TCs), which included combinations of 3 network impairments (delay, jitter, packet loss) to disturb the communication. The single discrete Absolute Category Rating (ACR) scale with five category labels (1-Bad, 2-Poor, 3-Fair, 4-Good, and 5-Excellent) was used by the users to rate the perceived QoE. A total of 300 ACR scores were obtained (20 participants x 15 TCs). The size of this dataset is 4.00 KB.

The significance of each column is explained as follows:

Test Condition (TC): It enumerates the TC numbers, which span from 1 to 15.

Delay [ms]: It refers to the time it takes for a signal to travel from one point to another and is represented at three levels: 0 ms (no delay), 500 ms (moderate delay), and 1000 ms (significant delay).

Jitter [ms]: It refers to the variability in delay and is represented at two levels: 0 ms (no jitter) and 500 ms (moderate jitter).

Packet Loss Rate [%]: It refers to the loss of data packets during transmission and is represented at three levels: 0% (no packet loss), 15% (moderate packet loss), and 30% (significant packet loss).

User: The users, identified from 1 to 20, participated in 15 video calls and evaluated the quality of these calls on a scale from 1 to 5.

Webrtc_internals_dataset.zip: This dataset contains text files collected using the webrtc-internals tool during the video calls. The zip is organized into 20 distinct folders, each labeled from ‘User1’ to ‘User20’. Each user folder contains 15 text files (.txt), each named following the pattern ‘webrtc_internals_dump-TCx_y-z-t.txt’. In this naming convention, ‘x’ denotes the TC number, ranging from 1 to 15. The ‘y-z-t’ segment varies with each TC, where ‘y’ signifies the delay value (0, 500, or 1000), ‘z’ indicates the jitter value (0 or 500), and ‘t’ represents the packet loss rate (0, 15, or 30). Each text file includes application-level data concerning WebRTC sessions' statistics in a JSON format. A total of 300 webrtc-internals dump text files were obtained (20 participants x 15 TCs). The size of this zip dataset is 28.7 MB (396 MB uncompressed).

Facial_expression_features_dataset.zip: This dataset contains facial expression features extracted from the recorded videos with face images using the OpenFace toolkit. The zip is organized into 20 distinct folders, each labelled from ‘User1’ to ‘User20’. Each user folder contains 15 .csv files, which are named following the pattern ‘TCx_y-z-t.csv’. In this naming convention, ‘x’ denotes the TC, ranging from 1 to 15. The ‘y-z-t’ segment varies with each TC, where ‘y’ signifies the delay value (0, 500, or 1000), ‘z’ indicates the jitter value (0 or 500), and ‘t’ represents the packet loss rate (0, 15, or 30). A total of 300 facial expression feature files in .csv format were obtained (20 participants x 15 TCs). For each face image of each TC, the OpenFace outputs 6 gaze direction features, 280 eye region landmarks and 35 Action Units (AUs). The size of this zip dataset is 547 MB (2.14 GB uncompressed).

Each column carries a specific significance, which is elaborated as follows:

frame: the frame number in the context of sequences.

face_id: the identifier assigned to each face when multiple faces are present.

timestamp: the elapsed time in seconds during the processing of a video sequence.

confidence: the level of confidence the tracker has in the current landmark detection estimate.

success: a face has been detected in the frame and it has been tracked accurately.

gaze_0_x, gaze_0_y, gaze_0_z: the normalized eye gaze direction vector in world coordinates for eye 0, which is the eye on the left in the image.

gaze_1_x, gaze_1_y, gaze_1_z: the normalized eye gaze direction vector in world coordinates for eye 1, which is the eye on the right in the image.

gaze_angle_x, gaze_angle_y: the eye gaze direction, averaged for both eyes and expressed in world coordinates in radians, is converted into a format that is easier to use than gaze vectors.

eye_lmk_x_0, eye_lmk_x_1, ..., eye_lmk_x55, eye_lmk_y_1, ... eye_lmk_y_55: the pixel coordinates of 2D landmarks in the eye region.

eye_lmk_X_0, eye_lmk_X_1, ..., eye_lmk_X55, eye_lmk_Y_0, ..., eye_lmk_Z_55: the position of landmarks in the eye region in 3D space, measured in millimeters.

17 AUr: detect the activation intensity (from 1 to 5) of a particular facial muscle. These are: AU01_r, AU02_r, AU04_r, AU05_r, AU06_r, AU07_r, AU09_r, AU10_r, AU12_r, AU14_r, AU15_r, AU17_r, AU20_r, AU23_r, AU25_r, AU26_r, AU45_r.

18 AUc: Identify the activation of a particular muscle and note its presence (0 for absent, 1 for present). These are: AU01_c, AU02_c, AU04_c, AU05_c, AU06_c, AU07_c, AU09_c, AU10_c, AU12_c, AU14_c, AU15_c, AU17_c, AU20_c, AU23_c, AU25_c, AU26_c, AU28_c, AU45_c.

Speech_features_dataset.csv: This dataset is a robust assembly of speech features extracted from the recorded audio files using the OpenSMILE toolkit. The dataset is further enhanced with the integration of Absolute Category Rating (ACR) scores, which were assigned by each subject for every TC. These scores are embedded within the speech features of each subject. The dataset is exhaustive, encompassing a total of 1,911,900 speech features (calculated as 15 TCs x 20 subjects x 6373 speech features per subject). The total size of this dataset is 18.9 MB.

Each column carries a specific significance, which is elaborated as follows:

file: It presents the audio files from which the speech features were extracted.

Users: The users, identified from 1 to 20, participated in 15 video calls and evaluated the quality of these calls on a scale from 1 to 5.

Test Condition (TC): It enumerates the Test Condition numbers, which span from 1 to 15.

Delay [ms]: It refers to the time it takes for a signal to travel from one point to another and is represented at three levels: 0 ms (no delay), 500 ms (moderate delay), and 1000 ms (significant delay).

Jitter [ms]: It refers to the variability in delay and is represented at two levels: 0 ms (no jitter) and 500 ms (moderate jitter).

Packet Loss Rate [%]: It refers to the loss of data packets during transmission and is represented at three levels: 0% (no packet loss), 15% (moderate packet loss), and 30% (significant packet loss).

OpenSmile Speech Features Columns: This presents a detailed view of the speech features. Specifically, from column G to column IKI, it enumerates 6373 distinct speech features for each user under each TC of each processed audio file in .wav.

ACR Score: The Absolute Category Rating (ACR) scale ranges from 1, representing the lowest quality, to 5, indicating the highest quality evaluated by the subjects.

If you make use of this dataset, please consider citing the following publication:

Bingol G., Porcu, S., Floris, A., & Atzori, L. (2024). WebRTC-QoE: A dataset of QoE assessment of subjective scores, network impairments, and facial & speech features. Computer Networks, 244, 110356, doi: 10.1016/j.comnet.2024.110356.

BibTex format:

@article{bingol2024datasetwebrtc, title={WebRTC-QoE: A dataset of QoE assessment of subjective scores, network impairments, and facial & speech features}, author={Bingol, Gulnaziye and Porcu, Simone and Floris, Alessandro and Atzori, Luigi}, journal={Computer Networks}, volume={244}, pages={110356}, year={2024}, publisher={Elsevier}, doi = {https://doi.org/10.1016/j.comnet.2024.110356} }
h
webguard_test
huggingface.co
Updated Jul 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boyuan Zheng (2025). webguard_test [Dataset]. https://huggingface.co/datasets/boyuanzheng010/webguard_test
Explore at:
Dataset updated
Jul 27, 2025
Authors
Boyuan Zheng
Description
WebGuard Annotation Dataset

WebGuard Dataset This dataset contains web safety annotations for browser interactions. Each entry represents an annotated action on a website with a risk level. Fields:

url: The URL where the action was performed description: Description of the action (may be null) tagHead: HTML tag type of the target element Screenshot: Google Drive link to screenshot view Annotation: Review classification (SAFE/UNSAFE/LOW/HIGH/BUG/Bug) website: Website name/category… See the full description on the dataset page: https://huggingface.co/datasets/boyuanzheng010/webguard_test.
Production in industry - total (excluding construction)
db.nomics.world
service.tib.eu
+3more
Updated Jul 26, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DBnomics (2025). Production in industry - total (excluding construction) [Dataset]. https://db.nomics.world/Eurostat/teiis080
Explore at:
Dataset updated
Jul 26, 2025
Dataset provided by
Eurostathttps://ec.europa.eu/eurostat
Authors
DBnomics
Description
The industrial production index shows the output and activity of the industry sector. It measures changes in the volume of output on a monthly basis. Data are compiled according to the Statistical classification of economic activities in the European Community, (NACE Rev. 2, Eurostat). Industrial production is compiled as a "fixed base year Laspeyres type volume-index". The current base year is 2021 (Index 2021 = 100). The index is presented in calendar and seasonally adjusted form. Growth rates with respect to the previous month (M/M-1) are calculated from calendar and seasonally adjusted figures while growth rates with respect to the same month of the previous year (M/M-12) are calculated from calendar adjusted figures.
w
UK House Price Index: data downloads June 2023
gov.uk
Updated Aug 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HM Land Registry (2023). UK House Price Index: data downloads June 2023 [Dataset]. https://www.gov.uk/government/statistical-data-sets/uk-house-price-index-data-downloads-june-2023
Explore at:
Dataset updated
Aug 16, 2023
Dataset provided by
GOV.UK
Authors
HM Land Registry
Area covered
United Kingdom
Description
The UK House Price Index is a National Statistic.

Create your report

Download the full UK House Price Index data below, or use our tool to https://landregistry.data.gov.uk/app/ukhpi?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=tool&utm_term=9.30_16_08_23" class="govuk-link">create your own bespoke reports.

Download the data

Datasets are available as CSV files. Find out about republishing and making use of the data.

Google Chrome is blocking downloads of our UK HPI data files (Chrome 88 onwards). Please use another internet browser while we resolve this issue. We apologise for any inconvenience caused.

Full file

This file includes a derived back series for the new UK HPI. Under the UK HPI, data is available from 1995 for England and Wales, 2004 for Scotland and 2005 for Northern Ireland. A longer back series has been derived by using the historic path of the Office for National Statistics HPI to construct a series back to 1968.

Download the full UK HPI background file:

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/UK-HPI-full-file-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=full_fil&utm_term=9.30_16_08_23" class="govuk-link">UK HPI full file (CSV, 59.4MB)

Individual attributes files

If you are interested in a specific attribute, we have separated them into these CSV files:

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price&utm_term=9.30_16_08_23" class="govuk-link">Average price (CSV, 9.4MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-prices-Property-Type-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average_price_property_price&utm_term=9.30_16_08_23" class="govuk-link">Average price by property type (CSV, 28MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Sales-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=sales&utm_term=9.30_16_08_23" class="govuk-link">Sales (CSV, 4.9MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Cash-mortgage-sales-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=cash_mortgage-sales&utm_term=9.30_16_08_23" class="govuk-link">Cash mortgage sales (CSV, 7MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/First-Time-Buyer-Former-Owner-Occupied-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=FTNFOO&utm_term=9.30_16_08_23" class="govuk-link">First time buyer and former owner occupier (CSV, 6.5MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/New-and-Old-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=new_build&utm_term=9.30_16_08_23" class="govuk-link">New build and existing resold property (CSV, 17.1MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index&utm_term=9.30_16_08_23" class="govuk-link">Index (CSV, 6MB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Indices-seasonally-adjusted-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=index_season_adjusted&utm_term=9.30_16_08_23" class="govuk-link">Index seasonally adjusted (CSV, 207KB)

http://publicdata.landregistry.gov.uk/market-trend-data/house-price-index-data/Average-price-seasonally-adjusted-2023-06.csv?utm_medium=GOV.UK&utm_source=datadownload&utm_campaign=average-price_season_adjusted&utm_term=9.30_16_08_23" class="govuk-link">Average price seasonally adjusted</a
d
SSA
catalog.data.gov
gimi9.com
Updated Jun 8, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofchicago.org (2024). SSA [Dataset]. https://catalog.data.gov/dataset/ssa-655c3
Explore at:
Dataset updated
Jun 8, 2024
Dataset provided by
data.cityofchicago.org
Description
OUTDATED. See the current data at https://data.cityofchicago.org/d/kjav-iyuj - Special Service Areas (SSA) boundaries in Chicago. The Special Service Area program is a mechanism used to fund expanded services and programs through a localized property tax levy within contiguous industrial, commercial and residential areas. The enhanced services and programs are in addition to services and programs currently provided through the city. SSA-funded projects could include, but are not limited to, security services, area marketing and advertising assistance, promotional activities such as parades and festivals, or any variety of small scale capital improvements that could be supported through a modest property tax levy. The data can be viewed on the Chicago Data Portal with a web browser. However, to view or use the files outside of a web browser, you will need to use compression software and special GIS software, such as ESRI ArcGIS (shapefile) or Google Earth (KML or KMZ).
e
Vitis Organ ontology - Dataset - B2FIND
b2find.eudat.eu
Updated Feb 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Vitis Organ ontology - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/1b1c0211-0806-53bb-a58c-154321705159
Explore at:
Dataset updated
Feb 13, 2024
Description
The file lists the terms that are recommended to be used to describe the different parts of a grapevine plant. It also contains cross references with other plant ontology databases. This thesaurus was established in the framework of the European COST 858 "Integrape" action. The main objective was to establish a common vocabulary to make the data obtained in the experiences on grapevines FAIR (findable, accessible, interoperable and reusable). Other ontologies used: : http://browser.planteome.org/ , https://www.ebi.ac.uk/ols/ontologies/po, https://www.ebi.ac.uk/ols/ontologies/bto
e
OpenSlopeMap - The slope slope map
data.europa.eu
jpeg
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
OpenSlopeMap - The slope slope map [Dataset]. https://data.europa.eu/88u/dataset/a2769cb0-a096-4743-bcd3-8493ba1cea63
Explore at:
jpegAvailable download formats
Description
OpenSlopeMap offers a slope map for Austria and South Tyrol and should help to carry out alpine activities a little safer. The map can be downloaded online in the browser as well as for offline use (on Android, iOS, Linux, Windows, macOS).

Facebook

Twitter

Click to copy link

Link copied

Cite

Scale AI (2024). BrowserART [Dataset]. https://huggingface.co/datasets/ScaleAI/BrowserART

BrowserART

ScaleAI/BrowserART

Explore at:

26 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 11, 2024

Dataset authored and provided by

Scale AIhttps://scale.com/

License

Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically

Description

Refusal-Trained LLMs Are Easily Jailbroken As Browser Agents

Paper PDF

Homepage

Github

This project contains the behavior dataset in BrowserART, a red teaming test suit tailored particularly for browser agents.

  Abstract

For safety reasons, large language models (LLMs) are trained to refuse harmful user instructions, such as assisting dangerous activities. We study an open question in this work: Can the desired safety refusal, typically enforced in chat… See the full description on the dataset page: https://huggingface.co/datasets/ScaleAI/BrowserART.

Clear search

Close search

Google apps

Main menu

BrowserART

📣 Ad Click Prediction Dataset

Internet security and privacy related practices, frequency of deleting...

Register of green-roof activities: database application - Dataset - B2FIND

Swash Web Browsing Clickstream Data - 1.5M Worldwide Users - GDPR Compliant

Octo Browser: Your Ultimate Web Browsing Solution

Data from: E2EGit: A Dataset of End-to-End Web Tests in Open Source Projects...

SSA

Performance counter for biometrics authentication

Impervious Cover 2023

Persons participating in cultural or sport activities in the last 12 months...

Example dataset input for IgIDivA

Flash Eurobarometer 443 (e-Privacy) - Dataset - B2FIND

WebRTC-QoE: A Dataset of Quality of Experience in Audio-Video Communications...

webguard_test

Production in industry - total (excluding construction)

UK House Price Index: data downloads June 2023

Create your report

Download the data

Full file

Individual attributes files

SSA

Vitis Organ ontology - Dataset - B2FIND

OpenSlopeMap - The slope slope map

BrowserART

ScaleAI/BrowserART