100+ datasets found

c
Open Data User Guide
californianature.ca.gov
data.ca.gov
+5more
Updated Dec 28, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CA Nature Organization (2021). Open Data User Guide [Dataset]. https://www.californianature.ca.gov/datasets/open-data-user-guide
Explore at:
Dataset updated
Dec 28, 2021
Dataset authored and provided by
CA Nature Organization
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This guide will introduce the open data resources available in the CA Nature website and familiarize you with key features and capabilities of the site.CA Nature is an online Geographic Information System (or GIS), that collects a suite of publicly accessible interactive digital mapping tools and data.
Data from: Internet users
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Apr 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2021). Internet users [Dataset]. https://www.ons.gov.uk/businessindustryandtrade/itandinternetindustry/datasets/internetusers
Explore at:
xlsxAvailable download formats
Dataset updated
Apr 6, 2021
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Internet use in the UK annual estimates by age, sex, disability, ethnic group, economic activity and geographical location, including confidence intervals.
v
MTA Open Data User Personas
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
catalog.data.gov
Updated Sep 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ny.gov (2023). MTA Open Data User Personas [Dataset]. https://res1catalogd-o-tdatad-o-tgov.vcapture.xyz/dataset/mta-open-data-user-personas
Explore at:
Dataset updated
Sep 15, 2023
Dataset provided by
data.ny.gov
Description
User personas are a human-centered design tool that help open data program administrators design programs offerings for the full community open data users for maximum reach and impact. User personas help keep real people in mind when designing program offerings and can identify user segments in the open data community that have the potential to use open data to help solve problems. The Metropolitan Transportation Authority (MTA) is excited to share our open data user personas which were designed in collaboration with our existing open data community through multiple stakeholder workshops.
User trust in data use of mobile apps in China 2020, by app type
statista.com
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). User trust in data use of mobile apps in China 2020, by app type [Dataset]. https://www.statista.com/statistics/1111445/china-awareness-of-overused-user-permissions-required-by-mobile-apps-by-type/
Explore at:
Dataset updated
Jul 7, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
China
Description
As of the early of 2020, around ** percent of surveyed respondents in China were awared that many online shopping and e-commerce mobile apps overused user permissions. Social media and messenger apps were the second app category with a low user trust in data security.
User data collection in select mobile iOS streaming apps worldwide 2021, by...
statista.com
Updated Apr 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). User data collection in select mobile iOS streaming apps worldwide 2021, by type [Dataset]. https://www.statista.com/statistics/1305377/data-points-collected-streaming-apps/
Explore at:
Dataset updated
Apr 6, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2021
Area covered
Worldwide
Description
As of March 2021, YouTube was the video and streaming app found to collect the largest amount of data from global iOS users. The app collected a total of ** data points from each of the examined data types, respectively. The mobile app of video streaming service Amazon Prime Video followed, with ** data points collected across all the examined data types.
RICO dataset
kaggle.com
Updated Dec 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Onur Gunes (2021). RICO dataset [Dataset]. https://www.kaggle.com/datasets/onurgunes1993/rico-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 2, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Onur Gunes
Description
Context

Data-driven models help mobile app designers understand best practices and trends, and can be used to make predictions about design performance and support the creation of adaptive UIs. This paper presents Rico, the largest repository of mobile app designs to date, created to support five classes of data-driven applications: design search, UI layout generation, UI code generation, user interaction modeling, and user perception prediction. To create Rico, we built a system that combines crowdsourcing and automation to scalably mine design and interaction data from Android apps at runtime. The Rico dataset contains design data from more than 9.3k Android apps spanning 27 categories. It exposes visual, textual, structural, and interactive design properties of more than 66k unique UI screens. To demonstrate the kinds of applications that Rico enables, we present results from training an autoencoder for UI layout similarity, which supports query-by-example search over UIs.

Content

Rico was built by mining Android apps at runtime via human-powered and programmatic exploration. Like its predecessor ERICA, Rico’s app mining infrastructure requires no access to — or modification of — an app’s source code. Apps are downloaded from the Google Play Store and served to crowd workers through a web interface. When crowd workers use an app, the system records a user interaction trace that captures the UIs visited and the interactions performed on them. Then, an automated agent replays the trace to warm up a new copy of the app and continues the exploration programmatically, leveraging a content-agnostic similarity heuristic to efficiently discover new UI states. By combining crowdsourcing and automation, Rico can achieve higher coverage over an app’s UI states than either crawling strategy alone. In total, 13 workers recruited on UpWork spent 2,450 hours using apps on the platform over five months, producing 10,811 user interaction traces. After collecting a user trace for an app, we ran the automated crawler on the app for one hour.

Acknowledgements

UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN https://interactionmining.org/rico

Inspiration

The Rico dataset is large enough to support deep learning applications. We trained an autoencoder to learn an embedding for UI layouts, and used it to annotate each UI with a 64-dimensional vector representation encoding visual layout. This vector representation can be used to compute structurally — and often semantically — similar UIs, supporting example-based search over the dataset. To create training inputs for the autoencoder that embed layout information, we constructed a new image for each UI capturing the bounding box regions of all leaf elements in its view hierarchy, differentiating between text and non-text elements. Rico’s view hierarchies obviate the need for noisy image processing or OCR techniques to create these inputs.
iOS apps that declared collecting global users private data 2025
statista.com
Updated May 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). iOS apps that declared collecting global users private data 2025 [Dataset]. https://www.statista.com/statistics/1322669/ios-apps-declaring-collecting-data/
Explore at:
Dataset updated
May 20, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2025
Area covered
Worldwide
Description
As of January 2025, around 13.7 percent of paid iOS apps admitted collecting data from users engaging with their mobile products. In comparison, approximately 53 percent of free-to-download iOS apps reported they collect private data from users worldwide, while approximately 86 percent of paid apps have not declared whether they collect users' privacy data.
User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from...
data.success.ai
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2021). User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from 700M+ LinkedIn Dataset | Best Price Guarantee [Dataset]. https://data.success.ai/products/user-profiles-data-nonprofit-ngo-leaders-verified-globa-success-ai-dae8
Explore at:
Dataset updated
Oct 27, 2021
Dataset provided by
Area covered
Guatemala, Tajikistan, Australia, Netherlands, Montenegro, Réunion, Norway, Chad, Falkland Islands (Malvinas), Barbados
Description
Find User Profiles Data with LinkedIn profiles for nonprofit and NGO executives, managers, and administrators worldwide. Includes verified contact details, organizational affiliations, and professional histories. Best price guaranteed.
User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from...
datarade.ai
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2021). User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from 700M+ LinkedIn Dataset | Best Price Guarantee [Dataset]. https://datarade.ai/data-products/user-profiles-data-nonprofit-ngo-leaders-verified-globa-success-ai-dae8
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Oct 27, 2021
Dataset provided by
Area covered
Cabo Verde, Jersey, Brunei Darussalam, Kosovo, Guernsey, Benin, Mayotte, Saint Kitts and Nevis, New Caledonia, Nauru
Description
Success.ai’s User Profiles Data for Nonprofit and NGO Leaders provides businesses, organizations, and researchers with comprehensive access to global leaders in the nonprofit and NGO sectors. With data sourced from over 700 million verified LinkedIn profiles, this dataset includes actionable insights and contact details for executives, program managers, administrators, and decision-makers. Whether your goal is to partner with nonprofits, support global causes, or conduct research into social impact, Success.ai ensures your outreach is backed by accurate, enriched, and continuously updated data.

Why Choose Success.ai’s User Profiles Data for Nonprofit and NGO Leaders? Comprehensive Professional Profiles

Access verified LinkedIn profiles of nonprofit leaders, NGO managers, program directors, grant writers, and administrative executives. AI-driven validation ensures 99% accuracy for efficient communication and minimized bounce rates. Global Coverage Across Nonprofit Sectors

Includes profiles from nonprofits, humanitarian organizations, environmental groups, social enterprises, and advocacy organizations. Covers key markets across North America, Europe, APAC, South America, and Africa for global reach. Continuously Updated Dataset

Reflects real-time professional updates, organizational changes, and emerging trends in the nonprofit landscape to keep your targeting relevant and effective. Tailored for Nonprofit Insights

Enriched profiles include work histories, organizational affiliations, areas of expertise, and social impact projects for deeper engagement opportunities. Data Highlights: 700M+ Verified LinkedIn Profiles: Access a vast network of nonprofit and NGO professionals worldwide. 100M+ Work Emails: Direct communication with executives, managers, and decision-makers in the nonprofit sector. Enriched Organizational Data: Gain insights into leadership structures, mission focuses, and operational scales. Industry-Specific Segmentation: Target nonprofits focused on healthcare, education, environmental sustainability, human rights, and more. Key Features of the Dataset: Nonprofit and NGO Leader Profiles

Identify and connect with executives, program managers, fundraisers, and policy directors in global nonprofit and NGO sectors. Engage with individuals who drive decision-making and operational strategies for impactful organizations. Detailed Organizational Insights

Leverage firmographic data, including organizational size, mission, regional activity, and funding sources, to align with specific nonprofit goals. Advanced Filters for Precision Targeting

Refine searches by region, mission type, role, or organizational focus for tailored outreach. Customize campaigns based on social impact priorities, such as climate action, gender equality, or economic development. AI-Driven Enrichment

Enhanced datasets provide actionable insights into professional accomplishments, partnerships, and leadership achievements for targeted engagement. Strategic Use Cases: Partnership Development and Outreach

Identify nonprofits and NGOs for collaboration on social impact projects, sponsorships, or grant distribution. Build relationships with decision-makers driving advocacy, fundraising, and community initiatives. Donor Engagement and Fundraising

Target nonprofit leaders responsible for managing fundraising campaigns and donor relationships. Tailor outreach efforts to align with specific causes and funding priorities. Research and Analysis

Analyze leadership trends, mission focuses, and regional nonprofit activities to inform program design and funding strategies. Use insights to evaluate the effectiveness of social impact initiatives and partnerships. Recruitment and Talent Acquisition

Target HR professionals and administrators seeking qualified staff, consultants, or volunteers for nonprofits and NGOs. Offer talent solutions for specialized roles in program management, advocacy, and administration. Why Choose Success.ai? Best Price Guarantee

Access industry-leading, verified User Profiles Data at unmatched pricing to ensure your campaigns are cost-effective and impactful. Seamless Integration

Easily integrate verified nonprofit data into your CRM or marketing platforms with APIs or downloadable formats. AI-Validated Accuracy

Rely on 99% accuracy to minimize wasted outreach efforts and maximize engagement outcomes. Customizable Solutions

Tailor datasets to focus on specific nonprofit types, geographical regions, or areas of social impact to meet your strategic objectives. Strategic APIs for Enhanced Campaigns: Data Enrichment API

Update your internal records with verified nonprofit leader profiles to enhance targeting and engagement. Lead Generation API

Automate lead generation for a consistent pipeline of nonprofit and NGO professionals, scaling your outreach efforts efficiently. Success.ai’s User Profiles Data for Nonprofit and NGO Leader...
d
Resources for All Users
catalog.data.gov
datasets.ai
Updated Mar 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.wa.gov (2025). Resources for All Users [Dataset]. https://catalog.data.gov/dataset/resources-for-all-users
Explore at:
Dataset updated
Mar 14, 2025
Dataset provided by
data.wa.gov
Description
This page pulls together resources for various types of data.wa.gov users, including developers, publishers and data users.
C
BESDUI: A Benchmark for End-User Structured Data User Interfaces
dataverse.csuc.cat
application/gzip +4
Updated Mar 27, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roberto Garcia; Roberto Garcia; Rosa Gil; Rosa Gil; Juan Manuel Gimeno; Juan Manuel Gimeno; Eirik Bakke; Eirik Bakke; David R. Karger; David R. Karger (2023). BESDUI: A Benchmark for End-User Structured Data User Interfaces [Dataset]. http://doi.org/10.34810/data20
Explore at:
xls(38912), application/gzip(8949760), text/markdown(2280), text/markdown(12709), text/markdown(813), text/plain; charset=us-ascii(20132), text/markdown(1460), txt(6419), txt(880325), text/markdown(2652), text/markdown(2253), text/markdown(2092), text/markdown(11900), text/markdown(7962), text/markdown(11826), text/markdown(2958), text/markdown(1866), text/markdown(15613), text/markdown(3252), text/markdown(2059), text/markdown(8412), text/markdown(8620), text/markdown(1862)Available download formats
Unique identifier
https://doi.org/10.34810/data20
Dataset updated
Mar 27, 2023
Dataset provided by
CORA.Repositori de Dades de Recerca
Authors
Roberto Garcia; Roberto Garcia; Rosa Gil; Rosa Gil; Juan Manuel Gimeno; Juan Manuel Gimeno; Eirik Bakke; Eirik Bakke; David R. Karger; David R. Karger
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset funded by
Spanish Government
Description
Benchmark for End-User Structured Data User Interfaces (BESDUI) based on the Berlin SPARQL Benchmark (BSBM) but intended for benchmarking the user experience while exploring a structured dataset, not the performance of the query engine. BSBM is just used to provide the data to be explored. This is a cheap User Interface benchmark as it does not involve users but experts, who measure how many interaction steps are required to complete each of the benchmark tasks, if possible. This also facilitates comparing different tools without the bias that different end-user profiles might introduce. The way to measure this interaction steps and convert them to an estimate of the required time to complete a task is based on the Keystroke-Level Model (KLM)
h
OctoTools-Gradio-Demo-User-Data
huggingface.co
Updated Mar 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johnson Thomas (2025). OctoTools-Gradio-Demo-User-Data [Dataset]. https://huggingface.co/datasets/Johnyquest7/OctoTools-Gradio-Demo-User-Data
Explore at:
Dataset updated
Mar 23, 2025
Authors
Johnson Thomas
Description
Johnyquest7/OctoTools-Gradio-Demo-User-Data dataset hosted on Hugging Face and contributed by the HF Datasets community
Restricted-Use NCHS-VA Linked Data Files
catalog.data.gov
data.va.gov
+2more
Updated Aug 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Veterans Affairs (2025). Restricted-Use NCHS-VA Linked Data Files [Dataset]. https://catalog.data.gov/dataset/restricted-use-nchs-va-linked-data-files
Explore at:
Dataset updated
Aug 2, 2025
Dataset provided by
United States Department of Veterans Affairshttp://va.gov/
Description
National Center for Health Statistics (NCHS) population health survey data have been linked to VA administrative data containing information on military service history and VA benefit program utilization. The linked data can provide information on the health status and access to health care for VA program beneficiaries. In addition, researchers can compare the health of Veterans within and outside the VA health care system and compare Veterans to non-Veterans in the civilian non-institutionalized U.S. population. Due to confidentiality requirements, the Restricted-use NCHS-VA Linked Data Files are accessible only through the NCHS Research Data Center (RDC) Network. All interested researchers must submit a research proposal to the RDC. Please see the NCHS RDC website (https://www.cdc.gov/rdc/index.htm) for instructions on submitting a proposal.
User Profile for Ads Project in Tableau twbx
kaggle.com
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sanjana Murthy (2024). User Profile for Ads Project in Tableau twbx [Dataset]. https://www.kaggle.com/datasets/sanjanamurthy392/user-profile-for-ads/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 4, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sanjana Murthy
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
About Dataset:

Domain : Marketing Project: User Profiling and Segmentation Datasets: user_profile_for_ads Dataset Type: Excel Data Dataset Size: 16k+ record

KPI's: 1. Distribution of Key Demographic Variables like: a. Count of Age b. Count of Gender c. Count of Education Level d. Count of Income Level e. Count of Device Usage

Understanding Online Behavior like: a. Count of Time Spent Online (hrs/Weekday) b. Count of Time Spent Online (hrs/Weekend)

Ad Interaction Metrics: a. Count of likes and Reactions b. Count of click through rates (CTR) c. Count of Conversion Rate d. Count of Ad Interaction Time (secs) e. Count of Ad Interaction Time by Top Interests

Process: 1. Understanding the problem 2. Data Collection 3. Exploring and analyzing the data 4. Interpreting the results

This data contains bar chart, horizontal bars, circle, treemap, area chart, square, line chart, dashboard, slicers, navigation button.
d
Random People and User Behavior Data - Dataset - Datopian CKAN instance
demo.dev.datopian.com
Updated Apr 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Random People and User Behavior Data - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/morabeza-organization--random-people-and-user-behavior-data
Explore at:
Dataset updated
Apr 1, 2025
Description
This dataset comprises two resources. The first resource contains a list of random people with their date and place of birth. This can be used for demographics and hypothetical scenario testing. The second resource includes user behavior data on various device models, detailing app usage, screen time, and other metrics, which is beneficial for analyzing mobile usage patterns.
d
US B2B Marketing Data | 148MM B2B Marketing Contacts: Email, Phone + Social...
datarade.ai
.json, .csv, .xls
Updated Oct 16, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Salutary Data (2023). US B2B Marketing Data | 148MM B2B Marketing Contacts: Email, Phone + Social Media Marketing Data [Dataset]. https://datarade.ai/data-products/salutary-data-direct-marketing-data-62m-us-b2b-contacts-salutary-data
Explore at:
.json, .csv, .xlsAvailable download formats
Dataset updated
Oct 16, 2023
Dataset authored and provided by
Salutary Data
Area covered
United States
Description
Salutary Data is a boutique, B2B contact and company data provider that's committed to delivering high quality data for sales intelligence, lead generation, marketing, recruiting / HR, identity resolution, and ML / AI. Our database currently consists of 148MM+ highly curated B2B Contacts ( US only), along with over 4M+ companies, and is updated regularly to ensure we have the most up-to-date information.

We can enrich your in-house data ( CRM Enrichment, Lead Enrichment, etc.) and provide you with a custom dataset ( such as a lead list) tailored to your target audience specifications and data use-case. We also support large-scale data licensing to software providers and agencies that intend to redistribute our data to their customers and end-users.

What makes Salutary unique? - We offer our clients a truly unique, one-stop aggregation of the best-of-breed quality data sources. Our supplier network consists of numerous, established high quality suppliers that are rigorously vetted. - We leverage third party verification vendors to ensure phone numbers and emails are accurate and connect to the right person. Additionally, we deploy automated and manual verification techniques to ensure we have the latest job information for contacts. - We're reasonably priced and easy to work with.

Products: API Suite Web UI Full and Custom Data Feeds

Services: Data Enrichment - We assess the fill rate gaps and profile your customer file for the purpose of appending fields, updating information, and/or rendering net new “look alike” prospects for your campaigns. ABM Match & Append - Send us your domain or other company related files, and we’ll match your Account Based Marketing targets and provide you with B2B contacts to campaign. Optionally throw in your suppression file to avoid any redundant records. Verification (“Cleaning/Hygiene”) Services - Address the 2% per month aging issue on contact records! We will identify duplicate records, contacts no longer at the company, rid your email hard bounces, and update/replace titles or phones. This is right up our alley and levers our existing internal and external processes and systems.
r
Data from: SMARTBUY dataset
researchdata.se
gimi9.com
Updated Jan 29, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Karl Andersson; Damianos Gavalas (2021). SMARTBUY dataset [Dataset]. http://doi.org/10.5878/cg82-h783
Explore at:
(181405)Available download formats
Unique identifier
https://doi.org/10.5878/cg82-h783
Dataset updated
Jan 29, 2021
Dataset provided by
Luleå University of Technology
Authors
Karl Andersson; Damianos Gavalas
Time period covered
Sep 1, 2018 - Dec 31, 2018
Area covered
Greece
Description
The dataset represents a compilation of user interaction data generated by users who participated in the project's pilot activities in Patras, Greece. Data was generated by users in the SMARTBUY app and includes information about users, stores, product categories, professions, and events.

The dataset comprises the following data: - users: user account data for the Patras pilot users - occupation: all possible occupations that the pilot users could choose from - stores: stores which participated in the Patras pilot - sel_products_cat: products uploaded to the SMARTBUY platform by retailers - events: geo-stamped and time-stamped descriptions of a user interaction event (for instance, "user_id 67 rated product_id 722 with rating 4 at location x1 at datetime y1", or "user_id 91 denoted product_id 78 as favorite at location x2 at datetime y2") - event_types: all possible event types captured by the SMARTBUY platform ('Product searches', 'Product views', 'Featured product', 'Products near you views', 'Product photos browsed', 'Product ratings', 'Clicks on Read More button to read product reviews', 'Clicks on Open map button', 'Clicks on Send this info by email button', 'Products denoted as Favorite')

Privacy-sensitive information such as user names, retailer owner names and store names and keywords searched are anonymized.
g
Uploading Data - User Guide
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Uploading Data - User Guide [Dataset]. https://gimi9.com/dataset/uk_uploading-data-user-guide
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
🇬🇧 영국
Z
Data from: Investigating Online Art Search through Quantitative Behavioral...
data.niaid.nih.gov
zenodo.org
Updated Mar 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kouretsis, Alexandros (2023). Investigating Online Art Search through Quantitative Behavioral Data and Machine Learning Techniques - Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7741134
Explore at:
Dataset updated
Mar 16, 2023
Dataset provided by
Pergantis, Minas
Kouretsis, Alexandros
Giannakoulopoulos, Andreas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset includes the detailed values and scripts used to study behavioral aspects of users searching online for Art and Culture by analyzing quantitative data collected by the Art Boulevard search engine using machine learning techniques. This dataset is part of the core methodology, results and discussion sections of the research paper entitled "Investigating Online Art Search through Quantitative Behavioral Data and Machine Learning Techniques"
Z
AIT Log Data Set V1.1
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Oct 18, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Skopik, Florian (2023). AIT Log Data Set V1.1 [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3723082
Explore at:
Dataset updated
Oct 18, 2023
Dataset provided by
Rauber, Andreas
Hotwagner, Wolfgang
Wurzenberger, Markus
Landauer, Max
Skopik, Florian
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
AIT Log Data Sets

This repository contains synthetic log data suitable for evaluation of intrusion detection systems. The logs were collected from four independent testbeds that were built at the Austrian Institute of Technology (AIT) following the approach by Landauer et al. (2020) [1]. Please refer to the paper for more detailed information on automatic testbed generation and cite it if the data is used for academic publications. In brief, each testbed simulates user accesses to a webserver that runs Horde Webmail and OkayCMS. The duration of the simulation is six days. On the fifth day (2020-03-04) two attacks are launched against each web server.

The archive AIT-LDS-v1_0.zip contains the directories "data" and "labels".

The data directory is structured as follows. Each directory mail..com contains the logs of one web server. Each directory user- contains the logs of one user host machine, where one or more users are simulated. Each file log.log in the user- directories contains the activity logs of one particular user.

Setup details of the web servers:

OS: Debian Stretch 9.11.6

Services:

Apache2

PHP7

Exim 4.89

Horde 5.2.22

OkayCMS 2.3.4

Suricata

ClamAV

MariaDB

Setup details of user machines:

OS: Ubuntu Bionic

Services:

Chromium

Firefox

User host machines are assigned to web servers in the following way:

mail.cup.com is accessed by users from host machines user-{0, 1, 2, 6}

mail.spiral.com is accessed by users from host machines user-{3, 5, 8}

mail.insect.com is accessed by users from host machines user-{4, 9}

mail.onion.com is accessed by users from host machines user-{7, 10}

The following attacks are launched against the web servers (different starting times for each web server, please check the labels for exact attack times):

Attack 1: multi-step attack with sequential execution of the following attacks:

nmap scan

nikto scan

smtp-user-enum tool for account enumeration

hydra brute force login

webshell upload through Horde exploit (CVE-2019-9858)

privilege escalation through Exim exploit (CVE-2019-10149)

Attack 2: webshell injection through malicious cookie (CVE-2019-16885)

Attacks are launched from the following user host machines. In each of the corresponding directories user-, logs of the attack execution are found in the file attackLog.txt:

user-6 attacks mail.cup.com

user-5 attacks mail.spiral.com

user-4 attacks mail.insect.com

user-7 attacks mail.onion.com

The log data collected from the web servers includes

Apache access and error logs

syscall logs collected with the Linux audit daemon

suricata logs

exim logs

auth logs

daemon logs

mail logs

syslogs

user logs

Note that due to their large size, the audit/audit.log files of each server were compressed in a .zip-archive. In case that these logs are needed for analysis, they must first be unzipped.

Labels are organized in the same directory structure as logs. Each file contains two labels for each log line separated by a comma, the first one based on the occurrence time, the second one based on similarity and ordering. Note that this does not guarantee correct labeling for all lines and that no manual corrections were conducted.

Version history and related data sets:

AIT-LDS-v1.0: Four datasets, logs from single host, fine-granular audit logs, mail/CMS.

AIT-LDS-v1.1: Removed carriage return of line endings in audit.log files.

AIT-LDS-v2.0: Eight datasets, logs from all hosts, system logs and network traffic, mail/CMS/cloud/web.

Acknowledgements: Partially funded by the FFG projects INDICAETING (868306) and DECEPT (873980), and the EU project GUARD (833456).

If you use the dataset, please cite the following publication:

[1] M. Landauer, F. Skopik, M. Wurzenberger, W. Hotwagner and A. Rauber, "Have it Your Way: Generating Customized Log Datasets With a Model-Driven Simulation Testbed," in IEEE Transactions on Reliability, vol. 70, no. 1, pp. 402-415, March 2021, doi: 10.1109/TR.2020.3031317. [PDF]

Facebook

Twitter

Click to copy link

Link copied

Cite

CA Nature Organization (2021). Open Data User Guide [Dataset]. https://www.californianature.ca.gov/datasets/open-data-user-guide

Open Data User Guide

Explore at:

Dataset updated

Dec 28, 2021

Dataset authored and provided by

CA Nature Organization

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This guide will introduce the open data resources available in the CA Nature website and familiarize you with key features and capabilities of the site.CA Nature is an online Geographic Information System (or GIS), that collects a suite of publicly accessible interactive digital mapping tools and data.

Clear search

Close search

Google apps

Main menu

Open Data User Guide

Data from: Internet users

MTA Open Data User Personas

User trust in data use of mobile apps in China 2020, by app type

User data collection in select mobile iOS streaming apps worldwide 2021, by...

RICO dataset

Context

Content

Acknowledgements

Inspiration

iOS apps that declared collecting global users private data 2025

User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from...

User Profiles Data | Nonprofit & NGO Leaders | Verified Global Profiles from...

Resources for All Users

BESDUI: A Benchmark for End-User Structured Data User Interfaces

OctoTools-Gradio-Demo-User-Data

Restricted-Use NCHS-VA Linked Data Files

User Profile for Ads Project in Tableau twbx

Random People and User Behavior Data - Dataset - Datopian CKAN instance

US B2B Marketing Data | 148MM B2B Marketing Contacts: Email, Phone + Social...

Data from: SMARTBUY dataset

Uploading Data - User Guide

Data from: Investigating Online Art Search through Quantitative Behavioral...

AIT Log Data Set V1.1

Open Data User GuideSee More Versions

Open Data User Guide