36 datasets found
  1. A Dataset of over 500.000 commercial email newsletters, as collected by...

    • zenodo.org
    Updated Jun 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Max Maass; Max Maass; Stephan Schwär; Stephan Schwär; Matthias Hollick; Matthias Hollick (2022). A Dataset of over 500.000 commercial email newsletters, as collected by PrivacyMail.info [Dataset]. http://doi.org/10.5281/zenodo.6509751
    Explore at:
    Dataset updated
    Jun 13, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Max Maass; Max Maass; Stephan Schwär; Stephan Schwär; Matthias Hollick; Matthias Hollick
    Description

    This dataset contains the data from roughly two years of operating PrivacyMail.info, an Open Source Email privacy measurement platform. It contains slightly over 500.000 commercial newsletters, as crowdsourced by users of PrivacyMail.info. You can find the methodology discussed in our paper: Max Maass, Stephan Schwär, and Matthias Hollick. "Towards transparency in email tracking." Annual Privacy Forum, 2019. The source code can be found on github.com/privacymail/privacymail

    Please note that, due to its crowdsourced nature, this dataset is a sample of opportunity - it is not representative for all newsletters on the Internet, and likely contains biases based on how it was collected. Notably, German-language newsletters will likely be heavily over-represented.

    Dataset Structure
    The dataset is structured as follows: On the top level are folders describing the website the newsletter belongs to. Inside that folder are subfolders for each identity that was registered for that website. Inside each of these folders are a series of .eml files that represent the received email messages.

    Copyright and Licensing
    This dataset is set to non-public due to copyright concerns: The contents of the email messages are (presumably) protected by copyright in most jurisdictions. Most copyright doctrines contain exceptions for non-commercial research use - thus, we feel it is appropriate and acceptable to share the data on a case-by-case basis, the same way we did before shutting down PrivacyMail.info. When requesting access to the data, please briefly describe what research you want to conduct with it, and we will grant you access.

    We thus do not put any explicit license on this dataset. Please do not share the raw data publicly. We request that you cite the above-mentioned paper and this dataset in any publications that result from it.

  2. Enron Fraud Email Dataset

    • kaggle.com
    Updated Dec 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Advaith S Rao (2023). Enron Fraud Email Dataset [Dataset]. https://www.kaggle.com/datasets/advaithsrao/enron-fraud-email-dataset/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Advaith S Rao
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    In 2000, Enron was one of the largest companies in the United States. By 2002, it had collapsed into bankruptcy due to widespread corporate fraud. The data has been made public and presents a diverse set of email information ranging from internal, marketing emails to spam and fraud attempts.

    In the early 2000s, Leslie Kaelbling at MIT purchased the dataset and noted that, though the dataset contained scam emails, it also had several integrity problems. The dataset was updated later, but it becomes key to ensure privacy in the data while it is used to train a deep neural network model.

    Though the Enron Email Dataset contains over 500K emails, one of the problems with the dataset is the availability of labeled frauds in the dataset. Label annotation is done to detect an umbrella of fraud emails accurately. Since, fraud emails fall into several types such as Phishing, Financial, Romance, Subscription, and Nigerian Prince scams, there have to be multiple heuristics used to label all types of fraudulent emails effectively.

    To tackle this problem, heuristics have been used to label the Enron data corpus using email signals, and automated labeling has been performed using simple ML models on other smaller email datasets available online. These fraud annotation techniques are discussed in detail below.

    To perform fraud annotation on the Enron dataset as well as provide more fraud examples for modeling, two more fraud data sources have been used, Phishing Email Dataset: https://www.kaggle.com/dsv/6090437 Social Engineering Dataset: http://aclweb.org/aclwiki

    Label Annotation

    To label the Enron email dataset two signals are used to filter suspicious emails and label them into fraud and non-fraud classes. Automated ML labeling Email Signals

    Automated ML Labeling

    The following heuristics are used to annotate labels for Enron email data using the other two data sources,

    Phishing Model Annotation: A high-precision SVM model trained on the Phishing mails dataset, which is used to annotate the Phishing Label on the Enron Dataset.

    Social Engineering Model Annotation: A high-precision SVM model trained on the Social Engineering mails dataset, which is used to annotate the Social Engineering Label on the Enron Dataset.

    The two ML Annotator models use Term Frequency Inverse Document Frequency (TF-IDF) to embed the input text and make use of SVM models with Gaussian Kernel.

    If either of the models predicted that an email was a fraud, the mail metadata was checked for several email signals. If these heuristics meet the requirements of a high-probability fraud email, we label it as a fraud email.

    Email Signals

    Email Signal-based heuristics are used to filter and target suspicious emails for fraud labeling specifically. The signals used were,

    Person Of Interest: There is a publicly available list of email addresses of employees who were liable for the massive data leak at Enron. These user mailboxes have a higher chance of containing quality fraud emails.

    Suspicious Folders: The Enron data is dumped into several folders for every employee. Folders consist of inbox, deleted_items, junk, calendar, etc. A set of folders with a higher chance of containing fraud emails, such as Deleted Items and Junk.

    Sender Type: The sender type was categorized as ‘Internal’ and ‘External’ based on their email address.

    Low Communication: A threshold of 4 emails based on the table below was used to define Low Communication. A user qualifies as a Low-Comm sender if their emails are below this threshold. Mails sent from low-comm senders have been assigned with a high probability of being a fraud.

    Contains Replies and Forwards: If an email contains forwards or replies, a low probability was assigned for it to be a fraud email.

    Manual Inspection

    To ensure high-quality labels, the mismatch examples from ML Annotation have been manually inspected for Enron dataset relabeling.

    Dataset Breakdown

    FraudNon-Fraud
    2327445090

    Citations

    Enron Dataset Title: Enron Email Dataset URL: https://www.cs.cmu.edu/~enron/ Publisher: MIT, CMU Author: Leslie Kaelbling, William W. Cohen Year: 2015

    Phishing Email Detection Dataset Title: Phishing Email Detection URL: https://www.kaggle.com/dsv/6090437 DOI: 10.34740/KAGGLE/DSV/6090437 Publisher: Kaggle Author: Subhadeep Chakraborty Year: 2023

    CLAIR Fraud Email Collection Title: CLAIR collection of fraud email URL: http://aclweb.org/aclwiki Author: Radev, D. Year: 2008

  3. Email Dataset for Automatic Response Suggestion within a University

    • figshare.com
    pdf
    Updated Feb 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aditya Singh; Dibyendu Mishra; Sanchit Bansal; Vinayak Agarwal; Anjali Goyal; Ashish Sureka (2018). Email Dataset for Automatic Response Suggestion within a University [Dataset]. http://doi.org/10.6084/m9.figshare.5853057.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Feb 4, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Aditya Singh; Dibyendu Mishra; Sanchit Bansal; Vinayak Agarwal; Anjali Goyal; Ashish Sureka
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We have developed an application and solution approach (using this dataset) for automatically generating and suggesting short email responses to support queries in a university environment. Our proposed solution can be used as one tap or one click solution for responding to various types of queries raised by faculty members and students in a university. Office of Academic Affairs (OAA), Office of Student Life (OSL) and Information Technology Helpdesk (ITD) are support functions within a university which receives hundreds of email messages on the daily basis. Email communication is still the most frequently used mode of communication by these departments. A large percentage of emails received by these departments are frequent and commonly used queries or request for information. Responding to every query by manually typing is a tedious and time consuming task. Furthermore a large percentage of emails and their responses are consists of short messages. For example, an IT support department in our university receives several emails on Wi-Fi not working or someone needing help with a projector or requires an HDMI cable or remote slide changer. Another example is emails from students requesting the office of academic affairs to add and drop courses which they cannot do it directly. The dataset consists of emails messages which are generally received by ITD, OAA and OSL in Ashoka University. The dataset also contains intermediate results while conducting machine learning experiments.

  4. Spam email classification

    • kaggle.com
    Updated Sep 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yousef Mohamed (2023). Spam email classification [Dataset]. https://www.kaggle.com/datasets/yousefmohamed20/spam-email-detection
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 21, 2023
    Dataset provided by
    Kaggle
    Authors
    Yousef Mohamed
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    This is a csv file containing related information of 5157 randomly picked email files and their respective labels for spam or not-spam classification. The csv file contains 5157 rows, each row for each email. There are 2 columns. The first column indicates Email category (spam or ham), The second column indicates the email sent.

  5. d

    US Consumer Marketing Data - 269M+ Consumer Records - 95% Email and Direct...

    • datarade.ai
    Updated Jun 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Giant Partners (2022). US Consumer Marketing Data - 269M+ Consumer Records - 95% Email and Direct Dials Accuracy [Dataset]. https://datarade.ai/data-products/consumer-business-data-postal-phone-email-demographics-giant-partners
    Explore at:
    Dataset updated
    Jun 1, 2022
    Dataset authored and provided by
    Giant Partners
    Area covered
    United States of America
    Description

    Premium B2C Consumer Database - 269+ Million US Records

    Supercharge your B2C marketing campaigns with comprehensive consumer database, featuring over 269 million verified US consumer records. Our 20+ year data expertise delivers higher quality and more extensive coverage than competitors.

    Core Database Statistics

    Consumer Records: Over 269 million

    Email Addresses: Over 160 million (verified and deliverable)

    Phone Numbers: Over 76 million (mobile and landline)

    Mailing Addresses: Over 116,000,000 (NCOA processed)

    Geographic Coverage: Complete US (all 50 states)

    Compliance Status: CCPA compliant with consent management

    Targeting Categories Available

    Demographics: Age ranges, education levels, occupation types, household composition, marital status, presence of children, income brackets, and gender (where legally permitted)

    Geographic: Nationwide, state-level, MSA (Metropolitan Service Area), zip code radius, city, county, and SCF range targeting options

    Property & Dwelling: Home ownership status, estimated home value, years in residence, property type (single-family, condo, apartment), and dwelling characteristics

    Financial Indicators: Income levels, investment activity, mortgage information, credit indicators, and wealth markers for premium audience targeting

    Lifestyle & Interests: Purchase history, donation patterns, political preferences, health interests, recreational activities, and hobby-based targeting

    Behavioral Data: Shopping preferences, brand affinities, online activity patterns, and purchase timing behaviors

    Multi-Channel Campaign Applications

    Deploy across all major marketing channels:

    Email marketing and automation

    Social media advertising

    Search and display advertising (Google, YouTube)

    Direct mail and print campaigns

    Telemarketing and SMS campaigns

    Programmatic advertising platforms

    Data Quality & Sources

    Our consumer data aggregates from multiple verified sources:

    Public records and government databases

    Opt-in subscription services and registrations

    Purchase transaction data from retail partners

    Survey participation and research studies

    Online behavioral data (privacy compliant)

    Technical Delivery Options

    File Formats: CSV, Excel, JSON, XML formats available

    Delivery Methods: Secure FTP, API integration, direct download

    Processing: Real-time NCOA, email validation, phone verification

    Custom Selections: 1,000+ selectable demographic and behavioral attributes

    Minimum Orders: Flexible based on targeting complexity

    Unique Value Propositions

    Dual Spouse Targeting: Reach both household decision-makers for maximum impact

    Cross-Platform Integration: Seamless deployment to major ad platforms

    Real-Time Updates: Monthly data refreshes ensure maximum accuracy

    Advanced Segmentation: Combine multiple targeting criteria for precision campaigns

    Compliance Management: Built-in opt-out and suppression list management

    Ideal Customer Profiles

    E-commerce retailers seeking customer acquisition

    Financial services companies targeting specific demographics

    Healthcare organizations with compliant marketing needs

    Automotive dealers and service providers

    Home improvement and real estate professionals

    Insurance companies and agents

    Subscription services and SaaS providers

    Performance Optimization Features

    Lookalike Modeling: Create audiences similar to your best customers

    Predictive Scoring: Identify high-value prospects using AI algorithms

    Campaign Attribution: Track performance across multiple touchpoints

    A/B Testing Support: Split audiences for campaign optimization

    Suppression Management: Automatic opt-out and DNC compliance

    Pricing & Volume Options

    Flexible pricing structures accommodate businesses of all sizes:

    Pay-per-record for small campaigns

    Volume discounts for large deployments

    Subscription models for ongoing campaigns

    Custom enterprise pricing for high-volume users

    Data Compliance & Privacy

    VIA.tools maintains industry-leading compliance standards:

    CCPA (California Consumer Privacy Act) compliant

    CAN-SPAM Act adherence for email marketing

    TCPA compliance for phone and SMS campaigns

    Regular privacy audits and data governance reviews

    Transparent opt-out and data deletion processes

    Getting Started

    Our data specialists work with you to:

    1. Define your target audience criteria

    2. Recommend optimal data selections

    3. Provide sample data for testing

    4. Configure delivery methods and formats

    5. Implement ongoing campaign optimization

    Why We Lead the Industry

    With over two decades of data industry experience, we combine extensive database coverage with advanced targeting capabilities. Our commitment to data quality, compliance, and customer success has made us the preferred choice for businesses seeking superior B2C marketing performance.

    Contact our team to discuss your specific targeting requirements and receive custom pricing for your marketing objectives.

  6. o

    Spam Mail Prediction Dataset

    • opendatabay.com
    .undefined
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Datasimple (2025). Spam Mail Prediction Dataset [Dataset]. https://www.opendatabay.com/data/dataset/080d396c-0650-452b-9bef-d6bb3fa9366e
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 6, 2025
    Dataset authored and provided by
    Datasimple
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Fraud Detection & Risk Management
    Description

    The dataset consists of a collection of emails categorized into two major classes: spam and not spam. It is designed to facilitate the development and evaluation of spam detection or email filtering systems.

    The spam emails in the dataset are typically unsolicited and unwanted messages that aim to promote products or services, spread malware, or deceive recipients for various malicious purposes. These emails often contain misleading subject lines, excessive use of advertisements, unauthorized links, or attempts to collect personal information.

    The non-spam emails in the dataset are genuine and legitimate messages sent by individuals or organizations. They may include personal or professional communication, newsletters, transaction receipts, or any other non-malicious content.

    The dataset encompasses emails of varying lengths, languages, and writing styles, reflecting the inherent heterogeneity of email communication. This diversity aids in training algorithms that can generalize well to different types of emails, making them robust against different spammer tactics and variations in non-spam email content.

    Original Data Source: Spam Mail Prediction Dataset

  7. email-EU

    • zenodo.org
    • opendatalab.com
    • +1more
    json
    Updated Nov 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Landry; Nicholas Landry (2023). email-EU [Dataset]. http://doi.org/10.5281/zenodo.10155823
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 19, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Landry; Nicholas Landry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This hypergraph dataset was generated using email data from a large European research institution for a period from October 2003 to May 2005 (18 months). Information about all incoming and outgoing emails between members of the research institution has been anonymized. The e-mails only represent communication between institution members (the core), and the dataset does not contain incoming messages from or outgoing messages to the rest of the world.

    This is a temporal hypergraph dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. Timestamps are in ISO8601 format. In email communication, messages can be sent to multiple recipients. In this dataset, nodes are email addresses at a European research institution. The original data source only contains directed temporal edge tuples (sender, receiver, timestamp), where timestamps are recorded at 1-second resolution. The hyperedges are undirected and consist of a sender and all receivers grouped such that the email between the sender and each receiver has the same timestamp.

    Statistics

    Some basic statistics of this dataset are:

    • number of nodes: 1,005
    • number of timestamped hyperedges: 235,263
    • distribution of the connected components:

    Component Size, Number

    • 986, 1
    • 1, 19

    Source of original data

    Source: email-Eu dataset

    References

    If you use this dataset, please cite these references:

  8. Enron Email Time-Series Network

    • zenodo.org
    • explore.openaire.eu
    csv
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst (2020). Enron Email Time-Series Network [Dataset]. http://doi.org/10.5281/zenodo.1342353
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst; Volodymyr Miz; Benjamin Ricaud; Pierre Vandergheynst
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We use the Enron email dataset to build a network of email addresses. It contains 614586 emails sent over the period from 6 January 1998 until 4 February 2004. During the pre-processing, we remove the periods of low activity and keep the emails from 1 January 1999 until 31 July 2002 which is 1448 days of email records in total. Also, we remove email addresses that sent less than three emails over that period. In total, the Enron email network contains 6 600 nodes and 50 897 edges.

    To build a graph G = (V, E), we use email addresses as nodes V. Every node vi has an attribute which is a time-varying signal that corresponds to the number of emails sent from this address during a day. We draw an edge eij between two nodes i and j if there is at least one email exchange between the corresponding addresses.

    Column 'Count' in 'edges.csv' file is the number of 'From'->'To' email exchanges between the two addresses. This column can be used as an edge weight.

    The file 'nodes.csv' contains a dictionary that is a compressed representation of time-series. The format of the dictionary is Day->The Number Of Emails Sent By the Address During That Day. The total number of days is 1448.

    'id-email.csv' is a file containing the actual email addresses.

  9. Arabic Phishing and Legitimate emails - Samples

    • kaggle.com
    Updated Dec 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rian Sh. Al-yozbaky (2024). Arabic Phishing and Legitimate emails - Samples [Dataset]. https://www.kaggle.com/datasets/rianshalyozbaky/arabic-phishing-and-legitimate-emails-samples
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Rian Sh. Al-yozbaky
    Description

    Dataset of Phishing and Legitimate emails This dataset includes 1250 email messages, divided into two parts: The first are phishing emails, which contain 250 email messages. The second is legitimate email and includes 1000 email messages.

    This dataset was created by gathering more than 4,000 email messages from multiple international databases, processing, and analyzing them. The best examples that might be utilized in cybersecurity research, particularly in preventing and recognizing phishing messages, were chosen because some of them are not appropriate for testing.

    Please be aware that the file contains the full dataset.

  10. P

    How to Login Roadrunner Account? | A Complete Guide Dataset

    • paperswithcode.com
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). How to Login Roadrunner Account? | A Complete Guide Dataset [Dataset]. https://paperswithcode.com/dataset/how-to-login-roadrunner-account-a-complete
    Explore at:
    Dataset updated
    Jun 17, 2025
    Description

    (Toll Free) Number +1-341-900-3252 Email remains a vital communication tool for both personal and professional use. For those who have been using (Toll Free) Number +1-341-900-3252 Time Warner Cable services, the Roadrunner email service is a familiar name. (Toll Free) Number +1-341-900-3252 Now managed by Spectrum, the Roadrunner email platform is still active and accessible for users with existing accounts. However, to access all its features and ensure smooth communication, it's essential to understand how to set up, use, and manage your Roadrunner login account effectively (Toll Free) Number +1-341-900-3252 (Toll Free) Number +1-341-900-3252 .

    What Is a Roadrunner Login Account? A Roadrunner login account is the email account created through Time Warner Cable’s Roadrunner service, now handled by Spectrum. Although new Roadrunner accounts are no longer issued, existing users can continue to access their email using the credentials associated with their original account.

    The Roadrunner login account functions like any other email service, allowing users to send, receive, organize, and store emails. It's especially popular among long-time customers who prefer the simplicity and reliability of the interface.

    Setting Up a Roadrunner Login Account For users with an existing Roadrunner email address, setting up access on new devices or email clients is straightforward. While you cannot create a new Roadrunner login account, here’s how to set up your existing account on various platforms:

    (Toll Free) Number +1-341-900-3252

    On Web Browser Open your preferred browser.

    Navigate to the Spectrum or legacy Roadrunner email portal.

    Enter your Roadrunner email address and password.

    Click "Sign In" to access your inbox.

    On Email Clients (Outlook, Thunderbird, etc.) To configure your Roadrunner login account on email software, you need both incoming and outgoing server details:

    Incoming Server (IMAP or POP3): Server: mail.twc.com Port: 993 (IMAP), 110 (POP3) Security: SSL/TLS

    Outgoing Server (SMTP): Server: mail.twc.com Port: 587 Security: STARTTLS

    Make sure to enter your full email address and password when setting up.

    Benefits of Using a Roadrunner Login Account While Roadrunner email may seem old-school to some, it still offers various features that benefit users:

    (Toll Free) Number +1-341-900-3252

    Reliable Service Users report that their Roadrunner login account remains stable and reliable for both sending and receiving emails.

    Simple Interface Unlike many modern, cluttered email interfaces, Roadrunner email is known for its clean and user-friendly layout.

    Storage and Access Roadrunner provides decent storage limits and access across various devices including desktops, laptops, and mobile phones.

    (Toll Free) Number +1-341-900-3252

    Spam Filtering The spam detection system for Roadrunner login accounts helps keep your inbox clean and secure.

    Troubleshooting Roadrunner Login Issues If you're having trouble accessing your Roadrunner login account, you're not alone. Below are some of the most common issues and how to fix them:

    Forgot Password If you forget your Roadrunner password, visit the Spectrum account recovery page. You’ll need to verify your identity and then reset your password.

    Incorrect Credentials Double-check the spelling of your email address and password. Also, make sure Caps Lock isn’t turned on, which can cause login errors.

    Locked Account Too many failed login attempts may result in your Roadrunner login account being temporarily locked. Waiting a few minutes or resetting the password usually resolves this.

    Server Settings If your email client isn’t working, make sure you're using the correct IMAP/POP and SMTP settings as listed above.

    (Toll Free) Number +1-341-900-3252

    Managing Your Roadrunner Login Account Properly managing your Roadrunner login account ensures it stays secure and functional over time. Here are a few tips:

    Update Recovery Options Make sure your account has a valid recovery email or phone number, so you can regain access if needed.

    Regular Password Changes For security purposes, it’s advisable to change your password every few months.

    Organize Emails Use folders and filters to keep your inbox organized. This will help you manage important messages more effectively.

    Delete Unnecessary Emails Clearing old or unwanted messages can help you stay within storage limits and improve overall account performance.

    Keeping Your Roadrunner Login Account Secure With cybersecurity threats on the rise, protecting your Roadrunner login account is more important than ever:

    Use a strong and unique password combining letters, numbers, and symbols.

    (Toll Free) Number +1-341-900-3252

    Avoid using public Wi-Fi to access your email unless you're using a VPN.

    Enable two-step authentication if available through Spectrum.

    Never click suspicious links or download attachments from unknown senders.

    Accessing Roadrunner Email on Mobile Devices To use your Roadrunner login account on a smartphone or tablet:

    Go to your device’s email app and add a new account.

    Choose "Other" or "Manual Setup" if prompted.

    Enter your Roadrunner email address and password.

    Input the server settings manually as previously mentioned.

    Save and sync.

    (Toll Free) Number +1-341-900-3252

    Once configured, you can send and receive emails from your mobile device just like you would from a computer. (Toll Free) Number +1-341-900-3252

    Final Thoughts Though it may not be as modern as Gmail or Outlook, the Roadrunner login account continues to serve many long-time users with reliability and simplicity. Whether you’re checking email on your desktop or syncing it with your mobile device, understanding how to manage and secure your Roadrunner account is key to staying connected and protected. (Toll Free) Number +1-341-900-3252

  11. a

    Email.cz image spam dataset v1

    • academictorrents.com
    bittorrent
    Updated Dec 30, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vit Listik (2019). Email.cz image spam dataset v1 [Dataset]. https://academictorrents.com/details/06f2389082e9c034fa4a73aaee00131a27e388b6
    Explore at:
    bittorrent(2660566545)Available download formats
    Dataset updated
    Dec 30, 2019
    Dataset authored and provided by
    Vit Listik
    License

    https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified

    Description

    The problem with email image spam classification is known from the year 2005. There are several approaches to this task. Lately, those approaches use convolutional neural networks (CNN). We propose a novel approach to the image spam classification task. Our approach is based on CNN and transfer learning, namely Resnet v1 used for semantic feature extraction and one layer Feedforward Neural Network for classification. We have shown that this approach can achieve state-of-the-art performance on publicly available datasets. 99% F1-score on two datasets [dredze 2007, Princeton] and 96% F1-score on the combination of these datasets. Due to the availability of GPUs, this approach may be used for just-in-time classification in anti-spam systems handling huge amounts of emails. We have observed also that mentioned publicly available datasets are no longer representative. We overcame this limitation by using a much richer dataset from a one-week long real traffic of the freemail provider Email.

  12. h

    FinePersonas-Synthetic-Email-Conversations

    • huggingface.co
    Updated Sep 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Argilla (2024). FinePersonas-Synthetic-Email-Conversations [Dataset]. https://huggingface.co/datasets/argilla/FinePersonas-Synthetic-Email-Conversations
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 23, 2024
    Dataset authored and provided by
    Argilla
    License

    https://choosealicense.com/licenses/llama3.1/https://choosealicense.com/licenses/llama3.1/

    Description

    FinePersonas Synthetic Email Conversations

    FinePersonas Synthetic Email Conversations is a dataset containing around 115k conversations via email between two personas from the argilla/FinePersonas-v0.1. Conversations were generated using NousResearch/Hermes-3-Llama-3.1-70B.

      🗞️ News
    

    [10/16/2024] New subsets: added two new subsets unfriendly_email_conversations and unprofessional_email_conversations.

      How were the conversations generated?… See the full description on the dataset page: https://huggingface.co/datasets/argilla/FinePersonas-Synthetic-Email-Conversations.
    
  13. d

    Best Healthcare Solutions Provider | Healthcare Data | Physician Data by...

    • datarade.ai
    Updated Jun 21, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Infotanks Media (2021). Best Healthcare Solutions Provider | Healthcare Data | Physician Data by Infotanks Media [Dataset]. https://datarade.ai/data-products/best-healthcare-solutions-provider-healthcare-data-physic-infotanks-media
    Explore at:
    Dataset updated
    Jun 21, 2021
    Dataset authored and provided by
    Infotanks Media
    Area covered
    Mexico, Saint Helena, Wallis and Futuna, Sri Lanka, French Guiana, Ethiopia, Colombia, Malta, Latvia, Korea (Republic of)
    Description

    "Facilitate marketing campaigns with the healthcare email list from Infotanks Media that includes doctors, healthcare professionals, NPI numbers, physician specialties, and more. Buy targeted email lists of healthcare professionals and connect with doctors, specialists, and other healthcare professionals to promote your products and services. Hyper personalize campaigns to increase engagement for better chances of conversion. Reach out to our data experts today! Access 1.2 million physician contact database with 150+ specialities including chiropractors, cardiologists, psychiatrists, and radiologists among others. Get ready to integrate healthcare email lists from Infotanks Media to start email marketing campaigns through any CRM and ESP. Contact us right now! Ensure guaranteed lead generation with segmented email marketing strategies for specialists, departments, and more. Make the best use of target marketing to progress and move closer to your business goals with email listing services for healthcare professionals. Infotanks Media provides 100% verified healthcare email lists with the highest email deliverability guarantee of 95%. Get a custom quote today as per your requirements. Enhance your marketing campaigns with healthcare email lists from 170+ countries to build your global outreach. Request your free sample today! Personalize your business communication and interactions to maximize conversion rates with high quality contact data. Grow your business network in your target markets from anywhere in the world with a guaranteed 95% contact accuracy of the healthcare email lists from Infotanks Media. Contact data experts at Infotanks Media from the healthcare industry to get a quick sample for free. Write to us or call today!

    Hyper target within and outside your desired markets with GDPR and CAN-SPAM compliant healthcare email lists that get integrated into your CRM and ESPs. Balance out the sales and marketing efforts by aligning goals using email lists from the healthcare industry. Build strong business relationships with potential clients through personalized campaigns. Call Infotanks Media for a free consultation. Explore new geographies and target markets with a focused approach using healthcare email lists. Align your sales teams and marketing teams through personalized email marketing campaigns to ensure they accomplish business goals together. Add value and grow revenue to take your business to the next level of success. Double up your business and revenue growth with email lists of healthcare professionals. Send segmented campaigns to monitor behaviors and understand the purchasing habits of your potential clients. Send follow up nurturing email marketing campaigns to attract your potential clients to become converted customers. Close deals sooner with detailed information of your prospects using the healthcare email list from Infotanks Media. Reach healthcare professionals on their preferred platform of communication with the email list of healthcare professionals. Identify, capture, explore, and grow in your target markets anywhere in the world with a fully verified, validated, and compliant email database of healthcare professionals. Move beyond the traditional approach and automate sales cycles with buying triggers sent through email marketing campaigns. Use the healthcare email list from Infotanks Media to engage with your targeted potential clients and get them to respond. Increase email marketing campaign response rate to convert better! Reach out to Infotanks Media to customize your healthcare email lists. Call today!"

  14. 4367x PII Label-Specific Essays (by 7b Models)

    • kaggle.com
    Updated Feb 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentin Werner (2024). 4367x PII Label-Specific Essays (by 7b Models) [Dataset]. https://www.kaggle.com/datasets/valentinwerner/pii-label-specific-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Valentin Werner
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Evaluation of my dataset with my .915 baseline:

    F5 score = .690 - Recall = .692, Precision = .639

    Distribution of data:

    • 843x Address (ca. 500 US)
    • 496x Names (Incl. Middle Names, Pronounciation or Nicknames)
    • 537x Userid
    • 704x Username (Incl. Name)
    • 531x Phone
    • 755x Email (Incl. Name)
    • 501x URL

    See linked notebook for generation.

    Remarks on labels:

    EMAIL:

    1. Email is always based on name, but random domains
    2. Prompt was to also write about their favourite book, they are heavily favouring “to kill a mockingbird”

    PHONE:

    1. Generated from multiple countries for diversity
    2. Labelling of phone numbers should only include the full number (not parts of it)

    ADDRESSES:

    1. From multiple countries for diversity
    2. For US Addresses, State abbreviations are mapped to full name, so these are labeled as well
    3. Addresses are only labelled as such if it starts with either of the first two words of the full address (e.g., if house number misses for us address, it is still labelled)

    NAMES:

    1. Middle names are sometimes generated, either separeted with " " or "-"
    2. Pronounciations and nicknames were generated and labelled
    3. However, “t’oma” as in my name Thomas is derived from the arameic word “t’oma” was not tagged. Let me know if this is wrong. They are relatively easy to identify in the names dataset by looking for “derived from”

    URL:

    1. Short domains, full websites and full URIs

    USERID:

    1. Mostly random generated string, number combination - not oriented on other formats
    2. Can mostly easily be augmented by replacing the userid
    3. Userid is sometimes split in text into parts - these splits are not labelled (not sure if this is right)

    USERNAMES:

    1. either generated based on name OR animal+birthyear OR colour+fruit
  15. Dataset analysing the crossover between archivists, recordkeeping...

    • figshare.com
    xlsx
    Updated Aug 29, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Grant (2018). Dataset analysing the crossover between archivists, recordkeeping professionals and research data management using email list data [Dataset]. http://doi.org/10.6084/m9.figshare.7007903.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Rebecca Grant
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset relates to research on the connections between archives professionals and research data management. It consists of a single Excel spreadsheet with four sheets, containing an analysis of emails sent to two email discussions lists: Archives-NRA (Archivists, conservators and records managers) and Research-Dataman. The coded dataset and a list of codes used for each mailing list is provided.The two datasets were downloaded from the JiscMail Email Discussion list archives on 27 July 2018. The Archives-NRA dataset was compiled by conducting a free text search for "research data" on the mailing list's archives, and the metadata for every search result was downloaded and coded (144 metadata records in total). The resulting coded dataset demonstrates how frequently archivists and records professionals discuss research data on the Archives-NRA list, the topics which are discussed, and an increase in these discussions over time. The Research-Dataman dataset was compiled by conducting a free text search for "archivist" on the mailing list's archives, and the metadata for every search result was downloaded and coded (197 emails total). The resulting coded dataset demonstrates how frequently data management professionals seek the advice of archivists or advertise vacancies for archivists, and how often archivists email this mailing list. The names and email addresses of the mailing list participants have been redacted for privacy reasons but the original full-text emails can be accessed by members of the respective mailing lists using the URLs provided in the dataset.

  16. d

    City of Tempe 2023 Business Survey Data

    • catalog.data.gov
    • s.cnmilf.com
    • +10more
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2024). City of Tempe 2023 Business Survey Data [Dataset]. https://catalog.data.gov/dataset/city-of-tempe-2023-business-survey-data
    Explore at:
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    City of Tempe
    Area covered
    Tempe
    Description

    These data include the individual responses for the City of Tempe Annual Business Survey conducted by ETC Institute. These data help determine priorities for the community as part of the City's on-going strategic planning process. Averaged Business Survey results are used as indicators for city performance measures. The performance measures with indicators from the Business Survey include the following (as of 2023):1. Financial Stability and Vitality5.01 Quality of Business ServicesThe location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city.Additional InformationSource: Business SurveyContact (author): Adam SamuelsContact E-Mail (author): Adam_Samuels@tempe.govContact (maintainer): Contact E-Mail (maintainer): Data Source Type: Excel tablePreparation Method: Data received from vendor after report is completedPublish Frequency: AnnualPublish Method: ManualData DictionaryMethods:The survey is mailed to a random sample of businesses in the City of Tempe. Follow up emails and texts are also sent to encourage participation. A link to the survey is provided with each communication. To prevent people who do not live in Tempe or who were not selected as part of the random sample from completing the survey, everyone who completed the survey was required to provide their address. These addresses were then matched to those used for the random representative sample. If the respondent’s address did not match, the response was not used.To better understand how services are being delivered across the city, individual results were mapped to determine overall distribution across the city.Processing and Limitations:The location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city.The data are used by the ETC Institute in the final published PDF report.

  17. P

    Dataset of Grouped Commit Author IDs after Identity Resolution Dataset

    • paperswithcode.com
    • zenodo.org
    Updated May 5, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Dataset of Grouped Commit Author IDs after Identity Resolution Dataset [Dataset]. https://paperswithcode.com/dataset/dataset-of-grouped-commit-author-ids-after
    Explore at:
    Dataset updated
    May 5, 2021
    Description

    This Dataset contains the IDs of 5,427,024 commit authors who have created commits in git version control system, and have more than 1 ID in git. It is a compressed CSV file (separated by ; ) with 14,861,538 author IDs, where the first column is the group ID, which is same as the first (randomly selected) author ID of the group, and the second column is the author ID that is part of the group. If an author was found to have 2 different IDs: I1, I2, then it is recorded in the file in 2 separate lines, with the lines being I1;I1 and I1;I2, i.e. the first column is the group identifier, which is one of the IDs in a group, and the second column contains the different author IDs in separate lines. This data set contains email addresses for various Git author's, but the '@' within the email address has been replaced with a '#'.

  18. t

    CommunitySurvey2023weighted

    • data.tempe.gov
    • data-academy.tempe.gov
    • +6more
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2024). CommunitySurvey2023weighted [Dataset]. https://data.tempe.gov/datasets/tempegov::communitysurvey2023weighted
    Explore at:
    Dataset updated
    Jan 2, 2024
    Dataset authored and provided by
    City of Tempe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    These data include the individual responses for the City of Tempe Annual Community Survey conducted by ETC Institute. This dataset has two layers and includes both the weighted data and unweighted data. Weighting data is a statistical method in which datasets are adjusted through calculations in order to more accurately represent the population being studied. The weighted data are used in the final published PDF report.These data help determine priorities for the community as part of the City's on-going strategic planning process. Averaged Community Survey results are used as indicators for several city performance measures. The summary data for each performance measure is provided as an open dataset for that measure (separate from this dataset). The performance measures with indicators from the survey include the following (as of 2023):1. Safe and Secure Communities1.04 Fire Services Satisfaction1.06 Crime Reporting1.07 Police Services Satisfaction1.09 Victim of Crime1.10 Worry About Being a Victim1.11 Feeling Safe in City Facilities1.23 Feeling of Safety in Parks2. Strong Community Connections2.02 Customer Service Satisfaction2.04 City Website Satisfaction2.05 Online Services Satisfaction Rate2.15 Feeling Invited to Participate in City Decisions2.21 Satisfaction with Availability of City Information3. Quality of Life3.16 City Recreation, Arts, and Cultural Centers3.17 Community Services Programs3.19 Value of Special Events3.23 Right of Way Landscape Maintenance3.36 Quality of City Services4. Sustainable Growth & DevelopmentNo Performance Measures in this category presently relate directly to the Community Survey5. Financial Stability & VitalityNo Performance Measures in this category presently relate directly to the Community SurveyMethods:The survey is mailed to a random sample of households in the City of Tempe. Follow up emails and texts are also sent to encourage participation. A link to the survey is provided with each communication. To prevent people who do not live in Tempe or who were not selected as part of the random sample from completing the survey, everyone who completed the survey was required to provide their address. These addresses were then matched to those used for the random representative sample. If the respondent’s address did not match, the response was not used. To better understand how services are being delivered across the city, individual results were mapped to determine overall distribution across the city. Additionally, demographic data were used to monitor the distribution of responses to ensure the responding population of each survey is representative of city population. Processing and Limitations:The location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city. The weighted data are used by the ETC Institute, in the final published PDF report.The 2023 Annual Community Survey report is available on data.tempe.gov or by visiting https://www.tempe.gov/government/strategic-management-and-innovation/signature-surveys-research-and-dataThe individual survey questions as well as the definition of the response scale (for example, 1 means “very dissatisfied” and 5 means “very satisfied”) are provided in the data dictionary.Additional InformationSource: Community Attitude SurveyContact (author): Adam SamuelsContact E-Mail (author): Adam_Samuels@tempe.govContact (maintainer): Contact E-Mail (maintainer): Data Source Type: Excel tablePreparation Method: Data received from vendor after report is completedPublish Frequency: AnnualPublish Method: ManualData Dictionary

  19. t

    City of Tempe 2023 Community Survey Data

    • data.tempe.gov
    • data-academy.tempe.gov
    • +8more
    Updated Jan 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2024). City of Tempe 2023 Community Survey Data [Dataset]. https://data.tempe.gov/maps/cacfb4bb56244552a6587fd2aa3fb06d
    Explore at:
    Dataset updated
    Jan 2, 2024
    Dataset authored and provided by
    City of Tempe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    These data include the individual responses for the City of Tempe Annual Community Survey conducted by ETC Institute. This dataset has two layers and includes both the weighted data and unweighted data. Weighting data is a statistical method in which datasets are adjusted through calculations in order to more accurately represent the population being studied. The weighted data are used in the final published PDF report.These data help determine priorities for the community as part of the City's on-going strategic planning process. Averaged Community Survey results are used as indicators for several city performance measures. The summary data for each performance measure is provided as an open dataset for that measure (separate from this dataset). The performance measures with indicators from the survey include the following (as of 2023):1. Safe and Secure Communities1.04 Fire Services Satisfaction1.06 Crime Reporting1.07 Police Services Satisfaction1.09 Victim of Crime1.10 Worry About Being a Victim1.11 Feeling Safe in City Facilities1.23 Feeling of Safety in Parks2. Strong Community Connections2.02 Customer Service Satisfaction2.04 City Website Satisfaction2.05 Online Services Satisfaction Rate2.15 Feeling Invited to Participate in City Decisions2.21 Satisfaction with Availability of City Information3. Quality of Life3.16 City Recreation, Arts, and Cultural Centers3.17 Community Services Programs3.19 Value of Special Events3.23 Right of Way Landscape Maintenance3.36 Quality of City Services4. Sustainable Growth & DevelopmentNo Performance Measures in this category presently relate directly to the Community Survey5. Financial Stability & VitalityNo Performance Measures in this category presently relate directly to the Community SurveyMethods:The survey is mailed to a random sample of households in the City of Tempe. Follow up emails and texts are also sent to encourage participation. A link to the survey is provided with each communication. To prevent people who do not live in Tempe or who were not selected as part of the random sample from completing the survey, everyone who completed the survey was required to provide their address. These addresses were then matched to those used for the random representative sample. If the respondent’s address did not match, the response was not used. To better understand how services are being delivered across the city, individual results were mapped to determine overall distribution across the city. Additionally, demographic data were used to monitor the distribution of responses to ensure the responding population of each survey is representative of city population. Processing and Limitations:The location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city. The weighted data are used by the ETC Institute, in the final published PDF report.The 2023 Annual Community Survey report is available on data.tempe.gov or by visiting https://www.tempe.gov/government/strategic-management-and-innovation/signature-surveys-research-and-dataThe individual survey questions as well as the definition of the response scale (for example, 1 means “very dissatisfied” and 5 means “very satisfied”) are provided in the data dictionary.Additional InformationSource: Community Attitude SurveyContact (author): Adam SamuelsContact E-Mail (author): Adam_Samuels@tempe.govContact (maintainer): Contact E-Mail (maintainer): Data Source Type: Excel tablePreparation Method: Data received from vendor after report is completedPublish Frequency: AnnualPublish Method: ManualData Dictionary

  20. d

    CommunitySurvey2023unweighted

    • catalog.data.gov
    • datasets.ai
    • +4more
    Updated Sep 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2024). CommunitySurvey2023unweighted [Dataset]. https://catalog.data.gov/dataset/communitysurvey2023unweighted
    Explore at:
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    City of Tempe
    Description

    These data include the individual responses for the City of Tempe Annual Community Survey conducted by ETC Institute. This dataset has two layers and includes both the weighted data and unweighted data. Weighting data is a statistical method in which datasets are adjusted through calculations in order to more accurately represent the population being studied. The weighted data are used in the final published PDF report.These data help determine priorities for the community as part of the City's on-going strategic planning process. Averaged Community Survey results are used as indicators for several city performance measures. The summary data for each performance measure is provided as an open dataset for that measure (separate from this dataset). The performance measures with indicators from the survey include the following (as of 2023):1. Safe and Secure Communities1.04 Fire Services Satisfaction1.06 Crime Reporting1.07 Police Services Satisfaction1.09 Victim of Crime1.10 Worry About Being a Victim1.11 Feeling Safe in City Facilities1.23 Feeling of Safety in Parks2. Strong Community Connections2.02 Customer Service Satisfaction2.04 City Website Satisfaction2.05 Online Services Satisfaction Rate2.15 Feeling Invited to Participate in City Decisions2.21 Satisfaction with Availability of City Information3. Quality of Life3.16 City Recreation, Arts, and Cultural Centers3.17 Community Services Programs3.19 Value of Special Events3.23 Right of Way Landscape Maintenance3.36 Quality of City Services4. Sustainable Growth & DevelopmentNo Performance Measures in this category presently relate directly to the Community Survey5. Financial Stability & VitalityNo Performance Measures in this category presently relate directly to the Community SurveyMethods:The survey is mailed to a random sample of households in the City of Tempe. Follow up emails and texts are also sent to encourage participation. A link to the survey is provided with each communication. To prevent people who do not live in Tempe or who were not selected as part of the random sample from completing the survey, everyone who completed the survey was required to provide their address. These addresses were then matched to those used for the random representative sample. If the respondent’s address did not match, the response was not used. To better understand how services are being delivered across the city, individual results were mapped to determine overall distribution across the city. Additionally, demographic data were used to monitor the distribution of responses to ensure the responding population of each survey is representative of city population. Processing and Limitations:The location data in this dataset is generalized to the block level to protect privacy. This means that only the first two digits of an address are used to map the location. When they data are shared with the city only the latitude/longitude of the block level address points are provided. This results in points that overlap. In order to better visualize the data, overlapping points were randomly dispersed to remove overlap. The result of these two adjustments ensure that they are not related to a specific address, but are still close enough to allow insights about service delivery in different areas of the city. The weighted data are used by the ETC Institute, in the final published PDF report.The 2023 Annual Community Survey report is available on data.tempe.gov or by visiting https://www.tempe.gov/government/strategic-management-and-innovation/signature-surveys-research-and-dataThe individual survey questions as well as the definition of the response scale (for example, 1 means “very dissatisfied” and 5 means “very satisfied”) are provided in the data dictionary.Additional InformationSource: Community Attitude SurveyContact (author): Adam SamuelsContact E-Mail (author): Adam_Samuels@tempe.govContact (maintainer): Contact E-Mail (maintainer): Data Source Type: Excel tablePreparation Method: Data received from vendor after report is completedPublish Frequency: AnnualPublish Method: ManualData Dictionary

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Max Maass; Max Maass; Stephan Schwär; Stephan Schwär; Matthias Hollick; Matthias Hollick (2022). A Dataset of over 500.000 commercial email newsletters, as collected by PrivacyMail.info [Dataset]. http://doi.org/10.5281/zenodo.6509751
Organization logo

A Dataset of over 500.000 commercial email newsletters, as collected by PrivacyMail.info

Explore at:
Dataset updated
Jun 13, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Max Maass; Max Maass; Stephan Schwär; Stephan Schwär; Matthias Hollick; Matthias Hollick
Description

This dataset contains the data from roughly two years of operating PrivacyMail.info, an Open Source Email privacy measurement platform. It contains slightly over 500.000 commercial newsletters, as crowdsourced by users of PrivacyMail.info. You can find the methodology discussed in our paper: Max Maass, Stephan Schwär, and Matthias Hollick. "Towards transparency in email tracking." Annual Privacy Forum, 2019. The source code can be found on github.com/privacymail/privacymail

Please note that, due to its crowdsourced nature, this dataset is a sample of opportunity - it is not representative for all newsletters on the Internet, and likely contains biases based on how it was collected. Notably, German-language newsletters will likely be heavily over-represented.

Dataset Structure
The dataset is structured as follows: On the top level are folders describing the website the newsletter belongs to. Inside that folder are subfolders for each identity that was registered for that website. Inside each of these folders are a series of .eml files that represent the received email messages.

Copyright and Licensing
This dataset is set to non-public due to copyright concerns: The contents of the email messages are (presumably) protected by copyright in most jurisdictions. Most copyright doctrines contain exceptions for non-commercial research use - thus, we feel it is appropriate and acceptable to share the data on a case-by-case basis, the same way we did before shutting down PrivacyMail.info. When requesting access to the data, please briefly describe what research you want to conduct with it, and we will grant you access.

We thus do not put any explicit license on this dataset. Please do not share the raw data publicly. We request that you cite the above-mentioned paper and this dataset in any publications that result from it.

Search
Clear search
Close search
Google apps
Main menu