Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🕵️♂️ Advanced OSINT Public Profiles Dataset (Synthetic) 📄 Overview This dataset contains 2,000 synthetic public profile records generated for open-source intelligence (OSINT) research, cybersecurity education, and red team simulation. It mimics realistic personal, professional, and breach-related information typically found through OSINT tools and techniques.
It is 100% synthetic — no real individuals or private data were used.
| Column Name | Description |
|---|---|
Name Full name of the synthetic individual | |
Username Commonly used username | |
Email Generated email address | |
Phone Randomly formatted phone number | |
Twitter Simulated Twitter profile link | |
LinkedIn Simulated LinkedIn profile link | |
Domain Domain name associated with the person | |
Location City and country | |
Job_Title Profession or role | |
Company Employer or organization | |
IP_Address Public IPv4 address | |
MAC_Address Synthetic MAC address | |
Breached Indicates whether their data was breached | |
Breach_Source Known breach source (LinkedIn, Dropbox, etc.) | |
Breach_Year Year of breach (if applicable) | |
Password_Strength | Simulated password strength: Weak, Moderate, or Strong |
Public_Pastebin | Whether their data appeared on a pastebin (Yes/No) |
🎯 Use Cases You can use this dataset for:
✅ OSINT Reconnaissance Practice
✅ Identity Risk Scoring Systems
✅ Cybersecurity Education & Red Team Simulations
✅ NLP & Fuzzy Matching for Entity Resolution
✅ Network Graphs of Breached Users
✅ Training AI models for fake profile detection
✅ Demonstrating recon tools and dashboards 📌 License This dataset is licensed under the Creative Commons CC0 1.0 — Public Domain Dedication.
Feel free to use it in your academic projects, machine learning models, blogs, or demos — with or without attribution.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🕵️♂️ Advanced OSINT Public Profiles Dataset (Synthetic) 📄 Overview This dataset contains 2,000 synthetic public profile records generated for open-source intelligence (OSINT) research, cybersecurity education, and red team simulation. It mimics realistic personal, professional, and breach-related information typically found through OSINT tools and techniques.
It is 100% synthetic — no real individuals or private data were used.
| Column Name | Description |
|---|---|
Name Full name of the synthetic individual | |
Username Commonly used username | |
Email Generated email address | |
Phone Randomly formatted phone number | |
Twitter Simulated Twitter profile link | |
LinkedIn Simulated LinkedIn profile link | |
Domain Domain name associated with the person | |
Location City and country | |
Job_Title Profession or role | |
Company Employer or organization | |
IP_Address Public IPv4 address | |
MAC_Address Synthetic MAC address | |
Breached Indicates whether their data was breached | |
Breach_Source Known breach source (LinkedIn, Dropbox, etc.) | |
Breach_Year Year of breach (if applicable) | |
Password_Strength | Simulated password strength: Weak, Moderate, or Strong |
Public_Pastebin | Whether their data appeared on a pastebin (Yes/No) |
🎯 Use Cases You can use this dataset for:
✅ OSINT Reconnaissance Practice
✅ Identity Risk Scoring Systems
✅ Cybersecurity Education & Red Team Simulations
✅ NLP & Fuzzy Matching for Entity Resolution
✅ Network Graphs of Breached Users
✅ Training AI models for fake profile detection
✅ Demonstrating recon tools and dashboards 📌 License This dataset is licensed under the Creative Commons CC0 1.0 — Public Domain Dedication.
Feel free to use it in your academic projects, machine learning models, blogs, or demos — with or without attribution.