ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Database of IPv4 address networks with their respective geographical location.
Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
This is a E-commerce website logs data created for helping the data analysts to practice exploratory data analysis and data visualization. The dataset has data on when the website was accessed, IP address of the source, Country, language in which website was accessed, amount of sales made by that IP address.
Included columns:
Time and duration of of accessing the website
Country, Language & Platform in which it was accessed
No. of bytes used & IP address of the person accessing website
Sales or return amount of that person
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Summary Timespan: 2019-01-01 : 2019-12-31 Granularity: 1-hour disjoint time windows # of characteristics observed: 9 Hosts observed: 65536 Labels: included Unzipped volume: approx. 10 GB Dataset Origins Dataset was collected over the whole year 2019. The observation points for the collection of IP flows were located at the borders of the university campus network. The campus university network has /16 CIDR IPv4 network range at disposal and contains various network segments from segments connecting dormitories, over server segments, to a segment containing working stations of university administrative workers. A host in our dataset is identified by its source IPv4 address. Variables The dataset contains the following variables: Aggregations - created sums of the individual variables over a one-hour interval: # of flows - number of flows for a given source IP # of packets - number of packets for a given source IP # of bytes - number of packets for a given source IP flow duration - average flow duration in seconds Distinct Counts - count of distinct values for each variable over a one-hour window # of peers - number of distinct communication peers for a given source IP # of ports - number of distinct destination ports for a given source IP # of protocols - number of distinct communication protocols for a given source IP # of AS numbers - number of distinct destination AS numbers for a given source IP # of countries - number of distinct destination countries for a given source Dataset Structure Dataset Files - each variable is contained in one Comma-Separated File (.csv) file Row index - timestamp of the observation window (8760 rows) Columns index - anonymized IP addresses (65536 columns) Label File - contains labels of the individual IP addresses from the Dataset Files Row index - anonymized IP addresses (65536 rows) Columns index - labels for the IP addresses Subnet - ID of a subnet - hosts belonging to the same subnet have the same Id. Subnet_range - CIDR range of a subnet Unit - an ID of administrative unit owning the network range Sub-unit - an ID of administrative sub-unit owning the network range Subnet_label - subnet label Servers - selected subnets containing mostly servers (133.250.178.0/24, 133.250.163.0/24) Workstations - selected subnets containing mostly workstations (133.250.146.0/24, 133.250.157.128/25) Further notes N/A values Variables - means that in a given observation window, the host did not communicate Labels - no additional information on this IP is available Dataset load df = pd.read_csv(
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These results include tables of contents in CSV format of patents, corresponding to the substance Policosanol and its six main extraction methods, in the world and in Ibero-American countries, prospected from Advanced Search and IP Business Intelligence of Orbit seekers (version 1.9.8)® (Orbit, 2013, 2015; Orbit Intelligence, 2021, 2022; Winson, 2013).
No results were found for the Ibero-American countries, in technological route 5 - "Molecular Distillation", until October 1, 2024, reason for the absence of their respective files in PDF and CSV.
The prospecting reports contain the following information:
The original format of the report generated by the database was preserved.
In order to manage the query according to specific or joint interests, the PDF files are presented separately and also gathered in a single document (with 12,200 pages). This aggregated file is organized with the results of the world appearing in first place and those of the Ibero-American countries next, both obeying the numbering order of the prospected routes, separated by the respective summaries of contents.
This places data product gives you direct access to real-time and historical geolocation signals, allowing you to analyze human movement around POIs and custom-defined areas of interest. The Irys Location API lets you query anonymized mobile GPS events using precise polygons, supporting both high-frequency live monitoring and deep historical backfill.
Each data point includes:
Device ID (IDFA/GAID) GPS coordinates (lat/lon) Timestamps (epoch & date) Country codes Horizontal accuracy (85% fill rate) Optional IP addresses, carrier, and device metadata
Irys supports output in Parquet, CSV, or JSON, and offers hourly or daily delivery via API, AWS S3, or Google Cloud. Events are typically available with 95% coverage within a 3-day lag, and historical data goes back to September 2022.
Flexible Features:
Query by polygon (up to 10,000 tiles per query) Credit-based pricing system, scalable from test to production Optional schema customization & delivery folder structure Fully GDPR and CCPA compliant
Access GPS-based location data across major countries in Africa and the Middle East using Irys’s API-based geolocation feed. Query real-time or historical movement patterns using flexible polygon filters to retrieve anonymized GPS events.
This data feed includes timestamps, coordinates, country code, device ID, and IP metadata. It supports custom delivery formats (CSV, JSON, Parquet) and endpoints (API, AWS, GCP). Events are delivered within 3 days of capture and are available dating back to September 2024.
Ideal for infrastructure modeling, humanitarian response, public security, and transport analysis, this feed enables teams to visualize and respond to population trends in rapidly changing regions.
Fully GDPR/CCPA compliant, and adaptable to your infrastructure.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AMLNet - Synthetic Anti-Money Laundering Transaction Dataset
DESCRIPTION:
This dataset contains over 1 million synthetic financial transactions (1,048,575) generated using the AMLNet framework for anti-money laundering research.
CONTENTS:
- 1,047,028 legitimate transactions across 5 categories
- 1,547 labeled money laundering transactions (0.15%)
- 192-day simulation period
- AUSTRAC-compliant suspicious patterns
DATA FORMAT:
CSV file with 16 columns containing transaction details:
CORE TRANSACTION DATA:
- step: Sequential transaction step/ID
- type: Payment method (TRANSFER, OSKO, BPAY, EFTPOS, DEBIT, NPP)
- amount: Transaction amount in AUD (positive/negative values)
- category: Transaction category (Housing, Food, Transport, Recreation, Other)
- nameOrig: Originating customer ID (e.g., C3511)
- nameDest: Destination customer/merchant ID (e.g., C4945, M558)
- oldbalanceOrg: Account balance before transaction
- newbalanceOrig: Account balance after transaction
LABELS:
- isFraud: Binary fraud indicator (0=legitimate, 1=fraudulent)
- isMoneyLaundering: Binary AML label (0=normal, 1=suspicious)
- fraud_probability: Calculated fraud risk score
TEMPORAL FEATURES:
- hour: Hour of transaction (0-23)
- day_of_week: Day of week (1=Monday, 7=Sunday)
- day_of_month: Day of month (1-31)
- month: Month number (1-12)
METADATA:
- metadata: JSON object containing:
* timestamp: Exact transaction datetime
* location: City, state, country, postcode
* device_info: Device type, OS, IP address
* payment_method: Specific payment method used
* merchant_info: Merchant details (if applicable)
* risk_indicators: Comprehensive risk scoring metrics
DATASET STATISTICS:
- Total transactions: 1,048,575 (1M+)
- Legitimate transactions: 1,047,028 (99.85%)
- Money laundering transactions: 1,547 (0.15%)
- CSV file rows: 1,048,576 (including header row)
- Payment types: 6 different methods
- Transaction categories: 5 main categories
- Time period: 192-day simulation
- Geographic coverage: Australian cities and postcodes
USAGE:
This dataset is designed for:
- Anti-money laundering research and algorithm development
- Financial fraud detection benchmarking
- Machine learning model training and validation
- Academic research in financial crime detection
- Commercial AML system development and testing
Licensed under CC BY 4.0. Free to use for any purpose with proper attribution.
See LICENSE.txt for full terms.
CITATION:
If you use this dataset, please cite:
Huda, S., Foo, E., Jadidi, Z., Newton, M.A.H., & Sattar, A. (2025).
AMLNet: A Knowledge-Based Multi-Agent Framework to Generate and Detect
Realistic Money Laundering Transactions. Expert Systems with Applications.
CONTACT:
s.huda@griffith.edu.au
VERSION: 1.0
DATE: July 2025
Not seeing a result you expected?
Learn how you can add new datasets to our index.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Database of IPv4 address networks with their respective geographical location.
Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...