7 datasets found
  1. d

    IPv4 geolocation

    • datahub.io
    Updated Sep 1, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). IPv4 geolocation [Dataset]. https://datahub.io/core/geoip2-ipv4
    Explore at:
    Dataset updated
    Sep 1, 2017
    License

    ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
    License information was derived automatically

    Description

    Database of IPv4 address networks with their respective geographical location.

    Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...

  2. E-Commerce Website Logs

    • kaggle.com
    Updated Dec 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KZ Data Lover (2023). E-Commerce Website Logs [Dataset]. https://www.kaggle.com/datasets/kzmontage/e-commerce-website-logs
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    Kaggle
    Authors
    KZ Data Lover
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    This is a E-commerce website logs data created for helping the data analysts to practice exploratory data analysis and data visualization. The dataset has data on when the website was accessed, IP address of the source, Country, language in which website was accessed, amount of sales made by that IP address.

    Included columns:

    Time and duration of of accessing the website
    Country, Language & Platform in which it was accessed
    No. of bytes used & IP address of the person accessing website
    Sales or return amount of that person

  3. Host Network Traffic 2019

    • data.europa.eu
    • data.niaid.nih.gov
    unknown
    Updated Jul 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). Host Network Traffic 2019 [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-3799932?locale=da
    Explore at:
    unknown(1639099058)Available download formats
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Summary Timespan: 2019-01-01 : 2019-12-31 Granularity: 1-hour disjoint time windows # of characteristics observed: 9 Hosts observed: 65536 Labels: included Unzipped volume: approx. 10 GB Dataset Origins Dataset was collected over the whole year 2019. The observation points for the collection of IP flows were located at the borders of the university campus network. The campus university network has /16 CIDR IPv4 network range at disposal and contains various network segments from segments connecting dormitories, over server segments, to a segment containing working stations of university administrative workers. A host in our dataset is identified by its source IPv4 address. Variables The dataset contains the following variables: Aggregations - created sums of the individual variables over a one-hour interval: # of flows - number of flows for a given source IP # of packets - number of packets for a given source IP # of bytes - number of packets for a given source IP flow duration - average flow duration in seconds Distinct Counts - count of distinct values for each variable over a one-hour window # of peers - number of distinct communication peers for a given source IP # of ports - number of distinct destination ports for a given source IP # of protocols - number of distinct communication protocols for a given source IP # of AS numbers - number of distinct destination AS numbers for a given source IP # of countries - number of distinct destination countries for a given source Dataset Structure Dataset Files - each variable is contained in one Comma-Separated File (.csv) file Row index - timestamp of the observation window (8760 rows) Columns index - anonymized IP addresses (65536 columns) Label File - contains labels of the individual IP addresses from the Dataset Files Row index - anonymized IP addresses (65536 rows) Columns index - labels for the IP addresses Subnet - ID of a subnet - hosts belonging to the same subnet have the same Id. Subnet_range - CIDR range of a subnet Unit - an ID of administrative unit owning the network range Sub-unit - an ID of administrative sub-unit owning the network range Subnet_label - subnet label Servers - selected subnets containing mostly servers (133.250.178.0/24, 133.250.163.0/24) Workstations - selected subnets containing mostly workstations (133.250.146.0/24, 133.250.157.128/25) Further notes N/A values Variables - means that in a given observation window, the host did not communicate Labels - no additional information on this IP is available Dataset load df = pd.read_csv(

  4. Data from: Policosanol extraction patents in six extraction routes from IP...

    • zenodo.org
    • portaldelaciencia.uva.es
    • +1more
    Updated Oct 20, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rilton G. B. Primo; Rilton G. B. Primo; Díaz de los Ríos Manuel; Díaz de los Ríos Manuel; Ricardo de Araújo Kalid; Ricardo de Araújo Kalid (2024). Policosanol extraction patents in six extraction routes from IP Business Intelligence of Orbit seekers (version 1.9.8)®: World and Ibero-American countries (CSV Format). [Dataset]. http://doi.org/10.5281/zenodo.12594370
    Explore at:
    Dataset updated
    Oct 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rilton G. B. Primo; Rilton G. B. Primo; Díaz de los Ríos Manuel; Díaz de los Ríos Manuel; Ricardo de Araújo Kalid; Ricardo de Araújo Kalid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These results include tables of contents in CSV format of patents, corresponding to the substance Policosanol and its six main extraction methods, in the world and in Ibero-American countries, prospected from Advanced Search and IP Business Intelligence of Orbit seekers (version 1.9.8)® (Orbit, 2013, 2015; Orbit Intelligence, 2021, 2022; Winson, 2013).

    No results were found for the Ibero-American countries, in technological route 5 - "Molecular Distillation", until October 1, 2024, reason for the absence of their respective files in PDF and CSV.

    The prospecting reports contain the following information:

    1. Priority numbers
    2. Application number
    3. Publication numbers
    4. Family numbers
    5. Priority dates
    6. Application dates
    7. Publication dates
    8. Grant date
    9. Expected expiry dates
    10. Titles
    11. Abstracts
    12. Assignees
    13. Inventors
    14. Representatives
    15. Key content
    16. Technical concepts
    17. Images
    18. Claims
    19. Description
    20. Keywords in context
    21. Technology domains
    22. Cooperative classification
    23. International classification
    24. US patent classification
    25. Japanese classifications
    26. Citing patents (forward citations)
    27. Citing patents - Raw information
    28. Cited patents (backward citations)
    29. Cited patents - Raw information
    30. Cited non-patent literature
    31. Legal status
    32. Legal actions
    33. Designated states
    34. Protected authorities
    35. Farliest countries
    36. Assignee country
    37. Representative country
    38. Inventor country
    39. Cited patents (backward citations) - Counts
    40. Citing patents (forward citations) - Counts
    41. Independent claims - Counts
    42. Dependent claims - Counts
    43. All claims - Counts
    44. Assignees - Counts
    45. Inventors - Counts
    46. Age - Counts
    47. Words in document - Questel semantic count
    48. Words in abstract - Questel semantic count
    49. Words in description - Questel semantic count
    50. WO and EP designated states - Counts
    51. Drawings - Counts
    52. US specific - Counts
    53. Licensing agreements - Counts
    54. Number of figures
    55. WO parent
    56. Number of priorities
    57. Licensing interest names
    58. Security interest names
    59. Referenced drug names
    60. WO parent
    61. EP parent
    62. Filing notes
    63. Filing details
    64. Filing language
    65. Litigation
    66. Standards
    67. Licence ID

    The original format of the report generated by the database was preserved.

    In order to manage the query according to specific or joint interests, the PDF files are presented separately and also gathered in a single document (with 12,200 pages). This aggregated file is organized with the results of the world appearing in first place and those of the Ibero-American countries next, both obeying the numbering order of the prospected routes, separated by the respective summaries of contents.

  5. d

    Places Data | Global | Real-Time Geolocation API with Polygon Queries

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irys, Places Data | Global | Real-Time Geolocation API with Polygon Queries [Dataset]. https://datarade.ai/data-products/irys-geospatial-data-insights-global-real-time-histor-irys
    Explore at:
    .json, .csv, .xls, .sqlAvailable download formats
    Dataset authored and provided by
    Irys
    Area covered
    Trinidad and Tobago, Venezuela (Bolivarian Republic of), Jordan, Kazakhstan, Paraguay, Liberia, Finland, Micronesia (Federated States of), South Africa, Namibia
    Description

    This places data product gives you direct access to real-time and historical geolocation signals, allowing you to analyze human movement around POIs and custom-defined areas of interest. The Irys Location API lets you query anonymized mobile GPS events using precise polygons, supporting both high-frequency live monitoring and deep historical backfill.

    Each data point includes:

    Device ID (IDFA/GAID) GPS coordinates (lat/lon) Timestamps (epoch & date) Country codes Horizontal accuracy (85% fill rate) Optional IP addresses, carrier, and device metadata

    Irys supports output in Parquet, CSV, or JSON, and offers hourly or daily delivery via API, AWS S3, or Google Cloud. Events are typically available with 95% coverage within a 3-day lag, and historical data goes back to September 2022.

    Flexible Features:

    Query by polygon (up to 10,000 tiles per query) Credit-based pricing system, scalable from test to production Optional schema customization & delivery folder structure Fully GDPR and CCPA compliant

  6. d

    Location Data | Africa & Middle East | Real-Time API Geolocation Feed

    • datarade.ai
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Irys (2023). Location Data | Africa & Middle East | Real-Time API Geolocation Feed [Dataset]. https://datarade.ai/data-products/location-data-africa-middle-east-real-time-api-geolocat-irys
    Explore at:
    .json, .csv, .xls, .sqlAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset authored and provided by
    Irys
    Area covered
    Africa
    Description

    Access GPS-based location data across major countries in Africa and the Middle East using Irys’s API-based geolocation feed. Query real-time or historical movement patterns using flexible polygon filters to retrieve anonymized GPS events.

    This data feed includes timestamps, coordinates, country code, device ID, and IP metadata. It supports custom delivery formats (CSV, JSON, Parquet) and endpoints (API, AWS, GCP). Events are delivered within 3 days of capture and are available dating back to September 2024.

    Ideal for infrastructure modeling, humanitarian response, public security, and transport analysis, this feed enables teams to visualize and respond to population trends in rapidly changing regions.

    Fully GDPR/CCPA compliant, and adaptable to your infrastructure.

  7. AMLNet - Synthetic Anti-Money Laundering Transaction Dataset

    • zenodo.org
    Updated Jul 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sabin Huda; Sabin Huda (2025). AMLNet - Synthetic Anti-Money Laundering Transaction Dataset [Dataset]. http://doi.org/10.5281/zenodo.16482144
    Explore at:
    Dataset updated
    Jul 27, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sabin Huda; Sabin Huda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    AMLNet - Synthetic Anti-Money Laundering Transaction Dataset

    DESCRIPTION:
    This dataset contains over 1 million synthetic financial transactions (1,048,575) generated using the AMLNet framework for anti-money laundering research.

    CONTENTS:
    - 1,047,028 legitimate transactions across 5 categories
    - 1,547 labeled money laundering transactions (0.15%)
    - 192-day simulation period
    - AUSTRAC-compliant suspicious patterns

    DATA FORMAT:
    CSV file with 16 columns containing transaction details:

    CORE TRANSACTION DATA:
    - step: Sequential transaction step/ID
    - type: Payment method (TRANSFER, OSKO, BPAY, EFTPOS, DEBIT, NPP)
    - amount: Transaction amount in AUD (positive/negative values)
    - category: Transaction category (Housing, Food, Transport, Recreation, Other)
    - nameOrig: Originating customer ID (e.g., C3511)
    - nameDest: Destination customer/merchant ID (e.g., C4945, M558)
    - oldbalanceOrg: Account balance before transaction
    - newbalanceOrig: Account balance after transaction

    LABELS:
    - isFraud: Binary fraud indicator (0=legitimate, 1=fraudulent)
    - isMoneyLaundering: Binary AML label (0=normal, 1=suspicious)
    - fraud_probability: Calculated fraud risk score

    TEMPORAL FEATURES:
    - hour: Hour of transaction (0-23)
    - day_of_week: Day of week (1=Monday, 7=Sunday)
    - day_of_month: Day of month (1-31)
    - month: Month number (1-12)

    METADATA:
    - metadata: JSON object containing:
    * timestamp: Exact transaction datetime
    * location: City, state, country, postcode
    * device_info: Device type, OS, IP address
    * payment_method: Specific payment method used
    * merchant_info: Merchant details (if applicable)
    * risk_indicators: Comprehensive risk scoring metrics

    DATASET STATISTICS:
    - Total transactions: 1,048,575 (1M+)
    - Legitimate transactions: 1,047,028 (99.85%)
    - Money laundering transactions: 1,547 (0.15%)
    - CSV file rows: 1,048,576 (including header row)
    - Payment types: 6 different methods
    - Transaction categories: 5 main categories
    - Time period: 192-day simulation
    - Geographic coverage: Australian cities and postcodes

    USAGE:
    This dataset is designed for:
    - Anti-money laundering research and algorithm development
    - Financial fraud detection benchmarking
    - Machine learning model training and validation
    - Academic research in financial crime detection
    - Commercial AML system development and testing

    Licensed under CC BY 4.0. Free to use for any purpose with proper attribution.
    See LICENSE.txt for full terms.

    CITATION:
    If you use this dataset, please cite:
    Huda, S., Foo, E., Jadidi, Z., Newton, M.A.H., & Sattar, A. (2025).
    AMLNet: A Knowledge-Based Multi-Agent Framework to Generate and Detect
    Realistic Money Laundering Transactions. Expert Systems with Applications.

    CONTACT:
    s.huda@griffith.edu.au

    VERSION: 1.0
    DATE: July 2025

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2017). IPv4 geolocation [Dataset]. https://datahub.io/core/geoip2-ipv4

IPv4 geolocation

Explore at:
15 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 1, 2017
License

ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically

Description

Database of IPv4 address networks with their respective geographical location.

Based on GeoLite2 Country Free Downloadable Databases as of Apr 21, 2015 http://dev.maxmind.com/geoip/geoip2/geolite2/...

Search
Clear search
Close search
Google apps
Main menu