70 datasets found
  1. Bitcoin Wallet Classification Public Dataset

    • kaggle.com
    zip
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Wilder (2025). Bitcoin Wallet Classification Public Dataset [Dataset]. https://www.kaggle.com/datasets/gregorywilder/bitcoin-wallet-classification-public-dataset
    Explore at:
    zip(2471936960 bytes)Available download formats
    Dataset updated
    Mar 15, 2025
    Authors
    Gregory Wilder
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Classification of 43,621,232 Bitcoin Wallet addresses into 14 categories (following the classifications defined by BABD, cited below): (1) Blackmail, (2) Cyber-Security Service, (3) Darknet Market, (4) Centralized Exchange, (5) P2P Financial Infrastructure Service, (6) P2P Financial Service, (7) Gambling, (8) Government Criminal Blocklist, (9) Money Laundering, (10) Ponzi Scheme, (11) Mining Pool, (12) Tumbler, (13) Individual Wallets, and (14) Unknown Service.

    This database was made by combining info from four separate sources: Aleš Janda, who operates WalletExplorer.com, generously offers an API which provides significant wallet classification and identification that he was able to determine. Mr. Janda determined the owners of wallets by registering for certain services and learning the addresses used by those services (like SatoshiDice) and then backed in to other wallet addresses by determining which wallets were merged together (similar to a shadow wallet address analysis done by others). However, WalletExplorer states it has not updated much data since 2016. Thus, the data available from WalletExplorer is largely contained in the first 425,000 blocks (block 425,000 was mined on August 13, 2016). This classification data constituted, by far, the largest portion of classification data in this dataset, with over 43 million wallet addresses being classified. Preference was always given to WalletExplorer.com's classifications, as “Strong Addresses.” Aleš Janda. Wallet explorer. https://www.walletexplorer.com/info.

    Additionally, several datasets (in spreadsheets) identify and classify wallets:

    The largest of these spreadsheets available on Kaggle was made by Xiang, et al. They were able to classify 544,462 wallet addresses between blocks 585,000 (July 12, 2019) and 685,000 (May 26, 2021) into 13 classifications: (1) Blackmail, (2) Cyber-Security Service, (3) Darknet Market, (4) Centralized Exchange, (5) P2P Financial Infrastructure Service, (6) P2P Financial Service, (7) Gambling, (8) Government Criminal Blocklist, (9) Money Laundering, (10) Ponzi Scheme, (11) Mining Pool, (12) Tumbler, and (13) Individual Wallets. They used the WalletExplorer database and other governmental blocklist databases as their “strong addresses” and information from BitcoinAbuse.com (which now appears to be ChainAbuse.com) as “weak addresses” to train their AI models. The processes they used to identify and classify the wallets on their experimental sets resulted in a minimum F-1 score of 92.97%, accuracy of 93.24%, precision of 92.80%, and recall of 93.24%. They accomplished this by using a framework consisting of two parts: a statistical indicator (SI), and a local structural indicator (LSI). The SI considered four indicator types, Pure Amount Indicator (PAI), Pure Degree Indicator (PDI), Pure Time Indicator (PTI), and Combination Indicator (CI). The SI considered 132 features to predict the classification. For the LSI, they generated k-hop subgraphs with an algorithm, and then used various graph metrics and other features to predict the classification. Yiexin Xiang, Yuchen Lei, Ding Bao, Tiantian Li, Quingqing Yang, Wenmao Liu, Wei Ren, and Kim-Kwang Raymond Choo. Babd: A bitcoin address behavior dataset for pattern analysis. IEEE Transactions on Information Forensics and Security, (19):2171–85, 2024. https://www.kaggle.com/datasets/lemonx/babd13

    Michalski, et al., identified 8,808 addresses and classified them using machine learning techniques considering 149 features, into categories including (1) mining pools, (2) miners, (3) coinjoins, (4) gambling, (5) exchange, and (6) services. They analyzed the blocks between 520,850 and 520,950. They obtained training data from WalletExplorer.com and then used machine learning techniques including Evaluated Supervised Learning Algorithms. They determined that the Random Forest classification was the best method of classifying the wallets, and stated they obtained an F-score of 95%. Due to the small size of this dataset and the fact that only 100 blocks were covered by this dataset, I considered these classifications to be “weak addresses” for this work. The wallets they classified as “services” were added to my database as an “Unknown Service.” Radoslaw Michalski, Daria Dziuba ltowska, and Piotr Macek. Bitcoin addresses and their categories. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/KEWU0N

    The US Office of Foreign Asset Control maintains a list of ‘sanctioned’ Bitcoin and other digital currency assets. Several Github contributors maintain a tool that extracts the Bitcoin addresses from this database. This added 390 wallet addresses to the dataset. U.S. Treasury. Specially designated nationals list of the U.S. Office of Foreign Asset Control. https://www.treasury.gov/ofac/downloads/sanctions/1.0/sdn_advanced.xml. 0xB10C, Michael Neale, and Yahiheb. https://github.com/0xB10C/ofac-sanctioned-digital-currency-addresses?tab=readme-ov-file

  2. g

    Inspire-WFS SL Traffic Networks OKSTRA – Road Use Types – OGC API Features |...

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Inspire-WFS SL Traffic Networks OKSTRA – Road Use Types – OGC API Features | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_9579b1e3-3bad-530e-3c0b-4bdcddad3ce7
    Explore at:
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This service provides data implemented for the INSPIRE topic of transport networks from the OKSTRA data model:A classification based on the physical characteristics of the road section.

  3. a

    Transportation System Plan (TSP) Classifications

    • hub.arcgis.com
    • gis-pdx.opendata.arcgis.com
    Updated Aug 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Portland, Oregon (2023). Transportation System Plan (TSP) Classifications [Dataset]. https://hub.arcgis.com/datasets/PDX::transportation-system-plan-tsp-classifications/api
    Explore at:
    Dataset updated
    Aug 31, 2023
    Dataset authored and provided by
    City of Portland, Oregon
    Area covered
    Description

    TSP classifications are part of a group of layers that make up the Transportation System Plan, which is the 20-year plan for transportation improvements in the City of Portland. The goal of the TSP is to provide transportation choices for residents, employees, visitors and firms doing business in Portland by describing what the system should look like and what purpose it fulfills. This linear feature class contains the street classifications of the TSP. Attribution for classifications under Traffic, Transit, Bicycle, Pedestrian, Freight, Emergency Response and Street Design designate the type of movement and planning that should be emphasized on each street. Classification descriptions are used to describe how streets should function for each modes of travel, not necessarily how they are functioning at present.-- Additional Information: Category: Planning Purpose: For mapping related to the City's Transportation System Plan. Update Frequency: Irregular-- Metadata Link: https://www.portlandmaps.com/metadata/index.cfm?&action=DisplayLayer&LayerID=52497

  4. d

    Gas Station Location Data Europe | 140k+ Stations with 400+ Attributes | 25+...

    • datarade.ai
    .json, .xml
    Updated Jul 20, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    xavvy (2021). Gas Station Location Data Europe | 140k+ Stations with 400+ Attributes | 25+ Fuel Types and 60+ Services | weekly updates | API & Datasets [Dataset]. https://datarade.ai/data-products/xavvy-s-gas-station-poi-data-of-each-country-in-europe-140k-xavvy
    Explore at:
    .json, .xmlAvailable download formats
    Dataset updated
    Jul 20, 2021
    Dataset authored and provided by
    xavvy
    Area covered
    France, Germany, United Kingdom
    Description

    Base data • Name/Brand • Adress • Geocoordinates • Opening Hours • Phone •...

    25+ Fuel Types like • Super E5 • Super 98 • Diesel • AdBlue • LPG • CNG •...

    60+ Services and characteristics like • Carwash • Shop • Restaurant • Toilet • ATM • Toll •...

    300+ Payment options • Cash • Visa • MasterCard • Fueling Cards •...

    We are the leading source for Gas Station Location Data and Petrol Price Data worldwide and specialized in data quality and enrichment. We provide high quality POI Data of gas stations for all European countries.

    The gas station location data is delivered country by country and the level of information to be provided is highly customizable One-time or regular data delivery, push or pull services, and any data format – we adjust to our customer’s needs.

    Total number of stations per country or region, distribution of market shares among competitors or the perfect location for new gas stations, charging stations or hydrogen dispensers - our data provides answers to various questions and offers the perfect foundation for in-depth analyses and statistics. In this way, our gas station location data and petrol price data helps customers from various industries to gain more valuable insights into the fuel market and its development. Thereby providing an unparalleled basis for strategic decisions such as business development, competitive approach or expansion.

    In addition, our data can contribute to the consistency and quality of an existing dataset. Simply map data to check for accuracy and correct erroneous data.

    200+ sources including governments, petroleum companies, fuel card providers and crowd sourcing enable xavvy to provide various information. Next to base information like name/brand, address, geo-coordinates or opening hours, there are also detailed information about available fuel types, accessibility, special services, or payment options for each station:

    Especially if you want to display information about gas stations on a map or in an application, high data quality is crucial for an excellent customer experience. Therefore, processing procedures are continuously improved to increase data quality:

    • regular quality controls (e.g. via monitoring dashboards) • Geocoding systems correct and specify geocoordinates • Data sets are cleaned and standardized • Current developments and mergers are taken into account • The number of data sources is constantly expanded to map different data sources against each other

    Check out our other Data Offerings available and gain more valuable market insights on gas stations directly from the experts!

  5. a

    MDOT SHA Roadway Functional Classification

    • data-maryland.opendata.arcgis.com
    • data.imap.maryland.gov
    • +1more
    Updated Sep 4, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ArcGIS Online for Maryland (2020). MDOT SHA Roadway Functional Classification [Dataset]. https://data-maryland.opendata.arcgis.com/datasets/mdot-sha-roadway-functional-classification/api
    Explore at:
    Dataset updated
    Sep 4, 2020
    Dataset authored and provided by
    ArcGIS Online for Maryland
    Area covered
    Description

    Esri ArcGIS Online (AGOL) Hosted Feature Layer which provides access to the MDOT SHA Roadway Functional Classification data product.MDOT SHA Roadway Functional Classification data consists of linear geometric features which showcase the functional classification of roadways throughout the State of Maryland. Roadway Functional Classification is defined as the role each roadway plays in moving vehicles throughout a network of highways. MDOT SHA Roadway Functional Classification data is primarily used for general planning purposes, and for Federal Highway Administration (FHWA) Highway Performance Monitoring System (HPMS) annual submission & coordination. The Maryland Department of Transportation State Highway Administration (MDOT SHA) currently reports this data only on the inventory direction (generally North or East) side of the roadway. MDOT SHA Roadway Functional Classification data is not a complete representation of all roadway geometry.The State of Maryland's roadway system is a vast network that connects places and people within and across county borders. Planners and engineers have developed elements of this network with particular travel objectives in mind. These objectives range from serving long-distance passenger and freight needs to serving neighborhood travel from residential developments to nearby shopping centers. The functional classification of roadways defines the role each element of the roadway network plays in serving these travel needs. ​ Over the years, functional classification has come to assume additional significance beyond its purpose as a framework for identifying the particular role of a roadway in moving vehicles through a network of highways. Functional classification carries with it expectations about roadway design, including its speed, capacity and relationship to existing and future land use development. Federal legislation continues to use functional classification in determining eligibility for funding under the Federal-aid program. Transportation agencies describe roadway system performance, benchmarks and targets by functional classification. As agencies continue to move towards a more performance-based management approach, functional classification will be an increasingly important consideration in setting expectations and measuring outcomes for preservation, mobility and safety.MDOT SHA Roadway Functional Classification data is developed as part of the Highway Performance Monitoring System (HPMS) which maintains and reports transportation related information to the Federal Highway Administration (FHWA) on an annual basis. HPMS is maintained by the Maryland Department of Transportation State Highway Administration (MDOT SHA), under the Office of Planning & Preliminary Engineering (OPPE) Data Services Division (DSD). This data is used by various business units throughout MDOT, as well as many other Federal, State and local government agencies. Roadway Functional Classification data is key to understanding the role each roadway plays in moving vehicles throughout the State of Maryland's network of highways.MDOT SHA Roadway Functional Classification data is owned & maintained by the MDOT SHA Office of Planning & Preliminary Engineering (OPPE). This data product is updated & published on an annual basis for the prior year. This data product is for the year 2024.For more information related to the data, contact MDOT SHA OPPE Data Services Division (DSD):Email: DSD@mdot.maryland.gov For more information, contact MDOT SHA OIT Enterprise Information Services:Email: GIS@mdot.maryland.gov

  6. Malware Benign API Call Argument Feature Vector

    • kaggle.com
    zip
    Updated Apr 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BISHWAJIT PRASAD GOND (2025). Malware Benign API Call Argument Feature Vector [Dataset]. https://www.kaggle.com/datasets/bishwajitprasadgond/malware-benign-api-call-argument-feature-vector
    Explore at:
    zip(36681722 bytes)Available download formats
    Dataset updated
    Apr 28, 2025
    Authors
    BISHWAJIT PRASAD GOND
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Disclaimer: To process and perform analysis with this dataset, it is strongly recommended that your system has at least 128 GB of RAM. Attempting to work with this dataset on systems with lower memory may result in crashes, incomplete processing, or significant performance issues.

    The process involves acquiring malware data, performing behavioral analysis, and preparing features for deep learning models.

    Step 1: Data Acquisition

    • Source Malware Hashes: Obtain malware hashes from VirusShare.
    • Query VirusTotal: Use the hashes to query VirusTotal and download JSON files containing scan results from over 70 antivirus engines.
    • Class Determination: Analyze the scan results to classify the malware into distinct categories.

    Step 2: Malware Download

    • Download Malware Samples: Based on the classification, download malware samples for each category.

    Step 3: Dynamic Analysis with Cuckoo Sandbox

    • Environment Setup: Conduct dynamic analysis in a controlled environment using Cuckoo Sandbox.
    • Behavioral Report: Generate a JSON behavioral report for each malware sample, focusing on Portable Executable (PE) files.
    • API Call Sequence Extraction: Extract API call sequence reports in JSON format, including:
      • API Name
      • API Argument
      • API Return
      • API Category

    Step 4: Data Preprocessing

    • JSON Report Segmentation: Split the JSON report into four text files:

      • api_name.txt
      • api_argument.txt
      • api_return.txt
      • api_category.txt
    • Unigram Generation:

      • Combine API names with their corresponding arguments using underscores (e.g., LdrLoadDll_urlmon.dll).
      • Generate unigrams for each malware category.

      Example unigram: - LdrLoadDll_urlmon_urlmon.dll

    • Output: Create a CSV file containing unigrams for each malware category.

    Step 5: Feature Extraction and Vectorization

    • API Elements Extraction:

      • Extract key elements: API Name and API Argument
    • Unique Unigrams:

      • Identify unique unigrams from the JSON reports.
    • Term Frequency (TF) Calculation:

      • Tokenize the text and compute TF weights for each unigram, reflecting their importance in the dataset.
      • Optionally apply L2 normalization to ensure consistent feature vector lengths.
    • Feature Refinement:

      • Filter unnecessary features from the unigram CSV files to create a refined feature set.

    Output

    • Prepared Dataset:
      • A refined CSV file containing unigrams and TF-weighted features for each malware category.

    Citation

    • B. P. Gond, M. Shahnawaz, Rajneekant and D. P. Mohapatra, "NLP-Driven Malware Classification: A Jaccard Similarity Approach," 2024 IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS), Bangalore, India, 2024, pp. 1-8, DOI: https://doi.org/10.1109/ICITEICS61368.2024.10624953
    • B. P. Gond, A. K. Singh and D. P. Mohapatra, "A Deep Learning Framework for Malware Classification using NLP Techniques," 2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT), Kamand, India, 2024, pp. 1-8, DOI: https://doi.org/10.1109/ICCCNT61001.2024.10725427
    • Gond, B. P., & Mohapatra, D. P. (2025). Deep Learning-Driven Malware Classification with API Call Sequence Analysis and Concept Drift Handling. ArXiv. https://arxiv.org/abs/2502.08679
  7. Twitter mining using semi-supervised classification for relevance filtering...

    • plos.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oduwa Edo-Osagie; Gillian Smith; Iain Lake; Obaghe Edeghere; Beatriz De La Iglesia (2023). Twitter mining using semi-supervised classification for relevance filtering in syndromic surveillance [Dataset]. http://doi.org/10.1371/journal.pone.0210689
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Oduwa Edo-Osagie; Gillian Smith; Iain Lake; Obaghe Edeghere; Beatriz De La Iglesia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We investigate the use of Twitter data to deliver signals for syndromic surveillance in order to assess its ability to augment existing syndromic surveillance efforts and give a better understanding of symptomatic people who do not seek healthcare advice directly. We focus on a specific syndrome—asthma/difficulty breathing. We outline data collection using the Twitter streaming API as well as analysis and pre-processing of the collected data. Even with keyword-based data collection, many of the tweets collected are not be relevant because they represent chatter, or talk of awareness instead of an individual suffering a particular condition. In light of this, we set out to identify relevant tweets to collect a strong and reliable signal. For this, we investigate text classification techniques, and in particular we focus on semi-supervised classification techniques since they enable us to use more of the Twitter data collected while only doing very minimal labelling. In this paper, we propose a semi-supervised approach to symptomatic tweet classification and relevance filtering. We also propose alternative techniques to popular deep learning approaches. Additionally, we highlight the use of emojis and other special features capturing the tweet’s tone to improve the classification performance. Our results show that negative emojis and those that denote laughter provide the best classification performance in conjunction with a simple word-level n-gram approach. We obtain good performance in classifying symptomatic tweets with both supervised and semi-supervised algorithms and found that the proposed semi-supervised algorithms preserve more of the relevant tweets and may be advantageous in the context of a weak signal. Finally, we found some correlation (r = 0.414, p = 0.0004) between the Twitter signal generated with the semi-supervised system and data from consultations for related health conditions.

  8. Firmographic Data API | Detailed Insights on 70M+ Companies | Strategic...

    • datarade.ai
    Updated Oct 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2021). Firmographic Data API | Detailed Insights on 70M+ Companies | Strategic Decision Making | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/firmographic-data-api-detailed-insights-on-70m-companies-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    Area covered
    Montserrat, Colombia, Grenada, Zimbabwe, Malawi, Peru, Bosnia and Herzegovina, Holy See, Lithuania, Aruba
    Description

    Success.ai’s Firmographic Data API empowers organizations to make data-driven decisions with on-demand access to detailed insights on over 70 million companies worldwide. Covering key firmographic attributes like industry classifications, revenue size, and employee count, this API ensures your market analysis, strategic planning, and competitive benchmarking efforts are backed by continuously updated, AI-validated information.

    Whether you’re exploring new markets, refining your product offerings, or optimizing partner relationships, Success.ai’s Firmographic Data API delivers the intelligence you need. Supported by our Best Price Guarantee, this solution helps you confidently navigate the global business landscape.

    Why Choose Success.ai’s Firmographic Data API?

    1. Detailed, Verified Firmographic Data

      • Access comprehensive company attributes including industries, revenue ranges, and headcount.
      • AI-driven validation ensures 99% accuracy, minimizing errors and fostering informed decision-making.
    2. Extensive Global Coverage

      • Includes profiles of companies from North America, Europe, Asia-Pacific, and beyond.
      • Scale your strategies by tapping into emerging markets, niche sectors, and diverse geographies.
    3. Continuous Data Updates

      • Receive real-time updates to keep pace with changing organizational structures, market expansions, and acquisitions.
      • Always rely on current data to guide product roadmaps, growth plans, and strategic partnerships.
    4. Ethical and Compliant

      • Fully adheres to GDPR, CCPA, and other global privacy regulations, ensuring responsible, lawful data usage for every query.

    Data Highlights:

    • 70M+ Verified Company Profiles: Gain clarity on businesses spanning all major industries and regions.
    • Industry Classifications: Filter companies by their sector focus, from manufacturing to technology.
    • Revenue and Employee Counts: Understand company sizes, growth potential, and market reach.
    • Global Market Insights: Use firmographic data to inform product launches, expansions, and strategic alliances.

    Key Features of the Firmographic Data API:

    1. Real-Time Company Enrichment

      • Enhance your CRM or analytics platforms with verified firmographic data, eliminating guesswork and manual data imports.
      • Update records automatically as companies grow, diversify, or shift their market focus.
    2. Advanced Filtering and Query Capabilities

      • Query the API for specific parameters like industry vertical, company size, or geographic location.
      • Zero in on opportunities aligned with your business goals, improving efficiency and outcomes.
    3. Scalability and Flexibility

      • Seamlessly integrate the API into existing workflows, CRM systems, or marketing automation tools.
      • Adjust parameters as markets evolve, ensuring that you always have the intelligence needed to adapt and thrive.
    4. AI-Validated Accuracy and Reliability

      • Rely on an AI-powered validation process that continually verifies data integrity.
      • Increase confidence in strategic decisions backed by accurate, current, and relevant information.

    Strategic Use Cases:

    1. Market Analysis and Competitive Benchmarking

      • Identify industries poised for growth, evaluate emerging markets, and benchmark against competitor profiles.
      • Refine go-to-market strategies and product launches based on solid data rather than assumptions.
    2. Strategic Partnering and M&A Efforts

      • Explore potential partners, suppliers, or acquisition targets that match your criteria, from revenue tiers to geographic presence.
      • Shorten due diligence cycles with reliable, on-demand firmographic insights.
    3. Sales and Account-Based Marketing

      • Segregate target accounts by industry, size, and region to tailor outreach and messaging.
      • Personalize campaigns, improve lead quality, and increase win rates through better audience alignment.
    4. Product Roadmapping and Portfolio Management

      • Inform product development by identifying high-growth verticals or underpenetrated regions.
      • Allocate resources effectively and prioritize product enhancements based on firmographic-driven insights.

    Why Choose Success.ai?

    1. Best Price Guarantee

      • Access premium-quality firmographic data at competitive prices, ensuring optimal ROI for your research and strategic planning.
    2. Seamless Integration

      • Easily incorporate the API into existing workflows, eliminating data silos and manual data management tasks.
    3. Data Accuracy with AI Validation

      • Depend on 99% accuracy to guide data-driven decisions, refine targeting, and boost strategic outcomes.
    4. Customizable and Scalable Solutions

      • Tailor the dataset to specific industries, regions, or revenue segments as your business evolves and market conditions shift.

    Additional APIs for Enhanced Functionality:

    1. Data Enrichment API...
  9. Time Series International Trade: Monthly U.S. Imports by Standard...

    • catalog.data.gov
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2025). Time Series International Trade: Monthly U.S. Imports by Standard International Trade Classification (SITC) Code [Dataset]. https://catalog.data.gov/dataset/time-series-international-trade-monthly-u-s-imports-by-standard-international-trade-classi
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    United States
    Description

    The Census data API provides access to the most comprehensive set of data on current month and cumulative year-to-date imports using the Standard International Trade Classification (SITC) system. The SITC endpoint in the Census data API also provides value, shipping weight, and method of transportation totals at the district level for all U.S. trading partners. The Census data API will help users research new markets for their products, establish pricing structures for potential export markets, and conduct economic planning. If you have any questions regarding U.S. international trade data, please call us at 1(800)549-0595 option #4 or email us at eid.international.trade.data@census.gov.

  10. d

    B2B Marketing Data API | Access 70M+ Business Profiles | Empower Your...

    • datarade.ai
    Updated Oct 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Success.ai (2021). B2B Marketing Data API | Access 70M+ Business Profiles | Empower Your Marketing Strategy | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/b2b-marketing-data-api-access-70m-business-profiles-empo-success-ai
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Oct 27, 2021
    Dataset provided by
    Success.ai
    Area covered
    Gambia, Fiji, Christmas Island, Congo, Dominican Republic, Rwanda, Saint Helena, Ecuador, Madagascar, Singapore
    Description

    Success.ai’s B2B Marketing Data API empowers marketing and sales teams to execute highly targeted and effective outreach campaigns. By providing on-demand access to over 70 million detailed business profiles worldwide, this API ensures your strategies are always guided by accurate, up-to-date information. From industry classifications and employee counts to firmographic and demographic insights, Success.ai’s B2B Marketing Data API enables you to zero in on the right businesses and decision-makers.

    With robust filtering capabilities, continuously updated datasets, and AI-validated accuracy, you can confidently refine segments, tailor messaging, and drive higher engagement rates. Backed by our Best Price Guarantee, this solution is essential for achieving meaningful ROI in a competitive global marketplace.

    Why Choose Success.ai’s B2B Marketing Data API?

    1. Extensive Global Coverage

      • Access over 70 million business profiles spanning various industries, geographies, and company sizes.
      • Expand into new markets, discover niche segments, and tap into emerging opportunities with ease.
    2. AI-Validated Accuracy

      • Depend on 99% accurate data validated by AI, ensuring your outreach hits the mark and minimizes wasted efforts.
      • Rely on continuously updated information, eliminating concerns about stale, irrelevant records.
    3. Robust Filtering Capabilities

      • Query the API based on key parameters like industry vertical, company size, revenue range, or geographic region.
      • Hone in on ideal customer profiles (ICPs), ensuring your campaigns resonate and drive tangible results.
    4. Ethical and Compliant

      • Fully adheres to GDPR, CCPA, and other global data privacy standards, ensuring responsible and lawful data usage.

    Data Highlights:

    • 70M+ Verified Business Profiles: Engage with a diverse range of companies worldwide.
    • Detailed Firmographics: Gain insights into industry classifications, employee counts, and revenue ranges.
    • Continuous Updates: Always work with the latest data, reflecting market changes, business expansions, and new entrants.
    • Best Price Guarantee: Get maximum value and ROI for your marketing investments at the most competitive prices.

    Key Features of the B2B Marketing Data API:

    1. On-Demand Data Enrichment

      • Enhance your CRM or marketing automation platforms with enriched contact and firmographic data in real-time.
      • Remove manual data imports and outdated lists, streamlining workflows and saving resources.
    2. Flexible Integration Options

      • Seamlessly integrate the API into existing marketing systems, analytics platforms, or sales dashboards.
      • Tailor datasets to align perfectly with your unique campaign goals, ICP criteria, or vertical interests.
    3. Granular Segmentation and Targeting

      • Filter records by industry, revenue, workforce size, or region to ensure every campaign focuses on receptive, high-potential prospects.
      • Improve personalization, relevance, and message resonance, increasing open, click-through, and conversion rates.
    4. Real-Time Validation and Reliability

      • Benefit from AI-driven data validation and continuous updates to maintain data integrity.
      • Build trust in your outreach efforts, knowing the data you rely on is current, accurate, and ready to inform critical decisions.

    Strategic Use Cases:

    1. Account-Based Marketing (ABM)

      • Fine-tune your ABM campaigns by targeting specific accounts that match your ideal criteria.
      • Deliver personalized messaging and content, improving engagement and deal closure rates.
    2. Market Expansion and Product Launches

      • Identify new industries or geographies that align with your product offerings.
      • Enter fresh markets confidently, supported by data-driven insights and targeted prospect lists.
    3. Partnership Development and Channel Sales

      • Discover complementary businesses, suppliers, or distributors that can amplify your reach and value proposition.
      • Accelerate partnerships and alliances, forming strategic relationships for sustainable growth.
    4. Competitive Benchmarking and Market Research

      • Analyze industry trends, growth patterns, and emerging opportunities to inform product roadmaps and marketing strategies.
      • Stay ahead of market shifts by continually monitoring changes and adjusting approaches dynamically.

    Why Choose Success.ai?

    1. Best Price Guarantee

      • Access premium-quality marketing data at unbeatable prices, ensuring exceptional ROI on your marketing spend.
    2. Seamless Integration

      • Incorporate the API into your workflow effortlessly, eliminating manual data handling and siloed tools.
    3. Data Accuracy with AI Validation

      • Trust in 99% accuracy to shape data-driven decisions, refine targeting, and enhance conversion rates across campaigns.
    4. Customizable and Scalable Solutions

      • Adapt the dataset as your ...
  11. Time Series International Trade: Monthly U.S. Exports by North American...

    • catalog.data.gov
    • datasets.ai
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2025). Time Series International Trade: Monthly U.S. Exports by North American Industry Classification System (NAICS) Code [Dataset]. https://catalog.data.gov/dataset/time-series-international-trade-monthly-u-s-exports-by-north-american-industry-classificat
    Explore at:
    Dataset updated
    Sep 30, 2025
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Area covered
    United States
    Description

    The Census data API provides access to the most comprehensive set of data on current month and cumulative year-to-date exports using the North American Industry Classification System (NAICS). The NAICS endpoint in the Census data API also provides value, shipping weight, and method of transportation totals at the district level for all U.S. trading partners. The Census data API will help users research new markets for their products, establish pricing structures for potential export markets, and conduct economic planning. If you have any questions regarding U.S. international trade data, please call us at 1(800)549-0595 option #4 or email us at eid.international.trade.data@census.gov.

  12. m

    Maryland Highway Performance Monitoring System - Roadway Functional...

    • data.imap.maryland.gov
    • dev-maryland.opendata.arcgis.com
    • +2more
    Updated Oct 22, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ArcGIS Online for Maryland (2018). Maryland Highway Performance Monitoring System - Roadway Functional Classification [Dataset]. https://data.imap.maryland.gov/datasets/maryland::maryland-highway-performance-monitoring-system-roadway-functional-classification/about
    Explore at:
    Dataset updated
    Oct 22, 2018
    Dataset authored and provided by
    ArcGIS Online for Maryland
    Area covered
    Description

    Roadway Functional Classification consists of linear features which specifically show the functional classification of public roadways in the State of Maryland. Roadway Functional Classification is defined as the role each roadway plays in moving vehicles throughout a network of highways. Roadway Functional Classification is primarily used for general planning purposes, and for Federal Highway Administration (FHWA) Highway Performance Monitoring System (HPMS) annual submission & coordination. The Maryland Department of Transportation State Highway Administration (MDOT SHA) currently reports this data only on the inventory direction (generally North or East) side of the roadway. Roadway Functional Classification data is not a complete representation of all roadway geometry.Maryland's roadway system is a vast network that connects places and people within and across county borders. Planners and engineers have developed elements of this network with particular travel objectives in mind. These objectives range from serving long-distance passenger and freight needs to serving neighborhood travel from residential developments to nearby shopping centers. The functional classification of roadways defines the role each element of the roadway network plays in serving these travel needs. ​ Over the years, functional classification has come to assume additional significance beyond its purpose as a framework for identifying the particular role of a roadway in moving vehicles through a network of highways. Functional classification carries with it expectations about roadway design, including its speed, capacity and relationship to existing and future land use development. Federal legislation continues to use functional classification in determining eligibility for funding under the Federal-aid program. Transportation agencies describe roadway system performance, benchmarks and targets by functional classification. As agencies continue to move towards a more performance-based management approach, functional classification will be an increasingly important consideration in setting expectations and measuring outcomes for preservation, mobility and safety.Roadway Functional Classification data is developed as part of the Highway Performance Monitoring System (HPMS) which maintains and reports transportation related information to the Federal Highway Administration (FHWA) on an annual basis. HPMS is maintained by the Maryland Department of Transportation State Highway Administration (MDOT SHA), under the Office of Planning and Preliminary Engineering (OPPE) Data Services Division (DSD). This data is used by various business units throughout MDOT, as well as many other Federal, State and local government agencies. Roadway Functional Classification data is key to understanding the role each roadway plays in moving vehicles throughout Maryland's network of highways.Roadway Functional Classification data is updated and published on an annual basis for the prior year. This data is for the year 2017. View the most current Roadway Functional Classification data in the MDOT SHA Roadway Functional Classes Application For additional information, contact the MDOT SHA Geospatial TechnologiesEmail: GIS@mdot.state.md.usFor additional information related to the Maryland Department of Transportation (MDOT):https://www.mdot.maryland.gov/For additional information related to the Maryland Department of Transportation State Highway Administration (MDOT SHA):https://roads.maryland.gov/Home.aspxMDOT SHA Geospatial Data Legal Disclaimer:The Maryland Department of Transportation State Highway Administration (MDOT SHA) makes no warranty, expressed or implied, as to the use or appropriateness of geospatial data, and there are no warranties of merchantability or fitness for a particular purpose or use. The information contained in geospatial data is from publicly available sources, but no representation is made as to the accuracy or completeness of geospatial data. MDOT SHA shall not be subject to liability for human error, error due to software conversion, defect, or failure of machines, or any material used in the connection with the machines, including tapes, disks, CD-ROMs or DVD-ROMs and energy. MDOT SHA shall not be liable for any lost profits, consequential damages, or claims against MDOT SHA by third parties.This is a MD iMAP hosted service layer. Find more information at https://imap.maryland.gov.Map Service Link:https://mdgeodata.md.gov/imap/rest/services/Transportation/MD_HighwayPerformanceMonitoringSystem/MapServer/2

  13. g

    HÜK200 - HÜK200: Classification of the Upper Aquifer into Aquifer Types -...

    • gimi9.com
    Updated Oct 15, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). HÜK200 - HÜK200: Classification of the Upper Aquifer into Aquifer Types - OGC API Features | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_47d396d3-3dc3-bcba-7590-d32485414a0c
    Explore at:
    Dataset updated
    Oct 15, 2024
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    In the context of the implementation of the EU Water Framework Directive, the first and further description of the groundwater bodies of Rhineland-Palatinate represents an inventory of the subsoil of the river basins in Rhineland-Palatinate with the aim of recording those groundwater bodies for which there is a risk of not achieving the environmental objectives under Article 4 of the EU Water Framework Directive. The description is based on the Hydrogeological Overview Map of Germany (HÜK 200). It was established in 2001 by the State Geological Services of Germany (SGD) and the Federal Institute for Geosciences and Natural Resources (BGR) on a scale of 1: 200,000 in the sheet cuts of the TK 200. The contents correspond to HÜK 200 of the BGR. :The classification of the upper aquifer into aquifer types was based on the cavity type and the geochemical nature of the flowing aquifer (combination of attributes). Geochemical conditions in the leachate zone are not taken into account. The contents correspond to HÜK 200 of the BGR.

  14. g

    Land Planning – Space Categories – OGC API Features | gimi9.com

    • gimi9.com
    Updated Nov 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Land Planning – Space Categories – OGC API Features | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_e6b395d6-6db9-f8bf-ed0c-1a3e08b3dc13
    Explore at:
    Dataset updated
    Nov 18, 2023
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Map Service (WFS Group) provides the map bases of the Land Development Plan Environment (2004) and Settlement (2006) of the Saarland.:Strong generalised representation of the space categories Core zone of the compaction space, edge zone of the compaction area and rural space within the framework of the LEP settlement 2006.

  15. Operating system resource types and API calls.

    • plos.figshare.com
    xls
    Updated Jun 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jian Zhang; Shengquan Liu; Zhihua Liu (2024). Operating system resource types and API calls. [Dataset]. http://doi.org/10.1371/journal.pone.0304066.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 27, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jian Zhang; Shengquan Liu; Zhihua Liu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the development of the Internet, the attribution classification of APT malware remains an important issue in society. Existing methods have yet to consider the DLL link library and hidden file address during the execution process, and there are shortcomings in capturing the local and global correlation of event behaviors. Compared to the structural features of binary code, opcode features reflect the runtime instructions and do not consider the issue of multiple reuse of local operation behaviors within the same APT organization. Obfuscation techniques more easily influence attribution classification based on single features. To address the above issues, (1) an event behavior graph based on API instructions and related operations is constructed to capture the execution traces on the host using the GNNs model. (2) ImageCNTM captures the local spatial correlation and continuous long-term dependency of opcode images. (3) The word frequency and behavior features are concatenated and fused, proposing a multi-feature, multi-input deep learning model. We collected a publicly available dataset of APT malware to evaluate our method. The attribution classification results of the model based on a single feature reached 89.24% and 91.91%. Finally, compared to single-feature classifiers, the multi-feature fusion model achieves better classification performance.

  16. G

    Reference Data as a Service (RDaaS) API

    • open.canada.ca
    • gimi9.com
    json
    Updated Feb 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2025). Reference Data as a Service (RDaaS) API [Dataset]. https://open.canada.ca/data/dataset/71fad0cb-bc36-4682-815f-0984e9d9a3bb
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Feb 6, 2025
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    The Reference Data as a Service (RDaaS) API provides a list of codesets, classifications, and concordances that are used within Statistics Canada. These resources are shared to help harmonize data, enabling better interdepartmental data integration and analysis. This dataset provides an updated version of the StatCan RDaaS API specification, originally part of the Government of Canada’s GC API Store, which permanently closed on September 29th, 2023. The archived version of the original API specification can be accessed via the Wayback Machine . The specification has been updated to the OpenAPI 3.0 (Swagger 3) standard, enabling use of current tools and features for API exploration and integration. Key interactive features of the updated specification include: * Try-It-Out Functionality: Allows a user to interact with API endpoints directly from the documentation in their browser, submitting test requests and viewing live responses. * Interactive Parameter Input: Simplifies experimentation with filters and parameters to explore API behavior. * Schema Visualization: Provides clear representations of request and response structures.

  17. d

    Biotope – Geschuetzte_Biotope_f – OGC API Features

    • datasets.ai
    0
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GDI-DE (2025). Biotope – Geschuetzte_Biotope_f – OGC API Features [Dataset]. https://datasets.ai/datasets/573ed966-6c2e-c3d9-bfc3-a4b8a406e5e9
    Explore at:
    0Available download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    GDI-DE
    Description

    The Map Service (WFS Group) presents data from the biotope cadastre of the Saarland.: Protected biotopes of the Saarland in terms of area. In this object class, areas that are protected in accordance with § 22 SNG in conjunction with § 30 BNatSchG are recorded and represented. Several biotope types can be grouped together in one GB area, provided that they form a meaningful functional unit, e.g. lime-half dry grasses and heat-loving bushes, used wet meadows and wet meadows, or mesotraphent meadows and large harrow meadows. Viewing object in the GDZ; Export the area-based feature class GDZ2010.A_nggbt and the business table with the factual data (GDZ2010.nggbt) to the FileGDB. In addition to numerous internal database attributes, the following user-relevant attributes are available: IDENTIFIER: OSIRIS identifier; NAME; PROJ_URSPRUNG: Project origin; TYPE OF USE; Date of acquisition in OSIRIS; RECORDING TYPE; FLAECHENANZAHL; OFFIZIEL_FL: Area in ha (official); GEOGENAU: Geometric accuracy; GKRW: Legal value; GKHW: High value; INSDATE: Date of takeover in DGZ; BEMERKNG;

  18. a

    MDOT SHA Roadway Administrative Classifications

    • hub.arcgis.com
    • data.imap.maryland.gov
    Updated Oct 21, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ArcGIS Online for Maryland (2020). MDOT SHA Roadway Administrative Classifications [Dataset]. https://hub.arcgis.com/maps/maryland::mdot-sha-roadway-administrative-classifications-1
    Explore at:
    Dataset updated
    Oct 21, 2020
    Dataset authored and provided by
    ArcGIS Online for Maryland
    Area covered
    Description

    Esri ArcGIS Online (AGOL) Hosted Feature Layer for accessing the MDOT SHA Roadway Administrative Classifications (State Classifications) data product. MDOT SHA Roadway Administrative Classifications (State Classifications) data consists of linear geometric features which specifically show State-maintained roadways included in the State Primary & State Secondary systems throughout the State of Maryland. Roadway Administrative Classifications data is primarily used for general planning & funding purposes by showcasing the State Primary vs. State Secondary highway systems. The Maryland Department of Transportation State Highway Administration (MDOT SHA) currently reports this data only on the inventory direction (generally North or East) side of the roadway. Roadway Administrative Classification is not a complete representation of all roadway geometry.MDOT SHA Roadway Administrative Classifications data is maintained & updated by the MDOT SHA Office of Planning & Preliminary Engineering (OPPE) Data Services Division (DSD). Roadway Administrative Classifications data is used by various business units throughout MDOT, as well as many other Federal, State and local government agencies. Roadway Administrative Classification data is key to understanding which State-maintained roadways are included in the State Primary & State Secondary systems throughout Maryland.MDOT SHA Roadway Administrative Classifications data is updated & published on an annual basis for the prior year. This data is for the year 2023For more information related to the data, contact MDOT SHA OPPE Data Services Division (DSD):Email: DSD@mdot.maryland.gov For more information, contact MDOT SHA OIT Enterprise Information Services:Email: GIS@mdot.maryland.gov

  19. e

    Schools Groß-Gerau County - ft2:School Locations - OGC API Features

    • data.europa.eu
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schools Groß-Gerau County - ft2:School Locations - OGC API Features [Dataset]. https://data.europa.eu/88u/dataset/047edfcb-bcce-fdc5-7a87-b1423d9919aa
    Explore at:
    inspire download serviceAvailable download formats
    Description

    Schools including school types, contact details, further factual information and, if applicable, school districts (see specifications at https://www.gdi-suedhessen.de/fachthemen/pflichtenhefte/). Provided via the platform www.gdi-inspireumsetzer.de - A service of the GDI South Hesse.:

  20. g

    Biotope Cataster – Habitat Types fl – OGC API Features | gimi9.com

    • gimi9.com
    Updated Nov 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Biotope Cataster – Habitat Types fl – OGC API Features | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_e3b15c94-4af3-9c0d-70ce-fe6912b9601c/
    Explore at:
    Dataset updated
    Nov 7, 2023
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The Map Service (WFS Group) presents data from the Saarland biotope cadastre: habitat types of the Saarland in terms of area; This is the basic unit of object detection in biotope mapping. The areas are uniform in terms of terrain shape, use and vegetation equipment. The acquisition is selective, i.e. this object class is used exclusively for the detection and evaluation of FFH habitat types. Each area of different type and assessment has its own demarcation. Viewing object in the GDZ; Export the area-based feature class GDZ2010.A_ngbt and the business table with the factual data (GDZ2010.ngbt) to the FileGDB. In addition to numerous internal database attributes, the following user-relevant attributes are available: IDENTIFIER: OSIRIS Usage Type INFORMATION DATE:Dateum of acquisition in OSIRIS Inclusion type FLAECHENANZAHL OFFIZIEL_FL: Area in ha (official) GEOGENAU: Geometric Accuracy GKRW: Legal value GKHW: High value InsDate: Date of takeover in DGZ

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gregory Wilder (2025). Bitcoin Wallet Classification Public Dataset [Dataset]. https://www.kaggle.com/datasets/gregorywilder/bitcoin-wallet-classification-public-dataset
Organization logo

Bitcoin Wallet Classification Public Dataset

Explore at:
zip(2471936960 bytes)Available download formats
Dataset updated
Mar 15, 2025
Authors
Gregory Wilder
License

Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically

Description

Classification of 43,621,232 Bitcoin Wallet addresses into 14 categories (following the classifications defined by BABD, cited below): (1) Blackmail, (2) Cyber-Security Service, (3) Darknet Market, (4) Centralized Exchange, (5) P2P Financial Infrastructure Service, (6) P2P Financial Service, (7) Gambling, (8) Government Criminal Blocklist, (9) Money Laundering, (10) Ponzi Scheme, (11) Mining Pool, (12) Tumbler, (13) Individual Wallets, and (14) Unknown Service.

This database was made by combining info from four separate sources: Aleš Janda, who operates WalletExplorer.com, generously offers an API which provides significant wallet classification and identification that he was able to determine. Mr. Janda determined the owners of wallets by registering for certain services and learning the addresses used by those services (like SatoshiDice) and then backed in to other wallet addresses by determining which wallets were merged together (similar to a shadow wallet address analysis done by others). However, WalletExplorer states it has not updated much data since 2016. Thus, the data available from WalletExplorer is largely contained in the first 425,000 blocks (block 425,000 was mined on August 13, 2016). This classification data constituted, by far, the largest portion of classification data in this dataset, with over 43 million wallet addresses being classified. Preference was always given to WalletExplorer.com's classifications, as “Strong Addresses.” Aleš Janda. Wallet explorer. https://www.walletexplorer.com/info.

Additionally, several datasets (in spreadsheets) identify and classify wallets:

The largest of these spreadsheets available on Kaggle was made by Xiang, et al. They were able to classify 544,462 wallet addresses between blocks 585,000 (July 12, 2019) and 685,000 (May 26, 2021) into 13 classifications: (1) Blackmail, (2) Cyber-Security Service, (3) Darknet Market, (4) Centralized Exchange, (5) P2P Financial Infrastructure Service, (6) P2P Financial Service, (7) Gambling, (8) Government Criminal Blocklist, (9) Money Laundering, (10) Ponzi Scheme, (11) Mining Pool, (12) Tumbler, and (13) Individual Wallets. They used the WalletExplorer database and other governmental blocklist databases as their “strong addresses” and information from BitcoinAbuse.com (which now appears to be ChainAbuse.com) as “weak addresses” to train their AI models. The processes they used to identify and classify the wallets on their experimental sets resulted in a minimum F-1 score of 92.97%, accuracy of 93.24%, precision of 92.80%, and recall of 93.24%. They accomplished this by using a framework consisting of two parts: a statistical indicator (SI), and a local structural indicator (LSI). The SI considered four indicator types, Pure Amount Indicator (PAI), Pure Degree Indicator (PDI), Pure Time Indicator (PTI), and Combination Indicator (CI). The SI considered 132 features to predict the classification. For the LSI, they generated k-hop subgraphs with an algorithm, and then used various graph metrics and other features to predict the classification. Yiexin Xiang, Yuchen Lei, Ding Bao, Tiantian Li, Quingqing Yang, Wenmao Liu, Wei Ren, and Kim-Kwang Raymond Choo. Babd: A bitcoin address behavior dataset for pattern analysis. IEEE Transactions on Information Forensics and Security, (19):2171–85, 2024. https://www.kaggle.com/datasets/lemonx/babd13

Michalski, et al., identified 8,808 addresses and classified them using machine learning techniques considering 149 features, into categories including (1) mining pools, (2) miners, (3) coinjoins, (4) gambling, (5) exchange, and (6) services. They analyzed the blocks between 520,850 and 520,950. They obtained training data from WalletExplorer.com and then used machine learning techniques including Evaluated Supervised Learning Algorithms. They determined that the Random Forest classification was the best method of classifying the wallets, and stated they obtained an F-score of 95%. Due to the small size of this dataset and the fact that only 100 blocks were covered by this dataset, I considered these classifications to be “weak addresses” for this work. The wallets they classified as “services” were added to my database as an “Unknown Service.” Radoslaw Michalski, Daria Dziuba ltowska, and Piotr Macek. Bitcoin addresses and their categories. https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/KEWU0N

The US Office of Foreign Asset Control maintains a list of ‘sanctioned’ Bitcoin and other digital currency assets. Several Github contributors maintain a tool that extracts the Bitcoin addresses from this database. This added 390 wallet addresses to the dataset. U.S. Treasury. Specially designated nationals list of the U.S. Office of Foreign Asset Control. https://www.treasury.gov/ofac/downloads/sanctions/1.0/sdn_advanced.xml. 0xB10C, Michael Neale, and Yahiheb. https://github.com/0xB10C/ofac-sanctioned-digital-currency-addresses?tab=readme-ov-file

Search
Clear search
Close search
Google apps
Main menu