4 datasets found
  1. Pagerank Dataset for Bitcoin Blockchain - Part 1 of 2

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bz2, txt
    Updated Dec 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Baran Kılıç; Baran Kılıç; Can Özturan; Can Özturan; Alper Şen; Alper Şen (2022). Pagerank Dataset for Bitcoin Blockchain - Part 1 of 2 [Dataset]. http://doi.org/10.5281/zenodo.6052811
    Explore at:
    bz2, txtAvailable download formats
    Dataset updated
    Dec 19, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Baran Kılıç; Baran Kılıç; Can Özturan; Can Özturan; Alper Şen; Alper Şen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This dataset contains the Pagerank values and rankings of Bitcoin addresses and transaction IDs (TXID). It contains a total of 1.608.748.675 addresses or TXIDs.

    Part 2 is available at https://zenodo.org/deposit/6077428

    File format

    The dataset is compressed with bzip2. It can be uncompressed using the command bunzip2. The dataset is divided into multiple files since it was large. The files are space-delimited plain text files and have the following five fields:

    Label: A alphanumeric Bitcoin address (e.g. 1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV) or a 64 character hexadecimal transaction ID (e.g. 000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51) Type: String

    Label type: It's value is 0 if the label is transaction ID and 1 if the label is a Bitcoin address. Type: Integer

    Rank: Unique Pagerank rank where the ties (addresses having the same Pagerank value) are resolved by sorting the addresses. Type: Integer

    Rank with ties: Pagerank rank where the ties (addresses having the same Pagerank value) have the same rank. Type: Integer

    Pagerank value: Pagerank of the address and transaction IDs calculated using Pagerank algorithm. Type: Floating-point number

    Sample lines:

    000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51 0 427225664 266976712 0.979246
    1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV 1 1114666798 508037940 0.877961

    "head.txt" contains the first 10 lines of each file. "tail.txt" contains the last 10 lines of each file.

    Dataset Generation

    The Bitcoin transactions between blocks 0 (mined on 03.01.2009) and 713.999 (mined on 13.12.2021) are extracted. A transaction graph is constructed, where Bitcoin addresses and transaction IDs are nodes of the graph and the transaction inputs and outputs are edges of the graph. Pagerank is applied on this transaction graph. This computation is performed using the system presented in the paper 'Parallel analysis of Ethereum blockchain transaction data using cluster computing'.

    Note

    If you use our dataset in your research, please cite our paper: https://link.springer.com/article/10.1007/s10586-021-03511-0

    @article{kilic2022parallel, 
     title={Parallel Analysis of Ethereum Blockchain Transaction Data using Cluster Computing}, 
     journal={Cluster Computing},
     author={K{\i}l{\i}{\c{c}}, Baran and {\"O}zturan, Can and Sen, Alper},
     year={2022},
     month={Jan} 
    }

    Other Datasets

    If you are interested, please also check out our Pagerank Dataset for Ethereum Blockchain.

  2. Prices of top cryptocurrencies

    • kaggle.com
    Updated Jan 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kuntal Maity (2022). Prices of top cryptocurrencies [Dataset]. https://www.kaggle.com/datasets/kuntalmaity/prices-of-top-cryptocurrencies/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 2, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kuntal Maity
    Description

    Context

    Things like Block chain, Bitcoin, Bitcoin cash, Ethereum, Ripple etc are constantly coming in the news articles I read. So I wanted to understand more about it and this post helped me get started. Once the basics are done, the data scientist inside me started raising questions like:

    How many cryptocurrencies are there and what are their prices and valuations? Why is there a sudden surge in the interest in recent days? So what next? Now that we have the price data, I wanted to dig a little more about the factors affecting the price of coins. I started of with Bitcoin and there are quite a few parameters which affect the price of Bitcoin. Thanks to Blockchain Info, I was able to get quite a few parameters on once in two day basis.

    This will help understand the other factors related to Bitcoin price and also help one make future predictions in a better way than just using the historical price.

    Content

    The dataset has one csv file for each currency. Price history is available on a daily basis from April 28, 2013. This dataset has the historical price information of some of the top crypto currencies by market capitalization.

    Date : date of observation (1st jan 2014 to 1st jan 2022) Open : Opening price on the given day High : Highest price on the given day Low : Lowest price on the given day Close : Closing price on the given day Volume : Volume of transactions on the given day Market cap-The Capital of this coin

  3. f

    Verified Smart Contract Code Comments

    • figshare.com
    zip
    Updated Aug 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    André Storhaug (2023). Verified Smart Contract Code Comments [Dataset]. http://doi.org/10.6084/m9.figshare.20780878.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 21, 2023
    Dataset provided by
    figshare
    Authors
    André Storhaug
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Artifact DescriptionVerified Smart Contracts Code Comments is a dataset of real Ethereum smart contract functions, containing "code, comment" pairs of both Solidity and Vyper source code. The dataset is based on every deployed Ethereum smart contract as of 1st of April 2022, whose been verified on Etherscan and has a least one transaction. A total of 1,541,370 smart contract functions are provided, parsed from 186,397 unique smart contracts, filtered down from 2,217,692 smart contracts.The dataset contains three folders: "train", "validation" and "test". Each folder contains several enumerated files in the Apache Parquet data file format.Environment SetupThe Pandas library for Python is required to load the dataset. Both Unix-based and Windows systems are supported.Getting StartedThe following code snippet demonstrates how to load the dataset into a Pandas DataFrame.>>> import pandas as pd>>> df = pd.read_parquet("path/to/data")LicenseAll Smart Contracts in the dataset are publicly available, obtained by using Etherscan APIs, and subject to their own original licenses.

  4. Monthly size of crypto theft 2020-2022

    • statista.com
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Monthly size of crypto theft 2020-2022 [Dataset]. https://www.statista.com/statistics/1285057/crypto-theft-size/
    Explore at:
    Dataset updated
    Jul 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 3, 2022
    Area covered
    Worldwide
    Description

    The value of crypto lost to security threats grew over **** times between 2020 and 2021, with *** incident in August 2021 accounting for *** million U.S. dollars stolen. During this particular incident - claimed to be *** of the biggest cryptocurrency heists of all time - an individual person targeted the Ethereum-based DeFi application Poly Network after exploited a flaw in the Network's code. After Poly Network pleaded with the hacker, the anonymous hacker handed back about half of the money - *** million U.S. dollars - claiming he did the hack "for fun".

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Baran Kılıç; Baran Kılıç; Can Özturan; Can Özturan; Alper Şen; Alper Şen (2022). Pagerank Dataset for Bitcoin Blockchain - Part 1 of 2 [Dataset]. http://doi.org/10.5281/zenodo.6052811
Organization logo

Pagerank Dataset for Bitcoin Blockchain - Part 1 of 2

Explore at:
bz2, txtAvailable download formats
Dataset updated
Dec 19, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Baran Kılıç; Baran Kılıç; Can Özturan; Can Özturan; Alper Şen; Alper Şen
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Description

This dataset contains the Pagerank values and rankings of Bitcoin addresses and transaction IDs (TXID). It contains a total of 1.608.748.675 addresses or TXIDs.

Part 2 is available at https://zenodo.org/deposit/6077428

File format

The dataset is compressed with bzip2. It can be uncompressed using the command bunzip2. The dataset is divided into multiple files since it was large. The files are space-delimited plain text files and have the following five fields:

Label: A alphanumeric Bitcoin address (e.g. 1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV) or a 64 character hexadecimal transaction ID (e.g. 000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51) Type: String

Label type: It's value is 0 if the label is transaction ID and 1 if the label is a Bitcoin address. Type: Integer

Rank: Unique Pagerank rank where the ties (addresses having the same Pagerank value) are resolved by sorting the addresses. Type: Integer

Rank with ties: Pagerank rank where the ties (addresses having the same Pagerank value) have the same rank. Type: Integer

Pagerank value: Pagerank of the address and transaction IDs calculated using Pagerank algorithm. Type: Floating-point number

Sample lines:

000000000fdf0c619cd8e0d512c7e2c0da5a5808e60f12f1e0d01522d2986a51 0 427225664 266976712 0.979246
1DzTCMmWABEDM1rYFL1RgdLyE59jXMzEHV 1 1114666798 508037940 0.877961

"head.txt" contains the first 10 lines of each file. "tail.txt" contains the last 10 lines of each file.

Dataset Generation

The Bitcoin transactions between blocks 0 (mined on 03.01.2009) and 713.999 (mined on 13.12.2021) are extracted. A transaction graph is constructed, where Bitcoin addresses and transaction IDs are nodes of the graph and the transaction inputs and outputs are edges of the graph. Pagerank is applied on this transaction graph. This computation is performed using the system presented in the paper 'Parallel analysis of Ethereum blockchain transaction data using cluster computing'.

Note

If you use our dataset in your research, please cite our paper: https://link.springer.com/article/10.1007/s10586-021-03511-0

@article{kilic2022parallel, 
 title={Parallel Analysis of Ethereum Blockchain Transaction Data using Cluster Computing}, 
 journal={Cluster Computing},
 author={K{\i}l{\i}{\c{c}}, Baran and {\"O}zturan, Can and Sen, Alper},
 year={2022},
 month={Jan} 
}

Other Datasets

If you are interested, please also check out our Pagerank Dataset for Ethereum Blockchain.

Search
Clear search
Close search
Google apps
Main menu