4 datasets found

Data from: AdvSCanner: Generating Adversarial Smart Contracts to Exploit...
figshare.com
Updated Sep 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
yin wu (2024). AdvSCanner: Generating Adversarial Smart Contracts to Exploit Reentrancy Vulnerabilities Using LLM and Static Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.26014876.v4
Explore at:
text/x-script.pythonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26014876.v4
Dataset updated
Sep 13, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
yin wu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
AGEStaticAGEStatic is an innovative project aimed at enhancing the security of Ethereum smart contracts by automatically generating exploit smart contracts. The project leverages large language models (LLMs) and static analysis to automatically generate adversarial smart contracts (ASCs) designed to exploit reentrancy vulnerabilities in victim contracts, which are among the most critical security issues in smart contracts.DatasetWe have collected and integrated multiple smart contracts with reentrancy vulnerabilities from various sources. To obtain more representative samples, we filtered out ineligible and duplicate smart contracts according to the standards mentioned above, resulting in a total of 78 unique smart contracts (14 are duplicate.)Size: The dataset includes 78 smart contracts (14 duplicates), each verified for relevance and uniqueness,such as ERAP, ESC, Smartbugs, RSD, ATR, and SSE.Standards for Dataset Collection:Solidity Smart Contract: The AGEStatic tool we designed is aimed at Solidity smart contracts, with Solidity versions ranging from 0.4.0 to 0.8.25.Open-source and Peer-reviewed Dataset: The reentrancy vulnerabilities datasets are collected from widely-used or peer-reviewed open-source datasets that have obtained general public acceptance and applications in relevant research.Marked as Reentrancy Vulnerability: The most vital standard requires the existence of reentrancy vulnerability, which can be categorized into two types: manually injected vulnerability (MI) and real-world vulnerability (RW).Detection by Static Analysis Tool: These contracts in the dataset should be identified as reentrancy vulnerability by traditional static analysis tools that output reentrancy reports for each contract.Fully Functional Characteristics: Smart contracts with only partial functions cannot support attack verification experiments; therefore, the contracts satisfy logical integrity and full functionality characteristics.Physical ExperimentThis section describes the environment and code used for running the static analysis experiments and generating exploit contracts.Static Analysis: The static analysis experiments, obtained from GitHub, are run on an Ubuntu 22.04 system with the following hardware specifications:Operating System: Ubuntu 22.04CPU: Intel(R) Core(TM) i7-9750H @ 2.60GHz (2 cores and 2 threads)Cache Size: 12288 KBMemory Size: 6085248 KBExploit Contract Generation: We leverage APIs of gpt-3.5-turbo, gpt-4, or gpt-4o using Python. The environment specifications are as follows:Required Packages:python==3.10.0openai==0.28.0py-solc-x==2.0.2Experiment ResultsThe experimental results include RQ1, RQ2, RQ3, and RQ4.
Job Dataset
kaggle.com
zip
Updated Sep 17, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ravender Singh Rana (2023). Job Dataset [Dataset]. https://www.kaggle.com/datasets/ravindrasinghrana/job-description-dataset
Explore at:
zip(479575920 bytes)Available download formats
Dataset updated
Sep 17, 2023
Authors
Ravender Singh Rana
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Job Dataset

This dataset provides a comprehensive collection of synthetic job postings to facilitate research and analysis in the field of job market trends, natural language processing (NLP), and machine learning. Created for educational and research purposes, this dataset offers a diverse set of job listings across various industries and job types.

Descriptions for each of the columns in the dataset:

Job Id: A unique identifier for each job posting.

Experience: The required or preferred years of experience for the job.

Qualifications: The educational qualifications needed for the job.

Salary Range: The range of salaries or compensation offered for the position.

Location: The city or area where the job is located.

Country: The country where the job is located.

Latitude: The latitude coordinate of the job location.

Longitude: The longitude coordinate of the job location.

Work Type: The type of employment (e.g., full-time, part-time, contract).

Company Size: The approximate size or scale of the hiring company.

Job Posting Date: The date when the job posting was made public.

Preference: Special preferences or requirements for applicants (e.g., Only Male or Only Female, or Both)

Contact Person: The name of the contact person or recruiter for the job.

Contact: Contact information for job inquiries.

Job Title: The job title or position being advertised.

Role: The role or category of the job (e.g., software developer, marketing manager).

Job Portal: The platform or website where the job was posted.

Job Description: A detailed description of the job responsibilities and requirements.

Benefits: Information about benefits offered with the job (e.g., health insurance, retirement plans).

Skills: The skills or qualifications required for the job.

Responsibilities: Specific responsibilities and duties associated with the job.

Company Name: The name of the hiring company.

Company Profile: A brief overview of the company's background and mission.

Potential Use Cases:

Building predictive models to forecast job market trends.

Enhancing job recommendation systems for job seekers.

Developing NLP models for resume parsing and job matching.

Analyzing regional job market disparities and opportunities.

Exploring salary prediction models for various job roles.

Acknowledgements:

We would like to express our gratitude to the Python Faker library for its invaluable contribution to the dataset generation process. Additionally, we appreciate the guidance provided by ChatGPT in fine-tuning the dataset, ensuring its quality, and adhering to ethical standards.

Note:

Please note that the examples provided are fictional and for illustrative purposes. You can tailor the descriptions and examples to match the specifics of your dataset. It is not suitable for real-world applications and should only be used within the scope of research and experimentation. You can also reach me via email at: rrana157@gmail.com
TSMC Stock Daily Updated
kaggle.com
zip
Updated Nov 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Hidden Layer (2025). TSMC Stock Daily Updated [Dataset]. https://www.kaggle.com/datasets/isaaclopgu/tsmc-stock-daily-updated
Explore at:
zip(297189 bytes)Available download formats
Dataset updated
Nov 22, 2025
Authors
The Hidden Layer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About this Dataset

This dataset offers a comprehensive, up-to-date look at the historical stock performance of Taiwan Semiconductor Manufacturing Company (TSMC), the world's largest contract chip manufacturer. The data is provided in a clean, daily format, making it an excellent resource for financial analysis, machine learning, and time series modeling.

About the Company

Taiwan Semiconductor Manufacturing Company, Ltd. (TSMC) is a Taiwanese multinational semiconductor contract manufacturing and design company. Founded in 1987 and headquartered in Hsinchu, Taiwan, it is a key player in the global technology supply chain, producing chips for many of the world's leading tech companies, including Apple, NVIDIA, and AMD. TSMC's stock performance is a significant indicator of the health of the semiconductor industry and global demand for advanced electronics.

Key Features

Daily OHLCV Data: The dataset contains essential Open, High, Low, Close, and Volume metrics for each trading day.

Comprehensive History: Includes data from TSMC's early trading history to the present, offering a long-term perspective.

Regular Updates: The dataset is designed for regular, automated updates to ensure data freshness for time-sensitive projects.

Data Dictionary

Date: The date of the trading session in YYYY-MM-DD format.

ticker: The standard ticker symbol for Taiwan Semiconductor Manufacturing Company Ltd. on the NYSE: 'TSM'.

name: The full name of the company: 'Taiwan Semiconductor Manufacturing Company Ltd.'.

Open: The stock price in USD at the start of the trading session.

High: The highest price reached during the trading day in USD.

Low: The lowest price recorded during the trading day in USD.

Close: The final stock price at market close in USD.

Volume: The total number of shares traded on that day.

Data Collection

The data for this dataset is collected using the yfinance Python library, which pulls information directly from the Yahoo Finance API.

Potential Use Cases

Financial Analysis: Analyze historical price trends, volatility, and trading volume of TSMC stock.

Machine Learning: Develop and test models for stock price prediction and time series forecasting.

Educational Projects: A perfect real-world dataset for students and data enthusiasts to practice data cleaning, visualization, and modeling.
d
Augmented Texas 7000-bus synthetic grid
search.dataone.org
Updated Oct 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aravena, Ignacio; Jiyu Wang (2025). Augmented Texas 7000-bus synthetic grid [Dataset]. http://doi.org/10.7910/DVN/AKUDJT
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/AKUDJT
Dataset updated
Oct 29, 2025
Dataset provided by
Harvard Dataverse
Authors
Aravena, Ignacio; Jiyu Wang
Description
Augmented Texas 7000-bus synthetic grid Augmented version of the synthetic Texas 7k dataset published by Texas A&M University. The system has been populated with high-resolution distributed photovoltaic (PV) generation, comprising 4,499 PV plants of varying sizes with associated time series for 1 year of operation. This high-resolution dataset was produced following publicly available data and it is free of CEII. Details on the procedure followed to generate the PV dataset can be found in the Open COG Grid Project Year 1 Report (Chapter 6). The technical data of the system is provided using the (open) CTM specification for easy accessibility from Python without additional packages (data can be loaded as a dictionary). The time series for demand and PV production are provided as a HDF5 file, also loadable with standard open-source tools. We additionally provide example scripts for parsing the data in Python. Prepared by LLNL under Contract DE-AC52-07NA27344. LLNL control number: LLNL-DATA-2001833.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

yin wu (2024). AdvSCanner: Generating Adversarial Smart Contracts to Exploit Reentrancy Vulnerabilities Using LLM and Static Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.26014876.v4

Data from: AdvSCanner: Generating Adversarial Smart Contracts to Exploit Reentrancy Vulnerabilities Using LLM and Static Analysis

Explore at:

text/x-script.pythonAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.26014876.v4

Dataset updated

Sep 13, 2024

Dataset provided by

Figsharehttp://figshare.com/

Authors

yin wu

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

AGEStaticAGEStatic is an innovative project aimed at enhancing the security of Ethereum smart contracts by automatically generating exploit smart contracts. The project leverages large language models (LLMs) and static analysis to automatically generate adversarial smart contracts (ASCs) designed to exploit reentrancy vulnerabilities in victim contracts, which are among the most critical security issues in smart contracts.DatasetWe have collected and integrated multiple smart contracts with reentrancy vulnerabilities from various sources. To obtain more representative samples, we filtered out ineligible and duplicate smart contracts according to the standards mentioned above, resulting in a total of 78 unique smart contracts (14 are duplicate.)Size: The dataset includes 78 smart contracts (14 duplicates), each verified for relevance and uniqueness,such as ERAP, ESC, Smartbugs, RSD, ATR, and SSE.Standards for Dataset Collection:Solidity Smart Contract: The AGEStatic tool we designed is aimed at Solidity smart contracts, with Solidity versions ranging from 0.4.0 to 0.8.25.Open-source and Peer-reviewed Dataset: The reentrancy vulnerabilities datasets are collected from widely-used or peer-reviewed open-source datasets that have obtained general public acceptance and applications in relevant research.Marked as Reentrancy Vulnerability: The most vital standard requires the existence of reentrancy vulnerability, which can be categorized into two types: manually injected vulnerability (MI) and real-world vulnerability (RW).Detection by Static Analysis Tool: These contracts in the dataset should be identified as reentrancy vulnerability by traditional static analysis tools that output reentrancy reports for each contract.Fully Functional Characteristics: Smart contracts with only partial functions cannot support attack verification experiments; therefore, the contracts satisfy logical integrity and full functionality characteristics.Physical ExperimentThis section describes the environment and code used for running the static analysis experiments and generating exploit contracts.Static Analysis: The static analysis experiments, obtained from GitHub, are run on an Ubuntu 22.04 system with the following hardware specifications:Operating System: Ubuntu 22.04CPU: Intel(R) Core(TM) i7-9750H @ 2.60GHz (2 cores and 2 threads)Cache Size: 12288 KBMemory Size: 6085248 KBExploit Contract Generation: We leverage APIs of gpt-3.5-turbo, gpt-4, or gpt-4o using Python. The environment specifications are as follows:Required Packages:python==3.10.0openai==0.28.0py-solc-x==2.0.2Experiment ResultsThe experimental results include RQ1, RQ2, RQ3, and RQ4.

Clear search

Close search

Google apps

Main menu

Data from: AdvSCanner: Generating Adversarial Smart Contracts to Exploit...

Job Dataset

Job Dataset

Descriptions for each of the columns in the dataset:

Potential Use Cases:

Acknowledgements:

Note:

TSMC Stock Daily Updated

Augmented Texas 7000-bus synthetic grid

Data from: AdvSCanner: Generating Adversarial Smart Contracts to Exploit Reentrancy Vulnerabilities Using LLM and Static Analysis