Table of Links
Abstract and 1. Introduction
-
Background
2.1 Ethereum Primer
2.2 Whitelisted Address Verification
2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model
-
Motivating Example and Challenges
3.1 Motivating Example
3.2 Challenges
3.3 Limitations of Existing Tools
-
Design of AVVERIFIER and 4.1 Overview
4.2 Notations
4.3 Component#1: Code Grapher
4.4 Component#2: EVM Simulator
4.5 Component#3: Vulnerability Detector
-
Evaluation
5.1 Experimental Setup & Research Questions
5.2 RQ1: Effectiveness & Efficiency
5.3 RQ2: Characteristics of Real-world Vulnerable Contracts
5.4 RQ3: Real-time Detection
-
Discussion
6.1 Threats to Validity and 6.2 Limitations
6.3 Ethical Consideration
-
Related Work
-
Conclusion, Availability, and References
5 Evaluation
5.1 Experimental Setup & Research Questions
Baselines. To the best of our knowledge, no existing smart contract analysis framework supports the detection of address verification vulnerability. However, to illustrate the effectiveness of AVVERIFIER, we have extended Mythril [70] (commit: f5e2784), Ethainter [14], Jackal [39] (commit: 3993e5c), and ETHBMC [33] (commit: e887f33), four popular contract analyzers, as baselines. Specifically, Mythril is a well-known and widely adopted symbolic executor that is specifically designed to detect vulnerabilities in Ethereum smart contracts. Thus, we firstly integrate the same taint propagation rules into Mythril. Then, we employ the same three-stage detection logic as we stated in §4.5. Consequently, Mythril is modified as a static symbolic execution based taint analyzer. As for Ethainter, it is a source-code level reputable, extensively utilized, and scalable taint analyzer based on Datalog. After being given a source code, Ethainter will first perform a complete analysis and extract the control flow and data flow dependencies in the contract. Therefore, we also implement a three-stage detector in Datalog to filter out the ones that comply with the rules mentioned in §4.5 from all generated states. Regarding Jackal, it decompiles Ethereum smart contracts’ bytecode to an intermediate representation for constructing control flow graphs. We integrate the same taint propagation rules in Jackal, and employ the three-stage detection logic outlined in §4.5. This modification enhances Jackal’s capability to focus on identifying address verification vulnerabilities in smart contracts. Last, ETHBMC utilizes bounded model checking and symbolic execution. We also manually integrate the three-stage detection logic into it to sharpen its focus on identifying address verification vulnerabilities. Because detectors in all these tools follow the same set of principles on semantic level, this can reflect the distinction among them in terms of effectiveness and efficiency.
Implementation of AVVERIFIER. AVVERIFIER is written in Python and consists of 1.3K lines of code. As shown in Fig. 1, it is composed of three main modules:
Code Grapher. It is responsible for disassembling the given bytecode into opcodes, and constructing the CFG according to the statical function call opcodes as we mentioned in §4.3. Furthermore, in the function selector, the Grapher selects suspicious functions according to the heuristic in §4.3, and obtains the corresponding entry basic block.
EVM Simulator. The body of the Simulator is basically a two-layer nested loop, the outer one iterates all suspicious functions collected from the Grapher, and the inner one iterates all opcodes. According to the taint propagation rules defined in §4.4, a state, which is composed of data structures like EVM stack, memory, storage, and taint information, is updated. When JUMPI is encountered, AVVERIFIER employs the heuristic-based path selection approach. The final state of each path will be sent to the Detector. Once a function is labelled as vulnerable, the inner loop will break to the next function to improve the analysis efficiency.
Vulnerability Detector. Leveraging the state yielded by the Simulator, all states undergo a three-phase check as introduced in §4.5. If any state can pass all three phases, i.e., vulnerable, it will be returned to Simulator immediately.
Experimental Setup. The experiments are conducted on a 48 core server equipped with two Intel Xeon 6248R processors, accompanied by 256GB RAM, while its time limitation for each contract is 10 minutes.
We answer the following research questions (RQs):
RQ1 Is AVVERIFIER efficient and effective in identifying the address verification vulnerability?
RQ2 How many smart contracts are vulnerable in the wild and what are their characteristics?
RQ3 Can AVVERIFIER be deployed as a real-time detection system?
To answer RQ1, we launch AVVERIFIER and four baseline tools on a well-constructed benchmark and real-world contracts in the wild, ranging from the genesis block to the one with the height of 17,421,000, created at Jun. 6th, 2023, around 59 million contracts in total. Among these contracts, we regard the ones that are involved in at least one piece of transaction as the worth being analyzed one. Thus, 5,158,101 smart contracts remain as candidates. To answer RQ2, we characterize the vulnerable contracts from different perspectives, e.g., number of transaction and tokens involved. To answer RQ3, we have deployed AVVERIFIER on both Ethereum and BSC [20], a well-known EVM-like blockchain platform with the market cap as $37.7 billion [24]. We comprehensively its usability and scalability of being a real-time detector.
Authors:
(1) Tianle Sun, Huazhong University of Science and Technology;
(2) Ningyu He, Peking University;
(3) Jiang Xiao, Huazhong University of Science and Technology;
(4) Yinliang Yue, Zhongguancun Laboratory;
(5) Xiapu Luo, The Hong Kong Polytechnic University;
(6) Haoyu Wang, Huazhong University of Science and Technology.