Table of Links
Abstract and 1. Introduction
-
Background
2.1 Ethereum Primer
2.2 Whitelisted Address Verification
2.3 Taint Analysis on Smart Contracts and 2.4 Threat Model
-
Motivating Example and Challenges
3.1 Motivating Example
3.2 Challenges
3.3 Limitations of Existing Tools
-
Design of AVVERIFIER and 4.1 Overview
4.2 Notations
4.3 Component#1: Code Grapher
4.4 Component#2: EVM Simulator
4.5 Component#3: Vulnerability Detector
-
Evaluation
5.1 Experimental Setup & Research Questions
5.2 RQ1: Effectiveness & Efficiency
5.3 RQ2: Characteristics of Real-world Vulnerable Contracts
5.4 RQ3: Real-time Detection
-
Discussion
6.1 Threats to Validity and 6.2 Limitations
6.3 Ethical Consideration
-
Related Work
-
Conclusion, Availability, and References
3 Motivating Example and Challenges
3.1 Motivating Example
Listing 3 shows a smart contract owned by Visor Finance that is vulnerable to the address verification vulnerability. It was attacked on Dec. 21st, 2021 [11], causing $8.2 million financial losses. As we can see, the deposit function takes three arguments, namely, the number of tokens to be deposited (visrDeposit), the payer (from), and the beneficiary (to). From L6 to L8, it performs sanity checks, i.e., a valid amount of deposit, and valid addresses for both payer and payee. After that, it translates the deposit into shares according to totalSupply() (L11 to L14), performs the corresponding token transfer from the from address to itself (L16 to L22), and mints some vvisr tokens to the to address (L24). The vulnerability is hidden in the if code block
at L16. Specifically, it allows the from address as a contract, and examines if its owner function returns the address of the transaction initiator (L17). If the assertion passes, then it invokes the delegatedTransferERC20 function defined in from. Recalling the threat model mentioned in §2.4, attackers are able to deploy contracts and initiate transactions arbitrarily. More specific, if the from is actually provided by some malicious ones, they can control the behaviors of L17 and L18. To this end, the control flow will be successfully directed to L24, where vvisr finally issues tokens to to, which is also controlled by attackers, without receiving any tokens from from that is expected by developers of Visor Finance.
Through this example, we can summarize three principles related to the address verification vulnerability:
P1 The vulnerable function takes an address as a parameter, and performs insufficient authorization examination on that address. Through the address, attackers can pass self-deployed and unauthorized contracts.
P2 The address in P1 is taken as the target of an external call. Through the external call, the control flow is transferred to attackers. Thus, they can totally control the behavior of the external call, including the return value.
P3 On-chain states that are control-flow dependent on the return value mentioned in P2 are updated. To this end, through an unauthorized control flow, adversaries can get profits by indirectly manipulating on-chain states, like initiating an external call or updating balance.
3.2 Challenges
In response to the address verification vulnerability, as outlined in the summarized principles and the motivating example in §3.1, we identify the following challenges.
C1: Lack of semantics. It is challenging to precisely identify if the address mentioned in P1 is sufficiently verified due to the lack of semantics in bytecode. According to the statistics [59], more than 99% of Ethereum contracts have not released their source code. The bytecode format is quite unreadable and contains little semantic information. Moreover, there is no debug information to assist in recovering the semantics. Traditional bytecode-based analysis tools usually require some methods to overcome this challenge, like symbolic execution [70].
C2: Inter-procedural analysis on control flow and data flow. Detecting this vulnerability requires accurately extracting the control flow and data flow dependencies inter-procedurally. Specifically, in P2, there is an external call to an address, which is passed via the argument. Between the external call and the function entry, it will be propagated several times due to the authorization verification. Thus, we have to precisely identify if the callee address is one of the arguments through parsing data flow. Moreover, in P3, the on-chain state update depends on the return value of the external call in P2 in terms of control flow, which requires us to identify control flow dependencies among variables. In addition, from Listing 2 we can conclude that some authorizations are verified in other functions, which requires inter-procedural analysis.
Authors:
(1) Tianle Sun, Huazhong University of Science and Technology;
(2) Ningyu He, Peking University;
(3) Jiang Xiao, Huazhong University of Science and Technology;
(4) Yinliang Yue, Zhongguancun Laboratory;
(5) Xiapu Luo, The Hong Kong Polytechnic University;
(6) Haoyu Wang, Huazhong University of Science and Technology.