Table of Links
Abstract and 1 Introduction
2 Background and Related Work and 2.1 From Bitcoin to Blockchains
2.2 Open and Permissionless Blockchains
2.3 Interoperability Between Blockchains
3 Cross-Chain Query Language and 3.1 Integrated Data Model
3.2 Grammar and Query Processing Architecture
4 Evaluation of Implementation Feasibility and 4.1 Software and Hardware Configuration
4.2 Query Processing
4.3 Discussion
5 Conclusion and Outlook, Acknowledgment, and References
3 Cross-Chain Query Language
The following two subsections will detail (A.) the data model and (B.) the grammar with a concrete syntax and a corresponding processing architecture. Query statements are processed as per the architecture delineated in subsection (B.), yielding instances of data model classes using data sourced from the APIs of local blockchain nodes.
3.1 Integrated Data Model
The design of the language is predicated on a data model that integrates the principal data structures and attributes of the OPB discussed in Section 2.2. Building on prior work and existing tools addressed in Section 2.2, classes and attributes of the five OPB have been identified, generalized, and incorporated into a unified data model. Figure 1 presents the comprehensive data model as a UML class diagram. Table2 enumerates the main model classes, categorized into four packages to represent the chain, block, account, and transaction concepts of the OPB. The concrete syntax for formulating queries is introduced in subsection 3.2. Statements are articulated in terms of the classes and attributes, specifying the source data using class and attribute names of the data model.
The concepts of the OPB are shown in the table and data model, encapsulated by the classes of the following packages and classes. Classes of the chain package embody one main network and blockchain for Bitcoin, Ethereum, Cardano, and Solana, as represented by the classes Chain, Network, and ChainDescriptor of the data model. Additional test networks with their distinct blockchains, such as
Ropsten and Görli in Ethereum, are represented by the Network and ChainDescriptor classes. In Avalanche, the Network class encompasses one primary network, the first of potentially numerous ’subnets’, with separate ChainDescriptor instances for the three P/X/C blockchains.
The Block and BlockDescriptor classes represent blocks, with discrete classes Status for the block’s status, ValidationDescriptor for validation via the consensus protocol, and ValidatorDescriptor for the involved validators. Conceptually, blocks across all blockchains are identified by a hash value, supplemented with metadata like timestamps and a height value denoting the block number, assuming no changes to non-final blocks. For instance, in Bitcoin and other blockchains following the original “chain of blocks” concept by Nakamoto [22], a block is linked to its predecessor by a hash value, which is used for validation. This is represented by a Block object with (a) a reference to a BlockDescriptor object, e.g. containing metadata such as the timestamp, (b) a reference to the previous BlockDescriptor object in the linkedBlockDescriptor attribute, and (c) a reference to a ValidationDescriptor object containing the hash value. Regarding non-final blocks in Bitcoin, multiple blocks might be discovered as successors to a given block; however, only one block gets included in the chain, while others are dismissed with an ’orphan’ status. In contrast, Ethereum handles similar cases by retaining one block in the main chain while preserving other blocks at the same level with an ’ommer’ status. Blocks in Proof-of-Work chains are not explicitly finalized, permitting the assignment of ’orphan’ or ’ommer’ status to blocks found in parallel to preceding blocks of the chain. Nonetheless, the likelihood of existing blocks being superseded in this manner diminishes over time, as multiple consecutive parallel blocks with greater cumulative work are required. Explicit block finalization, forestalling the emergence of multiple successors, can be observed in more recent Proof-of-Stake blockchains such as Solana.
Concerning data structure, blocks are connected to one or more existing blocks via the linkedBlockDescriptor attribute of the Block class. This connection can establish either a series of backward-linked blocks as mentioned for Bitcoin or a graph structure, such as a Directed Acyclic Graph (DAG) in the Avalanche C chain. The linkedBlockDescriptor relates to the preceding block or, for DAG structures, to any number of previous vertices linked through directed edges. DAG blockchains are indicated by the dagSupport attribute in the BlockDescriptor class, which is set ’true’ accordingly.
The representation of blocks further depends on the consensus type. In order to unify the representation for Proof-of-Work and Proof-of-Stake, the ValidationDescriptor class contains generic attributes for storing a hash value, the condition for validation such as the target parameter in Bitcoin and the input, e.g. the Nonce in Bitcoin, as well as attributes for the Proof-of-Stake validation. Here, blocks are proposed, created and verified by attestation involving one or more validators. E.g., a Block in Ethereum is proposed and created by a validator represented as ValidationDescriptor object, and is subsequently verified by attestations. Attestations follow from multiple committees of validators that are represented through the attestationCommittee attribute in ValidationDescriptor, e.g. with multiple multiple addresses and votes. Regarding the creation of blocks, they either contain transactions directly or are grouped into time-based slots and epochs for Proof-of-Stake validation purposes. Upon appending a block, each block or slot undergoes validation, necessitating validators’ involvement. As per the ValidationDescriptor class, the creator of a Bitcoin or Ethereum block validates a linked block using the hashValue attribute. Conversely, for other Proof-of-Stake blockchains, block proposers are recorded in the corresponding attributes with attestations, which refer to the ValidatorDescriptor class. Each instance refers to any number of assigned validators who perform attestations of blocks through the committee mentioned before with votes and signatures. Thereby, for Ethereum and other Proof-of-Stake blockchains, the concepts for multiple groups of validators are represented.
Accounts, a concept prevalent in Ethereum, Solana, and Avalanche, are embedded in blocks to store assets, tokens, or data that are used for smart contracts. For a generic representation of accounts, the data model represents each Account object with an AccountDescriptor object containing the address and an indication whether the account represents a smart contract or an externally owned account of an individual. Concerning account-related data such as assets or tokens, it is important to note that data might represent assets or tokens natively, as seen in Cardano or Solana, or indirectly through data stored within an account. Each account is defined by an ID, with the concept of an address being common to all blockchains. Account storage of assets or tokens can refer to any custom asset or token represented by data in general. For tokens, token standards such as Ethereum’s ERC-20 or ERC-1155 are represented by the Token class’s attributes. Data storage utilizes binary large objects or key-value stores, which are employed in hash-based mapping data structures.
The concepts of transactions in Bitcoin and Cardano are distinctive due to these blockchains’ lack of account structures. Consequently, transactions hold references to unspent transaction outputs (UTXOs) from previous transactions. In this model, a UTXO is included alongside the transferred value and a script that outlines locking conditions or holds data. While data inclusion is implied in Bitcoin, Cardano explicitly accommodates data in transactions and its storage associated with an address for smart contract functionality.
On the other hand, in the case of Ethereum, Solana, and the Avalanche C chain, transactions are stored for the transfer of values, data, assets, or tokens between accounts. In the Avalanche X chain, the transfer of native assets is facilitated through the UTXO model. In the data model, the attributes of Transaction and TransactionDescriptor accommodate transfers between addresses by employing the attributes corresponding to the aforementioned concepts.
Author:
(1) Felix Härer[0000 −0002 −2768 −2342], Digitalization and Information Systems Group, University of Fribourg, Switzerland ([email protected]).