For an Algorithm f, two mutually untrusted participants, Alice and Bob, can establish trust in the following way:
For the above 2, 3, and 4, let x be Layer2 transactions and initial state, f be Layer2 Consensus program, y be the final state of the transaction, which corresponds to the Layer2 scaling solution of the blockchain.
Table 1: Methods to Build Trust
Also, it is important to note:
Currently, thanks to the Turing Complete Smart Contract of Solidity, fraud proof and validity proof technologies have been widely used in the Layer2 expansion of Ethereum. However, within the BTC framework, due to the limited functionality of BTC’s opcodes, a stack element count of only 1000, and other restrictions, the application of these technologies is still in the exploratory stage. This article summarizes the limitations and breakthrough technologies under the BTC framework, studies the validity proof and fraud proof technologies, and outlines the unique script segmentation technology under the BTC framework.
There are many limitations within the framework of Bitcoin, but these limitations can be overcome through various clever methods or technologies. For example, the use of Bit commitments can break through the statelessness limitation of UTXO, Taproot can solve the limitation of script space, connector outputs can overcome the limitation of UTXO spending methods, and Smart Contracts can overcome the limitation of pre-signing.
Bitcoin adopts the UTXO model, where each UTXO is locked in a locking script that defines the conditions required to spend the UTXO. There are the following limitations to BTC scripts:
Currently, the BTC script is completely stateless. When executing the BTC script, the execution environment of each script will be reset upon completion. This makes it impossible for the BTC script to natively support scripts 1 and 2 sharing the same x value. However, this problem can be solved through some methods, the core idea is to sign a value in some way. If a signature can be generated for a value, it is possible to achieve a stateful BTC script. In other words, by checking the signatures of the x values in scripts 1 and 2, it can be ensured that the same x value is used in these two scripts. If there are conflicting signatures, that is, two different values are signed for the same variable x, penalties can be imposed. This method is called Bit commitment.
The principle of Bit commitment is relatively simple. Each Bit corresponds to two different hash values, hash0 and hash1. If the Bit value to be signed is 0, the pre-image of hash0 is revealed; if the Bit value is 1, the pre-image of hash1 is revealed.
Taking a single Bit message m ∈ {0,1} as an example, the corresponding Bit commitment unlocking script is composed of some preimages: if the Bit value is 0, the unlocking script is preimage0 - ‘0xfa7fa5b1dea37d71a0b841967f6a3b119dbea140’; if the Bit value is 1, the unlocking script is preimage1 - ‘0x47c31e611a3bd2f3a7a42207613046703fa27496’. Therefore, through Bit commitment, the stateless limitation of UTXO can be overcome, enabling stateful BTC scripts and making various interesting new features possible.
OP_HASH160
OP_DUP
0xf592e757267b7f307324f1e78b34472f8b6f46f3> // This is hash1
OP_EQUAL
OP_DUP
OP_ROT
0x100b9f19ebd537fdc371fa1367d7ccc802dc2524 > // This is hash0
OP_EQUAL
OP_BOOLOR
OP_VERIFY
// The value promised by Bit is on the stack. It can be “0” or “1”.
Currently, Bit promises two implementation methods:
Currently, the BitVM2 library implements Winternitz signatures based on the Blake3 hash function. The length of a single Bit signature is approximately 26 bytes. Therefore, it can be seen that introducing state through Bit commitments has a cost. Therefore, in the implementation of BitVM2, the message is first hashed to obtain a 256-bit hash value, and then a Bit commitment is made to this hash value to save costs, instead of committing to each Bit of the original longer message directly.
The Taproot soft fork upgrade of Bitcoin (BTC) was activated in November 2021, including three proposals: Schnorr signatures (BIP 340), Taproot (BIP 341), and TapScript (BIP 342). This upgrade introduces a new transaction type - Pay-to-Taproot (P2TR) transactions. By combining the advantages of Taproot, MAST (Merkelized Abstract Syntax Trees), and Schnorr signatures, P2TR transactions can create more private, flexible, and scalable transaction formats.
P2TR supports two spending methods: spending via Secret Key path or script path. According to the Taproot (BIP 341) specification, when spending via the script path, the corresponding Merkle path length cannot exceed 128. This means that the number of tapleaves in the taptree cannot exceed 2^128. Since the SegWit upgrade in 2017, the BTC network measures block size in weight units, with a maximum support of 4 million weight units (about 4MB). When a P2TR output is spent via the script path, only a single tapleaf script needs to be revealed, which means that the block contains only one tapleaf script. Therefore, for P2TR transactions, the maximum script size for each tapleaf can reach about 4MB. However, according to BTC’s default policy, many nodes only forward transactions smaller than 400K; larger transactions require cooperation with miners for packaging.
The script space premium that Taproot brings makes it even more valuable to simulate cryptographic operations such as multiplication and hashing using existing opcodes. When building verifiable computations on top of P2TR, the script size is no longer limited by the 4MB limit, but can be split into multiple sub-functions distributed across multiple tapleaves, thus breaking the 4MB script space limit. In fact, the Groth16 validator algorithm currently implemented in BitVM2 is up to 2GB in size. However, it can be split and distributed across approximately 1000 tapleaves, and by combining with the Bit commitment, constrains the consistency between the inputs and outputs of each sub-function, thus ensuring the integrity and correctness of the entire computation.
Currently, BTC provides two native spending methods for a single Unspent Transaction Output (UTXO): spending through script or Public Key. Therefore, as long as the correct Public Key signature or script conditions are met, the UTXO can be spent. Two UTXOs can be spent independently, and restrictions cannot be added to constrain these two UTXOs, which means additional conditions must be met to spend them.
However, the founder of ARK protocol, Burak, cleverly utilized the SIGHASH flag to achieve connector output. Specifically, Alice can create a signature to send her BTC to Bob. Since the signature can commit to multiple inputs, Alice can set her signature as a condition: it is only valid when the second input is spent in the Taketx transaction. The second input is called the connector, which connects UTXO A and UTXO B. In other words, the Taketx transaction is only valid when UTXO B is not spent by the Challengetx. Therefore, by destroying the connector output, the validity of the Taketx transaction can be prevented.
Figure 1: Connector output schematic diagram
In the BitVM2 protocol, the connector output acts as an if…else function. Once the connector output is spent by a transaction, it cannot be spent again by other transactions, ensuring its exclusivity. In practical deployment, additional UTXOs with time locks are introduced to allow for challenge-response cycles. In addition, the connector output can be set with different spending strategies as needed, such as allowing either party to spend in the case of a challenge transaction, and only allowing the operator to spend in the case of a response transaction or allowing anyone to spend after a timeout.
Currently, BTC scripts primarily restrict unlocking conditions, but do not restrict how unspent transaction outputs (UTXOs) are further spent. This is because BTC scripts cannot read the content of the transaction itself and cannot achieve introspection of the transaction. If BTC scripts could examine any content of the transaction, including outputs, then contract functionality could be achieved.
The current contract implementation can be divided into two categories:
validity proof and fraud proof can both be used for BTC’s Layer2 expansion, with the main differences seen in Table 2.
Table 2: validity proof and fraud proof
BTC-based fraud proofs can be built based on Bit commitments, Taproot, pre-signed, and connector outputs. At the same time, by introducing contract opcodes (such as OP_CAT), it is also possible to build BTC validity proofs based on Taproot. In addition, fraud proof can also be divided into licensed fraud proof and permissionless fraud proof according to whether Bob has a get on board system. In permissioned fraud proof, only a specific group of people can challenge as Bob, whereas in permissionless fraud proof, any third party can challenge as Bob. Permissionless fraud proof is more secure than permissioned fraud proof because it drops the risk of collusion between participants.
Based on the challenge-response interaction count between Alice and Bob, fraud proof can be divided into single-round fraud proof and multi-round fraud proof, as shown in Figure 2.
Figure 2: Single-round fraud proof vs. multi-round fraud proof
As shown in Table 3, fraud proof can be implemented through different interaction models, including single-round interaction model and multi-round interaction model.
Table 3: Single-turn Interaction and Multi-turn Interaction
In the Layer 2 expansion paradigm of BTC, BitVM1 adopts a multi-round fraud proof mechanism, while BitVM2 adopts a single-round fraud proof mechanism. Bitcoin-circle stark utilizes validity proof. Among these mechanisms, BitVM1 and BitVM2 can be implemented without modifying the BTC protocol, while bitcoin-circle stark requires the introduction of a new Operation Code OP_CAT.
For most computational tasks, BTC’s signet, testnet, and mainnet cannot fully support 4MB scripts, so script splitting technology is needed, that is, splitting the complete computational script into blocks smaller than 4MB and distributing them in different Tapleafs.
As shown in Table 3, multi-round fraud proof is suitable for those who want to reduce on-chain arbitration calculations, or cannot locate the problematic calculation segment in one step. As the name suggests, multi-round fraud proof requires multiple rounds of interaction between the prover and validators to identify the problematic calculation segment, and then arbitrate based on these segments.
Robin Linus’s early BitVM White Paper (usually referred to as BitVM1) adopts multiple rounds of fraud proof. Assuming each challenge period lasts for one week and uses binary search to locate the problematic computation segment, the on-chain challenge response period of the Groth16 verifier may be extended to 30 weeks, resulting in poor timeliness. Although there are teams currently researching more efficient n-ary search methods than binary search, their timeliness is still significantly lower than the 2-week cycle of single-round fraud proof.
Currently, multi-round fraud proof in BTC uses permissioned challenges, while single-round fraud proof implements a permissionless challenge method, thereby dropping the risk of collusion between participants and enhancing security. To achieve this, Robin Linus fully utilized the advantages of Taproot to optimize BitVM1, reducing the number of interaction rounds to one and expanding the challenge method to a permissionless manner, despite increasing the cost of on-chain arbitration calculation.
In this model, the verification of fraud proof only requires one interaction between the prover and validators. Validators only need to initiate one challenge, while the prover only needs to respond once. In the response, the prover must provide evidence to prove that their calculation is correct. If the validators find inconsistencies in the evidence, the challenge is successful; otherwise, the challenge fails. The characteristics of single-round interaction fraud proof are shown in Table 3.
Figure 3: Single Round Fraud Proof
On August 15, 2024, Robin Linus released the technical White Paper ‘BitVM2: Connecting BTC to Layer 2’, which implements a cross-chain bridge using a single-round fraud proof method similar to that shown in Figure 3.
OP_CAT is part of the BTC scripting language when it was released, but it was disabled in 2010 due to security vulnerabilities. Nevertheless, the BTC community has been discussing the possibility of re-enabling OP_CAT for years. Currently, OP_CAT has been assigned the number 347 and has been enabled on BTC’s signet network.
The main function of OP_CAT is to merge the two elements in the stack and push the result back to the stack. This function provides new possibilities for contracts on BTC and STARK verifiers:
Despite the much lower computational load required for proof of validation after SNARK/STARK proof, the script size required for implementing the validator Algorithm in BTC script is still huge. Currently, based on existing BTC opcodes, the optimized sizes of Groth16 and Fflonk validator scripts both exceed 2GB, while the size of a single BTC Block is only 4MB, so it is impossible to run the entire validator script within a single Block. However, since the Taproot upgrade, BTC supports script execution through tapleaf, allowing the validator script to be split into multiple blocks, each serving as a tapleaf to construct the taptree. Consistency between blocks can be ensured through bit commitment.
Under the OP_CAT contract, the STARK verifier can be split into multiple standard transactions that are less than 400KB, so that the entire STARK validity proof verification can be completed without the need to cooperate with the Miner.
This section will focus on the splitting technique of BTC scripts under existing conditions, without introducing or activating any new opcodes.
When performing script splitting, it is necessary to balance the information of the following aspects:
Currently, the methods of script splitting can be divided into the following three categories:
For example, after multiple rounds of optimization, the script size of Groth16 verifier has dropped from about 7GB to approximately 1.26GB. In addition to overall computational optimization, each block can also be individually optimized to maximize stack space utilization. For instance, by introducing more efficient lookup table algorithms and dynamically loading and unloading lookup tables, the script size of each block can be further reduced.
Due to the significant differences in computing costs and runtime environments between web2 programming languages and BTC scripts, it is not feasible to simply convert existing algorithm implementations into BTC scripts. Therefore, specific optimizations for BTC scenarios need to be considered:
This article first discusses the limitations of BTC scripts and discusses several solutions: using BTC commitments to overcome the stateless restrictions of UTXO, using Taproot to break the limitations of script space, bypassing the limitations of UTXO spending methods by connecting outputs, and using contracts to solve the limitations of pre-signed transactions. Next, the article provides a comprehensive overview of the characteristics of fraud proof and validity proof, including the characteristics of licensed and unlicensed fraud proof, the difference between single-round and multi-round fraud proof, and the relevant content of BTC script splitting technology.