Kay Kurokawa on Cryptocurrencies: Scripting in Bitcoin: Part 2

Scripting In Bitcoin: Part 2

This is a work in progress, and the contents of these articles may be edited for corrections and clarifications.

Part 1

1.Introduction

2.Basics of the Scripting Language.

3.Basics of Bitcoin Transactions

Part 2

4.Pay to Pub Key Hash

5.Multi-signature Transactions

6.Pay To Script Hash

4.Pay to Pub Key Hash

Previously, we covered the basics of transactions. We learned that a transaction input describes where the Bitcoins come from and proves that those Bitcoins can be redeemed. We also learned that a transaction output describes where the Bitcoins will be spent . The mechanism that makes this possible is the scripting language. In a transaction, both the input and the output contains a script. Input has one half of the script, and output has the other half. When a transaction is received by a Bitcoin node, the input script in the transaction is combined with the output script from the previous transaction that the input refers to.

Figure B

   Referring to Figure B above, the input script is actually the part that “contains Bob's signature” in Input 2. The output script is the part that “contains Bob's public key” in Output 1 and also “contains Chris's public key” in Output2. A Bitcoin node combines input script in Input 2 with the output script from Output 1 and executes it to evaluate whether the transaction is valid. A transaction is valid if and only if the top most item on the data stack is “True” when execution is complete.

   The first script we will cover is the one used as Bitcoin's standard transaction. This type of transaction is the default transaction that occurs whenever you send Bitcoins to someone through any of the popular wallets like Bitcoin-Qt or Multibit, thus it is also the most common. It utilizes what developers call “Pay to Pubkey Hash”. Below is the content of the input and output script of a standard transaction.

Input Script (ScriptSig):
   <signature><pubKey>

Output Script (ScriptPubKey):
   OP_DUP OP_HASH160 <pubKeyHash> OP_EQUALVERIFY OP_CHECKSIG

   The input script is known as the ScriptSig as it is the script containing the signature. <pubKey> is the unhashed public key and <signtaure> is a ECSDA signature derived from the private key. The output script is known as ScriptPubKey as it is the script containing the public key where the Bitcoins shall be spent. <pubKeyHash> is the hash (SHA256 and then RIPEMD-160) of the public key, which is a Bitcoin address ( if this is confusing to you, please read the Bitcoin wiki on Bitcoin Addresses and Elliptic Curve Digital Signature Algorithm before proceeding).

   Let's refer again to the scenario in Figure B where Bob sends a Bitcoin to Chris. When a Bitcoin node receives transaction 2 created by Bob, the node validates it by combining the output script from transaction 1 and the input script from transaction 2. The output script from transaction 1 contains Bob's Bitcoin address in <pubKeyHash>. The input script from transaction 2 contains Bob's public key in <pubKey> and Bob's signature in <signature>. Below, we walk through the entire script.

Step	Data Stack	Instruction Stack	Explanation
1.	Empty	<signature> <pubKey> OP_DUP OP_HASH160 <pubkey Hash> OP_EQUALVERIFY OP_CHECKSIG	The entire script is combined in the instruction stack. The output script is put on the bottom of the instruction stack and the input script is put on top.
2.	<pubKey> <signature>	OP_DUP OP_HASH160 <pubKey Hash> OP_EQUALVERIFY OP_CHECKSIG	<signature> and <pubKey> are both constants and are sequentially moved to the top of the data stack
3.	<pubKey> <pubKey> <signature>	OP_HASH160 <pubKey Hash> OP_EQUALVERIFY OP_CHECKSIG	OP_DUP creates a duplicate of whatever is on the top of the data stack. In this case <pubKey> is duplicated.
4.	<pubKey Hash> <pubKey> <signature>	<pubkey Hash> OP_EQUALVERIFY OP_CHECKSIG	OP_HASH160 hashes (SHA256 and RIPEMD-160 ) the public key.
5.	<pubKey Hash> <pubKey Hash> <pubKey> <signature>	OP_EQUALVERIFY OP_CHECKSIG	<pubKey Hash> is moved to the data stack.
6.	<signature> <pubKey>	OP_CHECKSIG	OP_EQUALVERIFY takes the two top most data on the stack and verifies that they are equal. In this case we verified that <pubKey Hash> in the output script matches the hash of <pubKey> in the input script.
7.	True	Empty	OP_CHECKSIG checks that the <signature> belongs to the <pubKey>. If so, put “True” on the data stack.

Figure C

    We can see that the script has performed two crucial tasks to validate a transaction. First, in step 4 through 6, it has verified that if <pubKey> in the input is hashed, it matches <pubKey Hash> in the output. Second, in step 7, it has verified that the <signature> is derived from the private key associated with <pubKey>.

5.Multi-signature Transactions

   We now discuss how multi-signature transactions are possible using Bitcoin's scripting language. Multi-signature transactions are unique in that unlike a standard transaction which only requires one public/private key pair, it utilizes multiple public/private key pairs. This allows people to create addresses which are controlled by multiple people , each with their own unique key. An M of N multi-signature transaction is where you need M unique signatures out of a total of N unique signatures in order to redeem a transaction output. There are many uses for multi-signature transactions such as adding more security to Bitcon storage and creating escrow accounts. A good overview for its uses can be found in this article by Vitalik Buterin in BitcoinMagazine.com.

   The basic concept of a Bitcoin transaction still applies to multi-signature transactions. The only thing that changes is the input script and the output script. Below we present what they looks like for a general m of n multi-signature transaction.

Input Script (ScriptSig):
   <signature_1><signature_2>...<signature_n>

Output Script (ScriptPubKey):
   OP_m <pubKey_1><pubKey_2>...<pubKey_n> OP_n OP_CHECK_MULTISIG

So if we want a 2 of 3 multi-signature transaction, it would contain two signatures in the input script, and 3 public keys in the output script. OP_m would be OP_2 and OP_n would be OP_3. We walk through this example below.

Step	Data Stack	Instruction Stack	Explanation
1.	Empty	<signature1> <signature2> OP_2 <pubKey1> <pubkey2> <pubkey3> OP_3 OP_CHECK_MULTISIG	The entire script is combined in the instruction stack. The output script is put on the bottom of the instruction stack and the input script is put on top.
2.	3 <pubkey3> <pubkey2> <pubkey1> 2 <signature2> <singature1>	OP_CHECK_MULTISIG	All signatures and public keys are pushed onto the data stack. OP_2 puts the number “2” and OP_3 puts the number “3” onto the data stack.
3.	True	Empty	OP_CHECK_MULTISIG takes all the constants on the data stack as input. It checks to see if the 2 signature belongs to 2 of the 3 public keys. If it does, it returns True.

Figure D

6.Pay To Script Hash

   We've now covered standard Bitcoin transactions also knows as “Pay to Pub Key Hash” and multi-signature transactions. As you saw in both cases, the creator (or sender) of the transaction defines how the Bitcoins can be spent by defining the script in the output. The receiver of the transaction merely provides a signature in the input script. Bitcoin developers realized that this could be rather limiting when implementing more advanced financial transaction into the block chain.

   Let's look at the multi-signature transaction we covered previously for an example. Let's say company A has set up a multi-signature address consisting of 16 private/public key pairs. If a customer wants to send Bitcoins to that address, the customer needs to specify all 16 public keys in the output script. This is far more cumbersome than a standard transaction where the transaction sender needs to only know 1 Bitcoin address. In addition, the customer needs to pay a very high transaction fee for sending to company A's multi-signature address. This is because the transaction fee in Bitcoin is determined by the size (as in how many bytes it takes up) of the transaction, and a transaction with an output script containing 16 public keys in rather large (since each one takes up 65 bytes, that's 1040 bytes total)

   The Bitcoin developers needed a solution where the transaction creator does not need to know the full details of how the transaction receiver will redeem his coins. The solution is “Pay to Script Hash” as specified in Bitcoin Improvement Protocol 16 (BIP 16) and it was implemented on March 2012 in the Bitcoin core source code. The goal of the “Pay to Script Hash”, as you can tell from its name, is to allow the transaction creator to send Bitcoins to the hash of a script. Essentially the hash of a complex Bitcoin script can be used as an address that the transaction creator sends Bitcoins to.

The mechanism that makes this possible is to allow for a script within a script. When using “Pay to Script Hash”, the transaction creator uses the output script below. The output script contains <scriptHash> which is a hash (SHA256 and then RIPEMD-160) of a script.

Output Script:
   OP_HASH160 <scriptHash> OP_EQUAL

   The transaction receiver can then redeem the output by using the input script below.     The input script contains one or more signatures in <signatures...>. There is also a script within the input script in <serialized script> which is initially treated as a constant.

Input Script:
   <signatures...><serialized script>

So for example, a 1 of 2 multi-signature transactions , the input script would look like below. We use this example to walk through the full "Pay To Script Hash" in Figure E.

Input Script:
   <signatures...> = <signature_1>
   <serialized script> = OP_1 <pubKey_1><pubKey_2> OP_2 OP_CHECK_MULTISIG

Step	Data Stack	Instruction Stack	Explanation
1.	Empty	<signature_1> <serialized script> OP_HASH160 <script hash> OP_EQUAL OP_CHECK_MULTISIG	The entire script is combined in the instruction stack. The pay to script hash method is recognized. <serialized script> contains OP_1<pubKey_1><pubkey_2>OP_2 OP_CHECK_MULTISIG but is initially treated as a constant.
2.	<serialized script> <signature_1>	<OP_HASH160> <script hash> <OP_EQUAL>	Constants are pushed onto the data stack sequentially.
3.	<script hash> <serialized script> <signature_1>	<script hash> <OP_EQUAL>	OP_HASH160 hashes (SHA256 and then RIPEMD-160) the serialized script and puts it on the data stack.
4.	<script hash> <script hash> <serialized script> <signature_1>	OP_EQUAL	<script hash> is pushed onto the data stack.
5.	True <serialized script> <signature_1>		OP_EQUAL compares the two <script hash> and verifies that they are equal. We have now passed the first step of validation.
6.	Empty	<signatures_1> OP_1 <pubKey_1> <pubKey_2> OP_2 CHECK_MULTISIG	Since it has been recognized in step 1 that this is a “Pay to Script Hash”. <serialized script> and <signatures...> is moved off the data stack and back on the instruction stack. The serialized script will be de-serialized.
7.	OP_2 <pubKey_2> <pubKey_1> OP_1 <signatures_1>	CHECK_MULTISIG	Constants are moved on to the data stack.
8.	True	Empty	OP_CHECK_MULTISIG checks to see if the 1 signature belongs to 1 of the 2 public keys. If it does, returns True. The script has now passed the second step of validation.

Figure E

Now it should be clear how the script within a script works to make “Pay to Script Hash” possible. Notice that there is a two step validation process. One is for validating that the <script hash> in the output matches the hash of <serialized script> in the input. The other is for validating the script contained within <serialized script>. The two step validation process is used by the Bitcoin node only in the case where the output script is recognized as a "Pay to Script Hash".

Kay Kurokawa on Cryptocurrencies

Wednesday, July 30, 2014

Scripting in Bitcoin: Part 2

No comments:

Post a Comment