In order to understand this, we will need to understand some core concepts first.
How do transactions in a Bitcoin work?
Before we continue, a huge shout out to Professor Donald J Patterson and his Youtube channel “djp3” for the explanation.
Suppose Alice wants to send a certain number of bitcoins to Bob. How does the transaction system in Bitcoin work? Bitcoin transactions are very different from Fiat wallet transactions. If Alice was to give $2 to Bob, she would physically take 2 dollars from her wallet and give it to Bob. However, things don’t work like that in Bitcoin. You don’t physically own any Bitcoin, what you have is the proof that you have Bitcoins.
There are two more things that you need to know:
- The miners validate your transactions by putting the data inside the mines that they have blocked. In return for giving this service, they charge a transaction fee.
- When it comes to FIAT currency, you don’t really keep track of how and where you got that specific note from. Eg. Open your wallet right now and take out all the notes and coins in it. Can you tell where exactly did you get each and every specific note and coin from? Chances are that you don’t. However, in bitcoin, the history of each and every single bitcoin transaction is taken note of.
Ok, so now let’s do a deep dive into how a bitcoin transaction between Alice and Bob takes place. There are two sides to a transaction, the Input, and the Output. This entire Transaction will have a name that we will figure out in the end. For now, let’s look at the dynamics.
In order to make this transaction happen, Alice needs to get bitcoins which she has received from various previous transactions. Remember, like we said before, in bitcoins, each and every coin is accounted for via a transaction history.
So, suppose Alice needs to pull bitcoins from the following transactions which we shall name TX(0), TX(1) and TX(2). These three transactions will be added together and that will give you the input transaction which we shall call TX(Input).
Diagrammatically, it will look like this:
So, that is it from the input side, let’s check out what the output side will look like.
The output basically will have a number of bitcoins that Bob will possess post transaction and any remaining change that is left over, which is then sent back to Alice. This change then becomes her input value for all future transactions.
A pictorial representation of the output side looks like this:
Now, this is a very simple transaction that has just one output (apart from the CHANGE), there are transactions that are possible with multiple outputs.
This is what the basic layout of the transaction looks like. For this entire thing to go through, however, certain conditions must be met.
Conditions of a transaction
- TX(Input) > TX(output). The input transaction has to be always greater than the output transaction. In any transaction, the deficit between the input and the output (output+change) is the transaction fees that miners collect. So:Transaction fees = TX(Input) – (TX(output) + Change).
- In the input side:TX(0) + TX(1) + TX(2) = TX(Input).If Alice doesn’t have the funds necessary to carry out the transactions then the miners will simply reject the transactions.
- Bob will have to show that he can provide the proof needed to get the bitcoins. Alice will lock the transactions with Bob’ public address. He will need to produce his private key to unlock the transactions and gain access to his fees.
- Alice also needs to verify that she has the required rights to send over the bitcoins in the first place. The way she does that is by signing off the transaction with her digital signature (aka her private key). Anyone can decode this by using her public key and verify that it was indeed Alice who sent over the data. This proof is called “Signature data”. Remember this because this will be very important later on.
So, what is going to be the name of this entire transaction?
The Input (including the signature data) and the output data is added together and hashed using the SHA 256 hashing algorithm. The output hash is the name that is given to this transaction.
The transaction details code
This is what the transaction looks like in the code form. Suppose Alice wants to send 0.0015 BTC to Bob and in order to do so, she sends inputs which are worth 0.0015770 BTC. This is what the transaction detail looks like:
Image courtesy: djp3 youtube channel.
The first thing that you see:
Is the name of the Transaction aka the hash of the input and output value.
Vin_sz is the number of input data since Alice is sending the data using only one of her previous transactions, it is 1.
Vout_sz is 2 because the only outputs are Bob and the change.
This is the Input data:
See the input data? Alice is only using one input transaction (in the example that we gave above, this will be TX(0)), this is the reason why vin_sz was 1.
Below the input data is her signature data.
Underneath all this is the output data:
The first part of the data signifies that Bob is getting 0.0015 BTC.
The second part signifies that 0.00005120 BTC is what Alice is getting back as change.
Now, remember that out input data was 0.0015770 BTC? This is greater than (0.0015 + 0.00005120). The deficit of these two values is the transaction fee that the miners are collecting.
So, this is the anatomy of a simple transaction.
Before we continue though, let’s discuss a special kind of transaction called Coinbase transaction. It is basically the first transaction data that is on the block, and it signifies the mining reward that miners get upon mining the block. As of right now, the reward is 12.5 BTC. These transactions have no input data and they only have output data. Remember this because this will become important later on.
What is the scalability problem?
Now, remember, all the transactions that happen in the blockchain carry through because miners actually mine these blocks and put the transactions in the blocks to validate them. But, there are only so many transactions that you can put in the block. When Bitcoin was first conceived there was no block limit.
However, Satoshi Nakamoto (the founder(s) of Bitcoin) was forced to add the limit because they foresaw a possible DoS attack (denial of service attack) that hackers and trolls can inflict on the blockchain. They may stuff the blocks with spam transactions, and they may mine blocks which could be unnecessarily big in order to clog up the system. As a result of which the blocks were given a 1 MB size limit.
This was workable in the beginning, but as its popularity kept getting bigger and bigger, a number of transactions started adding up. This graph shows the number of transactions that are happening per month:
As you can see, the number of monthly transactions is only increasing and with the current 1mb block size limit, bitcoin can only handle 4.4 transactions per second. One of the biggest reasons why transactions are bulky and take up so much space is because of the signature data that is in it (we did tell you to keep this in mind). The fact is, that 65% of the space that the transaction uses is taken up by the signature data.
As the number of transactions increased by leaps and bounds, the rate at which the blocks filled up increased as well. More often than not, people actually had to wait till new blocks were created so that their transactions would go through. This created a backlog of transactions, in fact, the only way to get your transactions prioritized was to pay a high enough transaction fee to attract and incentivize the miners to prioritize your transactions.
This introduced the “replace-by-fee” system. Basically, this is how it works. Suppose Alice is sending 5 bitcoins to Bob, but the transaction is not going through because of a backlog. She can’t “delete” the transaction because bitcoins once spent can never come back. However, she can do another transaction of 5 bitcoins with Bob but this time with transaction fees which are high enough to incentivize the miners. As the miners put her transaction in the block, it will also overwrite the previous transaction and make it null and void.
While the “replace-by-fee” system is profitable for the miners, it is pretty inconvenient for users who may not be that well to do. In fact, here is a graph of the waiting time that a user will have to go through if they paid the minimum possible transaction fees:
Image courtesy: Business Insider.
If you pay the lowest possible transaction fees, then you will have to wait for a median time of 13 mins for your transaction to go through.
A possible solution that was thought of to speed up the transactions was the introduction of Lightning Network.
What is the lightning network?
The lightning netwok is an off-chain micropayment system which is designed to make transactions work faster in the blockchain. It was conceptualized by Joseph Poon and Tadge Dryja in their white paper which aimed to solve the block size limit and the transaction delay issues. It operates on top of Bitcoin and is often referred to as “Layer 2”.
As Jimmy Song notes in his medium article:
“The Lightning Network works by creating a double-signed transaction. That is, we have a new check that requires both parties to sign for it to be valid. The check specifies how much is being sent from one party to another. As new micro-payments are made from one party to the other, the amount on the check is changed and both parties sign the result.”
The network will enable Alice and Bob to transact with each other without the being held captive by a third part aka the miner. In order to activate this, the transaction needs to be signed off by both Alice and Bob before it is broadcasted into the network. This double signing is critical in order for the transaction to go through.
However, here is where we face another problem.
Since the double check relies heavily upon the transaction identifier, if for some reason the identifier is changed, this will cause an error in the system and the Lightning Network will not activate. In case, you are wondering what the transaction identifier is, it is the transaction name aka the hash of the input and output transactions. In the example we have given before:
This is the transaction identifier.
Now, you might be wondering, what would cause the transaction identifier to change? This brings us to an interesting bug in the bitcoin system called, “Transaction Malleability”.
What is transaction malleability?
Before we understand what transaction malleability is, it is important to recap one of the most important functions in the cryptoeconomics model…hashing. We have written an article before which covers hashing in detail. Just to give you a brief overview, a hashing function can take in any input of any length but the output it gives is always of a fixed length.
However, there is one another important function of hashing that you need to know to understand the “transaction malleability bug” as it is called. Any small change in the input data will drastically change the output hash.
Eg. Check out this test that we did with SHA-256 aka the hashing algorithm used in bitcoin:
We just changed “T” from uppercase to lowercase, and look at what it did to the output!
One more thing that you need to understand about the blockchain is that it is immutable, meaning, once the data has been inserted in a block, it can never ever be changed. While this proves a safety net against corruption, there was one weakness that nobody saw coming.
What if, the data was tampered with before it even entered the block? Even if people found out about it later on, there was nothing that anyone can do about it because data once entered in a block can never be taken out! That in essence is why malleability of transactions is such a problem.
Now, why does transaction malleability happen?
Turns out that the signature that goes along with the input data can be manipulated, which in turn can change the transaction ID. In fact, it can make it seem like the transaction didn’t even happen in the first place. Let’s see this in an example.
Suppose Bob wants Alice to send him 3 BTC. Alice initiates a 3 BTC transaction to Bob’s public address and then sends it over to the miners for approval. While the transaction is waiting in the queue, Bob uses transaction malleability to alter Alice’s signature and change the transaction ID.
Now there is a chance that this tampered transaction will be approved before Alice’s gets approved, which in turn overwrites Alice’s transaction. When Bob gets his 3 BTC, he can simply tell Alice that he didn’t get the 3 BTC that she owed him. Alice will then see that her transaction didn’t go through and will them resend it. As a result, Bob will end up with 6 BTC instead of 3 BTC.
That is how transaction malleability can work and this is a serious problem. Check this out:
Image courtesy: Bitcoin Magazine.
These are statistics from the 2015 malleability attack on Bitcoin. The red lines roughly represent malleated transactions on the network
Now, remember what we said in the beginning? Transaction malleability was happening because the signature data is temperable. So, not only was the signature data eating up block space, it was also posing a serious threat with transaction malleability.
The solution and the fears of a hard fork
Way back in 2012 people were exploring the idea of taking signature data away from the transactions. People like Russell O’Connor, Gregory Maxwell, Luke Dashjr and Dr. Adam Back were working on a way to make this work, but they all were hitting a wall. They realized that the only way that this could go forward was to do a hard fork, and nobody wanted to do that.
But then, in 2015 Blcokstream’s Dr. Peter Wiulle came up with a possible solution.
Sidechains and Segwit
Sidechain as a concept has been in the bitcoin circles for quite some time now. The idea is very straight forward; you have a parallel chain which runs along with the main chain. The side chain will be attached to the main chain via a two-way peg.
This is what Blockstream’s initial idea of the Bitcoin blockchain and a sidechain looked like:
Image courtesy: Bitcoin Magazine
What Dr. Wiulle thought of was simple why not add a feature to this sidechain? This feature would include the signature data of all transactions, separating it from the main chain in the process. This feature would be called Segregated Witness aka Segwit.
This is what a block would look like once it implements segwit:
So by removing the signature data from the transactions, it was killing two birds with one stone, the block space got emptier and the transactions became malleable free. There was one more thing that needed to be worked on, however. Segwit activation was possible only via a hardfork, which is what everyone wanted to avoid. The developers wanted to look at soft fork alternatives. That was when Luke Dashjr hit gold.
Segwit as a soft fork
To utilize segwit as a soft fork the developers had to come up with 2 ingenious innovations. They are as follows:
- Arrange the signature data in the side chains in the form of a Merkle Tree.
- Keep a part of the signature data in a new part of the block.
Before we continue, let’s a do a brief refresher of Merkle trees.
What is a Merkle Tree?
Image Courtesy: Wikipedia
The above diagram shows what a Merkle tree looks like. In a Merkle tree, each non-leaf node is the hash of the values of their child nodes.
Leaf Node: The leaf nodes are the nodes in the lowest tier of the tree. So wrt the diagram above, the leaf nodes will be L1, L2, L3 and L4.
Child Nodes: For a node, the nodes below its tier which are feeding into it are its child nodes. Wrt the diagram, the nodes labeled “Hash 0-0” and “Hash 0-1” are the child nodes of the node labeled “Hash 0”.
Root Node: The single node on the highest tier labeled “Top Hash” is the root node aka the Merkle root.
All the transactions inside a block are arranged in the form of a Merkle tree, and the Merkle root of all that data is kept inside the block. The transactions can all be accessed by traversing through the Merkle root.
(If you want a detailed explanation of Merkle Trees and their application in Blockchain then checkout our article on “Hashing”).
So, what the segwit developers suggested was, why not run another Merkle tree, but only with the signature data? That was the first innovation.
The second innovation was knowing where exactly to put the Merkle root of the signature data. The developers knew that in order to activate the segwit soft fork, the signature root needed to be placed in the block. The spot that they chose was the coinbase transaction spot. Now remember, we talked about this before, the coinbase transaction is the first transaction that takes place in a block, this basically the transaction that gives miners their reward and had no input value whatsoever.
What the developers didn’t realize was by doing so they were unwittingly stumbling on something that would have far wider repercussions.
By putting the signature merkle in a new place in the block, they were increasingly the block size…without actually increasing the block size limit in the first place! So basically what segwit achieved was that they increased the block size AND made the whole transition backwards compatible aka a soft fork! This was a major breakthrough which gave the bitcoin network a temporary fix for their scaling issues.
The Hong Kong Scalability Convention and segwit detractors
In the 2015 Hong Kong convention, Dr Wiulle introduced the Segwit proposal which was largely received very well. This was supposed to be the answer that everyone was looking for. It was hoped that everyone would jump on board, however it didn’t work out that way. Some of the miners had a big problem with Segwit.
When the developers built SegWit they added a special clause to it. It can only be activated when it has 95% approval from the miners. After all, it is a huge change in the system and they figured that getting a super majority was the way to go. However, this caused a disruption in the system. Some of the miners didn’t want segwit to activate. They were afraid that since the available block space will increase, there will be more space available for transactions and that will reduce the waiting time.
This, in turn, will reduce the transaction fees and kill off the “replace-by-fee” system which are their main modes of income (apart from the block reward). So as a result, the implementation of segwit was stalled. This, in turn, infuriated the users. In the context of a blockchain, users are people who run nodes in the blockchain network. They realized that something needed to be done to encourage the miners to mine segwit activated blocks.
Along with the miners, there were some developers who weren’t happy with the segwit solution. In their eyes, a temporary solution wasn’t good enough, something more permanent, like a block size increase, was needed. One of the bitcoin clients offering block size increase named “Bitcoin Unlimited”, was gaining a lot of support. The CEO of DCG Barry Silbert believed that the bitcoin community was under a lot of turmoil and, if not addressed, could lead to a lot of tensions in the future. He called in everyone for a truce meeting in New York. The outcome of this meeting is what is known as “The New York Agreement”.
The New York Agreement
On May 21 2017, prominent members of the Bitcoin community met in New York for the convention. After a lot of deliberations, a compromise was reached between the pro-segwit and the pro-blocksize increase camp. The outcome of the meeting is often called “The New York Agreement” or Segwit2x. It is basically a 2 stage agreement.
- Stage 1: Segwit gets up and running. The percentage of miners who need to consent to get this up and running goes down from 95% to 80%. Post the soft fork, any miners who mine blocks which are not segwit friendly will automatically be rejected from the blockchain. Miners who showed their support to this started including the letters “NYA” in their blocks.
- Stage 2: 6 months after segwit activation, the blockchain will undergo a hardfork and the block sizes will be increased from 1 mb to 2 mb.
Image courtesy: DCG article in Medium.
Aftermath of the New York Agreement
There were some very vocal detractors of the segwit2x. In fact, this led on to the series of events which eventually gave birth to Bitcoin Cash. However, a lot of the members in the community decided that this was the best path moving forward for bitcoin. Everyone was very excited about the upcoming segwit activation which was going to be around mid-July. But then something happened, because of a lot of complications, the miners missed the window!
Segwit was not activated when it should have and that caused widespread panic because it was felt that this would split the Bitcoin Core community even more. This dropped BTC’s price from $2500 all the way down to $1900…the lowest it has been in over a month. This drop in price startled the mining community and sprung them into action. By 20th July, the first stage of segwit activation, the BIP 91 activation was locked in. By August 8th the point of no return was reached and finally, on August 24th, Segwit got activated. Let’s see what Segwit had to say about that:
Image courtesy: segwit.co
The pros and cons of Segwit
Pros of segwit:
- Increases the number of transactions that a block can take.
- Decreases transaction fees.
- Reduces the size of each individual transaction.
- Transactions can now be confirmed faster because the waiting time will decrease.
- Helps in the scalability of bitcoin.
- Since the number of transactions in each block will increase, it may increase the total overall fees that a miner may collect.
- Removes transaction malleability.
- Aids in the activation of lightning protocol.
- Removes the quadratic hashing problem: Quadratic hashing is an issue that comes along with block size increase. The problem is that in certain transactions, signature hashing scales quadratically:
Image courtesy: Bitcoincore.org
Basically, doubling the amount of transactions in a block will double a number of transactions and that, in turn, will double the amount of signature data that will be inside each of those transactions. This would make the transactions even bulkier and increase the transaction time by a huge amount. This opens the gates for malicious parties who may want to spam the blockchain.
Segwit resolves this by changing the calculation of the signature hash and make the whole process more efficient as a result.
Cons of segwit:
- Miners will now get lesser transaction fees for each individual transaction.
- The implementation is complex and all the wallets will need to implement segwit themselves. There is a big chance that they may not get it right the first time.
- It will significantly increase the usage of resources since the capacity, transactions, bandwidth everything will increase.
- As the creation of Bitcoin Cash shows, it did ultimately split up the Bitcoin Core community.
- Another problem with Segwit is the maintenance. The sidechain containing the signature data will need to be maintained by miners as well. However, unlike the main blockchain, the miners have no financial benefits on doing so, it will need to be done pro-bono or some reward scheme needs to be thought of to incentivize the miners.
The following few months could be the most important and exciting times in Bitcoin history since Satoshi Nakamoto first published the Bitcoin white paper. Let’s see what the future potentially holds for various parties.
BTC has been growing from strength to strength post-Segwit activation:
Image Courtesy: Coindesk
On September 2, 2017, BTC hit record highs on $5000 before quickly readjusting to $4690. BTC finally scaled the $5000 mountain and there is no reason why that can’t become the new norm.
Bitcoin Cash provides a very interesting case study and a very strong option for anyone who is looking to diversify their crypto portfolio. No one can say what will happen in its future but one thing’s for sure, it has the potential to be a long term BTC alternative.