A technical introduction to Bitcoin for non-technical people
This post is something I’ve wanted to exist for a long time, but have never really found to my liking: an introductory but real explanation of how Bitcoin actually works, meant for a smart but non-technical audience. The original Bitcoin white paper is actually quite readable, and if you’ve ever thought about reading it you absolutely should. This post is meant for anyone who might be a bit intimidated by the technical stuff it talks about, but wants to learn anyway. Have no fear: although it’s a complicated system, there are no individual parts to it that are terribly difficult to understand. This post is going to walk through all of the basic things you need to know, and how they fit together, with the following sections:
Cryptography: How do we use math to prove ownership and identity?
Proof of Work: How do we use cryptography to create scarcity and objectively difficult tasks?
Peer to Peer Currency: Can we use cryptography to create a digitally native, open source Internet currency?
Double Spending & Consensus: What kinds of problems do we need to defend against?
The Byzantine Generals Problem: How can anonymous, untrustworthy strangers come to consensus on the internet?
Collaboration versus Competition: Instead of asking people to collaborate to find consensus, what if we had them compete?
The Blockchain: How can we use cryptography to make consensus unalterable once achieved?
The Race Against Time: How does proof of work make sure that no individual can ever trick the group into believing the wrong thing?
Bitcoin is like a Clock: How does proof of work turn continuous time in the real world into discrete chunks of time on the blockchain?
A Circular Incentive System: How do we trust that this circular network of incentives will perpetuate?
Some time later, I intend to follow this up with a second piece, A financial introduction to Bitcoin for non-financial people to cover the other side of the token: a deeper look at the incentives, economics, and politics of Bitcoin and how it meets the real world. This post is more meant to be a strictly technical introduction: how we use cryptography to create something we keep calling “trust”, which is a word that I actually really don’t like (I’m going to use it as few times as possible for the remainder of this post). I’ll spend a little bit of time on why it is that some people want this to exist, but for the most part I’m going to stick to how it works.
First of all, we need to understand what the “Crypto” in cryptocurrency stands for: it stands for cryptography, which is the foundation on top of which all of this rests. Cryptography underpins digital security: it helps software architects build and wield a kind of asymmetry: constructing digital structures and operations that are expensive to attack, and cheap to defend. This asymmetry is crucial for robust systems that need to withstand attacks and unknown threats.
The easiest path to understanding cryptography and what it’s useful for is the concept of “one-way math problems”. Suppose I were to ask you a math problem: 2,763,389 is the product of what two prime numbers? If you’re armed only with a calculator, it’s going to take you a long time to find the answer. You’re going to have to try every single prime number sequentially until you find an answer. However, the reverse math problem is much easier: What is 1,571 x 1,759? The answer is 2,763,389; if you have a calculator you can do that math problem in seconds. This math problem has a useful property for us: it is hard to do, but easy to verify. Now suppose I told you in advance to remember the secret number “1,571”. If I post the number 2,763,389 as a public prime number challenge, it’s trivially easy for you to find the other prime number (1,759) while it will be quite hard for somebody else who doesn’t know the secret information. This is a simple example of the kind of “mathematical asymmetry” that cryptography exploits.
Hash Functions: One generally useful kind of asymmetric function is something called a “Hash Function”. A hash function, at its simplest, goes “for any input, we’re going to process it with a certain rule, and generate a deterministic output which doesn’t reveal what the input number was.” So let’s say the rule is “the sum of the digits in the number” and I give the input number 126631. The answer is 19; finding the answer is easy. But reverse engineering the input is not. If I give you the number 19 and I ask you to find my original number, even if you know the sum-of-digits rule, it’s not going to be easy for you to guess my original number 126631. (This kind of rule is quite simple, and it has one problem in particular which is that there are many different input numbers that could give you the same output. (991, 8722, 3333331…)
Good hash algorithms, like one we’ll see later on called SHA-256 (“Secure Hash Algorithm”), have a useful set of properties in common. For any given input of a string of characters: 1) the outputs will be unique, to a sufficiently high degree of probability, 2) they’ll all be the same length, so they can be easily stored and compared, and 3) small changes in the input will lead to huge changes in the output. So the phrase “helloworld” might return “1672dc3a77bbc3c1” while “helloworlf” which is only one character off will return “f7b79870d3e3e98a”.
Where are hash functions useful? One way we use them is when storing passwords. Suppose you’re a website that stores user login and password information. A simple way for you to verify logins is for you compare a provided password with one you’ve stored; if they match, let the user in. But what happens if you get hacked? If you’re storing passwords in plaintext as-is, then you’ve just exposed all your users’ passwords. That’s bad. Fortunately, there is a much better way for you to do things. Instead of storing the user’s password, store the hash of their password only. When the user logs in, they provide you with a password; hash it, then compare the hashes and see if they match. Why do this? Well, it’s equally secure as a way of verifying identity. But if you get hacked, you’ve given up nothing of value. The hackers only steal a bunch of hash outputs, which have no meaning anywhere else, cannot be used as login information anywhere (even here, because re-hashing that hashed value will return something meaningless), and give you no hints about the original password. This is one-way math being put to effective use.
Public Key Cryptography: We can use math like this to formalize many different kinds of security challenges. One common theme you may have heard of is something called “public key cryptography”, which we use to encrypt messages. Encryption, which we do today with algorithms like PGP (which stands for “Pretty Good Privacy”), is another important technique. Suppose I want anyone to be able to send me a message on a public channel in a way that’s encrypted and secure, but for only me to be able to decrypt and read it. The way we do this is: everyone who wants to be able to send and receive encrypted messages uses PGP to generate a pair of cryptographic keys: one is the “Public Key” which you’ll share with the broader world, and the other is your “Private Key”, which you’ll guard for only yourself. If Julia wants to send a private message to Howie, she takes her message, “Hi Howie”, and encrypts it with PGP, specifying Howie’s public key as the recipient. The message will get encrypted into something like “735ab5b98c91cc3c”. In order to decrypt this message back into “Hi Howie”, you need to have Howie’s private key, which hopefully only Howie possesses. If you don’t have Howie’s private key (if you’re trying to spy on them, for instance) there is mathematically no way to break the encryption other than brute force trying every key combination; the difficulty level of modern encryption services is sufficiently high as to make unauthorized decryption effectively impossible. When you use a service like Signal or Telegram to send a secure message, this is what it’s doing in the background.
In general, cryptography is useful whenever we want to make a technical system robust against attack at little cost to the defender. We’ve become very good at it; an ordinary person with no resources other than a laptop can generate encrypted communications that are so strong that even the NSA couldn’t break it if they wanted. (We think, anyway.) Problems of proving identity and proving ownership, in cryptographic terms, are basically solved problems at this point. That’s good news for us: identity and ownership are important aspects of money and payments!
However, where cryptography doesn’t work as easily is when we have to deal with proving uniqueness, or when dealing with transfer of ownership. It’s easy for me to prove I possess a certain key, but it’s much harder for me to prove that I am the only person who possesses this key. Or, if I transfer ownership of the key to someone else, it’s hard for me to prove that I no longer possess the key. This is where we run into problems for things like digital money: numbers can just be copied. If I want to create some sort of “provably scarce” item, like a currency for instance, the regular tricks in the cryptographic bag aren’t going to work; this isn’t the kind of problem they’re naturally good at solving. We’ll need something else.
Proof of work
However, we do have a way to create one kind of digital scarcity that’s quite effective. It’s called Proof of Work.
Imagine it’s the early days of the internet, we’ve just invented email, and then shortly after some genius has a great idea that junk mail could now be sent over the internet. Unlike in the physical mail, with email there’s effectively zero cost to sending out the one millionth message. So you have a huge spam problem that quickly emerges. Suppose you run an email service, and you accept that there’s no easy way to automatically distinguish between real email and spam email; we accept that the spammers are clever. But you still want to discourage spam. What will you do? Well, you can impose a cost on sending email. You can stay, “for every email that you send, I want to see you do a small amount of work so that I know you’re serious about wanting to send this email.” If you’re sending one, or five, or even a hundred emails, this should pose no problem. But we want sending a million emails to impose a real cost. How will we do this?
We can take advantage of cryptography to create a kind of task that serves no useful purpose aside from being deliberately computationally expensive, and impossible to cheat. Here’s how to do it: you say, “Ok, for every email you want to send, we’re creating a hoop for you to jump through first.” Take this random starting sequence I’ll give you, and run it through a hash algorithm like we talked about before. The output will be some string of characters like ‘12d36ab35ffc51a1′. Now, if your output starts with the exact sequence of characters ‘000’, then you’re cleared to send the email. (Odds are this will not be the case.) Otherwise, add one additional character to the end of the sequence, which we call the “nonce”, and which initially will just be the integer ‘0’; then repeat. If that doesn’t work, try a nonce value of ‘1’, then ‘2’, and so on. After a few hundred or thousand tries, eventually just out of sheer probability you’ll find a nonce that successfully works: your starting input + the lucky nonce successfully hashes to some output that starts with ‘000…’. I can then easily check your work (remember, these kind of problems are hard to do, but easy to verify), and if you’ve indeed successfully found an answer, your email gets released.
This “proof of work” puzzle has a couple of desirable properties for us. First of all, although the amount of computational work you’ll have to do for any given email will vary widely depending on chance (you might find the nonce very quickly; or it might take a while. It’s all up to probability), on average the difficulty level of the math problem will be quite predictable.
Second, I can easily “tune” this difficulty level to however hard I want by simply adding more zeros onto the required threshold. For someone with a fixed amount of computing power, finding an answer that starts with ‘000’ might take four seconds of computing work, ‘0000’ might take a minute, and ‘00000’ might take fifteen minutes. Third, there is no way to cheat; you can only find the answer through brute force. This brute force will cost you real resources, in the form of computing power and electricity. And while it is expensive for you to find the answer, it is cheap for me to verify your answer. (Asymmetry is at work for us.) When used correctly, proof of work can be used to impose a genuine kind of scarcity and skin in the game: “You’re serious about this? Prove it by showing me how much work you’re willing to do for it.”
Peer to Peer Currency
Beyond preventing spam emails, we can imagine how Proof of Work might be successfully employed by someone looking to design a scarce digital currency: “Do the work in order to show me that you’re serious” will be a useful property when it comes to creating, validating and maintaining digital scarcity.
Ever since the early days of the internet, people have wanted to create an internet-native, digital version of money. Unlike the US dollar, which has value because the US government backs it and lets you pay taxes with it, and whose integrity and protection against forgery and cheating is maintained by banks, regulators and law enforcement who are continuously on the lookout for problematic behaviour, this internet money would much more closely resemble gold, or maybe even ‘collectibles’ in a general sense. Their value derives dually from 1) the fact that your peers also assign it value, and 2) that it has some inherent scarcity which prevents inflationary devaluation. Gold has done this ‘job’ for people for a very long time, and so do other modern collectibles like baseball cards: if we all agree that Ken Griffey Junior rookie cards have a certain value, and if we have a good way of defending against forgery, then they can absolutely work as a kind of “store of value”. You could even do transactions with them, if enough people are willing to accept Ken Griffey Junior rookie cards as currency. Crucially, we do not need a central authority in charge: gold does not need a central bank nor a legal or financial system in order to hold its value. But it’s poorly suited for the internet: you can’t send or store it electronically.
The challenge in front of us is: how do we do this digitally? The problem with digital bits is that they’re pretty bad at those two properties we care about: there’s no inherent cost to making them, and there’s no obvious barrier to replicating them. Creating a successful digital currency means figuring out a way to create “un-forgeable, costly bits” where transactions between parties cannot be faked, nor can they be stopped or censored by anyone. So what kind of tools do we have in our cryptographic toolbox do we have to create such a thing?
The first tool we’d need in our toolbox is a way to prove ownership over a particular digital bit or “coin”, which we might as well call it at this point. Here, we can use something we’ve already talked about: cryptography. We can define a little digital object called a coin that says, “I am a coin. I live inside a wallet. There’s only one thing you can do with me if I am in your wallet, which is specify a new wallet to send me to. If you so choose to do that, I will no longer be your wallet’s coin, I’ll become the new wallet’s coin.” That’s the basic job that currency has to do.
Cryptographically, the way we can make this work is to make the coin programmable in a specific way: the only person who has the ability to “reprogram” the coin to belong to a new wallet is whoever can provide the private key that accompanies the coin’s current wallet’s public key. When you do so, you can specify the new wallet to which it’s going by providing the its public key address. (You do not have to know the new wallet’s private key to do this! It’s like sending an encrypted message.) So when Ben sends a coin to Jeremy, he announces the following: “I am Ben. I can prove I’m Ben by showing this digital signature. Currently, this coin is set so that only Ben has the right to tell it where to go next. As I have proven that I am Ben; I am now resetting the coin so that Jeremy now has the right to spend the coin. If you can prove you are Jeremy, then you now own the coin.” That’s the basic mechanic of a transaction. We can see here how cryptography is pretty good at making sure that people can’t spend other people’s coins; not without knowing their private keys. The security challenges we’re going to need to deal with have more to do with scams people can run around their own coins.
The second thing that we’d need in our toolbox is a way for information about transactions and public statements to propagate among all of the members of a network quickly and effectively. As it turns out, this is something we’ve figured out how to do quite well. If you’ve ever pirated a movie over a P2P service like Limewire or BitTorrent, then you have firsthand experience with these kinds of ‘peer to peer networks’. On these networks, many different people run servers called “nodes” that continuously take in information from members of the group, and then share that information with other nodes, such that the whole group collectively behaves like one big centralized server, but knocking out any one node will not affect the integrity of the group. This peer-to-peer element is an important part of being un-censorable: if somebody powerful, like a national government, decides they don’t want you around anymore, it’s not so easy for them to take you out – they might be able to get to some of your members, but likely not everybody. If you can successfully create unforgeable costly bits, then a peer to peer network would be a good way for those bits to get recognized, traded and stored. This doesn’t solve our forgery or our skin-in-the-game problem, but it does give us a way to share and transact that currency should we figure it out.
The third tool in our toolbox we’d need is also something we’ve already talked about: we need to somehow make these coins costly, and proof of work is a great way to do that. You could imagine a setup where digitally scarce coins can only be generated through a necessarily costly proof of work process. Why is this helpful? Well, it helps give those coins “value”, because someone put in real effort and expended real resources in order to make them. This isn’t value in the way that a business has value or a commodity has value; it’s more like value in the sense that gold has value. I’m more likely to trust in the durable value of a digital baseball card if it was expensive to make that baseball card, and if other people recognize that expensiveness.
Unfortunately, we’re still stuck with the problem that digital coins as we’ve defined them so far are still easily copyable. While cryptography gives us ample tools to prove ownership of coins, it comes up short when it comes to proving uniqueness of those coins. What’s to stop Ben from sending that same coin to fifteen different people? What’s to stop Ben from just inventing new coins? Unless there’s some sort of reference list that Jeremy can check to make sure he’s in fact the only recipient of this coin, and that the coins can all be traded back to legitimate origin stories, then Jeremy can’t place all that much faith in Ben’s transaction.
The safest way to know for sure that people aren’t cheating is to have a historical record of all transactions that have ever taken place, so that the identity and integrity of every single coin can be traced back to its source. This is what banks do; the money in our checking accounts is really just a ledger of credits and debits that the bank continuously monitors in conjunction with other banks in order to make sure that everything is accounted for. If you trust banks to do this, this solution actually works quite well. Well-functioning banks in stable, democratic governments can do this because they know who people are, and fraudulent transactions can usually be reversed or unwound through the legal system. But with internet money that has to be able to function anywhere in the world, we don’t want to trust banks; in fact we don’t want to trust anyone. We don’t want to have to trust that any communication is secure; we don’t want to have to trust that any individual is who we think they are; we don’t want to have to trust that any privately held ledger has integrity. Instead, we’re going to try a new idea: put the ledger of transactions in public, out in the open, on the peer to peer network where anyone can read it and consult it.
Double Spending and Consensus
So we need some sort of ledger book where a full and complete transaction history can be verified, and we don’t want to trust a third party to maintain the book. If we did that, this would make the whole system vulnerable to tampering or censorship from that third party, or from anyone who has coercive leverage of any kind over that third party. On the contrary, if the book is a matter of public record, and is maintained by the group collectively rather than by any one individual, censorship is harder. That’s what we want, at least in theory. So how do we do it?
Making the integrity of the book the public’s problem, rather than a specific third party, means trading our censorship problem for a different kind of problem: a consensus problem. It’s easy to hand-wave “the book will be maintained by… everybody!”; Okay, but how? If I try to spend the same coin twice, there needs to be a rigorous way by which the group decides which transaction is the valid one and which one will be ignored. It’s relatively straightforward for the group to spot inconsistencies: the Peer to Peer network design will mean that many different people will have the transaction record, so two conflicting versions of a transaction (e.g. if someone tries to spend the same coin twice) will be eventually noticed by the group for sure. But that doesn’t mean the group knows how to resolve the conflict. The group needs a way to systematically select for one version of events over another in a way that is foolproof. This is an important point that’s worth repeating: the group should not need be in the business of adjudicating which transaction is the “real” one versus which is not. That’s actually not necessary. What the group does have to do is be able to swiftly, reliably and permanently select one over the other in a way that is completely clear to everybody.
Let’s take a minute to specify what exactly is the fraudulent scenario we’re worried about. If you’re a bad actor, and you were trying to scam somebody’s money from them by modifying or cheating the public record, how would you do it? Well, there are two main ways you could do it.
The first would be by going back into an older page of the ledger book, and then erasing or rewriting a particular transaction to your advantage. So let’s say I bought a car from you in exchange for 100 coins. If go back later on and rewrite the transaction to say “actually send these 100 coins to another wallet that’s controlled by me”, or nullify the transaction in some other way, then I now have both the car and the 100 coins. I can abscond with both of them, and you may be none the wiser unless you go look at your account and see that your coins are have vanished, as if they’d never been sent to you in the first place. That’s scenario one.
Scenario two is subtler, but has the same effect: it’s for me to agree with you to exchange 100 coins for your car, and then I broadcast two simultaneous signals to the network. The first says “Send these 100 coins to my car salesman friend here”, and the second saying “Actually send those 100 coins to this other wallet (which is controlled by me).”; if I can convince you that the first transaction went out successfully, but then convince the group that the second transaction is actually the real one that they should include in the public record, then I will have accomplished the same crime: absconding with both the coins and the car. We call the problem the “double spend problem”.
Why can’t we simply set a rule that said “whichever transaction happened first is the real one?” Well, it’s hard to establish “first” in a foolproof way on a peer to peer network. If you had an incredibly accurate clock and could get the entire P2P network synchronized down to the nanosecond together, you could maybe use the clock to differentiate between the transactions. But the reality of building a distributed network makes that impossible to build in a way that’s tamperproof, so long as we treat time as something linear and continuous. (That last sentence probably sounded weird. Don’t worry about it for now. But remember it for later.) Could you designate some central authority to be Time Cop, and be the arbiter of what happened when? Yes, but now we’re centralized a key function. We can’t have that either.
What we need is a way to guard against both of these fraudulent actions that’s more robust than anything we’ve seen before. Again: it’s not important for the group to adjudicate on which spend is “correct” versus which one is false. That isn’t necessary; nor is it really possible. It’s not important for the group to weigh in on whether it thinks the payment to the car salesman or the payment to my wallet is the “correct” transaction. But the group needs to be 100% unambiguous about which one it selects as the real one, in a way where the group’s choice cannot be gamed or faked by anybody after the fact. What we ideally want to do is to be able to use cryptography to our advantage here: we need to be able to structure the book of transactions and the incentive structure of book maintenance in such a way where: 1) the network is always completely clear about what transaction it has chosen as the “real” one, and once it’s chosen which version of events is real, there’s no way to go back on it. Repeat this to yourself one more time to let it sink in: the challenge isn’t figuring out how to choose one transaction over another based on which one is “correct” or more deserving. The challenge is on getting everyone to agree on an order of transactions. If we can do that in a foolproof way, then the legitimacy problem solves itself.
The Byzantine Generals Problem
The simplest way to approach this problem of which came first is a voting system of some kind: “If more than half of us are in agreement, we commit and move on.” Okay, but how do we know what the majority wants? Voting is full of problems, the most significant of which being bad actors. How do we know that the voting system isn’t being tampered with? There will always be a smarter hacker; there will always be some vulnerability to exploit. What if somebody is successfully able to impersonate another voter? What if someone is able to intercept a voting communication and replace it with their own vote? What if voters, with little skin in the game, could be bribed to change their votes? In general, the problem is that it’s hard to hold an election, or to determine if majority consensus has been achieved in any way, if you cannot completely trust any of the information that is being communicated, nor the motivations of the voters.
This problem has traditionally been called the “Byzantine Generals Problem”, after the following thought experiment. Imagine a group of generals, each commanding their own army, surround a walled city. They need to come to a consensus as to whether or not they should all attack at dawn, or all retreat. Either option is fine, but they need to reach complete consensus on which one they choose. For security reasons, the generals do not leave their tents within their army compounds; they have to communicate using messengers. They can send as many messages as they want. But the messengers themselves are subject to tampering: they might be bribed; they might be traitors; they might simply be forgetful and tell the wrong information to their recipient. Furthermore, the generals themselves might be bad actors: they might tell one thing to some generals and another thing to others, in order to sow misinformation. The challenge is: can you come up with a system that will allow the generals to know, with absolute certainty, whether they’ve reached real consensus – even if any given message is susceptible to tampering or any general could be a bad actor?
There’s a related concept to the Byzantine Generals Problem called “Byzantine Fault Tolerance”, which is the property we will ultimately need our system to have. Byzantine Fault Tolerance generally means: “suppose you have a system that contains a bunch of components, and the components need to communicate together in order to avoid some sort of failure. How do you engineer the system to avoid failure even if the components themselves are individually unreliable? Systems that need to avoid catastrophic failure at all costs, like airplanes, need to be designed in such a way where even if one or more crucial elements of the system were to malfunction in a completely unprecedented way, the system itself would not be compromised. That is what we need in our digital peer to peer currency setup: the book needs to be maintained in such a way where it is always 100% clear what the historical record says, and where tampering, hacking or error aren’t just difficult; they’re irrelevant.
Collaboration versus Competition
We’re now ready to start talking about Bitcoin.
The first critical breakthrough idea in the Bitcoin protocol is the following: Asking people to collaborate to maintain our transaction ledger seems like a losing battle. Collaboration is too susceptible to tampering; it’s too susceptible to stalemate; it’s too vulnerable to bad actors, no matter how you structure it. Specifically, there’s no way to fully safeguard against double spending. So what if instead of getting people to collaborate to maintain the book, we had them compete for the opportunity to maintain the book?
It’s worth repeating that idea one more time: instead of asking people to collaborate, make them compete. Specifically, make them compete on some task where 1) it’s impossible to cheat; 2) it costs them real resources so that they have skin in the game, 3) where the outcome of the task turns writing in the book into a kind of one-way function: it’s effectively impossible to undo once done. This should ring multiple bells from earlier. We already know about a kind of task like this: proof of work. In fact, we’ve already seen proof of work in a digital currency context before, as a way of making bits deliberately costly.
Bitcoin takes the problem of “who gets to write the pages of the book” and turns the problem of collaboration on its head. First, we’re going to treat the stream of incoming transactions taking place on the networks like batches of transactions that are going to get added to the transaction record in discrete, large chunks. Every time we’re ready to add a new page of transactions to the ledger book, everyone is going to first compete on a proof of work competition. Because it’s proof of work, it can only be accomplished by brute force: there’s no way to cheat, and your work can be easily checked by the group. Now here’s the weird part: if you win the proof of work competition, then you get to write the page of the book. At first glance this seems downright bizarre. This isn’t a democracy at all! It’s more like a “randomly assigned dictatorship.” The nature of the proof of work competition means that if everyone is competing with an equal amount of computing power, the winner will be someone… kinda random. And then they, and they alone, get to write the page and put it in the book. So right away, we can see one very real advantage here: there’s no argument or discussion or collaboration around voting for a transaction history. Whoever wins the proof of work competition is the authority on which transactions happened, period.
Two big questions leap out, though: first, why would anyone go through the trouble of doing this, and two, how does this stop cheating? We’ll get to the cheating part in a second; first we’ll talk about the incentive. If you’re going to expend real computing power and incur a real cost in order to win the proof of work competition, then you need to get something in return that’s sufficiently attractive. With Bitcoin, this reward comes in two forms: first, as a small “tax” that the winner of the competition gets to levy off of everyone whose transactions were processed in that batch. Second, the winner of the competition gets to assign themselves some newly created Bitcoin as a reward specifically for them. If you want to increase your odds of winning the competition, you can throw more computing horsepower at the problem: on average, this will make you probabilistically more likely to win the competition; until others do the same. You’re in a computing arms race with them.
In the first “phase” of Bitcoin (which we’re still in), a certain number of new coins are issued with every successful proof of work challenge, entering them into circulation under ownership of whoever won the challenge. In the long run, these newly issued coins will eventually taper off to zero, as we approach a total circulating coin volume of 21 million Bitcoin. At that point, the reward will only be a transaction tax. If you’re curious about the incentives and economics of this, I’ll talk more about this in a companion post, A financial introduction to Bitcoin for non-financial people. For the time being, just remember that 1) everybody who’s engaging in this proof of work competition is pursuing a real reward, and 2) the total number of Bitcoin in circulation will ultimately be restricted, to prevent runaway inflation.
To recap so far: instead of asking the group to collaborate to maintain the book, instead we have everybody individually compete to solve a proof of work problem, on which it’s impossible to gain any advantage other than by throwing more computing power at the problem. It is a very fair race. The winner of the competition is rewarded in the form of a Bitcoin fee, and is then granted the power to write the most recent page of the book for the permanent record. But there’s a huge question remaining: why does this discourage cheating? To answer this question, we’ll need to understand the structure of the book, and how pages relate to one another.
The key to how Bitcoin works lies in the structure of the book; specifically, what is written on each page, and how each page relates to previous and future pages. This structure, and the leap of genius behind it, is the technical innovation of the “blockchain”.
Suppose that I, one of the participants in the network, would like to write one of the pages in the ledger book. I am motivated to do so by the Bitcoin reward that comes with being the successful winner. What I do is the following. First, I gather together all of the transactions that recently took place and which need to go onto this new page. These transactions will all look like: “I, Ben, can prove I have the right to spend this coin by signing with my private key (which only I have). I’m now “spending” the coin by specifying its new owner, Jeremy. To spend this coin, prove that you are Jeremy.” This is important to remember, because we can understand how someone who’s writing in the ledger book cannot fake transactions by other people and other coins that they don’t have; not without knowing the private keys. What we’re concerned about is bad behaviour concerning people’s own coins.
I’ve now gathered all of these transactions from the P2P network. Now, at this point, transactions have been stated in public, but they are not yet considered validated or finalized. Everyone should have the same list at this point. I take all these transactions, and then I add one more transaction right at the top: a transaction assigning me the Bitcoin reward. (Unlike the rest of the transaction record, which everyone has in common, this will be unique to me. Everyone else will be adding a different transaction, which is assigning themselves the reward.)
So I write all these transactions in the page, to become a part of the permanent record. From now on we’re going to call a “page” by its real name in Bitcoin terminology, which is a block. Then, I get ready to do some proof of work, in a very particular way. First, I take every single transaction in the block, and I hash each one into an output that looks like “d3e3e98af7b79870”. Then I assemble all of these hash products into something that looks a bit like an NCAA bracket. Let’s imagine this is a bracket of four transactions. (In reality it’s much more than that.) I take Hash 1 and Hash 2, combine them together, and then hash that; same with Hash 3 and 4. The finally, I take Hash 1/2, combine it with Hash 3/4, and hash it one last time into Hash 1/2/3/4. What we’re doing here is we’re creating a “cryptographic summary” of everything in the block that can be faithfully recreated every time. (Note here that everyone’s cryptographic summary will be different, because the Reward Transaction means everyone will be starting with different inputs.)
Once I’ve done this work and have successfully found the hash product of the whole bracket of transactions, I’m ready to create something called the header of my block. This block header is going to contain the critical security mechanism that prevents cheating. My header is going to contain three things. First, the hash product of my bracket that we talked about. Second, the hash product of the previous block. Third, we’re going to put in our random little number we learned about a while back: the nonce. The nonce will initially just start with the number 0. Now we are all set to get started on our proof of work competition.
I take my block header, and I hash it, and I see if it starts with ‘00000…’ or whatever the required number of zeros is. Odds are, it won’t. So I adjust my nonce from ‘0’ to ‘1’, and try again. Still doesn’t work? Adjust my nonce from ‘1’ to 2’ and keep trying, and trying. You should recognize how this proof of work competition will have the same properties as we talked about before: there’s no way for me to cheat, or to game the system in any way aside from just throwing more computing power at the problem. In Bitcoin-speak, this is called mining; this is the computational work that Bitcoin miners are doing. Now let’s say I win! I successfully find a hash product of my header that starts with ‘00000’, which happens to correspond to a nonce value of ‘162534’, before anyone else does. What I’ll do is broadcast my solution to the group: I’ll publicly post my whole block, including the header and whatever the correct nonce happens to be. Then, the group will check my work; they can do this pretty quickly. (Hard to do, easy to verify!) It it all checks out, then the block gets posted permanently and I have earned a Bitcoin reward.
You’re probably asking, though: why doesn’t the winner of the race just write whatever they feel like in the block in order to benefit themselves?
The Race Against Time
First of all, let’s remind ourselves about the benefit of having it be one single person who writes the block, rather than it being a group effort: it forces the miner of the block to unambiguously state each and every transaction in the most recent batch, in sequence, for review by the others. If any transactions are not valid – either because the private key signature doesn’t work, or because a coin is getting double spent, or anything – then the cryptographic summary of the block will not work. The group will see this, and they will simply reject the block. Then the group goes back to competing to successfully mine a block whose transactions will all check out with the group.
But what if the miner wanted to go back and modify a past block? Specifically, what if the miner wanted to go back and modify a transaction, back in the past, where they sent a coin to someone else – if you wanted to cheat, you could modify that transaction so that instead it sent the coin to another wallet you control, or to one of your friends. If this were possible, then consensus would never be truly achievable. (To repeat that one more time: true consensus means not only are we all 100% in agreement today, but we have to be certain that no one in the future can retrospectively go back and change their answer, in any way, no matter how talented of a hacker they might be or how much coercion they have on anybody.) It’s not sufficient to have other miners trying to “watch” for misbehaviour in past blocks: we need to make it futile to even try.
Here is where the proof of work competition is going to do some heavy lifting for us. Recall how hash functions work: if you make even a small change to the input, you’re going to have a completely different output. Also recall the self-referential structure of the blockchain: each block header contains a small hash summary of the previous page, and will be accounted for in turn on the next page. So block 100 contains a hash summary of block 99, which contains a hash summary of block 98, which contains a hash summary of block 97. Now let’s say I wanted to go back and modify a transaction that took place in block 97. If I did that, then the hash product of that whole block would change. If that happened, then the hash product of block 98’s header would change in turn, since block 97 is summarized in the block header. Then that would mean the product of block 99’s header would change too, as would block 100, all the way back up to the present.
But remember, in order to mine a block, it’s not sufficient to hash together all of the different transactions: you also have to find the nonce that results in a hash product that starts with “00000…”. If I change any element of the transaction history, I need to find a new nonce. And that’s going to take a while. If I do successfully find the nonce for block 97, my work’s not done: I then have to do the same thing for block 98, then block 99, then block 100, just to catch up to the present. It’s possible I might get very lucky and find the nonces quickly, but on average, this will take a long time.
Except, what’s been happening all the while? All of the other miners have been hard at work mining new blocks! In the time it took you to successfully go re-mine block 97, the group (who collectively has much more computing power than you do individually) may have mined ten blocks, or twenty. If you forge on ahead and re-mine block 98, the group could now be on block 150 already. Here’s where the unforgiving math of proof of work is going to lock in an unforgiving reality: the group will find solutions faster than any individuals will. Any one individual might get lucky once, but they won’t get lucky twice in a row, or five times in a row.
The farther back a transaction is “embedded” in the blockchain, the more mathematically unlikely it is that someone could successfully go back, change it, and then re-mine the blocks to ‘catch back up’ with the group collectively. If a transaction is one block deep, or even two, it’s conceivable that someone could get away with changing it if they had a ton of computing power. Once a transaction gets five or six blocks deep, though, forget it. (This is why the rule of thumb on the Bitcoin network is to wait for five or six blocks to get mined on top of your transaction before you hand over your car, or whatever. Then you can be very sure that it’ll be permanently inscribed.)
What if you simply created a ’side version’ of the transaction record, where you accepted that you fell behind the group but still tried to scam other people with it? What’s to prevent Ben from doing exactly this: re-mining block 97 to his advantage, and then convincing Jake that ’this is my real bitcoin balance; please transact with me’? Well, Jake has a very easy way of checking whether that’s true: he can simply check to see if Ben’s version of the transaction history is the longest chain of blocks. If it’s not, then Jake can reject Ben’s offer: not only is it suspicious looking, but Jake can also count on the fact that other people will probably not want to join in on this version of events either, for exactly the same reason. So you’ll never get to spend your coins; they might as well be worthless. So any Bitcoin they might have in this particular version of events is effectively valueless. The only coin balances that really matter, as far anyone is concerned, are the ones in the longest chain.
What if (there are still lots of what-ifs to consider!) you, as the bad actor who is trying to modify past blocks, convince other miners to mine on top of your new version, in your effort to catch back up to the group? Well, if the other miners are in criminal cahoots with you, then sure, they might. But if they’re acting strictly in their self interest, they won’t help you. Now why won’t they help you? Well, for the same reason that Jake won’t want to transact with Ben’s little side chain: miners are also incentivized to only mine on the longest chain. Mining has a real cost, and and it’s only worth it if you have a chance at gaining a bitcoin reward for your troubles.
Suppose you go back to change block 97, and you convince me to help you mine block 98. If I help you, and successfully mine block 98 by doing all that proof of work, I will be rewarded in the fact that I added that extra transaction to the block that assigned me a mining reward. But that transaction will only exist in our little fake history. None of the other network participants out there in the world will have it, or will have ever heard of it. So as far as the Bitcoin network is collectively concerned, I never got any reward. But I DID expend real energy. My electricity bill doesn’t care whose chain I was mining on; it costs me real resources, in the real world. So there is no point in me helping you unless I have reason to believe that my effort will be rewarded, and that my reward will be collectively recognized by the Bitcoin network. And that’s something I can only believe if I’m mining on the longest chain. This is why we sometimes say that miners are “voting” on which chain they believe is the longest: they’re betting on which version of events is “real”, with a fair amount of skin in the game – because mining is costly to them.
Now there is one scenario where things can go wrong, and that’s if someone got a LOT of computing power together; so much so that they have a majority of the computing power available on the network. What would this mean? Well, if we think of it in terms of “voting”, then it’s as if somebody accumulated the majority vote. It basically becomes their network. That’s not far off from what would happen here. If you had a majority of the computing power (which we call “hash power” on the bitcoin network, since what you’re doing with your computing power is just hashing blocks over and over again until you find the right nonce), what could you do? You could go back to a previous block, mine a new one (in a way that benefits you, perhaps) and then put your army of computing power to work mining blocks on top of your new chain. If you have a genuine majority of the hash power, then you will actually catch up to the leading chain and overtake it as the new longest one.
This is called a “51% attack” in the Bitcoin community, as someone would effectively be able to “attack” the established Bitcoin consensus and replace it with their own consensus should they have 51% of the hash power on the network. If you are the majority, you get to enforce majority rules. The key to Bitcoin’s long-term success lies in part in making sure that such an attack never happens, by incentivizing new miners to join and make sure that no one person can ever win the computing arms race against the group collectively.
What happens as more people come online, and the total amount of hash power on the network increases? Again, here’s the beauty of the proof of work problem: we can set the challenge to be as arbitrarily difficult as necessarily, simply by adjusting the number of required zeros at the beginning of the hash. The Bitcoin network does this in a way so as to keep the total amount of time it takes the group to find the right nonce to be roughly constant on average; a couple of minutes. It will vary from block to block, with random chance. But on average, if we know how much computing power the group is bringing to bear on the challenge, we know how difficult to set the proof of work challenge. It all automatically self-adjusts.
Bitcoin is like a clock
Remember back when I said that we could theoretically make a Byzantine consensus payment network if we had an incredibly fast and accurate clock that perfectly synchronized across the P2P network? If we knew with certainty that everyone knew at exactly what time transactions were made and there were no way for hackers to go back and mess with timestamps, then we’d have a much easier time. In the real world, we have no such option. But what we need, in terms of consensus from the group, can actually be phrased in terms of time in an unusual way. We need a) the present to be perfectly synchronized among all members of the group in order to prevent double spending along with other misbehaviour like spoofing and front-running; and b) the past needs to be completely immutable, so that we can be confident that today’s consensus can never be undone.
That’s kind of what Bitcoin has accomplished, in a strange way. Instead of treating time as continuous, it treats time as something that starts and stops, synchronizing with each block. Each time someone successfully solves the proof of work challenge and mines a block, it’s like time freezes; we get to all look together to make sure the block is valid; once we agree and commit to begin the next block, time starts back up again. As a bad actor, you can’t play games with time because a) all of the transactions in a block are handled all at once, and b) any block buried in the past will be out of reach, because the nature of the proof of work challenge establishes a “time gap” between the blocks. It turns the record of the past (which has to exist in the present) actually behave like the past, in that it’ll be effectively out of reach to anyone in the present; we can observe it, but we can’t change it.
Another analogy here is thinking about a fly trapped in amber: each layer of amber progressively makes the fly more “preserved”. Any given layer of amber is like a discrete “block” of time that gets progressively added, in a step by step way, to the passage of time that entombs the fly. And the gap between each layer of amber is like the time required to solve the proof of work challenge. Bitcoin is like the clock that steadily ticks, with each tick adding a new layer of solved proof-of-work.
A Circular Incentive System
We’re ready to put everything together now to appreciate how this whole system functions. It really is a beautiful and complete idea, expressed in terms of cryptography, incentives and skin in the game, around the concept of “unforgeable, costly bits”. If I believe these baseball cards have value, and so do you, and so does the corner store where we buy groceries, then we could actually use them to store and transact our wealth if we wanted to. The trick is figuring out how to make digital baseball cards that are unforgeable, costly, can be sent through wires, and whose transaction record is foolproof.
Bitcoins, like digital baseball cards, do not have any intrinsic value. They do not generate any kind of earnings; they are not productive in the way that equity in a business is productive, or a unit of a commodity like oil is productive. But nonetheless, they might have value in the same way that gold or silver has value, if there is enough durable belief in a few key properties: that the network will be secure and transactions will have integrity; that the supply of Bitcoins will be capped or restricted in a way that prevents runaway inflation; and (most of all) that people around the world will need somewhere to store their wealth in challenging times, the way we’ve done with gold for centuries.
In this case, how are those incentives organized? It’s organized in a circle: 1) Bitcoin has a value because other people assign it value, and because we believe the network is secure. 2) Miners are clearly assigning the network value, because they’re expending real resources (computing power and electricity) to keep the network secure by perpetually elongating the longest chain. 3) Miners are incentivized to mine on the longest chain, because only the longest chain will be recognized as “legitimate” by others, and mining on any other chain will simply be a waste of objectively scarce resources. 4) No one miner has a majority of hash power (we hope, anyway) because too many other people around the world are throwing in computing power of their own in an effort to earn their own Bitcoin by mining. 5) They want to earn Bitcoin by mining it because other people also assign Bitcoin value, and if enough people assign it value, then we can all transact with it securely.
Now, does this actually work in real life? What if the cost of mining starts to outweigh the value of the mining reward? What about the fact that mining consumes huge amounts of electricity, and that’s bad for the environment? Why should any of this have value, since it has no intrinsic utility in the way a business or a commodity does? There are lots and lots of questions that we haven’t answered here, and I’m going to answer in another post, A financial introduction to Bitcoin for non-financial people. (I haven’t written this post yet but I will, hopefully some time soon.) The point of this post is just to get you to appreciate the technical structure of Bitcoin: how does cryptography work, how does proof of work create scarcity and skin in the game, where are P2P networks effective and where do they come up short, and then finally: how this intricate proof of work competition creates immutable, unforgeable, costly bits.
Thanks for reading! I hope this has been helpful.