In the first article of this series, we looked at the definition of Nakamoto consensus and the voting mechanism, under which evidence of work or a share of ownership allows “choosing a leader” to add a block to the common chain. Then we went over to the algorithms of the Byzantine generals and the ways by which the security analysis of such algorithms can be applied to blockchains using the Nakamoto agreement. Today we will complete a formal analysis of the blockchains and protocols of the Byzantine agreement, as well as idealized modeling of the blockchain system.
From the work on the Avalanche protocol from Team Rocket (with the participation of Emin Gün Sayrera and co-authors) it became clear that all the methods used for the formal analysis of blockchains are the same. If you carefully study the work on the basic Bitcoin protocol, Algorand or dfinity_network, you will find that in the end, all the authors create a kind of medium. Of course, academic literature can be controversial. The authors of each work argue that their method of analyzing idealized blockchains is better than all the others, but do not forget that the security analysis of blockchains depends on the specific model. Instead of focusing on the similarity of these methods, we can create a new protocol that can take the best from all the above protocols. I suggest that readers consider the following question: is it possible to combine media (as is the case with boosting in machine learning) and their modeling to create the “best” set of possible axioms? This question arose in connection with the use of Team Rocket's metastability to change the type of consensus mechanism depending on the state of the environment, which I consider as the first step towards boosting security and performance of blockchains.
In order to make a breakthrough in the understanding of media, it is first necessary to understand what are the security requirements and how to create opponents in the Byzantine protocols. This is followed by a generalized reasoning with the goal of combining the security assurances and opponents of all the above protocols. Sit back!
Most of the academic literature on security of blockchains focuses on proving the properties or axioms that a secure protocol must possess. In the Nakamoto consensus literature, two necessary (but not necessarily mandatory) properties are mentioned to ensure the security of blockchains:
- Constancy As soon as a transaction is accepted on the blockchain by a large number of miners, the probability of its moving to a possible fork decreases exponentially.
- Viability. Any transaction created by a reliable node will be once accepted by all reliable miners.
Team Rocket were able to create a hybrid of Byzantine agreement algorithms (such as Algorand) and Nakamoto consensus algorithms, replacing the constancy axiom, which is needed to ensure the security of other protocols. The advantages of an algorithm such as Avalanche are as follows:
- Such algorithms are much easier to implement. It is easier to argue in their favor than in favor of the Byzantine agreement or the hybrid of the Byzantine agreement and the Nakamoto consensus;
- Such an algorithm is an order of magnitude better than Algorand in terms of transaction latency (the time it takes for the network to confirm the transaction) and throughput — the number of transactions per second;
- From a probabilistic point of view, there is a simple phase transition in the protocol parameters in the description. This transition refers to the points at which probabilities that decay exponentially (like those that have constancy and viability) begin non-exponential decay.
Avalanche achieves this by replacing the property of constancy with a weaker axiom called security, which states:
Axiom 1. Security. Two correct nodes will never accept conflicting transactions.
The appearance of a phase transition that is qualitatively different from the Snow White protocol or Bitcoin suggests that the choice of axioms can lead to completely different and possibly non-universal behavior [0]. In addition, it is obvious that constancy is a stronger axiom, but it is not clear why one axiom should be preferred over another. I have no answer to this question, and I suspect that most academics and practitioners in this field are asking the same question. This is due to the fact that the axioms according to which a distributed system is necessary are often not related to the results that really guarantee security. The security of distributed and financial systems is guaranteed due to many factors. I will give a few examples of ideal goals:
- The honest behavior of the majority of participants leads to a Nash equilibrium close to 𝝴.
- Auction theory researchers often seek to prove that an auction depends on a dominant strategy-incentive compliant (DSIC). This means that you cannot increase your usefulness by behaving in a dishonest and / or unexpected way [1].
In practice, in a distributed system, these properties are very difficult to prove [2], we have to act according to the following scheme for ensuring demonstrable safety:
- Define protocol algorithms.
- Create an adversary model that seeks to break the protocol.
- Create a simulation that includes the game between honest participants and opponents.
- Show that the desired axioms (for example, consistency, viability, security, agreement, etc.) are highly likely to remain in most modeling examples.
- Prove that the axioms chosen lead to an approximate Nash equilibrium and / or are not a Nash equilibrium (and this is normal!) [3].
In the last paragraph, we are no longer talking about the formal security of the protocol (for example, evidence that the protocol complies with certain axioms), but about a better agreement that states that these axioms are “good enough”. [4] In the remainder of this article, I will ignore the fifth paragraph and focus on clarifying paragraphs 2 through 4. We will return to the fifth paragraph in the following publications, and now we will try to find the answer to the following questions:
- How are opponents created? What properties should they have?
- How to create a game in this setting?
- How to "simulate" an example protocol? For example, what would Bitcoin look like if the primary unit were different? Is it necessary to average all primary blocks?
In the first publication of this series, we found similarities between the Byzantine agreement and the Nakamoto consensus. However, in order to understand the process of creating opponents, we need to understand their differences. The main differences are as follows:
- The Byzantine agreement can allow dishonest behavior of only 1/3 of all nodes, while the Nakamoto agreement can withstand dishonest behavior of up to half of all nodes.
- Since the Byzantine agreement implies a clear choice of a leader with a fixed number of participants, it becomes clear how to create models of opponents. In particular, we can assume that a certain percentage of voters is controlled or distorted by one adversary. [five]
- In the Nakamoto agreement with censor-resistant conditions and an unknown number of participants, it is more difficult to determine the enemy, because then you have to figure out how to adapt to it over time.
Before we start discussing the solution to this problem using Bitcoin analysis and other distributed registries, I suggest delving into the concept of an adversary a bit, because this is a fundamental factor for understanding blockchain models.
In theoretical computer science and cryptography, opponents are modeled as objects that must perform at least the following functions:
- Copy known security hacking mechanisms / for your own purposes.
- Influencing actions that honest members of the network can take.
- Have independent random number generators.
The protected object (the one being attacked by the enemy) is the opponent. For example, consider an adversary who wants to acquire property in standard public key cryptography. The plaintext in the encryption system is a valid unencrypted input value, and the ciphertext is an encrypted output value. Informally, a system with a public key of size λ is considered resistant to a simple attack based on a chosen ciphertext (CCA - chosen-ciphertext attack) [6], if it is impossible to select a “small” open text group, query the associated ciphertext groups and use them to “invert” random ciphertext with a “significant degree of probability” [7]. This means that if I intercept a number of encrypted texts and the associated public key, I will not receive any non-trivial information about the secret key or the original text from the statistical properties of a part of these encrypted texts. Strictly speaking, the opponent’s goal is to take two plaintext, P⁰ and P¹, and Ciphertext C, for which C = Encrypt (P⁰) or C = Encrypt (P¹), and then guess with a degree of probability significantly greater than 50%, whether C is an encrypted version of P⁰ or P¹.
Then we need to create a game involving the opponent and his opponent. In the description of the CCA attack, the “request for a related set of ciphertexts” was mentioned above. We will formulate this definition more accurately using a simple three-step game (based on the work of Craig Gentry on homomorphic encryption):
- Step request. The adversary creates a list of fixed texts of fixed size n = O (Poly (λ)) and requests the ciphertext from the opponent for each of these open texts;
- Step call The opponent sends the opponent two plaintext, P⁰ and P¹. The opponent randomly selects P⁰ or P¹, encrypts it as a ciphertext C, and sends it back to the opponent;
- Guessing step. The opponent tries to guess which of the texts, P⁰ or P¹, the opponent has encrypted, and tells him his choice. The opponent wins, if it correctly guesses the plaintext.
An adversary can use the information gathered in the request step to create two open texts in the call step and take advantage of the last step. The idea is that if an adversary wins in more than 50% of cases, he can use an algebraic number of examples to obtain nontrivial information about an exponentially large space (the number of ciphertexts in the exponential dependence with λ). The problem with opponents and games is interesting because we clearly see compromises and where the random elements of the system are located. For example, in the case of a game resistant to CCA attacks, the element of chance is present in the call step, and also the enemy implicitly uses the mechanism of chance to select two open texts. Nevertheless, the example of a game resistant to CCA attacks is much simpler than the Byzantine system or the censorship-resistant model of Nakamoto, because we have clearly defined opponents and opponents, their personality is established and does not change. How to use the scheme with opponents in this case?
In blockchain systems, the number of participants and opponents varies. According to the famous task of the Byzantine generals, often an honest network participant cannot find out whether another participant is an adversary and is acting to the detriment of the system (for example, it increases the delay time of a large-scale network or creates problems with hardware). We create one generalized adversary, one challenge for the environment that controls the state of all users, including opponents. The code for this environment describes how we can model our distributed system, as well as the random variable that is responsible for all possible implementations of the distributed system. To be a little more specific - the environment is a kind of system that has access to the following:
- To the status of all participants (local copy of the blockchain, network latency);
- Access to a probabilistic oracle, which is a universal hash function for creating a blockchain;
- Contains all the random elements needed for the system (for example, all the coins that determine the share / proof of each participant’s work, any randomness that the system needs). It also includes any static and general randomness necessary for all participants, for example, the choice of the primary unit;
- Contains the condition required by the enemy. In this case, the adversary uses a function that can control any participant and change its state — change its blockchain, send transactions with double spending, and so on. However, the enemy can only affect a certain percentage (𝛾) of network participants. The environment allows the enemy to "take control" of some of the network before it starts to develop.
Considering all the above, the environment performs a specific cycle:
- Allows an adversary to change state and / or actions up to 𝛾n participants. Participants whose condition and / or activity is affected by an adversary are called dishonest participants;
- Chooses a subgroup of participants (honest or dishonest) in a potentially non-deterministic way and forces them to perform a valid action - create a block, transaction, perform a certain number of universal hash function requests and / or publish / send a block;
- Update the status of all participants under the influence of this action. For example, if in the previous step a new block was connected, the medium will send it to the rest of the network so that the participants add it to their copies of the blockchain [8].
It is natural to doubt that such a cycle can represent complex dependencies hidden in the blockchain protocol. The following questions may arise:
- What will happen if two blocks are created at the same time that provoke the appearance of a fork?
- What if our protocol creates a directed acyclic graph (like, for example, Avalanche), and the relative distribution of the two blocks is incorrect from a topological point of view?
However, such a cycle is a discrete-event simulation, creating the evolution of the environment, as new blocks and new participants appear. This cycle is more or less similar to the cycle of sampling from a directed graphical model using Monte Carlo methods with Markov chains (MCMC) [9]. In the case of a graphic model, the following happens:
- Create the original state.
- We make a sample of the distribution in the current state, and then decide whether to go to the next state or repeat this step.
- We pass through the graphical model until a failure occurs and / or until we consider that we are already close enough to the stationary distribution of the model.
A graphical model can contain very complex dependencies between seemingly unrelated states, but we can still make a sample from it using the local single-loop algorithm (MCMC). These dependencies are built into our mechanism and allow the enemy to change the state, provide a methodology for selecting participants, as well as the work of the block propagation mechanism. You may need a lot of loop iterations before the change happens, you have to wait until the dependency is satisfied. The only difference between graphical models and blockchain environments and modeling is that the dependencies are directly written into the distribution code from which we sample the protocol. Thus, the proof of the similarity of the results with the results of the previous section is similar to the proof of the properties of directional graphical models, although here there is a difficulty in studying internal dependencies. As soon as we write a loop for a specific protocol, we can easily analyze the algorithm as randomized [10]. Instead of delving into the analysis of the cycles and mechanisms of a particular protocol cycle, I propose to consider several examples of such cycles. In the next articles of this series we will analyze the cycles for different protocols with environments and modeling.
In the first example, we define the task environment of the Byzantine generals:
- Initial conditions: the initial state YES / NO of all generals, the choice of the first leader;
- Status: network graphs of generals, all sent messages;
- Oracle: no;
- Opponent: a deterministic list of dishonest generals;
- Cycle: sending voting messages to all neighboring participants of the network graph until a consensus is reached.
The example of the Byzantine generals is a relatively simple environment, as it uses a pure deterministic consensus mechanism. The complexity of the environment increases as we add randomness, free behavior, and synchronization primitives to it. First, take a look at Avalanche, since its environment is much simpler than the Algorand environment or the Bitcoin protocol. The main algorithm of Snowflake, on which Avalanche is built, has the following environment loop:
Snowflake is a very simple sampling mechanism from the Polya box, where red and blue balls (for example, R and B in the cycle above) are distributed among a certain number of participants. The cycle does the following:
- Makes a sample of one honest user whose color (for example, voting YES / NO) will be changed.
- Sends information about the selected user and a group of honest users to the enemy, who can now update the colors of the Byzantine users.
- Receives a sample of “votes” from k users (who may be fair or dishonest) and counts the number of colors.
- Changes colors / voices based on a set of protocol parameters.
An analysis of this environment, which Sayrer calls “global planning,” is surprisingly simple, since it leads to the standard result of the processes of queuing theory. We will look at Snowflake and Avalanche in more detail in the next article. The basic Bitcoin protocol uses a more complex environment loop:
This is a much more complex environment, since here we are dealing with a large number of states and the calculation of the proof of the work done is carried out in each iteration. At a high level, this cycle does the following:
- For each user, we select the longest chain we receive.
- Determine what unit to use to verify the proof of work performed.
- We check the proof of the work performed.
- If the block is lined, then we distribute it throughout the remaining network.
In this cycle, the adversary hides in the “proof of work” function, since the goal of the protocol is to show that an adversary with t% hashing capacity (where t <0.5) cannot fake a transaction. In this protocol, the R, I, and V functions have some control on how we select users who can mine blocks, and indirectly establish an adversary. We will look at this process in more detail in the fourth article in this series. Algorand is a bit more complicated to use in this form, so we will use it in a single cycle in the fifth article.
As you have already understood, the formal creation of probabilistic modeling of distributed systems with opponents can be difficult, have their own subtleties and contradictory changing components. The choice of the necessary axioms directly affects the environment, and if done correctly, the whole analysis can be greatly simplified. Researchers often face the problem of chicken and eggs when they try to figure out what to do first, choose axioms, or create an environment. The environments and models mentioned above show that there are different opinions on what the “good” model consists of. Perhaps one day we will be able to combine all of these options in a protocol that will provide security, flexibility and performance, taking full advantage of Algorand, Bitcoin Backbone, DFINITY and Avalanche.
Thanks to Abe Stenvey and Uthsavu Chitra for helping to write this article.
[0] The very name of the document on Avalanche - Snowflake to Avalanche: A Novel Metastable Consensus Protocol Family for Cryptocurrencies - hints at the differences between this protocol and Bitcoin. In this protocol, a state is a binary vector (for example, yes or no) of a particular block, and a quasistable state is a state where the violation of even one voice will cause the algorithm to be directed towards the altered voice. In Avalache, quasi-stable states serve as the boundary between the use of the Byzantine agreement or the consensus of Nakamoto. In statistical physics, these equiprobability points are distinguished into a special kind of surfaces (committer surfaces). They are very important for finding effective paths to local minima. You can read more about this in the article by E. Vanden-Ainden here.
[1] The easiest example is the Vickrey auction, where all participants submit to the quasi-linear / multi-variable logistic benefit function u (x) = max (v-x, 0), where v is the personal evaluation of the item by the auction participant. In this case, the personal evaluation has the DSIC property, since the second bid is won.
[2] The Nash equilibria have their own complexity class called PPAD. For more details on this can be found in the famous work Papadimitriou.
[3] It may seem that Nash's ambiguous equilibria have a negative effect on the protocol, but in reality this resembles neutral networks or a non-negative matrix decomposition when the existence of multiple minima allows the simpler algorithm to find a “reasonably good” solution faster. More details about the ambiguous Nash and Bitcoin equilibria can be found in this paper from the authors of Ouroboros.
[4] In many ways, this is similar to the choice of axioms of set theory (do we use ZF or ZFC?), So there is often no clear resolution of the problem.
[5] Distributed registries that use the Byzantine agreement, for example, Algorand, supplement the standard Byzantine agreement with the property of being free from restrictions, randomly choosing a certain number of voters who jointly decide the validity of a transaction. In this case, it is necessary to prove that both the Byzantine agreement methodology and the mechanism of joint choice satisfy the necessary safety requirements.
[6] Technically, this is a CCA1 attack, in which the opponent needs to request all pairs before the attack, during which the statistical advantage of the pairs is used. In the CCA2 world, an adversary adapts and may try to increase his advantages depending on how well the pairs he requested are working. This question is well explained in the second chapter of Craig Gentry’s work on homomorphic encryption.
[7] The term “with a significant degree” of probability refers to a polynomial increase in key size — in this case, the advantage should be Ω (1 / Poly (λ)).
[8] Note that in a synchronized world — without network delay — the environment reports the actions of the participants in the rest of the network instantly (for example, all users perform some kind of action). In the asynchronous world, each iteration of the cycles is a “round” in which participants and / or actions are not selected. At some point in time, “nothing happens,” while the packets pass through the network of miners.
[9] Consider this question in more detail. Usually we try to create discrete-time modeling, but we don’t need to be tied to the frequency of events. Theoretically, in one turn of a synchronized environment, many blocks can be created that instantly spread throughout the network. If we assume that there is a minimum time scale (for example, the fastest possible system frequency), then we can set the corresponding time step. It is possible to draw an analogy with the choice of the time step in the molecular dynamics modeling, where the fastest mode of oscillations in the system is used (for example, the period of hydrogen oscillations) to determine the time step (usually in femtoseconds). Nevertheless, all this can be considered as a sample of a directed graphical model, albeit with long-term temporal states. Here you can recall the Monte Carlo Hamiltonian with hierarchical potential.
[10] Do not worry about the correctness of the environment as a simulation at the level of non-dermal Turing machines with access to random oracles. This paper provides a theoretical justification for such a system. Read the introduction to the Bitcoin protocol to learn more about it.
[11] In the work on Algorand, the algorithm is not presented as a single cycle. He has a sequence of coroutines that prove properties. In the fifth article, I will turn these subroutines into one cycle and talk about how this cycle is similar to the other algorithms considered in this series.
First part: On blockchain security - part 1