PoS consensus problem: promiscuous voting

A danger to the consensus formed by validators in the absence of proof of work (PoW) is validators voting in many incompatible chains.

In PoW a miner votes on a chain by assigning its scarce resources to build on top of that chain. If the miner wants to vote to two chains, it will have to split its resources. In POS, validators just have to sign, and signing is free, so they can vote in many incompatible chains at the same time. $1/3$rd byzantine nodes can vote on two incompatible chains, producing consensus failure. More on this in Background on consensus. Voting on incompatible chains is an equivocation.

Need for a negative incentive

We want a strong incentive that discourages validators from equivocating. One such incentive could be that their collateral is seized whenever they equivocate. In the absence of equivocation consensus failure requires half of the nodes to be malicious.

Leaking private key by reusing one-time signing nonce

Take a Schnorr signature $s = k - ea$ where $a$ is the private key, $k$ is a single use private scalar (nonce), $e = h(m, R)$, $m$ is the message to be signed, and $R = kG$, where $G$ is the group generator and $h$ a hashing function.

A Schnorr signature $s$ involves one equation and two unknowns, $k$ and $a$. But if $k$ is reused in a new signature $s'$ with another commitment $e'$, then we have 2 equations and two unknows, and thus we can solve for $a$. More explicitly,

$$s' - s = (k - k) -a(e' - e) = -a(e'-e)$$

Applying key leaking for enforcing single protocol execution on a multiparty protocol execution

To coerce validators to never equivocate, i.e. execute the protocol only once, that is, maintain a linear messaging history, the above property of key-leaking of Schnorr signatures can be applied. Thus forcing validators to reuse $k$ upon equivocation, and thus leaking $a$. The existence of 2 messages signed by the same validator for the same height is a proof this validator is equivocating (i.e., exhibiting the most dangerous byzantine behavior). Now knowledge of $a$ must be enough to sign any message that validator could produce in that protocol.

Deterministic one-time nonce

Thus, for this to be useful, all parties participating in the protocol have to share with each other and agree on a parent pubkey, from which all the child-keys (nonces) must be derived.

The Schnorr signature in this case is not the tuple $(R, s)$, because the one-time signing public nonces were preagreed, and all party can compute them from the parent-pubkey independently, thus signature is just $s$.

We label the collection of consensus messages from a given validator $v_n$ as $M_n$. We retrieve the index of a message (with respect to its linear history) via

$$i: M \to \mathbb{N}$$

That is, for the genesis message, $i(m) = 0$ and for a message with height $j$ with respect to the genesis message, $i(m)=j$. If a validator has equivocated, it means that

$$\exists m, m' \in M. m \neq m', i(m) = i(m')$$

We want an algorithm to make the choice of $k$ deterministic — that is, force a validator to use a one-time signing nonce for a message $m_i$ via an algorithm $k(i): \mathbb{N} \to \mathbb{Z}_G$.

For message $m_i$ from a given validator, if validators would have to commit to the public value $R$ for a message at a given height, i.e. $R_i = k(i)G$, then the only way for them to produce two messages that are seen as independently valid for the same height $i$ (and therefore able to induce an equivocation) is to use the same nonce $k(i)$ for them.

Application to mimblewimble

In the context of mimblewimble, a sensible choice for the signing key $a$ that must be “put on the line” in these signatures is the private blinding factor $a_n$ for the collateral that validators must lock on the base layer to become validator nodes. Once a validator leaks its key, anyone can sign on his behalf, and therefore the misbehaving validator is out of the validation game. Future and past messages signed by this validator loses its meaning as anyone can forge any signature at any height. The equivocator’s deposit must be spent by the other validators using the equivocator’s key (with the aim of discouraging his behavior, and rewarding honest nodes for correctly enforcing the rules). After the equivocator’s key is leaked, everyone could sign any message, thus that key carries no value.

An interesting property is that outside observers could also get access to the leaked keys.

As an example, assume that we have a 3-of-3 multisig that has locked the collateral for the validators, so we have a shared commitment

$$C_m(v_{total}, \sum_{i=1}^3a_iG) = v_{total}H + (a_1 + a_2 + a_3)G,\\v_{total}=v_1+v_2+v_3$$

For messages, we use Schnorr signatures $s = k(i) - ea_n$ where $a_n$ is the blinding factor that we used for the multisig and $i$ is the message height used to determine the nonce.

Validator 3 has equivocated at message height $j$ by signing two messages $m_j$ and $m'_j$ with $s$ and $s'$ with the same nonce $k(j)$ and therefore leaked their private blinding factor $a_3$:

$$s' - s = (k(j) - k(j)) -a_3(e' - e) = -a_3(e'-e)$$

Then creating a transaction spending the funds in the shared commitment

$$C_m(v_{total}, \sum_{i=1}^3a_iG) = v_{total}H + (a_1 + a_2 + a_3)G$$

will now only require consent from validator 1 and 2 to spend since $a_3$ is now publicly known.

For instance, if they wanted to unlock their funds, they could create and sign a transaction

$$\sum_{i=1}^2 C'_i(v_i+\frac{v_{3}-\texttt{fee}}{2}, a'_i)+C_m(v_{total}, \sum_{i=1}^3a_iG) + \texttt{fee} = (\mathbf{0})$$

that would split the locked funds equally between the remaining validators (assuming they locked equal amounts initially, they’d profit 50% of their locked funds in the process).

Another interesting property is that the cost of the attack is well known. Say a seller is selling a car for 10k. If 55% of the stake signed the block that includes the payment to the seller, then seller knows the lower bound of how much nodes must burn in order to reorg that block away. If the full stake is 100k, 55% - 50% = 5% or at least 5k must be lost for a 10k item. Probably the seller should wait for more signatures before letting the buyer drive away.

Attacks not covered in this proposal

1. Censoring transactions as that cannot be easily distinguished from a disrupted network in which a message never arrives to a VN, thus hard to prove malicious intent. Say a briber bribes a subset of validators to censor tx_1 spending utxo_1 and not let tx_1 be finalized. At the same time, briber promotes through its bribed VNs a double spend of the utxo_1 at tx_2 to get finalized. This is where validator rotation (aka churning) becomes relevant.
2. Validators can only be caught if they equivocate. This means that validators can still mismanage the ledger content, as far as they don’t fork history. Other validators cannot report each other for that, thus this scheme is not sufficient per se to be used as a consensus protocol. Therefore there is the need to enforce by other means that the ledger is updated correctly. This is where settling in the PoW chain becomes relevant.

Background on Consensus

Simplified voting scheme

Let’s assume we have a malicious set of actors m with $1/3$rd of the voting power in a very simplified scheme where blocks are finalized within one round of voting, as in the figure below. Say the remaining validators' votes are equally divided into partions n and n' between two competing block proposals B and B' for the same block height, and, in this very simplified algorithm, they require a supermajority of $2/3$rds to finalize. Then the malicious $1/3$rd can submit its vote for B to partition n and for B' to partition n', and both sub-networks will respectively believe that either B or B' is finalized.

In pBFT, nodes all message one another at every stage of the commitment process as illustrated in Fig 1. of Practical Byzantine Fault Tolerance. A primary (or leader, in modern usage) is responsible for the collection, ordering, and dissemination of client instructions, but the normal operation messaging rounds induce a performance cost of $O(n^2)$ (replacement of the primary is $O(n^3)$).

In modern BFT systems like HotStuff, the scalability of the BFT scheme is improved by reducing the normal operation messaging complexity to $O(n)$ ($O(f n)$ for leader failures) by, among other changes, making the primary/leader not only responsible for collection and dissemination of client requests, but also the messages at each round of the commitment process of the State Replication Machine. In the dissemination process of each round, the leader includes a $k=2f+1$ threshold signature to prove that at least $2f+1$ nodes voted for the same state change and broadcasts this to the $n-1$ other nodes - thus, the leader cannot forge consensus if it does not exist.

Network topologies with partitioning such as described in the prior section, while possible, are unlikely unless network operators collude with the attackers. In a leader-based BFT, with respect to the round protocol, the round leader is the only relevant network operator. As such, if the round leader is honest, it will be impossible to induce incompatible finalized states in separate network partitions since both arms of an equivocation would have to pass through the round leader. On the other hand, if the round leader is part of a malicious $m > f$ set of attackers, the leader can collect two votes from each accomplice: one for B and the other for B', and broadcast these separately to each respective half of the remaining $n < f + 1$ honest nodes.

Since it requires three voting rounds to finalize a block in HotStuff and its derivates, with leader-rotation on every round, the threshold for such an attack to be successful is increased somewhat: the attackers require at least three of their members to be subsequent leaders in the protocol to pull off the attack for a single block height to have two distinct finalized blocks, but for every further round beyond the third that they are in control for, they gain another block on each fork. This is particularly problematic since round leader rotation, in the absence of networking failures and consequent NextView replacements, is deterministic.

As such, the leader-based BFT funneling through a single node at each round improves the scalability of the system assuming $n \geq 3f +1$, but if this assumption does not hold, it makes the attack cheaper to execute since the attackers have total control of the protocol messaging by virtue of the leader-based rounds.