About the send-rate-limit (and proof upgrading)

I started writing a github issue about this, thinking we might need to extend the send-rate-limit but halfway through I realized we don’t really need to, so I am posting a write-up about it here instead.


What’s this send-rate-limit about?

The send-rate-limit prevents a node from sending more than 2 transactions per block. It is in place only until block 25000, which should occur mid-year 2025 about 5.6 months after the genesis block, assuming 10 minute block intervals. Think of it as training wheels for neptune-core.

The limit was put in place due to concerns about the possibility of a transaction sending amplification attack against the entire network that I/we feared might result in a complete denial-of-service for legitimate users, thus rendering the network unfit for purpose.

spoiler: that doesn’t appear to be the case (entirely).

The rate-limit is set to expire at block 25000 because it was assumed we would have more information and mitigations by then, and could possibly extend it if need be.


What’s this about an amplification attack?

Each neptune transaction requires a very computationally expensive proof called a SingleProof. Typically when a transaction is initiated, the sending node does NOT compute the SingleProof, but rather provides a much easier to compute proof called a ProofCollection. This is then broadcast to other peers in the network to store in their mempool. Any of these peers can then voluntarily compute the proof, thereby upgrading the transaction so it can be included in a block. In so doing, the node that provides the proof collects a fee. (some details omitted.)

Ok, so already we see there is an amplification occurring here. It takes relatively little work for a sender to initiate a transaction, but a lot of work for a peer to upgrade it. The optimal scenario would be that for each transaction initiated, exactly one peer performs work to upgrade that transaction (leaving all other peers available to upgrade remaining transactions). Even in this optimal case though, we can see that since it takes minutes for a peer to upgrade a transaction and seconds for the sender to initiate a transaction, then the sender can easily flood the network. The mempool of the proving nodes will fill up with non-upgraded transactions.

we don’t have the optimal scenario

Presently neptune-core does not provide any mechanism for peers to coordinate with regards to upgrading proofs (to avoid duplicating work). This means that multiple peers may select the same proof to work on. And probably will, as the selection algorithm is deterministic, not random. Thus (needless) duplication of work will almost certainly be occurring at some points in time, and possibly most of the time.

A simple improvement would be to add a message so that a prover can tell other peers “hey I’m working on Transaction X”, so they can ignore transaction X for some time, perhaps something like 10 minutes or 2 more blocks. After that if the transaction is still in their mempool (has not been included in a block) they might consider it again.

Is this attack a serious problem, or not?

short answer: maybe, but not a complete denial of service.

Such an attack will make transaction fees rise for all users, but honest users should still be able to get transactions confirmed, provided they are willing to pay a higher transaction fee than the attacker’s highest fee(s).

Why?

In the flood attack scenario, mempools of all nodes will begin to fill and may well become completely full, at which point the lowest fee-per-byte transactions start being ejected.

Proof upgrading nodes can only upgrade one transaction at a time, and the remaining transactions just wait in the mempool. When an upgrading node finishes upgrading it will look in the mempool for the transaction with the highest fee to upgrade.

For an attacker Mallory to successfully deny service to honest participants, Mallory must always have a transaction in the mempool with the highest fee. But then Mallory’s transactions keep getting upgraded, which costs Mallory more money.

It’s important to note though that Mallory only needs one highest-fee transaction in the mempool at any given time. So all the rest of her transactions could be very low, or perhaps 0 fee. In practice, Mallory would likely try to keep a number of high fee transactions in the mempool, perhaps 5 or 10, since proof upgraders finish work at different times and will not be selecting the same proof at once.

Thus, if Mallory is well funded she can make the network cost-prohibitive for honest users to use. It is not a true denial of service because an honest user can always pay a higher fee to get their transaction upgraded and included in a block.

how can we mitigate or improve this?

  1. A good first step will be to reduce or eliminate duplicate work between upgrade nodes. This will increase the network’s overall capacity, so it can process more transactions in parallel. That then requires Mallory to have more high-fee transaction in the mempool at any given time, making the attack more expensive.

  2. Each transaction sender can generate the SingleProof themself rather than relying on an upgrade node to do it. When a transaction is broadcast with a SingleProof it can immediately be included in the next block and effectively bypasses the upgrade bottleneck. Recently new APIs have been added that enable neptune-core RPC clients to generate a proof outside neptune-core, perhaps even on another device.

  3. Proof generation times will come down over time. Hardware is continuously getting faster and more powerful. Further there is a path towards generating proofs with GPUs and perhaps eventually with dedicated devices, ie ASICs.

  4. Anyone, perhaps a community member reading this, could create a mempool viewer website (or app) that makes it easy to see what the highest fee Tx in the mempool presently is. This can help users to get Tx upgraded and confirmed faster when time is of the essence. (This could also be integrated into wallet apps, such as neptune-dashboard)

  5. ??? Ideas welcome!

should we extend the send-rate-limit?

short answer. probably not.

This limit can be considered problematic for services such as exchanges that need to perform higher volumes of payments.

Note however that a transaction can include many outputs to different recipients, so such services could batch all outgoing payments into one or two transactions per block.

Most importantly, the attack does not appear to enable the attacker to perform a complete denial of service to honest participants, and is thus not an existential threat to the network.

what do you think?

Let’s hear your thoughts, q’s, ideas, corrections, etc.

Could we apply limit on proof type so that only 2 ProofCollection txes allowed in a block, but higher limit on SingleProof?

that’s a good idea, and I don’t see any problem with such a change.

Whether we do or not I think will come down to timing of the next release(s). If we make another public release before block 25000, then it would make sense to have that change in there, provided no issues are raised about it.

If not, and the send-rate-limit is allowed to sunset anyway, then there is no need.

thanks for the suggestion!

Related. Here is an idea Thorkil and I came up with that allows proof-upgrading-as-a-service providers to coordinate the division of labor between themselves, without any interaction. The idea is to use the command line to specify a set of residue classes modulo some integer. Upgrade jobs are accepted when the hash of the resulting transaction id is congruent to one of those residues. For instance, --puaas-dol {3,4,5}%6 divides all upgrade tasks into 6 roughly equal size buckets and instructs the node to take buckets 3, 4, and 5. In combination with other nodes who run --puaas-dol 0%6 and --puass-dol {1,2}%6 the work is cleanly divided up.

Of course you don’t know which command line arguments other people are running their nodes with so you can’t coordinate with them. But whenever one person or a set of closely coordinating persons need to divide labor they can do it with this scheme.

It is even a good idea to integrate some random salt before hashing. E.g., --puaas-salt 0xffdd08b1 prepends the given value to the transaction kernel id before hashing it. As long as the salt remains secret, external nodes cannot anticipate the concrete division of labor and use that knowledge to benefit their attack.

It’s an interesting idea, and zero-coordination is a nice property, but problems immediately spring to my mind, which is why I haven’t suggested something like that:

  1. There seems to be a real possibility for some buckets to never be processed by any service provider. Especially for a small network.
  2. There is still potentially a lot of duplication of work between service providers because (a) providers can’t coordinate buckets, and (b) bucket sizes do not take into account the number of providers which is also dynamic. This means the network will still not be optimally processing upgrades, which means the attack is lower-cost than it could be.

So I would turn this around and ask: what is wrong with the simple coordination approach I proposed above, that seemingly should result in near optimal coordinaton. I quote it here:

A simple improvement would be to add a message so that a prover can tell other peers “hey I’m working on Transaction X”, so they can ignore transaction X for some time, perhaps something like 10 minutes or 2 more blocks. After that if the transaction is still in their mempool (has not been included in a block) they might consider it again.

To answer my own question: there is a possibility that two or more providers begin working on an update before receiving the peer’s “I’m working on it message”. So if a given node is already working on an upgrade and receive’s notice that peer is also, the protocol would ideally have a way for those peers to further coordinate such that one of them abandons the upgrade and the other does not. However in practice, this may not be a common occurence, so I would say the situation could simply be logged initially, and then we could collect some data to see if it happens enough to be worth mitigating or not. And/or simulations could inform. Anyway, I believe there are various papers written about solving this class of problem.

another zero-coordination approach would be for upgrade-providers to select tx (to upgrade) from the mempool at random. (instead of choosing tx with highest fee-per-byte)

This has the following properties:

  1. when mempool is full, there should be relatively little duplication of work, except in a truly huge network.
  2. when mempool is nearly empty, there will be a lot of duplication of work. But that seems OK since clearly Tx are being processed. (perhaps not OK from a total energy usage perspective)
  3. in attack scenario, or any congestion scenario, honest participants will NOT be able to pay a higher fee to get their tx upgraded sooner.

(3) appears to be a showstopper for this approach. Which is why I haven’t proposed it before. But just adding to the mix. maybe there’s a way to improve it.