Thoughts on the Mempool Issue and Upgrading

Getting a transaction confirmed in Neptune Cash right now is hard. Broadcasting a ProofCollection-backed transaction is easy enough, and you will probably be able to see it land in the mempool. If you are lucky someone else will Upgrade the ProofCollection into a SingleProof but if not then the transaction is destined to be ignored by the next block and all its successors. Even if someone does Upgrade the proof, there is still a high chance that the transaction does not make it into the next block. If that happens, it will most likely be kicked out of the mempool. In this case, your best bet is to try initiating the transaction again.

From a superficial reading, the difficulty of getting transactions confirmed is related anomalous behavior exhibited by the mempool. After all, the mempool should be a sort of staging area for transactions to be confirmed in the next block. One does not expect transactions to suddenly disappear from it for no good reason. However, putting the blame for this issue on the mempool risks misdiagnosing the root cause.

The heart of the mempool issue is an incentive problem. Upgrading transactions requires resources and nodes rationally prefer to devote those resources to tasks where the return is larger. Therefore, out-of-date transactions represent a burden and nodes clear their mempools of them. When it is time to compose a block, composers find their mempools empty.

There is insufficient bookkeeping. A node that upgraded a transaction in the past would have collected a fee for that job. Therefore that node has the incentive to keep that transaction up to date so that it can eventually be absorbed into a block. Presently, there is no code for keeping track of fees bound to the node’s own wallet for mempool transactions. Adding this mechanism would go a long way towards addressing the problem. And while I support adding that mechanism, I think it is treating a symptom and not the cause. Keeping track of collected fees does not present a compelling answer to the question why it is worth collecting those fees in the first place – particularly, given the other potential applications of the requisite proving power.

From a more profound point of view, the issue stems from resource contention. It requires proving power to Upgrade transactions and it requires proving power to produce block proposals. The same proving power cannot be applied towards both tasks simultaneously, and therefore it has to choose. Presently it is more profitable to produce block proposals, so that’s what provers end up doing.

In principle higher fees could solve this issue. However, the present code architecture, the configuration options, and the understanding on the part of users, are inadequate to regulate this market. Users cannot specialize in one of both proof-heavy tasks (Composing and Upgrading). Those tasks aren’t even cleanly separated in the code. So users end up having to do both, and rationally prefer to sacrifice the quality of whichever task is cheaper to sacrifice in order to maximize profit off the other.

The protocol stipulates that the Composer must to perform at least one Merge (in order to set the merge bit); this dynamic incentivizes him to confirm some transaction as opposed to none, which is good. But the Composer has no incentive to perform more than one Merge: the resources spent on the second Merge step are better spent outbidding competing Composers.

I propose the following perspective, and matching architectural and configuration changes.

  • Block production consists of three steps, not two. The three steps are upgrading, composing, and guessing. Going forward, we should characterize upgrading as the first step in a three-step mining process.
  • The Upgrader upgrades transactions (including Mergers) and gobbles transaction fees in the process. His task is to provide composers with ready-to-mine transactions. Subject to meeting a minimum proving speed, the composer benefits from having a larger proof throughput: the more transactions are merged, the more fees are collected. Any fees not claimed by the Upgrader are left to the remaining parties of the block production pipeline. The Upgrader is paid from transaction fees, but if there is enough competition he must cede a larger portion of that fee to improve the chances of his transactions being selected by a Composer.
  • The Composer merges a transaction that has a coinbase with a transaction prepared by the Upgrader. The Composer proceeds to use this merge output as the basis for a block proposal. Any fees not claimed by the composer are left to the Guesser. Note that the fees now include the block subsidy. Contrary to the Upgrader, the Composer stands to benefit from faster proving and cares little about proof throughput: the faster the block proposal is produced, the sooner guessing can start, or the faster the competing block proposal can be outbid. The Composer is paid from both the block subsidy and the transaction fee (the proportion that was not claimed by the Upgrader), but if there is enough competition then he must cede a larger portion of that reward to the Guesser in order to incentivize Guessers to adopt his proposal.
  • The Guesser contributes a difficult-to-find nonce to a block proposal (thus turning it into a block) and collects all fees not already claimed earlier in the pipeline.
  • Just as Composers compete against each other with more profitable block proposals, so too do Upgraders compete against each other with more profitable aggregated transactions. The objective of the Upgrader is to ensure that at any point in time, the most profitable (for the next stages of block production) synced transaction in circulation is one he collects fees from.

Concrete changes proposed:

  • CLI arguments:
    • --upgrade configures the node to upgrade transactions (by default this mode should be disabled).
    • --min-upgrade-fee rename from --min-gobbling-fee.
    • --max-fee-left-unclaimed regulates how big the portion of the fee that is not claimed by the Upgrader can get before it becomes uneconomical to perform upgrades – or phrased differently: defines what “economical” means.
  • Separate mine loop into compose loop and guess loop; and add upgrade loop. Only one loop can be active at any one point. (Want to run more than one? Do it on multiple machines.)
  • Upgrade loop:
    • Upgrade any transactions that makes economical sense (this includes performing Mergers), and broadcasts the result. There is a discussion to be had about how to select Upgrade jobs. Optimizing for profit is a poor strategy because competing Upgraders might do the same. Probably a good idea to strike a balance between profitability and randomness.
    • Record fees gobbled for upgraded transactions. Transactions with greater benefit to the node’s own wallet should be prioritized over others.
    • Track the most profitable (from the perspective of the rest of the block production pipeline) synced transaction in circulation and, if possible, outbid it.
  • When initiating transactions, default to ProofCollection plus fee.

How do these changes address the issue(s)?

In the abstract, we no longer depend on Composers to be altruistic. Instead, we depend on both Composers and Upgraders to be self-serving in a marketplace whose dynamics ultimately serve the user.

Concretely, the present state of the code has two limitations.

  1. There is no mechanism for merging transactions, except as a step in the Compose loop. Rational Composers will skip this step. For all we know, they already do. As a result, the effective throughput capacity of the blockchain is one (base) transaction per block. With this proposal, it becomes economical to run Upgraders on machines that cannot Compose competitively. The Upgraders collect fees by merging transactions, optimizing for throughput (because throughput, not time, maximizes fees). Ultimately the throughput capacity is limited by the block size and the number of Upgraders on the network – and independent from the target block interval. (Let that sink in.)
  2. There is no mechanism for keeping mempool transactions updated across blocks that do not confirm them (or conflict with them), unless those transactions were initiated by the node themself. Specifically, even nodes that upgrade the transactions of others – for example by transforming a ProofCollection into a SingleProof and take a fee for that task – fail to record that they benefit from that transaction’s confirmation. Consequently, if a new block is found that does not confirm such a transaction (or conflict with it), then the induced Update task is (incorrectly) determined to be unprofitable and the transaction in question is kicked out of the mempool. Consequently, nodes broadcasting ProofCollection and charging the network with upgrading it have a small and random time window for success. Too early, and they don’t know the state of the mutator set to sync the transaction to. Too late, and a new block is found before the transaction’s upgrade and dissemination was complete. And even if they hit the success window, then they run into problem (1). According to the proposed perspective, Upgraders obviously need to keep track of which mempool transactions they already collected a fee from. Failure to keep track of this info is no longer a detail whose absence can be ignored; it is instead a jarring deficiency because collecting fees from transactions is the quintessential feature of Upgraders. With the shift in perspective comes the mandate to keep track of already-collected-fee already. With that deficiency remedied, Upgraders will Update, Merge and rebroadcast fee-paying transactions that were Upgraded from ProofCollection to SingleProof.

I think this is difficult to implement in practice. How to ensure that the composer’s mempool has tx that needs to be updated? Especially in the early stage of the network, when there are very few tx.

@aszepieniec thx for the writeup and proposal. I’m glad to see this is becoming a top priority.

I believe the dynamics/incentives of the system are such that only a very small number (perhaps even 1) composer will typically be dominating block composition. Moreso if they dynamically adjust the guesser fraction to be close to avg over the last N blocks.

I don’t see that as a problem at all. Because there is a market and even the composer with fastest hardware must keep the guesser fraction high or risk losing the block reward to another composer.

What it means though is that other network participants with reasonably beefy hardware should realize it is more profitable to upgrade transactions.

But critically, this is only true if they are not performing wasteful work duplicating the work of other nodes. So, as I’ve been saying since day 1, it is important for the network’s health that a mechanism is found to help proof-upgraders to coordinate so that we get close to a situation where only one upgrader works on each transaction that needs to be upgraded.

It is has been suggested previously that we can simply leave that to the market and treat it as an out-of-band issue for neptune-core. I think that may be true in the long run, but who knows how long that might be. As such, I think it is in the best interest of the project to create a “best effort” solution now that may not be theoretically perfect but is “good enough” to make it profitable for nodes to dedicate resources to upgrading.

I have suggested such a simple mechanism for this a couple times in the past, before mainnet even. The latest is here. To date, I haven’t received any meaningful feedback about the suggestion. Note that this does not require any change to the mining logic, and no hard or soft fork.

right. The fundamental issue I still see is that neptune-core is wasting available resources by duplicating upgrade work across nodes, and that is both an unprofitable strategy and a very inefficient one for getting transactions upgraded and confirmed. For this reason, I suggested to either add a coordination mechanism before mainnet launch (my preference) or add “send-rate” training wheels; the latter course was chosen and by the way are coming off pretty soon.

We need to get things working at a decent level of efficiency before considering higher fees to be a solution. ie, there’s low hanging fruit here.

Ok cool. It’s great to have a writeup of the problem but even better to have concrete proposal(s) to fix it.

The first thing I notice about this proposal is that it changes the mining dynamics and would thus appear to require a fork. I don’t see that as a big problem at this early stage, but I would like to have a discussion of the tradeoffs and merits of this proposal vs the much simpler scheme I proposed before that does not require any changes to mining.

The biggest question of course is: what problem(s) does this solve that the earlier proposal does not?

I 200% agree the mining loop should be refactored. I think I would need more convincing that a node should only be allowed to do one of the 3. If it proves unpopular, miners will simply run a fork of neptune-core (or custom software) that behaves how miners want it. So I think that limitation only makes sense if it clearly aligns with incentives and feedback, and I don’t see that case has been made.

I might’ve missed it, but I don’t remember seeing anything in this proposal about upgraders coordinating to avoid duplicating eachother’s work. If we have 100 tx in the mempool that need upgrading, and we have 50 proof upgrading nodes available, it’s both wasteful and unprofitable if any 2 or more of those nodes start upgrading the same transaction. So they are not competing with eachother unless and until there are more proof upgraders available than transactions in the mempool. And until then, and perhaps after, coordination is the most profitable strategy for them individually and collectively.

I’m going to beat the same drum again. Selecting which upgrade job to work on is the core of the problem. And it does’t make any sense to select a Tx to upgrade that another node is already upgrading. and the only way to know that is to coordindate about it. I think that without solving that, we haven’t really solved anything.

I think this proposal is a step in the right direction, but needs to include coordination (somehow) before it can be said to fully address the issues(s).

That statement is incomplete. Consider the two extreme cases:

  1. all proof upgraders upgrade each tx
  2. only one proof upgrader ever performs work for a given tx.

The throughput capacity of the network will be very different for these two scenarios.

(2) is clearly more optimal, and any design should attempt to achieve it, or get near it, at least while Tx outnumber upgraders.

If the mempool is cleared and no transactions are initiated for a while, then Upgraders do not have anything they can keep up to date. So how can they ensure that Composers have at least one transaction in their mempools?

Upgraders could produce a dummy transaction and transmit that, but certainly do not have any incentive to do so – unless they are deliberately co-operating with a Composer.

Alternatively, we could drop the requirement that Upgraders have any responsibility in this regime. Let the Composers figure out what to do. Maybe they need to send a payment to an Upgrader. Or maybe they produce the dummy transaction on their own – and eat the penalty of producing a slower block proposal.

Either way, the problems that arise when no transactions are being initiated are quite opposite to the thing this thread is trying to solve :wink:

The suggestion @aszepieniec makes is not a change to the protocol, and hardly even one to the application.

Competitive composers already follow the model that @aszepieniec codifies above: they avoid spending time upgrading transactions and only do the minimal work required to make a block proposal as fast as possible in order to collect as many fees as possible.

Composers can already avoid doing much of the transaction proof upgrading by setting certain CLI flags: --tx-proof-upgrade-interval=0 and --max-mempool-num-tx=1. The only difference between the current application running with these parameters and @aszepieniec proposed three-step mining is that the composer might run the “update” branch of the SingleProof program once when a new block is received (if there is a transaction in the composer’s mempool. With @aszepieniec’s suggestion that would not happen.

Also: @aszepieniec claim that proof upgrading does not do transaction merging is incorrect, as can be seen here:

That being said, I wholeheartedly agree with @aszepieniec codification of what was otherwise only tacit knowledge present in those who are composing competitively. I do, however, think that the vision can be implemented with much smaller changes to the codebase than suggested. Specifically: I don’t see the need for an upgrade loop – upgrading currently happens as a scheduled task initiated by the main loop. I don’t see a reason to change that.

1 Like

It helps listing the proofs that each party would be doing under 3-step mining:

Transaction-proof upgrader (runs each arbitrarily many times)

  • ProofCollection → SingleProof (“proofcollection” branch)
  • SingleProof → SingleProof (“update” branch)
  • (SingleProof, SingleProof) → SingleProof (“merge” branch)

Composer (runs each only once)

  • ProofCollection → SingleProof on coinbase transaction
  • (SingleProof, SingleProof) → SingleProof, merging coinbase with the one transaction in the mempool, thus setting the merge bit
  • SingleProof → BlockProof by running the block program.

Guesser
No proof generation, only hashing.

We are observing transactions that are broadcast but do not make it into blocks, even when no other transactions are broadcast at the same time. At first approximation the logical explanation would seem to be not enough Upgrade work, not too much Upgrade work.

While I agree that work duplication is a problem worthy of attention, I don’t see how it is an adequate explainer for the observed problem.

I do not think it requires a fork. How did you come to this conclusion?

As for your suggestion, let me respond to this:

  1. It solves a different problem, one we do not have presently.

The problem addressed by this note is the very high likelihood of a transaction not being confirmed in the next block, even when no other transactions are being broadcast in the same time.

The problem addressed by the cited proposal is the reasonable likelihood of an initiated transaction being ignored when broadcast more or less simultaneously with others.

  1. It involves modifying the P2P message format, which is possible but fraught with difficulties. The last time we tried to add a new message type, nodes instantly closed the connection after receiving such a message. Did we solve that issue yet?
  2. Upgraders are in adversarial relationships with each other. If I know which transactions you are working on, I can deprive you of fees by upgrading the same ones.

If you are going to do more than one, you are bound to do two or more poorly. I think it is beneficial to the node operator and to the network to specialize. But that’s just an intuition. I’m not willing to die on this hill.

Even if there are two Upgraders and 100 transactions in the mempool, the Upgraders can be in competition with each other. The fees they can collect are not necessarily the same for every transaction. So conceivably all transactions have a fee of 0.01 NPT except for one whose fee is 10 NPT. Obviously the two Upgraders will compete to Upgrade the one high-fee transaction because if they cede that task to the counterparty they will have less profit.

If instead they coordinate to divide the fees more or less equally, then whoever doesn’t get the high fee paying transaction takes on the task of upgrading 99 low fee paying transactions. So in this case there is still an imbalance of workload.

This is true from a collective perspective, but false from an individual perspective. An individual Upgrader can rationally decide to Upgrade the same transaction they know others are Upgrading so that they can benefit from the fee instead.

The statement is false in case (1). Assume the system is running at full capacity. Then double the number of Upgraders and double the number of transactions. The workload of every Upgrader has doubled and they can’t keep up with the block time. Only half of the transactions can make it through.

The statement is true in case (2). Assume the system is running at full capacity. Then double the number of Upgraders and double the number of transactions. The workload of every individual Upgrader is the same. (There is an overhead due to one extra merge step being required but this one extra step is asymptotically negligible.)

So clearly the system I am describing resembles (2) more than it resembles (1).

However, the question is not: what is the optimal distribution of work, in the abstract? That answer is obvious. The question is, how best to approximate the optimal distribution of work in an adversarial environment? I think you will be hard-pressed to find a solution that works better than markets and prices.

So how does the present proposal approximate the optimal distribution of work in an adversarial environment? In a (probably way too abstract) nutshell, Upgraders adopt a Nash-optimal strategy, which most probably balances profit with randomness. How to find this Nash equilibrium? We can discuss strategies but ultimately the best way is to leverage arbitrage: let Upgraders chase profits and over time they will converge to a Nash equilibrium.

Let this transformation henceforth be called “raising”.

Instead of allowing all upgrader nodes to race for incoming transactions, what about we introduce a coordinated mechanism over the P2P network. Upgrader nodes can optionally reveal themselves as available, and the node that first receives a transaction (the “relayer”) is responsible for delegating it to a selected subset of these known upgrader candidates.
This delegation can be based on criteria such as availability(+ workload), performance, or fairness.
In addition, the original relaying node (that propagated the transaction) can receive a small reward or incentive, incentivizing the operation of public nodes.