On Monday, May 19, the Stacks network experienced an extended stall, with block 1213722 timestamped at 18:43:55 GMT and block 1213723 at 23:42:53 – no valid blocks were produced for 4 hours and 59 minutes. At the beginning of that time period, users may have witnessed blocks 1213723 through 1213804 be mined and broadcasted to the network. Those 82 blocks were then invalidated by a Bitcoin fork, a reorg of Bitcoin block 897442. After this Bitcoin reorg was resolved, Stacks nodes that followed the original fork, including the >70% of signers that signed off on them, were unable to successfully invalidate these blocks. This prevented any new miners from reaching the 70% threshold required for block production.
Technical Details
When Bitcoin forked, the new miner on block 897442a (we’ll use 897442a and 897442b to refer to the two competing blocks in the Bitcoin fork) attempted to mine blocks, but was unable to reach 70% acceptance from the signers. The miner from Bitcoin block 897441 was on the other fork and was able to extend its tenure with the burn view of block 897442b and mine blocks 1213723 through 1213804. This was allowed because >70% of the signers were following that same Bitcoin fork (897442b) and this block had no valid miner, in which case the previous miner is permitted to extend their tenure and continue producing blocks with the new burn view. Eventually, Bitcoin block 897443 arrived and confirmed 897442a, which should have invalidated all blocks on the 897442b fork. If these blocks had been mined by a new miner in block 897442b, they would have been properly invalidated, but due to a bug in the handling of the tenure extension into a reorged block, they were not. This caused the node endpoint /v3/tenures/tip/{consensus_hash}, used by signers to retrieve the header of the highest known block in the given tenure, to continue to return the header for the now invalid block 1213804. These signers then were forced to reject attempts to mine a new block 1213723, because they were expecting a block at height 1213805. This problem is being tracked in issue #6126.
At the same time, some signers also got stuck in a loop attempting to push the invalidated block 1213723 to their Stacks node, but the Stacks node rejected this block, because of the invalid burn view. The signer code was set to continue retrying to send this block to its node until it was successful, and therefore was stuck in an infinite loop. This problem is being tracked in issue #6127 and fixed in pull request #6129.
In order to resolve these problems, Stacks nodes which had been on the re-orged Bitcoin fork needed to have their chainstate reset to some time before the reorg happened, then allowed to catch back up to the chain tip. If the signers were stuck in the loop described above, a simple restart was enough to solve this issue. Due to the decentralization of the Stacks signers, this type of action is difficult to coordinate and therefore it took some time to get enough signers to recover before the network could safely resume.
Lessons and Remediation
In addition to resolving the bugs identified above, the following actions are being taken to prevent similar events and improve response times for future incidents:
Testing Improvements
Enhanced testing remains a primary focus of our ongoing strategy. The core development team is conducting an audit of the Stacks node integration tests, to identify and cover any missing scenarios, particularly those involving tenure extensions and Bitcoin forks. This effort complements existing work already in progress to improve these tests and enable synthesized command sequences, to help catch the scenarios that we were unable to come up with manually (e.g. see PR #6007).
Signer Preparedness
Alongside these technical improvements, we will also prioritize the preparedness of signers to handle future incidents swiftly. We recommend that signers implement effective monitoring and alerting solutions, ensuring timely awareness and action when issues arise. One option here may be to set up a community-wide alerts channel to which interested parties can subscribe (details to follow). Additionally, establishing a routine process for regularly capturing chainstate snapshots will allow for rapid recovery and minimize downtime during unforeseen events.
Alerts for Bitcoin Reorgs
We will introduce automated alerts to promptly notify core developers whenever a Bitcoin reorg occurs. These events are rare enough that it is always a good idea to keep an eye on the chain when one happens, to ensure that some new corner case hasn’t been exposed due to untested reorg timing.
Logging Enhancements
Finally, we’ve made minor but meaningful improvements to Stacks node logging. These small changes would have made reviewing the logs in search of this problem somewhat easier. A pull request has already been merged with an initial set of such changes (#6125).
Summary
Collectively, these improvements in testing, signer readiness, alerting, and logging, represent a shared commitment across the developer and signer community to strengthen the resilience of the Stacks network. While coordination in a decentralized ecosystem poses unique challenges, ongoing collaboration and proactive tooling can help the network respond more effectively to rare but challenging events like this one.