At 3:00 PM PST on December 9, 2020, an internal fullnode lost connection to the XRP network, and fell behind chain-head. Due to a spike in overall XRP network transaction traffic and resource constraints, the node was unable to fully catch up.
(All times stated here are Pacific Standard Time)
12/9/2020, 3:00 PM Initial XRP fullnode network disruption.
12/9/2020, 4:26 PM BitGo status page was updated to announce the initial incident.
12/10/2020, 10:41 AM XRP fullnode arrives at chain-head.
12/10/2020, 10:47 AM BitGo status page was updated to announce incident resolution.
The outage impacted our ability to index and process new XRP network transactions, and not result in any data loss or corruption.This outage did not interact with or impact any systems that handle funds or currency.
An initial networking disruption prevented an XRP fullnode from communicating with the network.
Once reconnected, the fullnode had fallen behind chain-head.
Due to the initial network disruption, a spike in overall XRP network transaction traffic, insufficient network throughput, and disk I/O latency, the fullnode was unable to index the XRP ledger fast enough to catch up with chain-head.
We approached the solution from two directions. First, we reached out to Ripple engineering for guidance on anything we could do to help mitigate the current network load. Second, we began implementing an improved hardware setup to accommodate more than enough resources the node was demanding. Once the new stack was operational, we were able to resume operations for XRP.
We’ve completed a capacity-planning exercise, and created new nodes with an improved architecture, capable of handling current and future projected XRP network behavior.