Senko Digital

History

Jan 2026 - Mar 2026

Incident

Node Outage

Affected services:

🇩🇪 FC [9575F] zerda

We are pleased to confirm that the elevated CPU steal time issue on the fra-9575f-zerda cluster node has been fully resolved and proper operations have been restored.

After thorough analysis, we determined that the root cause was related to how virtual machines were resuming simultaneously after the initial restart, causing excessive resource contention. To address this, we performed a full power-down and cold start of the node, and implemented changes to our virtualization configuration - specifically, VMs now start gradually with properly defined CPU quotas to prevent resource saturation during boot sequences.

All virtual machines on the node are back online and operating normally. We will continue monitoring the node closely over the coming hours to ensure sustained stability.

Regarding compensation: we will be providing three additional days of service to all impacted customers, on top of the two days already credited from yesterday's initial outage - totaling five extra days. We recognize this goes well beyond the actual downtime experienced and exceeds our SLA obligations, but we believe it's the right thing to do here.
Our customers put their trust in us, and we understand the disruption this incident caused. This is not the first time we've gone above and beyond our policy in situations like this, and it won't be the last.

We sincerely apologize for the inconvenience and thank you for your patience throughout this incident.

No further action is required from customers. If you experience any remaining issues with your VM, please don't hesitate to reach out to our support team.

Feb 23, 5:19 PM

We are aware of the elevated CPU steal time currently occurring on the fra-9575f-zerda node.

Our team is actively monitoring and analyzing the situation. Prior to the power interruption and subsequent reboot, no abnormal steal time behavior was ever observed. Based on this, the issue is most likely related to a software-level condition that manifested after the restart.

In parallel with our investigation, we are preparing a new cluster node to migrate a portion of the affected virtual machines within today. This will allow us to further distribute the load and mitigate the impact while we continue deeper analysis.

Feb 23, 9:24 AM

Approximately one hour ago, the fra-9575f-zerda cluster node experienced an unexpected outage due to a power rack feed issue within the data center. The interruption affected the entire cluster node.

To safely restore stability, we performed a controlled power-down of the server, re-established stable power to the rack feed, and then brought the node back online.

All affected virtual machines are now booting back up and should be operational going forward. Approximately 1/3 of all the VMs hosted on the node is already up. No further impact is expected at this time.

We will continue monitoring the node to ensure stability.

This is an unplanned incident. No action is required from customers at this time. We will automatically provide two extra days of service to all impacted customers from this incident. We apologize for any inconvenience caused.

Feb 22, 6:58 PM