Network disruption

Incident Report for SuperOffice

Postmortem

The incident in the datacenter was triggered by an overload of monitoring resources on a specific switch model (N9K-C93180YC-FX3). This switch was configured to handle multiple types of SPAN sessions (Switched Port Analyzer), which are used to monitor network traffic. However, the switch reached its hardware limit for handling these sessions, leading to a failure.

The fault (F3849) indicated that the SPAN limit was exceeded, triggering a reboot due to a known bug in the system. This bug was activated when we tried to set up a combination of monitoring sessions and then removed some of them. This action caused the bug to affect all similar switches in the network.

In simpler terms, the switch was asked to do more monitoring than it could handle, which led to a failure and reboot. The specific bug in the system was triggered by the way we configured and then changed the monitoring settings, causing widespread issues across all similar switches at the same time.

Posted Mar 18, 2025 - 10:40 CET

Resolved

This incident has been resolved.
Posted Mar 14, 2025 - 23:12 CET

Monitoring

A fix has been implemented and we are monitoring the results.
Posted Mar 14, 2025 - 15:51 CET

Investigating

We are currently investigating Network disruption that is causing intermittent availability issues with the SuperOffice CRM Cloud
Posted Mar 14, 2025 - 15:27 CET
This incident affected: SuperOffice CRM Cloud (Login services, Sales & Marketing Client, Service Client, WebTools, API and Apps).