Back to News

Incident Christian Schild

Incident: gitlab.git.nrw outage

Service
gitlab.git.nrw pilot
Started
May 9, 2025 01:30 +0200
Resolved
May 9, 2025 07:15 +0200
Duration
5h 45m
Affected Components
database gitlab

Impact

Service was completely unavailable during the outage window.

Root Cause

Database replication had silently stopped. Overnight, the remaining DB node's disk filled up with WAL transaction logs.

Resolution

Disk space was increased and replication was restored. Improved monitoring will be implemented.

Follow-up Actions

  • Implement WAL size monitoring and alerting

  • Add replication health dashboard

  • Document incident response procedures

Additional Details

During the night of May 9, 2025, gitlab.git.nrw was unavailable from approximately 01:30 to 07:15 CEST.

As a follow‑up effect of the Wednesday incident (May 7, 2025), database replication had silently stopped. Overnight, the remaining database node’s disk filled up with WAL transaction logs, which took the service down.

Disk capacity was increased and replication restored; the service has been fully available again since 07:15. As we are still in pilot, we continue to harden the platform and improve monitoring.

Related Updates