Incident Report - CUBE-DS-1477 system drive failure
Back Posted on 26 Apr 2016
System drive failure on CUBE-DS-1477.
Incident Overview
Location: London City Data Centre
Start Date & Time (GMT): 17 April 2016, 2245
End Date & Time (GMT): 19 April 2016, 0035
Description of Work
During routine maintenance filesystem inconsistencies caused the server system drive to fail.
Ultimately this led to a new server being built and the web services being restored from backup.
Services Affected
All services on CUBE-DS-1477.
Cloud email was not affected.
Impact
- No data was lost
- Potentially some mail could have been returned to sender as undeliverable if sent during working hours on 18 April 2016. No cases of this happening have been confirmed
- All services were offline for the duration of the incident, a total of 8 working hours
- A number of websites were unavailable from 0035 on 19 April due to database permissions issues
- 95% of system passwords (control panel, mailbox, database etc.) required a manual password reset due to an undocumented feature of the control panel security policy in the event of restoration from backup
Full Incident Report
The full incident report is available to download: Incident Report - System drive failure on CUBE-DS-1477 (135kbPDF)