System Performance
Incident Report for ServiceChannel
Postmortem

Date of Incident: 05/14/2019

Time/Date Incident Started: 05/14/2019, 12:23 pm EST

Time/Date Stability Restored: 05/14/2019, 2:02 pm EST

Time/Date Incident Resolved: 05/14/2019, 4:03 pm EST

Users Impacted: Active users

Frequency: Intermittent

Impact: Major

Incident description:

Performance degradation throughout all systems resulting from a drastic increase in system response times due to database performance issues

Root Cause Analysis:

A recently enabled feature for WO reports resulted in higher-than-expected database server resource consumption. As requests queued, this caused an overall degradation of performance in the application.

Additional research determined that the code returning data for these requests was not optimized, causing excessive database blocking and waits.

Actions Taken:

Temporarily disabled certain customer-specific integrations to reduce excessive database load due to a long API request queue.

Identified and disabled a poorly-performing database stored procedure

Mitigation Measures:

Engineering teams investigated and refactored the code responsible for this feature.

Enabled throttling of these requests to prevent recurrence.

Posted Oct 09, 2019 - 14:27 EDT

Resolved
We are currently investigating degraded system performance. We will provide an update shortly. Thank you for your patience.

...

Our engineering team has identified the issue and services are returning to normal with the exception of some Workforce reports. Work order reports are currently not displaying the technician's information associated with the work order. Tech assigned and Tech accepted fields are not available. We are continuing to monitor and will restore the reports shortly. Thank you for your patience.

...

Workforce reporting has been re-enabled. After 90 minutes of monitoring, all systems are functioning normally. We consider this incident to be resolved.
Posted May 14, 2019 - 12:24 EDT