System Performance Degradation
Incident Report for ServiceChannel
Postmortem

System Performance Degradation Incident Report

Date of Incident: 06/13/2019

Time/Date Incident Started: 06/13/2019, 10:35 am EST

Time/Date Stability Restored: 06/13/2019, 1:12 pm EST

Time/Date Incident Resolved: 06/13/2019, 1:18 pm EST

Users Impacted: Active users

Frequency: Intermittent

Impact: Major

Incident description:

System Performance Degradation where users were unable to login to the system, or experienced errors during the login process.

Root Cause Analysis:

We have identified an issue related to login session management in classic ASP code. This issue resulted in a number of cascading failures, which in turn created timeouts throughout the ServiceChannel platform.

Actions Taken:

Reverted code from previous release

Restarted Redis Cluster

Mitigation Measures:

Added additional monitoring to notify SRE team when Redis Cache hits are over defined thresholds.

Implemented manual temporary stopgap measures and currently working on a permanent solution.

Posted 2 months ago. Jun 17, 2019 - 16:37 EDT

Resolved
All services are confirmed running as expected. We consider this incident to be resolved.
Posted 2 months ago. Jun 13, 2019 - 15:50 EDT
Update
We are continuing to investigate this issue, we thank you for your patience.
Posted 2 months ago. Jun 13, 2019 - 11:44 EDT
Investigating
We are currently investigating degraded system performance. We will provide an update shortly. Thank you for your patience.
Posted 2 months ago. Jun 13, 2019 - 10:35 EDT
This incident affected: Call Center, Fixxbook, IVR, Service Automation, and Email Processing.