Analytics Access
Incident Report for ServiceChannel
Postmortem

Analytics Access Incident Report

Date of Incident:                   07/22/2020

Time/Date Incident Started: 07/22/2020, 6:26 AM EDT

Time/Date Stability Restored:   07/22/2020, 9:01 AM EDT

Time/Date Incident Resolved: 07/22/2020, 9:19 AM EDT

Users Impacted: SC Analytics Users

Frequency: Sustained

Impact: Minor 

Incident description:

On Wednesday Morning 7/22/2020, 8:21 AM EST. Support teams received reports of a database connection error for users connected to the ServiceChannel analytics platform.

Root Cause Analysis:

During triage, ServiceChannel engineers reached out to our analytics partner and identified a networking-related issue. After some additional review, the ServiceChannel SRE team confirmed an issue with the ServiceChannel end of the tunnel to the analytics vendor.

ServiceChannel SRE engineers teams performed an administrative restart of the VPN endpoint, which restored the tunnel. When the tunnel came back up, the Analytics platform resumed normal functionality. 

Actions Taken: 

  1. The SRE team reviewed internal logs and documentation for the system to identify which system was responsible for communications to the analytics partner.
  2. Confirmed the system had been placed in an unhealthy state by our hosting provider and restarted the system to restore the SSH tunnels to the analytics partner.
  3. Reached out to our hosting provider to further investigate why they placed the system in an unhealthy state.

Mitigation Measures:

  1. Improve monitoring of the infrastructure responsible for connectivity between ServiceChannel and the analytics provider to notify the SRE team should the connection become unhealthy again.
  2. Explore adding a redundant connection to the analytics partner.
Posted Jan 12, 2021 - 22:44 EST

Resolved
Connectivity has been restored to analytics. We consider this incident to be resolved.

As always, thank you for your patience.
Posted Jul 22, 2020 - 09:19 EDT
Monitoring
A fix has been implemented to restore connectivity. We are monitoring.
Posted Jul 22, 2020 - 09:05 EDT
Identified
The ServiceChannel SRE team has identified the issue and is implementing a fix to restore connectivity.
Posted Jul 22, 2020 - 09:01 EDT
Investigating
Platform users may experience database connectivity errors when using Analytics (New). We are investigating.
Posted Jul 22, 2020 - 08:57 EDT
This incident affected: Analytics (Analytics Dashboard).