Post Incident Summary
Availability of Flexpa is critically important to our customers. As part of our normal incident response, we have conducted a post-incident summary as to the source of this incident.
On June, we deployed new logging mechanisms in order to more effectively troubleshoot customer issues. The deployment was completed successfully and internal testing showed systems were operationally normal.
On June 6, Flexpa experienced three brief API outages cumulatively lasting 6 minutes. During these outages, Flexpa was completely unavailable as services restarted
At 6:31 PM EST, it was observed that the MyFlexpa application was behaving erratically.
At 7:22 PM EST, a crash of Flexpa applications was observed and noted internally.
At 7:23 PM EST, Flexpa's applications had automatically restarted and recovered.
At 8:05 PM EST, a root cause had been identified and a fix was deployed to update logging to prevent future crashes. We have been operationally normal since this time.
At 8:26 PM EST, an update was applied to ensure future alerting and logging was routed to the appropriate location.
Moving forward from this incident we are:
- Improving testing environment for logging updates to detect issues in advance of production deployment
- Continue investigation into logging libraries Flexpa depends on in order to understand failure more deeply and prevent similar issues in the future.
- Re-affirming our commitment to operational excellence
We have resolved the issue with our API.
We are currently investing an issue with our API.