api.flexpa.com - Operational
api.flexpa.com
link.flexpa.com - Operational
link.flexpa.com
my.flexpa.com - Operational
my.flexpa.com
flexpa.com/docs - Operational
flexpa.com/docs
portal.flexpa.com - Operational
portal.flexpa.com
Notice history
May 2026
- PostmortemPostmortem
Summary
On May 7 between approximately 7:51 PM and 8:08 PM ET, Flexpa customers experienced elevated response times and request timeouts on hot-path API endpoints, including the Flexpa OAuth Authorization URL. The root cause was an AWS infrastructure failure in us-east-1, which AWS has acknowledged in their AWS Health Dashboard (AWS event).
Timeline (ET)7:20 PM — AWS data center experienced a thermal event in us-east-1 Availability Zone use1-az4 causing instance impairments due to loss of power.
7:36 PM — Internal monitoring detected increased Redis command latency on background job workers.
7:51 PM — Customer-facing API requests began hitting timeouts as the impaired Redis primary became progressively unresponsive. Status page updated to "Monitoring."
8:06 PM — Amazon ElastiCache's automatic Multi-AZ failover completed, promoting a replica in an unaffected Availability Zone to primary.
8:08 PM — Full service recovery confirmed; latency and error rates returned to baseline.
8:25 PM — AWS publicly acknowledged the AZ-level impairment.
What happened
Flexpa runs Redis as a Multi-AZ ElastiCache replication group, with a primary node and replicas distributed across multiple Availability Zones for exactly this kind of failure. When AWS lost power in use1-az4, our Redis primary was hosted in that AZ and progressively degraded over ~15 minutes before AWS's automatic failover triggered. During that window, requests that depended on Redis (rate limiting, session state, queue operations) stalled and eventually timed out at the request layer. Once the failover completed and traffic shifted to a replica in a healthy AZ, the service recovered immediately.Customer impact
Window of degraded service: approximately 17 minutes (7:51 PM – 8:08 PM ET).
Affected: hot-path API endpoints, OAuth authorization redirect, and queued background jobs.
No data was lost. Failed requests are safe to retry.
What we're doing
We are reviewing our Redis client retry behavior to shorten the recovery window in future Multi-AZ failovers from ~90 seconds to under 30 seconds.
We are reviewing direct alerting under this scenario so that paging occurs sooner
- ResolvedResolvedThis incident has been resolved.
- MonitoringMonitoring
We observed degraded response times and a Redis failover at 00:06 UTC against several hot-path API endpoints including the Flexpa OAuth Authorization URL.
The service is currently stable and we are investigating.
Apr 2026
No notices reported this month
Mar 2026
No notices reported this month

