Resolved
APIs Outage

Started
September 02, 2022 at 1:40 PM
Status
Resolved after 30 minutes

Impact

Major outage
Affected
API
  • Resolved
    September 02, 2022 at 2:10 PM

    We scaled down some post transaction batch jobs (status sync consumers) which freed up a chunk of DB connections to normal, reduced contention for the tables. The dependency of legacy EC service was turned off to allow EC to prevent it from holding up further connections due to upstream latencies.

  • Monitoring
    September 02, 2022 at 1:45 PM

    We have initiated restart of all application pods across major services which were affected the most (txn, customer, wallets and Saved Payment methods). We are continuously monitoring the key metrics.

  • Identified
    September 02, 2022 at 1:41 PM

    We are seeing an increase in 504s across all the production k8s clusters.

  • Investigating
    September 02, 2022 at 1:40 PM

    We are seeing outages for multiple APIs and investigating the root cause.