The root cause was a change we released to the API this morning that began using a database index incorrectly. This caused a slow drain of database I/O burst quota, until it finally reached zero at 13:10 PST (20:10 UTC). There was an increase in request times starting after the deployment but our alert threshold was not sensitive enough to trigger. We are now in the process of tuning alarms and setting up additional ones to be able to detect these conditions early. We are also going to increase our base I/O throughput of our database and look into performance tests to catch issues like this before they reach production.
Posted Nov 10, 2021 - 23:04 UTC
Monitoring
The fix has been rolled out to our services. Further details will be shared once we are confident that our services are at their normal levels of availability and performance. We sincerely apologize for any inconvenience this outage caused.
Posted Nov 10, 2021 - 22:12 UTC
Identified
We've identified the problem and are currently rolling out a fix. We apologize for any inconvenience.
Posted Nov 10, 2021 - 22:00 UTC
Investigating
We're currently experiencing a large outage for our property API and are actively investigating the issue. We apologize for any inconvenience and will post updates as they become available.