Back to overview
Degraded

API is degraded

Aug 14 at 10:33am PDT
Affected services
api.firecrawl.dev

Resolved
Aug 14 at 02:26pm PDT

We are back. Job timeout metrics have recovered to pre-incident levels.

The issue came down to misconfigured pipeline queue limits on Dragonfly -- we configured them with high values expecting a high load on production, however, they ended up being ridiculously high. This caused Dragonfly's backpressure mechanisms to kick in way too late, only when the state of the instance is practically already unsalvageable. The configuration was tuned, and we will continue to monitor this.

Updated
Aug 14 at 01:48pm PDT

The issue has regressed. We are working on a fix.

Updated
Aug 14 at 01:03pm PDT

This issue was caused by an elevated number of connections overloading our Dragonfly instance in charge of operating the scrape queue. We've applied a fix to reduce the number of extraneous connections, and we are putting in further architectural work to do better on this metric. Apologies for the disruption.

Updated
Aug 14 at 10:54am PDT

The issue is now resolved.

Updated
Aug 14 at 10:44am PDT

The issue has been mitigated. We are now slowly allowing more workers to come online.

Created
Aug 14 at 10:33am PDT

We are investigating