API is degraded

Get status updates

Back to overview

Degraded

Aug 13 at 04:10pm PDT

Affected services

api.firecrawl.dev

Resolved
Aug 13 at 04:37pm PDT

The API has recovered.

We quickly restarted the workers to get them un-stuck again. After that, we revisited the core issue. A lot of jobs were finishing at the same time, started hammering the same sorted set in Redis at the same time, and ran into the same race conditions. We decided to break up these clumps of requests by applying a timeout of random length before retrying every time the anti-race mechanism activates. This spreads out the workers nicely.

After applying the fix and observing, the issue is resolved.

Updated
Aug 13 at 04:30pm PDT

The issue has regressed. We are continuing to work towards a solution.

Updated
Aug 13 at 04:26pm PDT

The API has recovered.

The concurrency limit queue has been used more heavily than previously expected, causing race conditions to occur at rates higher than expected. The anti-race mechanism was triggered in a de facto infinite loop, causing workers to stop handling new scrape jobs. The workers were restarted and the code causing the event loop exhaustion was quickly patched.

Created
Aug 13 at 04:10pm PDT

We are investigating.