CrawlerLimiter

com.jamesward.zio_http_guard.CrawlerLimiter
See theCrawlerLimiter companion object
final case class CrawlerLimiter[K](active: ConcurrentMap[String, Slot[K]])

Per-crawler "one active resource at a time" limiter.

Each known crawler User-Agent gets a single slot. While a slot is held, requests from that crawler for the same resource key (refresh the slot) and requests from that crawler for paths with no key (homepage, static assets, etc.) are allowed; requests for a different resource key are denied with 429 Too Many Requests until the slot's last access is older than hold.

The "resource key" is whatever you compute from the request — typically a coarse grouping that maps to an expensive backend operation. Examples:

  • a Maven groupId/artifactId/version triple, when each new triple triggers a fresh archive download + extraction
  • a tenant ID, when each new tenant warms an isolated cache
  • a date bucket, when each bucket spans a separate database shard

Crawlers that don't legitimately need to walk the whole keyspace in parallel (Googlebot et al.) will simply move on to a different page from the same key while their slot is held; well-behaved crawlers absorb the limit invisibly.

Value parameters

active

map from crawler User-Agent token to its currently held slot.

Attributes

Companion
object
Graph
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def tryClaim(crawler: String, key: K, hold: Duration)(using CanEqual[K, K]): UIO[Boolean]

Try to claim or refresh crawler's slot for key.

Try to claim or refresh crawler's slot for key.

Returns true (allow the request) if:

  • the crawler has no slot yet, or
  • the existing slot is for the same key (refreshes its lastAccess), or
  • the existing slot for a different key has been idle for at least hold (steals the slot).

Returns false (deny) if the crawler currently holds a slot for a different key and that slot is still within its hold window.

Attributes

Inherited methods

def productElementNames: Iterator[String]

An iterator over the names of all the elements of this product.

An iterator over the names of all the elements of this product.

Attributes

Inherited from:
Product
def productIterator: Iterator[Any]

An iterator over all the elements of this product.

An iterator over all the elements of this product.

Attributes

Returns

in the default implementation, an Iterator[Any]

Inherited from:
Product