zio-http-guard/com.jamesward.zio_http_guard/CrawlerLimiter

CrawlerLimiter

com.jamesward.zio_http_guard.CrawlerLimiter

See theCrawlerLimiter companion object

final case class CrawlerLimiter[K](active: ConcurrentMap[String, Slot[K]])

Per-crawler "one active resource at a time" limiter.

Each known crawler User-Agent gets a single slot. While a slot is held, requests from that crawler for the same resource key (refresh the slot) and requests from that crawler for paths with no key (homepage, static assets, etc.) are allowed; requests for a different resource key are denied with 429 Too Many Requests until the slot's last access is older than hold.

The "resource key" is whatever you compute from the request — typically a coarse grouping that maps to an expensive backend operation. Examples:

a Maven groupId/artifactId/version triple, when each new triple triggers a fresh archive download + extraction
a tenant ID, when each new tenant warms an isolated cache
a date bucket, when each bucket spans a separate database shard

Crawlers that don't legitimately need to walk the whole keyspace in parallel (Googlebot et al.) will simply move on to a different page from the same key while their slot is held; well-behaved crawlers absorb the limit invisibly.

Value parameters

active: map from crawler User-Agent token to its currently held slot.

Attributes

Companion: object
Graph
Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

Try to claim or refresh crawler's slot for key.

Returns true (allow the request) if:

the crawler has no slot yet, or
the existing slot is for the same key (refreshes its lastAccess), or
the existing slot for a different key has been idle for at least hold (steals the slot).

Returns false (deny) if the crawler currently holds a slot for a different key and that slot is still within its hold window.

Attributes

Inherited methods

An iterator over the names of all the elements of this product.

Attributes

Inherited from:: Product

An iterator over all the elements of this product.

Attributes

Returns: in the default implementation, an Iterator[Any]
Inherited from:: Product

In this article

Generated with