Reference

Limits and quotas

Consolidated overview of every limit on the knowmind platform. Values as of 11 June 2026 — changes are tracked in the changelog.

Scope	Limit	Source
Memories, Private plan	2,500	Plan matrix
Memories, Pro and above	unlimited	Plan matrix
Memory soft warning	from 80% of the plan limit (warnAt80 in stats and store responses)	Quota gate
API rate limit, Private plan	30 requests per minute per token	Plan matrix
API rate limit, Pro plan	60 requests per minute per token	Plan matrix
API rate limit, Team plan	120 requests per minute per token	Plan matrix
API rate limit, Business plan	120 requests per minute per token	Plan matrix
API rate limit, Enterprise	600 requests per minute per token	Plan matrix
Maximum document size (upload)	10 MB	Memory service
Maximum memory size (store_memory)	100 KB	Memory service
Recall — number of hits (k)	1 to 25	MCP tool schema
Recall — graph hops	0 to 3	MCP tool schema
Chunk size on upload	about 500 tokens with 50 tokens overlap	Recall pipeline
Embedding dimension	1024 (multilingual-e5-large)	Recall pipeline
Webhook body size	1 MB	Webhook worker
Webhook timeout	15 seconds	Webhook worker
Webhook retries	6 attempts (immediate, 1 min, 5 min, 30 min, 2 h, 12 h)	Webhook worker
Dead-letter threshold	Subscription disabled after 20 consecutive failures	Webhook worker
Audit log retention, Private	30 days	Plan matrix
Audit log retention, Pro	90 days	Plan matrix
Audit log retention, Team	12 months (365 days)	Plan matrix
Audit log retention, Business	24 months (730 days)	Plan matrix
Audit log retention, Enterprise	5 years (1825 days)	Plan matrix
OAuth code validity	10 minutes	OAuth server
Magic link validity	24 hours	Auth service
API token format	kmt_ + 43 base64url characters (256 bits of entropy)	Token module
Maximum tokens per workspace	50 active (more on request for Enterprise)	Token module

Behaviour on exceedance

Rate limit (429): response carries a Retry-After header in seconds. Clients should wait and retry, ideally with jitter.
Memory limit — soft warning from 80%: knowmind_stats and the success responses of store_memory/upload_document carry the fields memoryLimit, memoriesUsed and warnAt80 plus a plain-text hint. The count covers the stored documents/memories of the workspace — NOT the auto-extracted graph nodes.
Memory limit reached (402): the write call is rejected; the response carries plan_upgrade_required and indicates the required plan (REST: HTTP 402; MCP: a JSON-RPC error with the same message). Existing content and all read paths (recall, export) stay fully available.
Document too large (413): the call is rejected. Split the content and ingest in multiple calls.
Webhook dead letter: the delivery is marked dead. After 20 consecutive failures the whole subscription is disabled and must be reactivated manually in the dashboard.

Behaviour on exceedance

Related