AI Search - Control AI Search similarity cache freshness

AI Search now gives you more control over similarity cache freshness. Similarity cache helps reduce latency and inference cost by reusing responses for semantically similar queries. With these updates, you can choose how long responses are eligible for reuse and clear cached responses when they may be stale. Cache duration now defaults to 48 hours Previously, AI Search cached responses for a fixed duration of 30 days. Cached responses now use the instance's cache_ttl setting, and the default is 48 hours . You can set cache_ttl when creating or updating an instance to choose a cache duration from 10 minutes to 6 days. Use a shorter TTL when your source content changes frequently and freshness is more important. Use a longer TTL when your content is stable and you want more cache reuse. For example, set cache_ttl to 518400 to retain cached responses for 6 days: { " cache_ttl " : 518400 } Purge cached responses You can also purge all cached responses for an instance on demand. Purging cached responses does not delete indexed content or source files. It prevents AI Search from reusing previous cached responses, so subsequent similar queries generate fresh answers and repopulate the cache. curl -X POST " $ACCOUNT_ID /ai-search/instances/ $INSTANCE_NAME /purge_cache" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN " You can also purge cached responses from the instance settings page in the Cloudflare dashboard. Refer to similarity cache for the full list of supported cache_ttl values and more details about cache behavior.

AI Search - Control AI Search similarity cache freshness

Source

You might also wanna read

Butter Introduces Automatic Template Induction for LLM Response Caching

Precomputing KV Caches Could Dramatically Reduce AI Agent Compute Costs

AWS rebuilds OpenSearch Serverless from ground up to support AI agent workloads with zero-idle scaling

AWS rebuilds OpenSearch Serverless architecture to support AI agent workloads with zero-idle scaling

Cloudflare to automatically block web crawlers that collect content for AI companies

Cloudflare to automatically block web crawlers that collect content for AI companies

Deployment-Time Memorization in Foundation-Model Agents: Privacy-Utility Tradeoffs in Persistent Memory Systems

Comments