All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

AI Search - Control AI Search similarity cache freshness

11d ago

Source

CloudflareAI Search - Control AI Search similarity cache freshnesscloudflare.com
Snippet from the RSS feed
AI Search now gives you more control over similarity cache freshness. Similarity cache helps reduce latency and inference cost by reusing responses for semantically similar queries. With these updates, you can choose how long responses are eligible for reuse and clear cached responses when they may be stale. Cache duration now defaults to 48 hours Previously, AI Search cached responses for a fixed duration of 30 days. Cached responses now use the instance's cache_ttl setting, and the default is 48 hours . You can set cache_ttl when creating or updating an instance to choose a cache duration from 10 minutes to 6 days. Use a shorter TTL when your source content changes frequently and freshness is more important. Use a longer TTL when your content is stable and you want more cache reuse. For example, set cache_ttl to 518400 to retain cached responses for 6 days: { " cache_ttl " : 518400 } Purge cached responses You can also purge all cached responses for an instance on demand. Purging cached responses does not delete indexed content or source files. It prevents AI Search from reusing previous cached responses, so subsequent similar queries generate fresh answers and repopulate the cache. curl -X POST " $ACCOUNT_ID /ai-search/instances/ $INSTANCE_NAME /purge_cache" \ -H "Authorization: Bearer $CLOUDFLARE_API_TOKEN " You can also purge cached responses from the instance settings page in the Cloudflare dashboard. Refer to similarity cache for the full list of supported cache_ttl values and more details about cache behavior.

You might also wanna read

Butter Introduces Automatic Template Induction for LLM Response Caching

Butter, an HTTP proxy cache for LLM responses, has introduced automatic template induction for its response caching system. This new feature

blog.butter.dev·5mo ago

Precomputing KV Caches Could Dramatically Reduce AI Agent Compute Costs

This article proposes a radical efficiency improvement for AI agents: instead of each agent recomputing the key-value (KV) cache from scratc

arxiv.org·22d ago

AWS rebuilds OpenSearch Serverless from ground up to support AI agent workloads with zero-idle scaling

AWS has completely rebuilt its OpenSearch Serverless architecture to better support AI agent workloads, which have bursty usage patterns wit

bit.ly·1mo ago

AWS rebuilds OpenSearch Serverless architecture to support AI agent workloads with zero-idle scaling

AWS has launched a near-total rebuild of Amazon OpenSearch Serverless, redesigning its architecture to better handle AI agent workloads. The

bit.ly·1mo ago

Cloudflare to automatically block web crawlers that collect content for AI companies

Cloudflare announced it will automatically block mixed-use web crawlers that serve AI companies, giving website owners more control over how

Engadget·16h ago

Cloudflare to automatically block web crawlers that collect content for AI companies

Cloudflare announced it will automatically block mixed-use web crawlers that serve AI companies, giving website owners more control over how

engadget.com·16h ago

Deployment-Time Memorization in Foundation-Model Agents: Privacy-Utility Tradeoffs in Persistent Memory Systems

This paper introduces the concept of "deployment-time memorization" in foundation-model agents, where memory is an explicit function during

arxiv.org·20d ago

Comments

Sign in to join the conversation.

No comments yet. Be the first.