R2 - R2 Data Catalog now supports automatic snapshot expiration
6mo ago
Source
CloudflareR2 - R2 Data Catalog now supports automatic snapshot expirationcloudflare.comR2 Data Catalog now supports automatic snapshot expiration for Apache Iceberg tables. In Apache Iceberg, a snapshot is metadata that represents the state of a table at a given point in time. Every mutation creates a new snapshot which enable powerful features like time travel queries and rollback capabilities but will accumulate over time. Without regular cleanup, these accumulated snapshots can lead to: Metadata overhead Slower table operations Increased storage costs. Snapshot expiration in R2 Data Catalog automatically removes old table snapshots based on your configured retention policy, improving performance and storage costs. # Enable catalog-level snapshot expiration # Expire snapshots older than 7 days, always retain at least 10 recent snapshots npx wrangler r2 bucket catalog snapshot-expiration enable my-bucket \ --older-than-days 7 \ --retain-last 10 Snapshot expiration uses two parameters to determine which snapshots to remove: --older-than-days : age threshold in days --retain-last : minimum snapshot count to retain Both conditions must be met before a snapshot is expired, ensuring you always retain recent snapshots even if they exceed the age threshold. This feature complements automatic compaction , which optimizes query performance by combining small data files into larger ones. Together, these automatic maintenance operations keep your Iceberg tables performant and cost-efficient without manual intervention. To learn more about snapshot expiration and how to configure it, visit our table maintenance documentation or see how to manage catalogs .
You might also wanna read
Key Enhancements in Apache Iceberg V3 for Data Lake Efficiency
The article discusses the new features in Apache Iceberg V3, focusing on improvements like more efficient row-level transactions with deleti
Version 18.2
Microsoft·1mo ago
Cloudflare Launches Fully-Managed Data Platform for Analytical Data on Global Infrastructure
Cloudflare announced the Cloudflare Data Platform, a fully-managed suite of products for ingesting, storing, and querying analytical data. B

The Role of Apache Iceberg in Modern Data Infrastructure and the Equality Delete Problem
The article discusses the growing significance of Apache Iceberg in the data infrastructure landscape, highlighting major acquisitions by Da
Critique of the Iceberg Specification for Data Lake Metadata
The article critiques the Iceberg specification, arguing it fails to effectively address the metadata challenges of large Data Lakes. The au
Databricks Ships Iceberg v3 as Format War Ends; Competition Moves to Catalog and Optimizer Layers
Databricks shipped Apache Iceberg v3 to general availability, and CEO Ali Ghodsi declared that Delta Lake and Iceberg are now very close in

Comments
Sign in to join the conversation.
No comments yet. Be the first.