R2 - R2 Data Catalog snapshot expiration now removes unreferenced data files
2mo ago
Source
CloudflareR2 - R2 Data Catalog snapshot expiration now removes unreferenced data filescloudflare.comR2 Data Catalog , a managed Apache Iceberg catalog built into R2, now removes unreferenced data files during automatic snapshot expiration. This improvement reduces storage costs and eliminates the need to run manual maintenance jobs to reclaim space from deleted data. Previously, snapshot expiration only cleaned up Iceberg metadata files such as manifests and manifest lists. Data files that were no longer referenced by active snapshots remained in R2 storage until you manually ran remove_orphan_files or expire_snapshots through an engine like Spark. This required extra operational overhead and left stale data files consuming storage. Snapshot expiration now handles both metadata and data file cleanup automatically. When a snapshot is expired, any data files that are no longer referenced by retained snapshots are removed from R2 storage. # Enable catalog-level snapshot expiration npx wrangler r2 bucket catalog snapshot-expiration enable my-bucket \ --older-than-days 7 \ --retain-last 10 To learn more about snapshot expiration and other automatic maintenance operations, refer to the table maintenance documentation .
You might also wanna read
Key Enhancements in Apache Iceberg V3 for Data Lake Efficiency
The article discusses the new features in Apache Iceberg V3, focusing on improvements like more efficient row-level transactions with deleti
Cloudflare Launches Fully-Managed Data Platform for Analytical Data on Global Infrastructure
Cloudflare announced the Cloudflare Data Platform, a fully-managed suite of products for ingesting, storing, and querying analytical data. B
Version 18.2
Microsoft·1mo ago

The Role of Apache Iceberg in Modern Data Infrastructure and the Equality Delete Problem
The article discusses the growing significance of Apache Iceberg in the data infrastructure landscape, highlighting major acquisitions by Da
Databricks Ships Iceberg v3 as Format War Ends; Competition Moves to Catalog and Optimizer Layers
Databricks shipped Apache Iceberg v3 to general availability, and CEO Ali Ghodsi declared that Delta Lake and Iceberg are now very close in
Deprecating Structured Data Files v9, v9.1, and v9.2
Google Ads Developer Blog·5d ago

Comments
Sign in to join the conversation.
No comments yet. Be the first.