Pipelines - Pipelines and R2 Data Catalog now supported in Terraform
2mo ago
Source
CloudflarePipelines - Pipelines and R2 Data Catalog now supported in Terraformcloudflare.comCloudflare Pipelines ingests streaming data via Workers or HTTP endpoints, transforms it with SQL, and writes it to R2 as Apache Iceberg tables. R2 Data Catalog manages those Iceberg tables, compaction, and compatibility with query engines like R2 SQL , Spark , and DuckDB . You can now create and manage both products using Terraform, supported in the Cloudflare Terraform provider v5.19.0 . This adds four new resources that let you define your entire data pipeline as infrastructure-as-code: a data catalog, a stream for ingestion, a sink that writes to R2 Data Catalog or R2, and a pipeline that connects them with SQL. The new Terraform resources are: cloudflare_r2_data_catalog — enable the data catalog on an R2 bucket cloudflare_pipeline_stream — create a stream that receives events via HTTP or Worker bindings cloudflare_pipeline_sink — create a sink that writes to R2 Data Catalog or R2 cloudflare_pipeline — create a pipeline with SQL connecting a stream to a sink Here is a minimal example that creates a stream, an R2 Data Catalog sink, and a pipeline: resource "cloudflare_pipeline_stream" "my_stream" { account_id = var . cloudflare_account_id name = "my_stream" format = { type = "json" } schema = { fields = [ { name = "value" type = "json" required = true } ] } http = { enabled = true , authentication = false , cors = {} } worker_binding = { enabled = false } } resource "cloudflare_pipeline_sink" "my_sink" { account_id = var . cloudflare_account_id name = "my_sink" type = "r2_data_catalog" format = { type = "parquet" } schema = { fields = [] } config = { account_id = var.cloudflare_account_id bucket = "my-pipeline-bucket" table_name = "my_table" token = var.catalog_token } } resource "cloudflare_pipeline" "my_pipeline" { account_id = var . cloudflare_account_id name = "my_pipeline" sql = "INSERT INTO ${ cloudflare_pipeline_sink . my_sink . name } SELECT * FROM ${ cloudflare_pipeline_stream . my_stream . name } " } For a full end-to-end example that includes R2 bucket creation, data catalog setup, and scoped API token provisioning, refer to the Pipelines Terraform documentation .
You might also wanna read
Cloudflare Launches Fully-Managed Data Platform for Analytical Data on Global Infrastructure
Cloudflare announced the Cloudflare Data Platform, a fully-managed suite of products for ingesting, storing, and querying analytical data. B
Databricks Terraform provider supports Lakebase resources
Microsoft·11mo ago

Comments
Sign in to join the conversation.
No comments yet. Be the first.