All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

Pipelines - Pipelines and R2 Data Catalog now supported in Terraform

2mo ago

Source

CloudflarePipelines - Pipelines and R2 Data Catalog now supported in Terraformcloudflare.com
Snippet from the RSS feed
Cloudflare Pipelines ingests streaming data via Workers or HTTP endpoints, transforms it with SQL, and writes it to R2 as Apache Iceberg tables. R2 Data Catalog manages those Iceberg tables, compaction, and compatibility with query engines like R2 SQL , Spark , and DuckDB . You can now create and manage both products using Terraform, supported in the Cloudflare Terraform provider v5.19.0 . This adds four new resources that let you define your entire data pipeline as infrastructure-as-code: a data catalog, a stream for ingestion, a sink that writes to R2 Data Catalog or R2, and a pipeline that connects them with SQL. The new Terraform resources are: cloudflare_r2_data_catalog — enable the data catalog on an R2 bucket cloudflare_pipeline_stream — create a stream that receives events via HTTP or Worker bindings cloudflare_pipeline_sink — create a sink that writes to R2 Data Catalog or R2 cloudflare_pipeline — create a pipeline with SQL connecting a stream to a sink Here is a minimal example that creates a stream, an R2 Data Catalog sink, and a pipeline: resource "cloudflare_pipeline_stream" "my_stream" { account_id = var . cloudflare_account_id name = "my_stream" format = { type = "json" } schema = { fields = [ { name = "value" type = "json" required = true } ] } http = { enabled = true , authentication = false , cors = {} } worker_binding = { enabled = false } } resource "cloudflare_pipeline_sink" "my_sink" { account_id = var . cloudflare_account_id name = "my_sink" type = "r2_data_catalog" format = { type = "parquet" } schema = { fields = [] } config = { account_id = var.cloudflare_account_id bucket = "my-pipeline-bucket" table_name = "my_table" token = var.catalog_token } } resource "cloudflare_pipeline" "my_pipeline" { account_id = var . cloudflare_account_id name = "my_pipeline" sql = "INSERT INTO ${ cloudflare_pipeline_sink . my_sink . name } SELECT * FROM ${ cloudflare_pipeline_stream . my_stream . name } " } For a full end-to-end example that includes R2 bucket creation, data catalog setup, and scoped API token provisioning, refer to the Pipelines Terraform documentation .

You might also wanna read

Comments

Sign in to join the conversation.

No comments yet. Be the first.