All Topics
All Topics
Technology
Technology
AI
AI
Business
Business
Entertainment
Entertainment
News
News
Programming
Programming
Security
Security
Science
Science
Design
Design
Environment
Environment
Finance
Finance
Crypto
Crypto
Politics
Politics
Sports
Sports
Education
Education
Gaming
Gaming
Art
Art
Music
Music
Health
Health
Books
Books
Food
Food
Travel
Travel
Personal
Personal
Bluesky
Twitter

R2 SQL - R2 SQL now supports approximate aggregation functions

4mo ago

Source

CloudflareR2 SQL - R2 SQL now supports approximate aggregation functionscloudflare.com
Snippet from the RSS feed
R2 SQL now supports five approximate aggregation functions for fast analysis of large datasets. These functions trade minor precision for improved performance on high-cardinality data. New functions APPROX_PERCENTILE_CONT(column, percentile) — Returns the approximate value at a given percentile (0.0 to 1.0). Works on integer and decimal columns. APPROX_PERCENTILE_CONT_WITH_WEIGHT(column, weight, percentile) — Weighted percentile calculation where each row contributes proportionally to its weight column value. APPROX_MEDIAN(column) — Returns the approximate median. Equivalent to APPROX_PERCENTILE_CONT(column, 0.5) . APPROX_DISTINCT(column) — Returns the approximate number of distinct values. Works on any column type. APPROX_TOP_K(column, k) — Returns the k most frequent values with their counts as a JSON array. All functions support WHERE filters. All except APPROX_TOP_K support GROUP BY . Examples -- Percentile analysis on revenue data SELECT approx_percentile_cont (total_amount, 0 . 25 ), approx_percentile_cont (total_amount, 0 . 5 ), approx_percentile_cont (total_amount, 0 . 75 ) FROM my_namespace.sales_data -- Median per department SELECT department, approx_median(total_amount) FROM my_namespace.sales_data GROUP BY department -- Approximate distinct customers by region SELECT region, approx_distinct(customer_id) FROM my_namespace.sales_data GROUP BY region -- Top 5 most frequent departments SELECT approx_top_k(department, 5 ) FROM my_namespace.sales_data -- Combine approximate and standard aggregations SELECT COUNT ( * ), AVG (total_amount), approx_percentile_cont (total_amount, 0 . 5 ), approx_distinct(customer_id) FROM my_namespace.sales_data WHERE region = 'North' For the full syntax and additional examples, refer to the SQL reference .

You might also wanna read

Comments

Sign in to join the conversation.

No comments yet. Be the first.