DuckDB Extension for Filtered HNSW Vector Search with ACORN-1 and Binary Quantization
By
cigrainger
If you only eat one bagel today, this is the bagel.
Summary
This article describes a DuckDB extension called duckdb-hnsw-acorn that implements ACORN-1 pre-filtered HNSW search for vector similarity search. The extension addresses two key limitations of the upstream duckdb-vss extension: broken filtered search (WHERE clauses applied after index returns results) and lack of vector compression (vectors stored as full F32). The fork fixes both issues by implementing proper filtered HNSW search with ACORN-1 pre-filtering and adding RaBitQ binary quantization for vector compression, making it more memory-efficient for high-dimensional vector data.
Key quotes
· 5 pulledThe upstream duckdb-vss extension has two limitations: Filtered search is broken. WHERE clauses are applied after the HNSW index returns results, so SELECT ... WHERE category = 'X' ORDER BY distance LIMIT 10 often returns fewer than 10 rows.
No vector compression. Every vector is stored as full F32, so the index memory scales linearly with dimensions.
This fork fixes both: implements proper filtered HNSW search with ACORN-1 pre-filtering and adds RaBitQ binary quantization for vector compression.
A DuckDB extension for vector similarity search with filtered HNSW (ACORN-1) and RaBitQ binary quantization.
Fork of duckdb/duckdb-vss.
You might also wanna read
Codemix Graph: Open-Source TypeScript Property Graph Database with Real-Time Collaboration
@codemix/graph is an open-source TypeScript property graph database from codemix that enables real-time collaborative applications. It featu
Reevaluating the Need for Vector Databases in Search Applications
The article argues that many teams mistakenly believe they need vector databases for search and recommendation problems when they actually j
VectorNest: A New Vector Database Optimized for AI Applications
VectorNest is a new vector database designed specifically for AI applications, offering high-performance vector search capabilities optimize
PostgreSQL Extensions Can Replace Multiple Specialized Databases by 2026
The article advocates for using PostgreSQL as a unified database solution in 2026, suggesting that its extensions can replace specialized da
Bunny Database: SQLite-Compatible Managed Service Launches as Public Preview
Bunny Database is a new SQLite-compatible managed database service that offers a middle ground between self-managed databases and expensive
Minikv: A Production-Ready Distributed Key-Value Store with Raft Consensus in Rust
Minikv is a production-ready distributed key-value store written in Rust that implements Raft consensus for distributed consistency. It feat
