Supercharge Your Vector Search in AlloyDB: Introducing the acceleration of ScaNN with Columnar Engine

We are excited to announce the Public Preview of ScaNN Vector Search Acceleration powered by AlloyDB’s Columnar Engine. This industry-first feature leverages the columnar engine to accelerate index scans, significantly boosting vector search performance. As modern database systems continue to evolve their capabilities for handling complex data, this innovation seamlessly merges two crucial advancements: the Columnar Engine and Vector Indexes.

The Power Duo: Columnar Engine and Vector Indexes

Before delving into this powerful synergy, let’s briefly revisit these two technologies:

  • Columnar Engine: This engine accelerates analytical queries by storing subsets of tables and columns in a column-major format. This approach minimizes data access, optimizes compression, and enables vectorized execution, leading to substantial performance gains.

  • Vector Index: Essential for semantic search, a Vector Index allows your database to swiftly locate data points that are conceptually similar to your query. Typically structured as a multi-level tree, it organizes vectors in n-dimensional space, linking them through centroid vectors. The ScaNN index supports two popular quantizations: SQ8 and AH, with AH quantization boasting 4x greater compression compared to SQ8.

Together, these technologies lay the groundwork for intelligent data retrieval.

A New Paradigm in Vector Search

The acceleration of ScaNN with the Columnar Engine in AlloyDB marks a significant leap forward in vector search technology. This powerful combination delivers a new paradigm for efficient and scalable semantic search, allowing you to handle even the most complex similarity queries with unprecedented speed and accuracy. By intelligently leveraging the strengths of both technologies, AlloyDB is setting a new standard for intelligent data retrieval. A key aspect of this technical advancement involved tackling complex technical challenges, particularly around Data Definition Language (DDL) and Data Manipulation Language (DML) operations.

Seamless Integration and Management

Accelerating your vector indexes with the columnar engine is straightforward, utilizing familiar SQL commands.

To populate and cache the ScaNN index in the columnar engine:

SELECT google_columnar_engine_add_index('scann_index_name');

This command populates the specified index in the columnar engine, allowing subsequent vector index scans to leverage the Columnar Engine for enhanced performance.

To verify the columnar engine usage:

You can easily verify if your vector index scans are utilizing the columnar engine using EXPLAIN ANALYZE with the COLUMNAR_ENGINE option:

EXPLAIN (ANALYZE TRUE, SCANN TRUE, COSTS FALSE, TIMING FALSE, SUMMARY FALSE, VERBOSE FALSE, COLUMNAR_ENGINE TRUE)

SELECT * FROM t ORDER BY val <-> '[0.5,0.5,0.5,0.5]' LIMIT 100;

                                                                           QUERY PLAN

-----------------------------

 Limit (actual rows=100 loops=1)

   ->  Index Scan using t_ix3 on t (actual rows=100 loops=1)

         Order By: (val <-> '[0.5,0.5,0.5,0.5]'::vector)

         Limit: 100

         ScaNN Info: (num leaves searched=1 reordering=0 columnar engine nodes hit=2 columnar engine reordering tuples hit=0 used pca=false multi-thread=false)

         Columnar Engine ScaNN Info: (index found=true)

         Columnar Check: table is not in the columnar store

(7 rows)


Look for columnar engine nodes hit and Columnar Engine ScaNN Info: (index found=true) in the ScaNN Info section of the output to confirm columnar engine usage.

To remove an index from the Columnar Engine:

SELECT google_columnar_engine_drop_index('scann_index_name');

Tangible Benefits: Performance Gains

This acceleration delivers significant advantages:

  • Increased Transaction Throughput: Reduced query latencies directly contribute to a higher transaction throughput for your database system.

  • Fast Response Time: Users will experience remarkably faster response times for their vector search queries.

We are excited to announce the acceleration of the ScaNN index with the Columnar Engine, delivering a remarkable up to 80% improvement in performance. This represents a substantial enhancement to ScaNN’s already impressive capabilities. We would love to hear what you discover with your data!

Get Started Today!

The Public Preview of ScaNN acceleration using Columnar Engine is now available. Experience the dramatic acceleration in vector search performance and unlock new possibilities for your intelligent applications.

1 Like