Chapter 7: Understanding USearch Indexing Strategies

Introduction to USearch Indexing Strategies

Welcome back, intrepid learner! In our previous chapters, you’ve grasped the fundamentals of vector embeddings, understood what USearch is, and even set up your first basic vector search. That’s fantastic progress! But as you scale your applications and deal with ever-growing datasets, simply throwing vectors into an index isn’t enough. You need strategy.

This chapter is your deep dive into the brain of USearch: its indexing strategies. We’ll uncover how USearch organizes your high-dimensional vectors to enable lightning-fast similarity searches. We’ll focus heavily on the Hierarchical Navigable Small Worlds (HNSW) algorithm, which is the secret sauce behind USearch’s impressive performance. Understanding these strategies is paramount because they directly influence the speed of your searches, the accuracy of your results (known as recall), and the memory footprint of your application.

By the end of this chapter, you’ll not only understand the core concepts but also gain practical experience in configuring USearch indices for optimal performance, whether you’re using USearch standalone or leveraging its capabilities through ScyllaDB’s integrated vector search. Ready to optimize your vector search game? Let’s go!

Core Concepts of Vector Indexing

Before we get our hands dirty with code, let’s build a solid conceptual foundation. Why do we even need “indexing strategies” for vectors?

The Problem: Brute-Force is Too Slow

Imagine you have a library with a billion books, and you want to find all books “similar” to one specific book. If you had to read every single book in the library and compare it to your chosen book, it would take forever, right?

The same principle applies to vector search. When you have millions or billions of high-dimensional vectors (like those representing text, images, or audio), comparing an incoming query vector to every single vector in your database is called brute-force search. It’s guaranteed to find the absolute nearest neighbors, but it’s computationally expensive and incredibly slow for large datasets.

This is where vector indexing comes in.

The Solution: Approximate Nearest Neighbor (ANN) Search

Instead of brute-force, we use clever data structures and algorithms to perform Approximate Nearest Neighbor (ANN) search. The “approximate” part is key: we don’t guarantee finding the absolute closest vectors every single time, but we get very, very close, very, very fast. The trade-off is usually worth it for real-time applications.

USearch, and many other modern vector databases, achieve this approximation using algorithms that build a special graph-like structure from your vectors. This structure allows for highly efficient “hops” through the data to quickly narrow down the search space to the most promising candidates.

Hierarchical Navigable Small Worlds (HNSW): USearch’s Engine

USearch primarily leverages the Hierarchical Navigable Small Worlds (HNSW) algorithm. It’s a fantastic algorithm that balances search speed, recall, and memory usage. Let’s break down what HNSW means in simple terms.

Think of HNSW as a multi-layered highway system for your vectors:

Top Layers (Highways): These layers have fewer nodes, but each node has long-range connections. They act like expressways, allowing you to quickly jump large distances across your vector space to get into the general “neighborhood” of your target vector.
Bottom Layer (Local Roads): This layer contains all your vectors, and each node has many short-range connections. Once you’ve used the highways to get to the right neighborhood, you switch to these local roads to meticulously find the closest neighbors.

This hierarchical structure allows USearch to perform searches very efficiently. It starts at the top layer, quickly navigates towards the target’s general area, and then descends through the layers, refining its search until it reaches the bottom layer to find the nearest neighbors.

Here’s a simplified visual representation of the HNSW concept:

graph TD subgraph Layer_2["Top Layer "] N2_0[Node A] end subgraph Layer_1["Middle Layer "] N1_0[Node B] --- N1_1[Node C] N1_1 --- N1_2[Node D] end subgraph Layer_0["Bottom Layer "] N0_0[Node E] --- N0_1[Node F] N0_1 --- N0_2[Node G] N0_2 --- N0_3[Node H] N0_3 --- N0_0 N0_0 --- N0_2 N0_1 --- N0_3 end N2_0 -->|Fast Jump| N1_0 N1_0 -->|Refine Search| N0_0 N1_1 -->|Refine Search| N0_1 N1_2 -->|Refine Search| N0_3

In this diagram, imagine starting a search at Node A in the top layer. It quickly points you to Node B in the middle layer, which then guides you to Node E in the bottom layer, where the detailed search for nearest neighbors begins. This multi-layer approach avoids checking every single vector, making it incredibly fast.

Key HNSW Parameters in USearch

To fine-tune USearch’s HNSW index, you’ll interact with several crucial parameters that dictate its behavior during construction and search. Understanding these is vital for balancing performance and accuracy.

1. `connectivity` (often denoted as `M`)

What it is: This parameter controls the maximum number of connections (edges) each node will have in the HNSW graph. It essentially defines the “density” of the graph.
Why it’s important:
- Higher connectivity (e.g., M=32): Creates a denser graph with more connections per node. This generally leads to higher recall (more accurate results) because the search algorithm has more paths to explore. However, it also means a slower index build time, more memory usage, and potentially slower search times due to more graph traversals.
- Lower connectivity (e.g., M=8): Creates a sparser graph. This can lead to faster index build times, less memory, and faster searches, but often at the cost of lower recall.
General guidance: A good starting point is often 16 or 32.

2. `expansion` (often denoted as `efConstruction`)

What it is: This parameter controls the size of the dynamic list of “nearest neighbors” candidates that USearch maintains during the index construction phase. When adding a new vector, USearch explores efConstruction candidates to find its best connections.
Why it’s important:
- Higher expansion (e.g., efConstruction=200): Leads to a higher quality index. By considering more candidates during construction, the algorithm can find better connections, resulting in a more optimized graph and higher potential recall during searches. The trade-off is significantly slower index build times.
- Lower expansion (e.g., efConstruction=50): Results in faster index build times but can lead to a lower quality index and potentially lower recall.
General guidance: Start with 100 and increase if recall is critical and build time permits.

3. `expansion_add` (related to `efConstruction`)

What it is: Similar to expansion (efConstruction), but specifically for adding elements. USearch uses expansion_add to control the number of candidates considered when inserting new vectors into an already existing index.
Why it’s important: It allows you to fine-tune the quality of additions without necessarily rebuilding the entire index. Often, expansion_add is set equal to or slightly higher than connectivity.

4. `expansion_search` (often denoted as `efSearch`)

What it is: This parameter controls the size of the dynamic list of “nearest neighbors” candidates that USearch maintains during the search phase. When you query the index, USearch explores efSearch candidates to find the actual nearest neighbors.
Why it’s important:
- Higher expansion_search (e.g., efSearch=100): Leads to higher recall during query time. The search algorithm explores more paths and candidates, increasing the likelihood of finding the true nearest neighbors. The trade-off is slower search times.
- Lower expansion_search (e.g., efSearch=10): Results in faster search times but can significantly reduce recall.
General guidance: This is a runtime parameter, meaning you can adjust it on the fly for each query. You might use a lower efSearch for quick, less critical searches and a higher efSearch for more thorough, critical searches.

5. `metric` (Distance Function)

What it is: This defines how the “distance” or “similarity” between two vectors is calculated. USearch supports various metrics.
Why it’s important: The choice of metric is crucial and depends entirely on how your embeddings were generated and what “similarity” means in your context.
- usearch.MetricKind.IP (Inner Product / Dot Product): Often used for embeddings where the magnitude matters, or for transformer models. Higher value usually means more similar.
- usearch.MetricKind.Cos (Cosine Similarity): Measures the cosine of the angle between two vectors. It’s often used when the direction of the vectors matters more than their magnitude (e.g., for text embeddings). A value close to 1 means similar.
- usearch.MetricKind.L2sq (Squared Euclidean Distance): A common distance metric. Smaller value means more similar.
- usearch.MetricKind.Lp (Minkowski/Lp Distance): A generalization of Euclidean distance.
General guidance: Always match the metric to how your embeddings were trained. If your embedding model outputs normalized vectors, Cosine Similarity or Inner Product are often good choices.

ScyllaDB’s Role

ScyllaDB, as of its general availability of Vector Search in January 2026, leverages USearch (or a similar high-performance ANN library) under the hood to power its VECTOR data type and ANN OF query capabilities. This means that while you interact with ScyllaDB using CQL (Cassandra Query Language), the underlying indexing principles and the trade-offs we’re discussing for USearch directly apply to how ScyllaDB manages and searches your vector data. ScyllaDB handles the complex distributed aspects, but the core ANN algorithm for individual shards benefits from these optimizations.

Step-by-Step Implementation: Configuring USearch Indices

Let’s put these concepts into practice. We’ll use USearch’s Python bindings to create an index and experiment with its parameters.

First, ensure you have USearch installed. As of 2026-02-17, the latest stable version of USearch for Python can be installed via pip.

pip install usearch==6.25.0 numpy

Now, let’s write some Python code to create and query a USearch index, paying close attention to our indexing parameters.

Step 1: Initialize a Basic Index

We’ll start with a simple index and then modify its parameters. Open your Python environment (e.g., a .py file or a Jupyter notebook).

# Part 1: basic_index.py
import usearch
import numpy as np
import time

print(f"USearch version: {usearch.__version__}")

# Define vector dimensions
dimensions = 128
num_vectors = 1000

# Generate some random vectors for our database
# Each vector is a 128-dimensional float32 array
database_vectors = np.random.rand(num_vectors, dimensions).astype(np.float32)
# Generate a query vector
query_vector = np.random.rand(dimensions).astype(np.float32)

print(f"Generated {num_vectors} database vectors with {dimensions} dimensions.")

# 1. Create a USearch index with default parameters (or commonly used ones)
# For this example, we'll explicitly set common defaults for clarity.
# MetricKind.Cos is often a good choice for text embeddings.
index = usearch.Index(
    dimensions=dimensions,
    metric=usearch.MetricKind.Cos,
    connectivity=16,          # M: Max connections per node
    expansion=64,             # efConstruction: Candidates for index build
    expansion_add=64          # expansion_add: Candidates for incremental adds
)

print(f"\nIndex created with parameters: dimensions={dimensions}, metric={index.metric}, connectivity={index.connectivity}, expansion={index.expansion}")

# 2. Add vectors to the index
start_time = time.time()
for i, vector in enumerate(database_vectors):
    index.add(i, vector) # Add vector with integer ID 'i'
build_time = time.time() - start_time
print(f"Added {index.size} vectors to the index in {build_time:.4f} seconds.")

# 3. Perform a search
k = 5 # Number of nearest neighbors to retrieve
start_time = time.time()
matches = index.search(query_vector, k=k, expansion=32) # efSearch: Candidates for search
search_time = time.time() - start_time

print(f"\nSearch for {k} nearest neighbors took {search_time:.4f} seconds.")
print(f"Found {len(matches.keys)} matches:")
for i, key in enumerate(matches.keys):
    print(f"  - Match {i+1}: ID={key}, Distance={matches.distances[i]:.4f}")

Explanation of the code:

import usearch, numpy as np, time: We import the necessary libraries. numpy is great for handling numerical arrays (our vectors), and time helps us measure performance.
dimensions = 128: This sets the dimensionality of our vectors. Common embedding models produce vectors of 128, 384, 768, or even 1536 dimensions.
database_vectors = np.random.rand(...): We create num_vectors random vectors. In a real application, these would come from an embedding model. astype(np.float32) is important for USearch, which typically expects float32 for performance.
index = usearch.Index(...): This is where we create our USearch index object.
- dimensions: Must match the dimensionality of your vectors.
- metric: We chose usearch.MetricKind.Cos for cosine similarity. If your embeddings are not normalized, you might consider MetricKind.IP or MetricKind.L2sq.
- connectivity, expansion, expansion_add: These are our HNSW parameters discussed earlier. We start with reasonable defaults.
index.add(i, vector): We iterate through our database_vectors and add each one to the index. The first argument i is a unique identifier (ID) for the vector.
index.search(query_vector, k=k, expansion=32): We perform a search.
- query_vector: The vector we want to find similar items to.
- k: How many nearest neighbors we want back.
- expansion: This is our efSearch parameter, controlling the search quality. We’re setting it to 32 here.

Run this script. You’ll see the index build time and search time for our small dataset. For 1000 vectors, it will be very fast.

Step 2: Experimenting with `connectivity` and `expansion`

Now, let’s see how changing connectivity (M) and expansion (efConstruction) affects index build time and, conceptually, index quality. We’ll create two indices: one with low parameters and one with high.

# Part 2: parameter_experiment.py
import usearch
import numpy as np
import time

dimensions = 128
num_vectors = 10000 # Let's increase the number of vectors for more noticeable differences

database_vectors = np.random.rand(num_vectors, dimensions).astype(np.float32)
query_vector = np.random.rand(dimensions).astype(np.float32)

print(f"Generated {num_vectors} database vectors for parameter experiment.")

# --- Index 1: Lower Parameters (Faster Build, Potentially Lower Recall) ---
print("\n--- Creating Index with Lower HNSW Parameters ---")
index_low_params = usearch.Index(
    dimensions=dimensions,
    metric=usearch.MetricKind.Cos,
    connectivity=8,          # Lower M
    expansion=32,            # Lower efConstruction
    expansion_add=32
)

start_time_low = time.time()
for i, vector in enumerate(database_vectors):
    index_low_params.add(i, vector)
build_time_low = time.time() - start_time_low
print(f"Index (Low Params) built in {build_time_low:.4f} seconds with {index_low_params.size} vectors.")

# --- Index 2: Higher Parameters (Slower Build, Potentially Higher Recall) ---
print("\n--- Creating Index with Higher HNSW Parameters ---")
index_high_params = usearch.Index(
    dimensions=dimensions,
    metric=usearch.MetricKind.Cos,
    connectivity=32,         # Higher M
    expansion=128,           # Higher efConstruction
    expansion_add=128
)

start_time_high = time.time()
for i, vector in enumerate(database_vectors):
    index_high_params.add(i, vector)
build_time_high = time.time() - start_time_high
print(f"Index (High Params) built in {build_time_high:.4f} seconds with {index_high_params.size} vectors.")

# --- Perform searches and compare ---
k = 10 # Retrieve 10 nearest neighbors
ef_search_value = 64 # Use a consistent efSearch for comparison

print(f"\n--- Comparing Search Performance with efSearch={ef_search_value} ---")

# Search on Index 1
start_time_search_low = time.time()
matches_low = index_low_params.search(query_vector, k=k, expansion=ef_search_value)
search_time_low = time.time() - start_time_search_low
print(f"Search (Low Params Index) took {search_time_low:.4f} seconds.")

# Search on Index 2
start_time_search_high = time.time()
matches_high = index_high_params.search(query_vector, k=k, expansion=ef_search_value)
search_time_high = time.time() - start_time_search_high
print(f"Search (High Params Index) took {search_time_high:.4f} seconds.")

# What to observe:
print("\nObservation:")
print(f"  - Build time: Low params={build_time_low:.4f}s, High params={build_time_high:.4f}s. (High params should be slower)")
print(f"  - Search time: Low params={search_time_low:.4f}s, High params={search_time_high:.4f}s. (Search on high params index might be slightly slower or similar depending on dataset size and efSearch)")
print("  - For recall comparison, you'd typically need ground truth (brute-force results) to compare the found IDs.")
print(f"  - Low params matches (first 3 IDs): {[m for m in matches_low.keys[:3]]}")
print(f"  - High params matches (first 3 IDs): {[m for m in matches_high.keys[:3]]}")
print("Notice how the IDs might differ, indicating different recall quality for the same query.")

Run this second script. You should observe:

The index with higher connectivity and expansion (Index 2) takes noticeably longer to build. This is the cost of creating a higher-quality graph structure.
The search times might be similar for this size, but for much larger datasets, the higher connectivity index might be slightly slower to search if efSearch is also very high, because it has more paths to explore. The benefit of the high-quality index comes in recall.

How this relates to ScyllaDB

When you define a VECTOR column and create a VECTOR INDEX in ScyllaDB, you’ll specify similar parameters. For example, ScyllaDB’s documentation for vector search will guide you on setting similarity_function (like metric in USearch) and parameters like indexing_options which would encapsulate connectivity and expansion for the underlying HNSW index.

-- Example CQL for ScyllaDB (conceptual, actual syntax may vary slightly with exact version)
-- Assuming ScyllaDB's Vector Search is generally available as of January 2026.

CREATE TABLE my_vectors (
    id UUID PRIMARY KEY,
    embedding VECTOR<FLOAT, 128>
);

-- Create a vector index, conceptually similar to USearch's HNSW parameters
-- The exact syntax for HNSW parameters will be specified in ScyllaDB's documentation.
CREATE CUSTOM INDEX my_vector_index ON my_vectors (embedding)
USING 'org.apache.cassandra.index.sasi.SASIIndex' -- Or a dedicated Vector Index handler
WITH OPTIONS = {
    'similarity_function': 'COSINE',
    'indexing_options': '{
        "type": "HNSW",
        "parameters": {
            "max_connections": "16",       -- Corresponds to USearch 'connectivity' (M)
            "construction_ef": "128",      -- Corresponds to USearch 'expansion' (efConstruction)
            "search_ef": "64"              -- Corresponds to USearch 'expansion_search' (efSearch)
        }
    }'
};

-- Then, query for nearest neighbors
SELECT id, embedding FROM my_vectors
ORDER BY embedding ANN OF [0.1, 0.2, ..., 0.128] -- Your query vector
LIMIT 5;

This CQL example is illustrative. The crucial takeaway is that the max_connections, construction_ef, and search_ef parameters you’d configure in ScyllaDB directly map to the USearch connectivity, expansion, and expansion_search parameters we just explored in Python. ScyllaDB provides the distributed, persistent layer, while USearch (or a similar engine) provides the high-performance ANN indexing.

Mini-Challenge: Recall vs. Speed

It’s your turn to be the scientist!

Challenge: Modify the parameter_experiment.py script to include a ground truth comparison.

For the query_vector, perform a brute-force search on the database_vectors to find the true k nearest neighbors. You can do this by calculating the cosine similarity (or whatever metric you chose) between the query_vector and every single database_vector and then sorting the results.
Compare the IDs returned by index_low_params.search(...) and index_high_params.search(...) against the IDs from your brute-force search.
Calculate the recall for both approximate searches. Recall is the percentage of true nearest neighbors that were correctly identified by the ANN search.
- Recall = (Number of true nearest neighbors found by ANN) / (Total number of true nearest neighbors)

Hint:

scipy.spatial.distance.cosine can help with cosine similarity between two vectors. For many vectors, numpy.dot for inner product after normalization can be faster.
To calculate recall, convert the matches.keys (from USearch) and your brute-force top k IDs into sets and find their intersection.

What to observe/learn: You should see that the index_high_params (with higher connectivity and expansion) generally achieves higher recall, but at the cost of longer build times. The efSearch parameter will also significantly impact the recall for both indices. This exercise will solidify your understanding of the fundamental trade-off in ANN search: accuracy vs. speed.

Common Pitfalls & Troubleshooting

Working with vector indexes, especially with HNSW, can introduce a few common headaches. Here’s how to avoid and troubleshoot them:

Poor Recall Despite High Parameters:
- Pitfall: You’ve cranked up connectivity and expansion, but your search results still feel off.
- Troubleshooting:
  - Mismatching Metric: Are you using MetricKind.Cos when your embeddings are not normalized, or MetricKind.L2sq when cosine similarity is appropriate? Always ensure your metric parameter matches how your embeddings were generated and what “similarity” means for your data. This is the most common culprit!
  - Too Low efSearch: Remember efSearch (the expansion parameter in index.search()) is critical for query-time recall. If it’s too low, even a perfectly built index will perform poorly. Try increasing it.
  - Poor Embeddings: The index can only be as good as the data it’s given. If your embeddings themselves aren’t semantically meaningful, no amount of indexing magic will help.
Excessive Build Times or Memory Usage:
- Pitfall: Your index takes forever to build, or your application runs out of memory.
- Troubleshooting:
  - Too High connectivity (M) or expansion (efConstruction): These parameters directly influence memory consumption and build time. Start with moderate values (e.g., M=16, efConstruction=100) and only increase if recall absolutely demands it.
  - Hardware Limitations: For very large datasets (millions or billions of vectors), you’ll need substantial RAM. USearch tries to keep the graph in memory for maximum speed. Consider machines with ample RAM or explore distributed solutions like ScyllaDB’s Vector Search which handles partitioning and distribution.
  - Data Type: Ensure your vectors are np.float32. Using float64 doubles memory usage without significant benefits for most vector search tasks.
ScyllaDB Vector Index Creation Fails or Performs Poorly:
- Pitfall: Your CREATE CUSTOM INDEX statement for vectors in ScyllaDB doesn’t work, or queries are slow.
- Troubleshooting:
  - ScyllaDB Version: Ensure you’re running a ScyllaDB version (generally available as of early 2026) that fully supports Vector Search. Older versions won’t have the VECTOR type or ANN OF query.
  - Syntax Errors: Double-check the exact CQL syntax for creating the vector index and its options in the official ScyllaDB documentation. It can be quite specific.
  - Parameter Mapping: Verify that the HNSW parameters you’re passing in the indexing_options (e.g., max_connections, construction_ef, search_ef) are correctly mapped and within acceptable ranges as defined by ScyllaDB.
  - Driver Compatibility: Ensure your ScyllaDB client driver (e.g., Python’s cassandra-driver) is updated to a version that supports the latest ScyllaDB features, including vector search.

Always remember the golden rule of performance tuning: measure, don’t guess! Use time (as we did) or more sophisticated profiling tools to understand where bottlenecks truly lie.

Summary

Phew! You’ve navigated the intricate world of USearch indexing strategies, and that’s a significant accomplishment! Here’s a quick recap of our key takeaways:

Brute-force search is impractical for large vector datasets, making Approximate Nearest Neighbor (ANN) algorithms essential for efficient similarity search.
USearch leverages HNSW (Hierarchical Navigable Small Worlds), a powerful graph-based algorithm that balances search speed and recall by organizing vectors into multiple layers.
Key HNSW parameters give you control over index behavior:
- connectivity (M): Controls graph density, impacting memory, build time, and recall.
- expansion (efConstruction): Controls index quality during construction, impacting build time and potential recall.
- expansion_add: Similar to expansion, but for incremental additions.
- expansion_search (efSearch): Controls search quality at query time, directly impacting search speed and recall.
The metric (distance function) is critical and must match how your embeddings were generated (e.g., Cosine, Inner Product, L2 Squared).
ScyllaDB’s integrated Vector Search feature (generally available as of January 2026) utilizes similar underlying ANN indexing principles, allowing you to configure parameters that translate to USearch’s HNSW settings.
There’s always a trade-off between recall (accuracy), speed, and memory usage. Fine-tuning these parameters is an iterative process.

You now have a solid understanding of how vector indexes work and how to configure them in USearch for optimal performance. This knowledge is invaluable for building scalable and efficient vector search applications.

What’s Next?

In the next chapter, we’ll dive deeper into real-world use cases for USearch, particularly focusing on how it integrates with ScyllaDB to build robust, scalable AI applications like recommendation engines, semantic search, and anomaly detection. We’ll explore practical examples and discuss architectural considerations for deploying these systems. Get ready to see these concepts come alive in practical scenarios!

References

USearch GitHub Repository: https://github.com/unum-cloud/USearch
ScyllaDB Vector Search Press Release (January 2026): https://www.scylladb.com/press-release/scylladb-brings-massive-scale-vector-search-to-real-time-ai/
ScyllaDB Docs - Working with Vector Search: https://cloud.docs.scylladb.com/stable/vector-search/work-with-vector-search.html
The New Stack - Open source USearch library jumpstarts ScyllaDB vector search: https://thenewstack.io/open-source-usearch-library-jumpstarts-scylladb-vector-search/

This page is AI-assisted and reviewed. It references official documentation and recognized resources where relevant.