Architecture Guide

Deep dive into Pomai Search internals.

System Overview

flowchart TB Client --> Engine[SearchEngine] Engine --> Impl[SearchEngine::Impl] Impl --> Shard1[Shard 1] Impl --> Shard2[Shard 2] Impl --> ShardN[Shard N] Impl --> QPool[Query Thread Pool] Impl --> IPool[Ingest Thread Pool]

Shard Structure

Each shard is an independent search unit containing:

Indexing Implementations

1. FlatIndex

Exact search (O(N)). Brute-force scan with SIMD optimization. Best for small datasets or 100% recall requirements.

2. HnswIndex

Graph-based (O(log N)). Builds a multi-layer graph where upper layers serve as expressways to the target neighborhood. Supports M (connectivity) and ef (beam width) tuning.

3. IvfSq8Index

Quantized Inverted File. Compresses vectors from float32 to uint8 (4x savings). Uses k-means clustering to route queries to relevant buckets.

Concurrency Model

Design Principles