1. Getting Started
Complete tutorial for integrating Pomai Search into your C++ application.
Installation
# Using CMake FetchContent
include(FetchContent)
FetchContent_Declare(
pomaisearch
GIT_REPOSITORY https://github.com/yourusername/pomaisearch.git
GIT_TAG main
)
FetchContent_MakeAvailable(pomaisearch)
target_link_libraries(your_app PRIVATE pomai_search_core)
Your First Engine
#include "pomai_search/search_engine.h"
using namespace pomai_search;
int main() {
// 1. Configure
SearchEngineConfig cfg;
cfg.dim = 128;
cfg.index_type = SearchEngineConfig::IndexType::Hnsw;
// 2. Open
auto engine = SearchEngine::Open(cfg).value();
// 3. Insert
std::vector vec(128, 0.0f); vec[0] = 1.0f;
engine->Upsert("doc1", VectorView{vec.data(), 128}, {{"category", "tech"}});
// 4. Search
auto results = engine->Search(VectorView{vec.data(), 128});
std::cout << "Found: " << results.value()[0].key << "\n";
return 0;
}
2. Performance Tuning
Quick Wins
- Enable AVX2:
cfg.enable_avx2 = true;(4-8x speedup). - Use Multiple Shards: Set
cfg.num_shardsto `std::thread::hardware_concurrency()`. - Pre-normalize: Normalize vectors and use `Similarity::Dot` instead of `Cosine`.
- Snapshots: Use `SnapshotWriter` to save startup time (O(N) load vs O(N log N) build).
Index Selection Guide
| Requirement | Index Type | Pros |
|---|---|---|
| 100% Recall | Flat | Exact results, simple. |
| < 1M Vectors | HNSW | Best balance of speed/recall. |
| > 1M Vectors | IVF-SQ8 | 4x less memory, scalable. |
HNSW Tuning
- High Recall (99.8%+): `M=32`, `ef_construction=400`, `ef_search=100`.
- Fast Queries: `M=16`, `ef_construction=200`, `ef_search=30`.
3. Troubleshooting
- High Query Latency (P99 > 100ms)
- Reduce
ef_searchornprobe. Check CPU usage/throttling. - Low Recall
- Increase
ef_search(HNSW) ornprobe(IVF). Ensure data is normalized if using Dot product. - High Memory Usage
- Switch to
IndexType::IvfSq8for 4x reduction. Reducehnsw_m.