Sorex - Overview
Sorex is an attempt to bring database-class search to the browser, with a formal verification twist.
Most client-side search libraries make tradeoffs that hurt relevance. They tokenize aggressively, losing substring matches. They skip fuzzy matching for speed. They rank by term frequency alone, ignoring document structure. The result: users search for "auth" and don't find "authentication."
The project started from a personal frustration: searching "auth" and not finding "authentication", or making a typo and getting zero results. As a non-native English speaker, I wanted search that tolerates the mistakes I actually make.
Database search engines like Elasticsearch and Meilisearch solve these problems with suffix arrays, inverted indexes, and sophisticated ranking. But they require servers. Sorex asks: what if we brought those techniques to a 153KB WASM binary that runs entirely in the browser?
The formal verification twist: search ranking is notoriously hard to get right. Sorex encodes its ranking invariants in Lean 4 and proves them mathematically correct. When we say "title matches rank above content matches," that's not just tested. It's proven.
Reading Paths
New to Sorex? Start here:
- Quick Start - Get search running in 5 minutes
- Integration - Framework examples (React, vanilla JS)
- Troubleshooting - When things go wrong
Building an API?
- TypeScript API - Browser WASM bindings
- Rust API - Library for index building
- CLI Reference - Command-line tools
Understanding the internals?
- Runtime - Browser execution model
- Architecture - System design
- Binary Format - .sorex file specification
- Algorithms - Suffix arrays, Levenshtein DFA
Evaluating performance?
- Benchmarks - Comparisons with other libraries
Contributing?
- Verification - Formal verification rules
- Contributing - Development workflow
Documentation
Getting Started
| Guide | Description |
|---|---|
| Quick Start | Get search running in 5 minutes |
| Integration | Framework examples for React, Svelte, vanilla JS |
| Troubleshooting | Solutions to common issues |
API Reference
| Reference | Description |
|---|---|
| TypeScript API | Browser WASM bindings: loadSorex, SorexSearcher |
| Rust API | Library API: build_index, verification types |
| CLI Reference | Build with sorex index, inspect with sorex inspect |
Internals
| Guide | Description |
|---|---|
| Runtime | Streaming compilation, threading, progressive search |
| Architecture | System design, three-tier search, formal verification |
| Binary Format | .sorex v12 wire format specification |
| Algorithms | Suffix arrays, Levenshtein automata, Block PFOR |
Evidence & Contributing
| Guide | Description |
|---|---|
| Benchmarks | Performance comparisons with other search libraries |
| Verification | How Lean 4 proofs guarantee ranking correctness |
| Contributing | Development workflow, verification checklist |
Quick Start
1. Install the CLI
cargo install sorex2. Build an index
sorex index --input ./docs --output ./search3. Search in the browser
import { loadSorex } from './sorex.js';
const searcher = await loadSorex('./index.sorex');
searcher.search('query', 10, {
onUpdate: (results) => console.log(results), // Progressive updates
onFinish: (results) => console.log(results) // Final results
});Project Structure
sorex/
├── src/
│ ├── lib.rs # Library entry point
│ ├── main.rs # CLI entry point
│ ├── types.rs # Core data structures
│ ├── binary/ # .sorex format encoding/decoding
│ ├── build/ # Index construction pipeline
│ ├── cli/ # CLI display and output
│ ├── fuzzy/ # Levenshtein DFA, edit distance
│ ├── index/ # Suffix arrays, inverted index
│ ├── runtime/ # WASM bindings, Deno runtime
│ ├── scoring/ # Ranking (Lean-verified)
│ ├── search/ # Three-tier search
│ ├── util/ # SIMD, compression
│ └── verify/ # Runtime contracts
├── lean/ # Lean 4 formal specifications
│ └── SearchVerified/ # Proofs for ranking, binary search
├── data/
│ ├── datasets/ # Benchmark datasets (CUTLASS, PyTorch)
│ └── e2e/ # End-to-end tests
├── benches/ # Criterion benchmarks
├── tests/ # Integration and property tests
├── fuzz/ # Fuzz testing targets
└── docs/ # This documentation