Development
Prerequisites
- Rust 1.85+ (edition 2024; 1.88+ recommended for the latest dependencies)
- A C toolchain (gcc, clang, or MSVC)
- On Linux:
pkg-configandmimalloc’s usual build deps git,curl, andcargo(the toolchain)
Build
git clone https://github.com/muhammad-fiaz/AccelerateSearch
cd AccelerateSearch
cargo build
The binary lands at target/debug/accelerate.
Run
cargo run -- --config config/default.toml
Logs are emitted to stdout and logs/accelerate-YYYY-MM-DD.log.
Test
cargo test --workspace --no-fail-fast
- Unit tests live next to the code they cover (
#[cfg(test)] mod tests). - Property-based tests use
proptestfor the filter parser (filters::evaluator) and the tokenizer (indexing::analyzer). - Every crate that exposes a public type is documented with
cargo doc --no-deps --workspace; the CI fails on any rustdoc warning.
Lint and format
cargo fmt --all -- --check
cargo clippy --workspace --all-targets -- -D warnings
The CI workflow runs both on every push; failing either blocks the build.
Benchmarks
benchmark/ is a standalone project that spins up a 1M-document
collection and measures indexing and search throughput.
cd benchmark
cargo run --release
For micro-benchmarks, use cargo bench in the crate of interest
(currently only filters).
Project layout
crates/ # library crates
api/ # HTTP handlers, DTOs, OpenAPI
auth/ # master key, API keys, tenant tokens
cache/ # LRU + TTL cache
cluster/ # cluster skeleton (TODO)
collections/ # collection metadata service
config/ # TOML config, validation
documents/ # document service (add, update, delete, get, list)
errors/ # AppError and From impls
facets/ # facet distribution
filters/ # filter parser & evaluator
hybrid/ # hybrid query fusion (RRF)
highlighting/ # <em> highlight
indexing/ # tokenization + inverted index + FST term dict
metrics/ # Prometheus exporter
models/ # shared data types
replication/ # replication skeleton (TODO)
scheduler/ # cron + interval jobs
search/ # BM25, ranking, query parser
security/ # rate limit, CORS, audit
server/ # HTTP lifecycle, banner
sharding/ # sharding skeleton (TODO)
snapshots/ # tar+zstd snapshots
storage/ # StorageBackend trait + redb
synonyms/ # synonym map storage and lookup
tasks/ # async task queue
telemetry/ # tracing-subscriber setup
typo/ # Damerau-Levenshtein
utils/ # helpers (hash, random, time)
validation/ # input validation + sanitization
vector/ # embedding types + quantization
config/ # default.toml
docs/ # mdbook user guide (this site)
benchmark/ # standalone benchmark project
.github/ # CI + release + docs workflows
Adding a new feature
- Decide the layer. Filters belong in
filters/, ranking tweaks insearch/, persistence instorage/, etc. The crate boundaries exist to keep compile times low; honour them. - Define the data type in
models. Public types live there so they can be shared across crates without circular deps. - Write a failing unit test first. Tests are colocated with code.
- Implement the feature. No
unwrapoutside tests; nounsafe. - Add an integration test if the feature is exposed via HTTP.
- Update
docs/api.mdif a new route was added, anddocs/configuration.mdif a new config key was added. - Run
cargo fmt,cargo clippy, andcargo testbefore opening a PR.
Adding a new crate
mkdir crates/<name> && cd crates/<name>- Copy a minimal
Cargo.tomlfrom a sibling crate; pin the workspace deps and inherit thelintstable. - Add it to
[workspace.dependencies]in the workspaceCargo.tomlif other crates need to depend on it, otherwise just to the[workspace] memberslist (default). - Update
docs/architecture.mdto show the new crate in the diagram.
Releasing
- Bump versions:
cargo set-version --workspace 0.X.0(manual edit acceptable). - Update
CHANGELOG.md(chronological, newest first). - Tag the commit:
git tag v0.X.0. - Push the tag;
.github/workflows/release.ymlcross-compiles for six targets and publishes a draft release.
Common pitfalls
unwrapin non-test code will fail clippy. Use?,.expect("…")with a justification, or convert the error viaFromintoAppError.- Forgetting to invalidate the search cache after a document write.
See
crates/api/src/v1/documents.rsfor the pattern. - Using
println!for logging. Usetracing::{info, warn, error}so the structured-logging pipeline picks it up. - Adding a new env var that doesn’t go through
crates/config. All configuration must be TOML-driven; env vars are an escape hatch only.