Getting Started#

Install#

Prebuilt wheels are published on PyPI for CPython 3.10–3.13 on Linux x86-64, macOS arm64 (Apple Silicon), and Windows x86-64:

pip install seqtree

Note

There are no Intel/x86-64 macOS wheels — Intel Macs build from source (below), which only needs a C++17 compiler and CMake.

To build the C++17/20 core + pybind11 binding from a clone instead:

bash setup.sh            # repo-local .venv + editable install
bash setup.sh --tests    # also install pytest
bash setup.sh --bench    # also install benchmark deps

Batches in parallel#

search_batch releases the GIL and runs a C++ thread pool over the queries:

results = idx.search_batch(queries, p, threads=0)   # 0 = all cores
# results[i] is the hit list for queries[i]

Top hits, matrices, alignment#

# k best hits
top = idx.search_top("CASSLAPGATNEKLFF", p, k=5)

# BLOSUM62-weighted budget (seqtrie)
pm = seqtree.SearchParams(matrix="BLOSUM62", max_penalty=12, engine="seqtrie", gap_open=8)
hits = idx.search("CASSLAPGATNEKLFF", pm)

# alignment on demand (never computed during search)
aln = idx.align(0, "CASSLELGATNEKLFF", p)
print(aln.aligned_query, aln.aligned_ref, aln.ops)   # ops: M/S/I/D per column

Batch-vs-batch#

For comparing two sets, pairwise_batch indexes the larger set automatically and streams the smaller; results are always a-major:

pairs = seqtree.pairwise_batch(query_set, db_set, p, alphabet="aa")
# pairs[i] are hits for query_set[i]; Hit.ref_id indexes db_set