Installation ============ ``arda`` uses a dedicated conda environment for the MMseqs2 binary and the C++ toolchain; the package itself installs with pip and builds a small C++ extension. Bootstrap --------- .. code-block:: bash bash setup.sh conda activate arda ``setup.sh`` flags: * ``--no-conda`` — use the already-active environment instead of creating ``arda``. * ``--build-db`` — rebuild the reference database after install (needs IgBLAST). * ``--tests`` — run the fast unit + synthetic suites. What gets installed ------------------- * The ``arda`` conda env (Python, ``mmseqs2``, a C++ compiler, perl). * The latest IgBLAST release into ``bin/`` (gitignored) — only needed to rebuild references, not at annotation time. * The ``arda`` package + the ``arda._markup`` C++ extension (editable install). MMseqs2 without conda --------------------- If you install with plain ``pip`` (no conda env) and ``mmseqs`` is not on ``PATH``, arda **auto-fetches** a static MMseqs2 binary into ``bin/mmseqs`` on first use — no manual install needed. Controls: * ``$ARDA_MMSEQS`` — use a specific mmseqs binary (highest priority). * ``$ARDA_MMSEQS_ASSET`` — override the release asset (e.g. ``mmseqs-linux-sse41.tar.gz`` on pre-AVX2 CPUs). * ``$ARDA_NO_AUTO_FETCH`` — disable auto-fetch (then install mmseqs yourself). Fetch eagerly with ``python scripts/fetch_mmseqs.py`` (``setup.sh --no-conda`` does this for you). The committed ``database/vdj//`` references — including **precompiled MMseqs2 indexes** under ``mmseqs/`` — mean most users do not need to build anything. The shipped indexes are used when the local MMseqs2 version matches; otherwise arda rebuilds a private cache on first run. ``arda build-index`` (re)builds the shipped indexes for your MMseqs2 version.