Skip to content

DOI PyPI version Python versions Release License Container

Bio Sea Pearl

Sequence analysis and bioinformatics utilities in Python and Perl.

Bio Sea Pearl is a dual-language bioinformatics toolkit that integrates mature Perl implementations with a modern Python interface. It provides pairwise sequence alignment, Markov chain simulation, sequence distance metrics, k-mer analysis, and full-text indexing via the Burrows–Wheeler Transform — accessible through a unified CLI, a REST API, or direct Python imports.


Core capabilities

Subsystem What it does Implementation
Alignment Global, local, and LCS pairwise alignment with affine gap penalties Python + Perl (Gotoh algorithm)
Markov Chains Train transition models on sequences and sample random walks Perl (first-order and higher-order)
Sequence Tools Hamming distance, Levenshtein distance, k-mer counting, pattern search Python (native) + Perl (Inline C)
BWT & FM-index Suffix arrays, Burrows–Wheeler Transform, FM-index substring search Pure Python

Three ways to use it

biosea seqtools hamming ACGTACGT ACGTACGA
biosea bwt search --sequence ACGTACGT --pattern CGT
biosea align seq1.fa seq2.fa --mode global
curl -X POST http://localhost:8000/distance \
    -H "Content-Type: application/json" \
    -d '{"seq1": "kitten", "seq2": "sitting", "metric": "levenshtein"}'
from bio_sea_pearl.api import hamming_distance, build_fm_index, search_fm_index

print(hamming_distance("ACGT", "AGGT"))  # 1

idx = build_fm_index("ACGTACGT")
print(search_fm_index(idx, "CGT"))       # [1, 5]

Architecture at a glance

              ┌──────────┐    ┌──────────────┐
              │  biosea  │    │ FastAPI       │
              │   CLI    │    │ REST server   │
              └────┬─────┘    └──────┬────────┘
                   │                 │
                   └────────┬────────┘
                   ┌────────▼────────┐
                   │   API layer     │
                   │ (bio_sea_pearl) │
                   └───┬─────────┬───┘
                       │         │
              ┌────────▼──┐  ┌───▼──────────┐
              │  Wrappers │  │ Pure Python   │
              │ (Perl via │  │ (BWT, seqtools│
              │ subprocess)│  │  distances)   │
              └────┬──────┘  └──────────────┘
          ┌────────▼────────┐
          │  Perl scripts   │
          │  & modules      │
          │ (alignment,     │
          │  markov,        │
          │  seqtools)      │
          └─────────────────┘

The Python API layer is the single dispatch point. Whether a request arrives via the CLI, the REST API, or a direct import, it flows through the same API functions in src/bio_sea_pearl/api/. See Architecture for the full breakdown.


Quick start

pip install bio-sea-pearl
biosea --help

Installation guide · Quickstart tutorial


Documentation map

Section Description
Installation Install from PyPI, source, or Docker
Quickstart First commands across all subsystems
CLI Reference Complete biosea command documentation
REST API Endpoint schemas and interactive docs
Architecture Layers, dispatch, data flow, repo layout
Alignment Gotoh DP, scoring matrices, parallelisation
Markov Chains Transition models, sampling, random walks
Sequence Tools Distances, k-mers, Boyer–Moore search
BWT & FM-index Suffix arrays, BWT, backward search
Extending Adding modules, porting Perl to Python
Troubleshooting Common errors and failure modes