Skip to content

Home

SequenceAligner

C++17 CMake Doxygen DOI

SequenceAligner is a high-performance C++ tool for biological sequence alignment. It combines classical dynamic programming with modern indexing techniques (FM-index) and parallel computation (MPI + OpenMP).

It supports:

  • Longest Common Subsequence (LCS)
  • Global alignment (Needleman–Wunsch with affine gaps)
  • Local alignment (Smith–Waterman with affine gaps)
  • Seed-and-extend alignment using FM-index

The tool is optimized for large-scale sequence comparison, not just textbook examples.


What makes this different

This is not a naive DP implementation.

From the code:

  • FM-index enables fast substring search and seed generation
  • Suffix arrays are constructed in (O(n \log n))
  • Alignment uses affine gap penalties (GAP_OPEN, GAP_EXTEND)
  • SIMD (immintrin.h) and MPI support large-scale execution
  • Optional binary output for DP matrices

Workflow

  1. Parse FASTA input
  2. Build FM-index on target
  3. Generate k-mer seeds
  4. Chain seeds into candidate regions
  5. Run DP alignment (global/local)
  6. Output formatted alignment + optional matrices

Please refer to API Documentation