Skip to content

Quick Start — Running with .bin + FASTA files

This page walks you through the most common workflow: rendering a dotplot directly from a raw binary DP matrix and two FASTA sequence files.


1. Prepare your data directory

Put your files in data/ at the repository root:

fastdpplot/
└── data/
    ├── dp_matrix.bin     ← raw binary DP/scoring matrix
    ├── query.fasta       ← query sequence (rows of the matrix)
    ├── subject.fasta     ← subject sequence (columns of the matrix)
    └── dp_path.txt       ← DP traceback path (two-column integer file)

File naming

The filenames above are just examples. You can use any names — the paths are passed as CLI arguments.

What is a .txt DP path file?

A DP path file contains the optimal alignment traceback as a sequence of (sequence A position, sequence B position) coordinate pairs. Each line holds two whitespace-separated non-negative integers. No header line. Column 1 maps to the X-axis (sequence A); column 2 maps to the Y-axis (sequence B). The path is rendered as a connected blue line on top of the scatter points in all output modes.

What is a .bin file?

A .bin file is a flat, little-endian, row-major binary array. Each cell stores a score or identity value for the alignment of position i (query) with position j (subject). See Data Formats for a full specification and dtype table.


2. Build the Rust extension (first time only)

cd fastdpplot
maturin develop

3. Render a static PNG

fastdpplot \
    --matrix   data/dp_matrix.bin \
    --fasta-a  data/query.fasta \
    --fasta-b  data/subject.fasta \
    --dp-path  data/dp_path.txt \
    --output   dotplot.png

This will:

  1. Parse both FASTA files (only the first record of each is used).
  2. Infer the element dtype from the file size (--dtype auto).
  3. Load the raw DP scores (sign preserved — negative penalties remain negative).
  4. Render the dot cloud as a scatter plot, colour-coded by score using a diverging red → gray → green colormap anchored at zero.
  5. Save the PNG to dotplot.png in the current directory.

Custom output size and DPI

fastdpplot \
    --matrix  data/dp_matrix.bin \
    --fasta-a data/query.fasta \
    --fasta-b data/subject.fasta \
    --dp-path data/dp_path.txt \
    --output  dotplot.png \
    --width   6000 --height 6000 \
    --dpi     150

4. Launch an interactive Panel server

fastdpplot \
    --matrix   data/dp_matrix.bin \
    --fasta-a  data/query.fasta \
    --fasta-b  data/subject.fasta \
    --dp-path  data/dp_path.txt \
    --serve \
    --port     5006

Open your browser at http://localhost:5006.
Pan and zoom are backed by the Rust binning engine — every viewport change re-bins only the visible region, keeping the browser responsive even for matrices with 10⁸+ points.

Port conflicts

If port 5006 is already in use, change it with --port 8080 (or any free port).


5. Specify the element dtype explicitly

By default (--dtype auto) fastdpplot infers the dtype from the file size. If you know the dtype, pass it explicitly to skip inference and catch mismatches early:

Flag value C type Bytes per element
u8 uint8_t 1
i16 int16_t 2
i32 int32_t 4
f32 float 4
f64 double 8
fastdpplot \
    --matrix  data/dp_matrix.bin \
    --fasta-a data/query.fasta \
    --fasta-b data/subject.fasta \
    --dp-path data/dp_path.txt \
    --dtype   u8 \
    --output  dotplot.png

Dtype mismatch

If the explicit dtype does not match the actual file size the tool raises FastDpError::SizeMismatch with the expected and actual byte counts.


6. Python API equivalent

The same pipeline is available programmatically:

from fastdpplot.io import load_bin, load_dp_path
from fastdpplot.plot import render_static

result = load_bin(
    "data/dp_matrix.bin",
    "data/query.fasta",
    "data/subject.fasta",
)

dp_path = load_dp_path("data/dp_path.txt")

render_static(
    result["df"],
    output_path="dotplot.png",
    width=4000,
    height=4000,
    x_label=result["x_label"],
    y_label=result["y_label"],
    x_range=result["x_range"],
    y_range=result["y_range"],
    dp_path=dp_path,
)

7. Jupyter notebook

from fastdpplot.io import load_bin
from fastdpplot.plot import show

result = load_bin("data/dp_matrix.bin", "data/query.fasta", "data/subject.fasta")
show(
    result["df"],
    x_label=result["x_label"],
    y_label=result["y_label"],
    x_range=result["x_range"],
    y_range=result["y_range"],
)

Note

show() returns a HoloViews object that Jupyter renders inline. No server is started.