fasta — FASTA Parser¶
Source: crates/fastdpplot-core/src/fasta.rs
FastaRecord¶
A single parsed FASTA record.
pub struct FastaRecord {
pub id: String, // everything after '>' up to first whitespace
pub description: String, // everything after first whitespace on the header line
pub sequence: String, // full sequence, newlines stripped, uppercased
pub length: usize, // sequence.len(), cached
}
parse_fasta¶
Parse all records from a FASTA file via a streaming BufReader.
Rules¶
- Lines starting with
>begin a new record. - Lines starting with
;are skipped (FASTA comment syntax). - All other non-empty lines are appended to the current record's sequence. Each line is trimmed and uppercased before concatenation, so newlines and leading/trailing whitespace are removed, but internal whitespace within a line is preserved.
Errors¶
| Error | Condition |
|---|---|
FastDpError::Io |
File cannot be opened. |
FastDpError::NoRecords |
The file is empty or contains no > headers. |
FastDpError::EmptySequence |
A record has a header but zero sequence bases. |
Example (Rust)¶
use fastdpplot_core::fasta::parse_fasta;
let records = parse_fasta("data/query.fasta")?;
println!("First record: {} ({} bp)", records[0].id, records[0].length);
Multi-record FASTA files
All records are parsed and returned. The high-level pipeline (dp_input)
uses only the first record from each file.
format_axis_label¶
Format a FASTA record as an axis label, truncated to max_len characters.
- Returns
"{id} | {description}"when the description is non-empty. - Returns just
"{id}"when the description is empty. - Appends
…when the resulting string exceedsmax_len.
The default max_len used in the pipeline is 80 characters.