Metrics and Distance Measures
This page documents the metrics used in PhosKinTime to evaluate model fit quality.
Fréchet Distance (frechet/distance.py)
What it is
The discrete Fréchet distance is a measure of similarity between two ordered sequences of points (curves). Informally, it is the minimum "leash length" required for a person walking along one curve and a dog walking along the other — both moving forward only, never backward — to stay connected.
For two curves $P = (p_0, p_1, \dots, p_n)$ and $Q = (q_0, q_1, \dots, q_m)$, the discrete Fréchet distance is:
$$ d_F(P, Q) = \min_{\alpha, \beta} \max_{i} \, d(p_{\alpha(i)}, q_{\beta(i)}) $$
where $\alpha$ and $\beta$ are monotone traversals of the two curves.
Implementation
frechet/distance.py provides a Numba JIT-compiled function:
from common.frechet import frechet_distance
score = frechet_distance(true_coords, pred_coords) # returns float
- Inputs: 2D arrays of shape
(T, D)whereTis the number of time points andDis the number of dimensions (observables per time point). - Returns: a single
float64scalar — the discrete Fréchet distance. - Uses
@njit(parallel=True)with Numba for JIT compilation and parallelized pairwise distance computation.
The dynamic programming recurrence:
cost[0, 0] = dist[0, 0]
cost[i, 0] = max(cost[i-1, 0], dist[i, 0])
cost[0, j] = max(cost[0, j-1], dist[0, j])
cost[i, j] = max(min(cost[i-1,j], cost[i,j-1], cost[i-1,j-1]), dist[i,j])
The final result is cost[-1, -1].
Where it is used
networkmodel/runner.py: After optimization, Fréchet scores are computed between observed and predicted time-series trajectories for each gene. These scores are stored in the dashboard bundle (frechet_scoresfield) and displayed in the Streamlit dashboard.scripts/curve_similarity.py: Stand-alone script that computes per-row Fréchet distances between observed and estimated columns intfopt_results.xlsx/kinopt_results.xlsx.
Fréchet vs RMSE
| Property | RMSE | Discrete Fréchet |
|---|---|---|
| Sensitive to point-to-point order | No | Yes |
| Penalizes shape mismatch (timing shift) | Poorly | Well |
| Scale-dependent | Yes | Yes |
| Normalized | No (raw units) | No (raw units) |
| Interpretable threshold | No | No |
| Computational cost | O(T) | O(T²) |
Use Fréchet distance when trajectory shape and timing matter (e.g., detecting phase lags or peak shifts). Use RMSE when you only care about average point-wise deviation.
Normalization and Thresholds
The Fréchet distance values in PhosKinTime are not normalized. They are in the same units as the input data. Use them for ranking fits (lower = more similar trajectory), not as absolute quality judgments. Establish a baseline distribution (e.g., median across all proteins) to contextualize individual scores.
Performance
- Numba JIT compilation occurs on first call. Subsequent calls reuse the compiled binary.
- The
parallel=Trueflag enables Numba's parallel loop (prange) for the pairwise distance computation. - For large
T(many time points), the O(T²) dynamic programming step dominates. - Typical phosphoproteomics time series have T ≤ 15 time points, making the function fast.
Loss Functions (Optimization)
Both local and global optimizers support configurable loss functions, but the integer codes
differ between the two. See the Configuration Reference for how to set
loss_type (local) or loss (global).
Local optimizer loss codes ([tfopt] / [kinopt] → loss_type)
Used by tfopt and kinopt via SciPy least-squares or evolutionary optimizer:
| Code | Name | Use case |
|---|---|---|
| 0 | MSE | Standard squared error |
| 1 | MAE | Mean absolute error |
| 2 | Soft L1 | Smooth transition MSE↔MAE |
| 3 | Cauchy | Heavy-tail robust |
| 4 | Arctan | Bounded outlier penalty |
| 5 | Elastic Net | Sparsity + smoothness (default tfopt) |
| 6 | Tikhonov | L2 regularization |
Global model loss codes ([networkmodel] → loss)
Used by networkmodel/lossfn.py:
| Code | Name | Description |
|---|---|---|
| 0 | Squared Error (MSE) | Standard squared error |
| 1 | Huber | Smooth L1/L2 transition |
| 2 | Pseudo-Huber | Differentiable Huber approximation |
| 3 | Log-Cosh | Smooth approximation of MAE; grad-friendly near 0 |
| 4 | Cauchy | Heavy-tail robust loss |
| 5 | Poisson-scaled MSE | MSE scaled by predicted value |
| 6 | Geman-McClure | Soft-saturating robust loss |
| -1 | Charbonnier | Differentiable L1 (√(x² + ε²)) |
Sensitivity Metrics (networkmodel/sensitivity.py)
After global optimization, trajectory-based sensitivity is computed. The sensitivity metric aggregates the model output into a scalar before computing Morris elementary effects:
| Metric | Description |
|---|---|
total_signal |
Sum of all state values at all time points |
mean |
Mean state value across time |
variance |
Variance of state values across time |
l2_norm |
L2 norm of the trajectory vector |
Set via sensitivity_metric in config.toml under [networkmodel].