Architecture and Data Flow
This page describes the high-level architecture of PhosKinTime and how data flows through the pipeline from raw experimental inputs to final outputs.
Two Modeling Stacks
PhosKinTime provides two complementary but distinct modeling workflows:
Local (per-protein) workflow
Fits individual protein ODE models to per-site phosphoproteomics time series. Each protein/site is handled independently. Optimization uses SciPy (local) or Pymoo evolutionary algorithms (evol).
Packages involved: processing, kinopt, tfopt, models, paramest, sensitivity, steady,
plotting.
Global (network-scale) workflow
Solves a coupled ODE system spanning all genes, proteins, and kinases simultaneously. Calibrated against multi-omics time series (protein, mRNA, phospho). Uses Optuna + Pymoo for hyperparameter scanning and multi-objective evolutionary optimization.
Packages involved: processing, networkmodel, frechet, knockout.
Data Flow Diagram
flowchart TD
A["Raw Input Data\n(MS_Gaussian, CollecTRI,\nRout_LimmaTable, input2)"]
subgraph prep ["Preprocessing (processing/)"]
B["cleanup.py\n• data cleaning\n• proteomics transform\n• error propagation"]
C["map.py\n• TF-mRNA mapping\n• kinase-phospho mapping\n• Cytoscape export"]
end
subgraph local ["Local Per-Protein Workflow"]
D["tfopt/\n• TF→mRNA constrained opt\n• local (SciPy) or evol (Pymoo)"]
E["kinopt/\n• kinase-phospho opt\n• local (SciPy) or evol (Pymoo)\n• optimality/ post-processing"]
F["models/\n• ODE models\n• paramest/normest.py\n• sensitivity/analysis.py (Morris)"]
G["plotting/\n• fits, PCA, t-SNE,\n• sensitivity indices, reports"]
end
subgraph global_wf ["Global Network Workflow"]
H["networkmodel/runner.py\n• Optuna + Pymoo optimization\n• scan.py (hyperparameter scan)\n• refine.py (iterative refinement)"]
I["networkmodel/simulate.py\n• coupled ODE integration\n• jacspeedup.py (Numba JIT)"]
J["networkmodel/sensitivity.py\n• trajectory-based perturbation\n• Morris elementary effects"]
K["networkmodel/export.py\n• Pareto front, Excel, plots\n• convergence video"]
L["dashboard_app.py\n• Streamlit visualization\n(run via run_dashboard.py)"]
end
subgraph scripts_box ["Post-Processing Scripts (scripts/)"]
M["analyze_tf_kin_counts\ncurve_similarity\nexport_subnetworks\nmechanistic_insights\ntemporal_sensitivity\n..."]
end
A --> prep
prep --> D
prep --> E
E --> F
D --> F
F --> G
prep --> H
E --> H
D --> H
H --> I
I --> J
I --> K
K --> L
K --> M
G --> M
Note: The
allCLI command runsprep → tfopt → kinopt → modelonly. It does not invokenetworkmodel. Runphoskintime-global(orpython -m networkmodel.runner) separately after the local pipeline completes.
Key Package Roles
| Package | Role |
|---|---|
processing/ |
Raw data cleaning and network mapping |
tfopt/ |
TF→mRNA optimization (local or evolutionary) |
kinopt/ |
Kinase→phospho optimization (local or evolutionary) |
models/ |
Per-protein ODE definitions and solvers |
paramest/ |
Parameter estimation (normest.py); bootstrap confidence intervals |
sensitivity/ |
Morris sensitivity analysis (local ODE models) |
steady/ |
Steady-state initial condition computation |
plotting/ |
Static publication-ready plots and HTML reports |
networkmodel/ |
Network-scale coupled ODE simulation and calibration |
frechet/ |
Discrete Fréchet distance metric (Numba JIT) |
knockout/ |
Network knockout / perturbation analysis |
config/ |
CLI, constants, logging, shared configuration helpers |
config_loader.py |
Root-level config.toml loader (LRU-cached) |
scripts/ |
Stand-alone post-processing and analysis utilities |
Local vs Global: When to Use Which
| Goal | Use |
|---|---|
| Fit individual protein phosphorylation dynamics | Local workflow (kinopt + models) |
| Fit TF → mRNA regulation | tfopt |
| System-level network simulation | networkmodel (requires kinopt/tfopt outputs) |
| Sensitivity of individual ODE parameters | sensitivity/analysis.py |
| Sensitivity of global coupled model parameters | networkmodel/sensitivity.py |
| Interactive exploration of results | run_dashboard.py (Streamlit) |
Directory Structure Summary
phoskintime/
├── config/ CLI, constants, logging setup
├── config_loader.py Shared config.toml loader
├── config.toml Project configuration
├── processing/ Data preprocessing and mapping
├── tfopt/ TF optimization (local + evol)
├── kinopt/ Kinase optimization (local + evol)
├── models/ Per-protein ODE models
├── paramest/ Parameter estimation routines
├── sensitivity/ Morris sensitivity analysis
├── steady/ Steady-state solvers
├── plotting/ Visualization and reports
├── networkmodel/ Global coupled ODE pipeline
├── frechet/ Fréchet distance metric
├── knockout/ Knockout analysis
├── utils/ Shared helpers
├── bin/ CLI entry points
├── scripts/ Stand-alone analysis utilities
├── background/ Background tasks and helpers
├── app/ Web application assets
├── run_dashboard.py Dashboard launcher
└── Dockerfile Container definition
Technical Debt Notes
- Logging: Five near-identical
logconf.pyfiles exist acrossconfig/,kinopt/local/config/,kinopt/evol/config/,tfopt/local/config/, andtfopt/evol/config/. They cannot be trivially consolidated because each importsLOG_DIRandformat_durationfrom its own subpackage, and the top-level version adds amp_file_loggingparameter. Unification is tracked as a TODO in each file. - Configuration overlap:
config_loader.pyandnetworkmodel/config.pyboth load sections ofconfig.toml. The latter should eventually import from the former to avoid duplication. kinopt/andtfopt/mirroring: Both subpackages expose identicallocal/andevol/sub-interfaces. A shared strategy abstraction would reduce duplication but is deferred.