Contributing¶

Contributions to CETSAx–NADPH are welcome, but they need to meet a clear standard. This project prioritizes correctness, clarity, and biological relevance over feature volume.

Philosophy¶

This is not a generic software project. It is a scientific tool.

Contributions should:

improve correctness of the model or analysis
increase interpretability
enhance reproducibility
extend functionality in a biologically meaningful way

Avoid adding features that increase complexity without clear scientific value.

What You Can Contribute¶

1. Bug fixes¶

Incorrect results or edge-case failures
Numerical instability in fitting or scoring
Data handling inconsistencies

2. Performance improvements¶

Faster curve fitting
Reduced memory usage in sequence models
Better parallelization

3. New analysis modules¶

Examples:

alternative scoring metrics
improved network inference
new clustering or latent methods

These should integrate cleanly into the existing pipeline.

4. Sequence modeling improvements¶

better training strategies
improved explainability methods
integration of structural features

5. Documentation¶

clearer explanations
missing usage examples
better interpretation guidance

What Not to Contribute¶

UI layers or dashboards without analytical value
loosely tested experimental features
large refactors without justification
redundant implementations of existing functionality

Development Setup¶

Clone the repository and install in editable mode:

git clone https://github.com/bibymaths/cetsax.git
cd cetsax-nadph

uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
pip install -e .

Coding Guidelines¶

General¶

Keep functions small and focused
Prefer explicit logic over abstraction
Avoid hidden side effects

Data handling¶

Use pandas.DataFrame consistently
Preserve column naming conventions (id, condition, metrics)
Do not silently modify input data

Numerical code¶

Avoid unstable transformations
Document assumptions (e.g. scaling, bounds)
Prefer reproducible deterministic behavior

Deep learning¶

Keep training and inference clearly separated
Avoid unnecessary GPU memory usage
Document all hyperparameters

Testing¶

Before submitting:

Run the pipeline on a small dataset
Verify outputs are consistent
Check edge cases (missing data, low variance, etc.)

If possible:

add unit tests
include reproducible examples

Submitting Changes¶

1. Create a branch¶

bash id="p2x7oy" git checkout -b feature/your-feature-name

2. Make focused commits¶

One logical change per commit
Clear commit messages

3. Open a Pull Request¶

Include:

what the change does
why it is needed
how it was tested

If relevant, include before/after results.

Review Process¶

Pull requests are evaluated based on:

correctness
clarity
consistency with existing design
biological relevance

Changes may be rejected if they:

complicate the system unnecessarily
introduce ambiguity in interpretation
lack sufficient justification

Style Expectations¶

Write code as if it will be read, not just executed
Avoid unnecessary cleverness
Prefer clarity over brevity

Communication¶

If you plan a larger change:

open an issue first
describe the idea clearly
wait for feedback before implementing

This avoids wasted effort.

Summary¶

Contribute if you can:

make the model more accurate
make the outputs more interpretable
make the system more robust

Do not contribute just to add features.

The goal is a tool that produces results you would trust in a paper.