Contributing¶
Contributions to CETSAx–NADPH are welcome, but they need to meet a clear standard. This project prioritizes correctness, clarity, and biological relevance over feature volume.
Philosophy¶
This is not a generic software project. It is a scientific tool.
Contributions should:
- improve correctness of the model or analysis
- increase interpretability
- enhance reproducibility
- extend functionality in a biologically meaningful way
Avoid adding features that increase complexity without clear scientific value.
What You Can Contribute¶
1. Bug fixes¶
- Incorrect results or edge-case failures
- Numerical instability in fitting or scoring
- Data handling inconsistencies
2. Performance improvements¶
- Faster curve fitting
- Reduced memory usage in sequence models
- Better parallelization
3. New analysis modules¶
Examples:
- alternative scoring metrics
- improved network inference
- new clustering or latent methods
These should integrate cleanly into the existing pipeline.
4. Sequence modeling improvements¶
- better training strategies
- improved explainability methods
- integration of structural features
5. Documentation¶
- clearer explanations
- missing usage examples
- better interpretation guidance
What Not to Contribute¶
- UI layers or dashboards without analytical value
- loosely tested experimental features
- large refactors without justification
- redundant implementations of existing functionality
Development Setup¶
Clone the repository and install in editable mode:
git clone https://github.com/bibymaths/cetsax.git
cd cetsax-nadph
uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
pip install -e .
Coding Guidelines¶
General¶
- Keep functions small and focused
- Prefer explicit logic over abstraction
- Avoid hidden side effects
Data handling¶
- Use
pandas.DataFrameconsistently - Preserve column naming conventions (
id,condition, metrics) - Do not silently modify input data
Numerical code¶
- Avoid unstable transformations
- Document assumptions (e.g. scaling, bounds)
- Prefer reproducible deterministic behavior
Deep learning¶
- Keep training and inference clearly separated
- Avoid unnecessary GPU memory usage
- Document all hyperparameters
Testing¶
Before submitting:
- Run the pipeline on a small dataset
- Verify outputs are consistent
- Check edge cases (missing data, low variance, etc.)
If possible:
- add unit tests
- include reproducible examples
Submitting Changes¶
1. Create a branch¶
bash id="p2x7oy"
git checkout -b feature/your-feature-name
2. Make focused commits¶
- One logical change per commit
- Clear commit messages
3. Open a Pull Request¶
Include:
- what the change does
- why it is needed
- how it was tested
If relevant, include before/after results.
Review Process¶
Pull requests are evaluated based on:
- correctness
- clarity
- consistency with existing design
- biological relevance
Changes may be rejected if they:
- complicate the system unnecessarily
- introduce ambiguity in interpretation
- lack sufficient justification
Style Expectations¶
- Write code as if it will be read, not just executed
- Avoid unnecessary cleverness
- Prefer clarity over brevity
Communication¶
If you plan a larger change:
- open an issue first
- describe the idea clearly
- wait for feedback before implementing
This avoids wasted effort.
Summary¶
Contribute if you can:
- make the model more accurate
- make the outputs more interpretable
- make the system more robust
Do not contribute just to add features.
The goal is a tool that produces results you would trust in a paper.