Skip to content

Contributing

Contributions to CETSAx–NADPH are welcome, but they need to meet a clear standard. This project prioritizes correctness, clarity, and biological relevance over feature volume.


Philosophy

This is not a generic software project. It is a scientific tool.

Contributions should:

  • improve correctness of the model or analysis
  • increase interpretability
  • enhance reproducibility
  • extend functionality in a biologically meaningful way

Avoid adding features that increase complexity without clear scientific value.


What You Can Contribute

1. Bug fixes

  • Incorrect results or edge-case failures
  • Numerical instability in fitting or scoring
  • Data handling inconsistencies

2. Performance improvements

  • Faster curve fitting
  • Reduced memory usage in sequence models
  • Better parallelization

3. New analysis modules

Examples:

  • alternative scoring metrics
  • improved network inference
  • new clustering or latent methods

These should integrate cleanly into the existing pipeline.


4. Sequence modeling improvements

  • better training strategies
  • improved explainability methods
  • integration of structural features

5. Documentation

  • clearer explanations
  • missing usage examples
  • better interpretation guidance

What Not to Contribute

  • UI layers or dashboards without analytical value
  • loosely tested experimental features
  • large refactors without justification
  • redundant implementations of existing functionality

Development Setup

Clone the repository and install in editable mode:

git clone https://github.com/bibymaths/cetsax.git
cd cetsax-nadph

uv venv
source .venv/bin/activate
uv pip install -r requirements.txt
pip install -e .

Coding Guidelines

General

  • Keep functions small and focused
  • Prefer explicit logic over abstraction
  • Avoid hidden side effects

Data handling

  • Use pandas.DataFrame consistently
  • Preserve column naming conventions (id, condition, metrics)
  • Do not silently modify input data

Numerical code

  • Avoid unstable transformations
  • Document assumptions (e.g. scaling, bounds)
  • Prefer reproducible deterministic behavior

Deep learning

  • Keep training and inference clearly separated
  • Avoid unnecessary GPU memory usage
  • Document all hyperparameters

Testing

Before submitting:

  • Run the pipeline on a small dataset
  • Verify outputs are consistent
  • Check edge cases (missing data, low variance, etc.)

If possible:

  • add unit tests
  • include reproducible examples

Submitting Changes

1. Create a branch

bash id="p2x7oy" git checkout -b feature/your-feature-name


2. Make focused commits

  • One logical change per commit
  • Clear commit messages

3. Open a Pull Request

Include:

  • what the change does
  • why it is needed
  • how it was tested

If relevant, include before/after results.


Review Process

Pull requests are evaluated based on:

  • correctness
  • clarity
  • consistency with existing design
  • biological relevance

Changes may be rejected if they:

  • complicate the system unnecessarily
  • introduce ambiguity in interpretation
  • lack sufficient justification

Style Expectations

  • Write code as if it will be read, not just executed
  • Avoid unnecessary cleverness
  • Prefer clarity over brevity

Communication

If you plan a larger change:

  • open an issue first
  • describe the idea clearly
  • wait for feedback before implementing

This avoids wasted effort.


Summary

Contribute if you can:

  • make the model more accurate
  • make the outputs more interpretable
  • make the system more robust

Do not contribute just to add features.

The goal is a tool that produces results you would trust in a paper.