Skip to content

fastdpplot.sparse_convert — Sparse Matrix Conversion

Source: fastdpplot/sparse_convert.py


to_scipy_sparse(counts, values, shape) → scipy.sparse.csr_matrix

Convert the flat output arrays from fastdpplot._rs.bin_points_py to a scipy.sparse.csr_matrix.

Only bins where count > 0 are stored — the matrix is therefore as sparse as the underlying dot data.

Parameters

Parameter Type Description
counts Sequence[int] (array-like) 1-D per-bin count array of length shape[0] * shape[1], as returned by bin_points_py. Passed through np.asarray(..., dtype=np.uint32) internally.
values Sequence[float] (array-like) 1-D per-bin mean-value array (same length as counts). Passed through np.asarray(..., dtype=np.float32) internally.
shape tuple[int, int] (height, width) of the bin grid.

Returns

scipy.sparse.csr_matrix of shape shape containing mean values for non-zero bins.

Raises

  • ImportError — if the Rust extension is not compiled.

Example

import numpy as np
from fastdpplot._rs import bin_points_py
from fastdpplot.sparse_convert import to_scipy_sparse

# Assume df is a DataFrame with x, y, value columns
xs = df["x"].tolist()
ys = df["y"].tolist()
vs = df["value"].tolist()

width, height = 1024, 1024
counts, values = bin_points_py(
    xs, ys, vs,
    x_min=0.0, x_max=50000.0,
    y_min=0.0, y_max=50000.0,
    width=width, height=height,
)

sparse_mat = to_scipy_sparse(
    np.array(counts, dtype=np.uint32),
    np.array(values, dtype=np.float32),
    shape=(height, width),
)

print(sparse_mat.nnz)  # number of non-zero bins

COO alternative

If you need COO format instead of CSR, call fastdpplot._rs.to_coo directly:

from fastdpplot._rs import to_coo
rows, cols, data = to_coo(counts, values, width, height)