Skip to content

Perf: 距离计算优化 — ManhattanDistance 改用 scipy.cdist, BasicDistance 新增 torch 后端#21

Open
SteadfastAsArt wants to merge 1 commit intozjuwss:mainfrom
SteadfastAsArt:perf/distance-computation
Open

Perf: 距离计算优化 — ManhattanDistance 改用 scipy.cdist, BasicDistance 新增 torch 后端#21
SteadfastAsArt wants to merge 1 commit intozjuwss:mainfrom
SteadfastAsArt:perf/distance-computation

Conversation

@SteadfastAsArt
Copy link
Copy Markdown

@SteadfastAsArt SteadfastAsArt commented Feb 11, 2026

Summary

优化距离矩阵计算性能:

  1. ManhattanDistance: 改用 scipy.cdist — 替代 3D broadcasting (x[:, np.newaxis, :] - y),减少中间数组分配;N=10k 时约 11x 加速
  2. BasicDistance: 新增 torch 后端 — 通过 backend 参数选择 'scipy'(默认)/ 'torch' / 'auto',torch 后端使用 torch.cdist,支持 GPU 加速
  3. torch 后端支持分块计算chunk_size 参数控制内存上限

改动详情

ManhattanDistance (datasets.py)

# Before
np.float32(np.sum(np.abs(x[:, np.newaxis, :] - y), axis=2))

# After — scipy C 实现
np.float32(distance.cdist(x, y, 'cityblock'))

BasicDistance (datasets.py)

新增参数:backend, device, chunk_size, return_tensor

# 默认行为不变
BasicDistance(x, y)  # scipy,向后兼容

# torch GPU 加速
BasicDistance(x, y, backend='torch', device='cuda')

# 自动选择(N*M > 1e7 时用 torch)
BasicDistance(x, y, backend='auto')

init_dataset / init_dataset_split

新增 distance_backend 参数,传递给 BasicDistance

datasets.init_dataset(..., distance_backend='auto')

distance_backend != 'scipy'spatial_fun is BasicDistance 时,使用 functools.partial 绑定后端参数。

Known Limitations

  • init_dataset_cv 暂不支持 distance_backend 参数
  • return_tensor=True 在 GTNNWR 路径下未测试(当前代码不会触发此组合)

Benchmark

benchmarks/bench_distance.py,可复现:

python benchmarks/bench_distance.py

ManhattanDistance (2D coords, 3 runs mean):

N numpy (s) scipy (s) speedup
1k 0.01 0.001 ~6x
5k 0.30 0.03 ~10x
10k 1.44 0.13 ~11x

Test Plan

  • pytest tests/test_model_e2e.py — 端到端训练结果不变
  • benchmark 脚本验证数值正确性(np.allclose atol=1e-4)
  • 默认参数不变,完全向后兼容

…ch 后端

- ManhattanDistance: replace numpy broadcasting with scipy.cdist (7-15x speedup)
- BasicDistance: add 'torch' backend with GPU support and chunked computation
- Add distance_backend parameter to init_dataset() and init_dataset_split()
  to connect torch backend to the call chain (default: 'scipy', 'auto' selects
  torch+GPU when CUDA is available)
- Add benchmarks/bench_distance.py with correctness, timing, and memory comparison

Benchmark results (N=10K, 2D):
  ManhattanDistance: scipy 7.3x faster than numpy broadcasting
  BasicDistance: torch_cpu 5.8x faster than scipy
@SteadfastAsArt SteadfastAsArt changed the title Perf: 距离计算优化 — ManhattanDistance 使用 scipy.cdist, BasicDistance 支持 torch 后端 Perf: 距离计算优化 — ManhattanDistance 改用 scipy.cdist, BasicDistance 新增 torch 后端 Feb 11, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant