Perf: 距离计算优化 — ManhattanDistance 改用 scipy.cdist, BasicDistance 新增 torch 后端#21
Open
SteadfastAsArt wants to merge 1 commit intozjuwss:mainfrom
Open
Perf: 距离计算优化 — ManhattanDistance 改用 scipy.cdist, BasicDistance 新增 torch 后端#21SteadfastAsArt wants to merge 1 commit intozjuwss:mainfrom
SteadfastAsArt wants to merge 1 commit intozjuwss:mainfrom
Conversation
…ch 后端 - ManhattanDistance: replace numpy broadcasting with scipy.cdist (7-15x speedup) - BasicDistance: add 'torch' backend with GPU support and chunked computation - Add distance_backend parameter to init_dataset() and init_dataset_split() to connect torch backend to the call chain (default: 'scipy', 'auto' selects torch+GPU when CUDA is available) - Add benchmarks/bench_distance.py with correctness, timing, and memory comparison Benchmark results (N=10K, 2D): ManhattanDistance: scipy 7.3x faster than numpy broadcasting BasicDistance: torch_cpu 5.8x faster than scipy
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
优化距离矩阵计算性能:
x[:, np.newaxis, :] - y),减少中间数组分配;N=10k 时约 11x 加速backend参数选择'scipy'(默认)/'torch'/'auto',torch 后端使用torch.cdist,支持 GPU 加速chunk_size参数控制内存上限改动详情
ManhattanDistance (datasets.py)
BasicDistance (datasets.py)
新增参数:
backend,device,chunk_size,return_tensorinit_dataset / init_dataset_split
新增
distance_backend参数,传递给BasicDistance:当
distance_backend != 'scipy'且spatial_fun is BasicDistance时,使用functools.partial绑定后端参数。Known Limitations
init_dataset_cv暂不支持distance_backend参数return_tensor=True在 GTNNWR 路径下未测试(当前代码不会触发此组合)Benchmark
附
benchmarks/bench_distance.py,可复现:ManhattanDistance (2D coords, 3 runs mean):
Test Plan
pytest tests/test_model_e2e.py— 端到端训练结果不变np.allcloseatol=1e-4)