Skip to content

Commit 5e7dce7

Browse files
authored
Migrate to uv, fix Cython for ARM, support Python 3.10-3.14 (#597)
* Migrate to uv, fix Cython extensions for ARM and modern Python - Replace pip/requirements.txt/setup.cfg with uv and modern pyproject.toml - Fix Cython 3.x compatibility: move extern declarations from _dice.pxd into _dice.pyx with C name aliasing to resolve name collisions - Add language_level=3 directive to all .pyx files - Replace deprecated pkg_resources with importlib.metadata - Fix zero-length filter divide-by-zero edge case in _dice_x86.py - Update CI: use uv, test Python 3.10-3.14, add ARM Linux runner (ubuntu-24.04-arm), add QEMU aarch64 wheel builds - Native C extensions now build and load on ARM (NEON) and x86 * Fix char signedness mismatch on ARM Linux Use int8_t (signed char) for Cython memoryview types instead of char, since Python's array('b') is always signed char but plain char is unsigned on ARM Linux. Cast to const char* at C function call sites. * Remove Azure Pipeline configuration The Azure DevOps account is no longer active. CI has been migrated to GitHub Actions. * Add mypy typecheck job to GitHub Actions Adds mypy and types-setuptools as dev dependencies and a typecheck job to the unit testing workflow, replacing the Azure Pipeline static checks stage. * Bump version to 0.16.0, update docs and changelog - Version bump to 0.16.0 reflecting breaking changes (dropped Python 3.8/3.9, Cython 3 requirement, build system migration) - Update README: remove Azure badge, refresh install instructions for uv, update benchmark numbers with ARM results, update test output - Add changelog entry for 0.16.0 - Clean up MANIFEST.in: remove stale _entitymatcher reference, add _multiparty_solving_inner.cpp
1 parent dc93947 commit 5e7dce7

File tree

18 files changed

+2114
-494
lines changed

18 files changed

+2114
-494
lines changed

.azurePipeline/cibuildwheel_steps.yml

Lines changed: 0 additions & 16 deletions
This file was deleted.

.azurePipeline/unittest_wheel_steps.yml

Lines changed: 0 additions & 43 deletions
This file was deleted.

.github/workflows/build_wheels.yml

Lines changed: 26 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,35 +13,42 @@ jobs:
1313
runs-on: ${{ matrix.os }}
1414
strategy:
1515
matrix:
16-
os: [ubuntu-20.04, windows-2019, macos-11]
16+
os: [ubuntu-latest, windows-latest, macos-latest]
1717

1818
steps:
19-
- uses: actions/checkout@v2
19+
- uses: actions/checkout@v4
20+
21+
# QEMU needed for building Linux ARM wheels on x86 runners
22+
- name: Set up QEMU
23+
if: runner.os == 'Linux'
24+
uses: docker/setup-qemu-action@v3
25+
with:
26+
platforms: arm64
2027

2128
- name: Build wheels
22-
uses: pypa/cibuildwheel@v2.11.1
23-
# to supply options, put them in 'env', like:
29+
uses: pypa/cibuildwheel@v2.23
2430
env:
25-
CIBW_SKIP: '*-win32 *i686'
31+
CIBW_SKIP: '*-win32 *i686 *-musllinux_*'
32+
CIBW_ARCHS_LINUX: auto aarch64
2633
CIBW_ARCHS_MACOS: x86_64 arm64
2734

28-
- uses: actions/upload-artifact@v3
35+
- uses: actions/upload-artifact@v4
2936
with:
30-
name: binary-wheels
37+
name: wheels-${{ matrix.os }}
3138
path: ./wheelhouse/*.whl
3239

3340
build_sdist:
3441
name: Build source distribution
3542
runs-on: ubuntu-latest
3643
steps:
37-
- uses: actions/checkout@v2
44+
- uses: actions/checkout@v4
3845

3946
- name: Build sdist
4047
run: pipx run build --sdist
4148

42-
- uses: actions/upload-artifact@v3
49+
- uses: actions/upload-artifact@v4
4350
with:
44-
name: binary-wheels
51+
name: sdist
4552
path: dist/*.tar.gz
4653

4754
upload_pypi:
@@ -51,12 +58,18 @@ jobs:
5158
# upload to PyPI on every release
5259
if: github.event.release && github.event.action == 'published'
5360
steps:
54-
- uses: actions/download-artifact@v3
61+
- uses: actions/download-artifact@v4
62+
with:
63+
pattern: wheels-*
64+
merge-multiple: true
65+
path: dist
66+
67+
- uses: actions/download-artifact@v4
5568
with:
56-
name: binary-wheels
69+
name: sdist
5770
path: dist
5871

59-
- uses: pypa/gh-action-pypi-publish@v1.4.2
72+
- uses: pypa/gh-action-pypi-publish@v1.12
6073
with:
6174
user: __token__
6275
password: ${{ secrets.PYPI_API_TOKEN }}

.github/workflows/unittests.yml

Lines changed: 39 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -14,35 +14,49 @@ jobs:
1414
name: Unittest Anonlink ${{ matrix.python }} ${{ matrix.os }}
1515
runs-on: ${{ matrix.os }}
1616
strategy:
17+
fail-fast: false
1718
matrix:
18-
os: [macos-latest, windows-latest, ubuntu-20.04]
19-
python: ["3.8", "3.9", "3.10", "3.11"]
20-
19+
os: [ubuntu-latest, macos-latest, windows-latest]
20+
python: ["3.10", "3.11", "3.12", "3.13", "3.14"]
21+
include:
22+
# Native ARM Linux runner
23+
- os: ubuntu-24.04-arm
24+
python: "3.13"
25+
2126
steps:
22-
- uses: actions/checkout@v2
23-
- name: Set up Python ${{ matrix.python }}
24-
uses: actions/setup-python@v2
27+
- uses: actions/checkout@v4
28+
- uses: astral-sh/setup-uv@v4
2529
with:
26-
python-version: ${{ matrix.python }}
27-
- name: Get full Python version
28-
id: full-python-version
29-
shell: bash
30-
run: echo ::set-output name=version::$(python -c "import sys; print('-'.join(str(v) for v in sys.version_info))")
31-
- name: Install dependencies
32-
shell: bash
33-
run: |
34-
python -m pip install --upgrade pip
35-
python -m pip install -U -r requirements.txt
36-
- name: Build and install anonlink
37-
shell: bash
30+
enable-cache: true
31+
- name: Install dependencies and build
32+
run: uv sync --python ${{ matrix.python }} --extra test
33+
- name: Verify native extensions loaded
3834
run: |
39-
python -m pip install -e .
35+
uv run python -c "
36+
from anonlink.similarities import dice_coefficient
37+
from anonlink.solving import greedy_solve
38+
print(f'Similarity: {dice_coefficient.__module__}.{dice_coefficient.__name__}')
39+
print(f'Solver: {greedy_solve.__module__}.{greedy_solve.__name__}')
40+
assert 'accelerated' in dice_coefficient.__name__, 'Native dice extension not loaded'
41+
assert 'native' in greedy_solve.__name__, 'Native solver extension not loaded'
42+
"
4043
- name: Test with pytest
41-
run: |
42-
pytest -q
44+
run: uv run pytest -q
4345
- name: Test with extended size 100k
44-
if: ${{ matrix.python == 3.10}}
45-
env:
46+
if: matrix.python == '3.13' && matrix.os == 'ubuntu-latest'
47+
env:
4648
INCLUDE_100K: 1
47-
run: |
48-
pytest -q
49+
run: uv run pytest -q
50+
51+
typecheck:
52+
name: Typecheck
53+
runs-on: ubuntu-latest
54+
steps:
55+
- uses: actions/checkout@v4
56+
- uses: astral-sh/setup-uv@v4
57+
with:
58+
enable-cache: true
59+
- name: Install dependencies
60+
run: uv sync --python 3.13
61+
- name: Run mypy
62+
run: uv run mypy anonlink --ignore-missing-imports

CHANGELOG.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,16 @@
1+
0.16.0
2+
======
3+
4+
- Migrate build system to uv with modern pyproject.toml. Remove requirements.txt and setup.cfg.
5+
- Fix Cython extensions for Cython 3.x compatibility. Native C++ extensions now build on all platforms.
6+
- Add ARM (aarch64/Apple Silicon) support for native C++ extensions using NEON intrinsics.
7+
- Fix char signedness mismatch on ARM Linux (``const char`` vs ``signed char``).
8+
- Fix divide-by-zero with zero-length filters in accelerated Dice comparison.
9+
- Replace deprecated ``pkg_resources`` with ``importlib.metadata``.
10+
- Add Python 3.12, 3.13, 3.14 support. Drop Python 3.8, 3.9.
11+
- Migrate CI from Azure Pipelines to GitHub Actions with ARM Linux runner.
12+
- Add mypy typecheck job to CI.
13+
114
0.15.3
215
======
316

MANIFEST.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,6 @@ global-exclude __pycache__/*
55
global-include *.pyx
66
global-include *.pxd
77
include anonlink/solving/_multiparty_solving_inner.h
8+
include anonlink/solving/_multiparty_solving_inner.cpp
89
include anonlink/similarities/libpopcount.h
9-
include anonlink/_entitymatcher*
1010
include anonlink/similarities/dice.cpp

README.rst

Lines changed: 46 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,4 @@
11

2-
.. image:: https://dev.azure.com/data61/Anonlink/_apis/build/status/data61.anonlink?branchName=main
3-
:target: https://dev.azure.com/data61/Anonlink/_build/latest?definitionId=3&branchName=main
4-
52
.. image:: https://github.com/data61/anonlink/actions/workflows/unittests.yml/badge.svg?branch=main
63
:target: https://github.com/data61/anonlink/actions/workflows/unittests.yml
74

@@ -30,79 +27,81 @@ Install a precompiled wheel from PyPi::
3027

3128
pip install anonlink
3229

33-
Or (if your system has a C++ compiler) you can locally install from source::
30+
Or install from source using `uv <https://docs.astral.sh/uv/>`__::
3431

35-
pip install -r requirements.txt
36-
pip install -e .
32+
uv sync
3733

3834

3935
Benchmark
4036
---------
4137

42-
You can run the benchmark with:
38+
You can run the benchmark with ``python -m anonlink.benchmark`` (or ``uv run python -m anonlink.benchmark``).
39+
40+
The following results were obtained on an Apple M1 (ARM):
4341

4442
::
4543

4644
$ python -m anonlink.benchmark
4745
Anonlink benchmark -- see README for explanation
4846
------------------------------------------------
47+
using 'greedy_solve_native' as solver and 'dice_coefficient_accelerated' as similarity metric
4948

5049
Threshold: 0.5, All results returned
5150
Size 1 | Size 2 | Comparisons | Total Time (s) | Throughput
5251
| | (match %) | (comparisons / matching)| (1e6 cmp/s)
5352
-------+--------+------------------+-------------------------+-------------
54-
1000 | 1000 | 1e6 (50.73%) | 0.762 (49.2% / 50.8%) | 2.669
55-
2000 | 2000 | 4e6 (51.04%) | 3.696 (42.6% / 57.4%) | 2.540
56-
3000 | 3000 | 9e6 (50.25%) | 8.121 (43.5% / 56.5%) | 2.548
57-
4000 | 4000 | 16e6 (50.71%) | 15.560 (41.1% / 58.9%) | 2.504
53+
1000 | 1000 | 1e6 (48.94%) | 0.201 (59.2% / 40.8%) | 8.426
54+
2000 | 2000 | 4e6 (49.95%) | 1.344 (37.1% / 62.9%) | 8.025
55+
3000 | 3000 | 9e6 (50.11%) | 3.204 (36.0% / 64.0%) | 7.799
56+
4000 | 4000 | 16e6 (49.86%) | 5.873 (35.3% / 64.7%) | 7.725
5857

5958
Threshold: 0.5, Top 100 matches per record returned
6059
Size 1 | Size 2 | Comparisons | Total Time (s) | Throughput
6160
| | (match %) | (comparisons / matching)| (1e6 cmp/s)
6261
-------+--------+------------------+-------------------------+-------------
63-
1000 | 1000 | 1e6 ( 6.86%) | 0.170 (85.9% / 14.1%) | 6.846
64-
2000 | 2000 | 4e6 ( 3.22%) | 0.384 (82.9% / 17.1%) | 12.561
65-
3000 | 3000 | 9e6 ( 2.09%) | 0.612 (81.6% / 18.4%) | 18.016
66-
4000 | 4000 | 16e6 ( 1.52%) | 0.919 (78.7% / 21.3%) | 22.135
67-
5000 | 5000 | 25e6 ( 1.18%) | 1.163 (80.8% / 19.2%) | 26.592
68-
6000 | 6000 | 36e6 ( 0.97%) | 1.535 (75.4% / 24.6%) | 31.113
69-
7000 | 7000 | 49e6 ( 0.82%) | 1.791 (80.6% / 19.4%) | 33.951
70-
8000 | 8000 | 64e6 ( 0.71%) | 2.095 (81.5% / 18.5%) | 37.466
71-
9000 | 9000 | 81e6 ( 0.63%) | 2.766 (72.5% / 27.5%) | 40.389
72-
10000 | 10000 | 100e6 ( 0.56%) | 2.765 (81.7% / 18.3%) | 44.277
73-
20000 | 20000 | 400e6 ( 0.27%) | 7.062 (86.2% / 13.8%) | 65.711
62+
1000 | 1000 | 1e6 ( 6.79%) | 0.064 (84.6% / 15.4%) | 18.503
63+
2000 | 2000 | 4e6 ( 3.23%) | 0.134 (85.9% / 14.1%) | 34.651
64+
3000 | 3000 | 9e6 ( 2.07%) | 0.220 (86.7% / 13.3%) | 47.213
65+
4000 | 4000 | 16e6 ( 1.53%) | 0.310 (86.2% / 13.8%) | 59.837
66+
5000 | 5000 | 25e6 ( 1.18%) | 0.414 (85.7% / 14.3%) | 70.435
67+
6000 | 6000 | 36e6 ( 0.98%) | 0.524 (86.7% / 13.3%) | 79.239
68+
7000 | 7000 | 49e6 ( 0.83%) | 0.636 (86.3% / 13.7%) | 89.303
69+
8000 | 8000 | 64e6 ( 0.71%) | 0.794 (82.8% / 17.2%) | 97.306
70+
9000 | 9000 | 81e6 ( 0.64%) | 0.894 (86.1% / 13.9%) | 105.184
71+
10000 | 10000 | 100e6 ( 0.56%) | 1.034 (86.8% / 13.2%) | 111.325
72+
20000 | 20000 | 400e6 ( 0.27%) | 2.679 (87.3% / 12.7%) | 170.965
7473

7574
Threshold: 0.7, All results returned
7675
Size 1 | Size 2 | Comparisons | Total Time (s) | Throughput
7776
| | (match %) | (comparisons / matching)| (1e6 cmp/s)
7877
-------+--------+------------------+-------------------------+-------------
79-
1000 | 1000 | 1e6 ( 0.01%) | 0.009 (99.0% / 1.0%) | 113.109
80-
2000 | 2000 | 4e6 ( 0.01%) | 0.033 (98.7% / 1.3%) | 124.076
81-
3000 | 3000 | 9e6 ( 0.01%) | 0.071 (99.1% / 0.9%) | 128.515
82-
4000 | 4000 | 16e6 ( 0.01%) | 0.123 (99.0% / 1.0%) | 131.654
83-
5000 | 5000 | 25e6 ( 0.01%) | 0.202 (99.1% / 0.9%) | 124.999
84-
6000 | 6000 | 36e6 ( 0.01%) | 0.277 (99.0% / 1.0%) | 131.403
85-
7000 | 7000 | 49e6 ( 0.01%) | 0.368 (98.9% / 1.1%) | 134.428
86-
8000 | 8000 | 64e6 ( 0.01%) | 0.490 (99.0% / 1.0%) | 131.891
87-
9000 | 9000 | 81e6 ( 0.01%) | 0.608 (99.0% / 1.0%) | 134.564
88-
10000 | 10000 | 100e6 ( 0.01%) | 0.753 (99.0% / 1.0%) | 134.105
89-
20000 | 20000 | 400e6 ( 0.01%) | 2.905 (98.8% / 1.2%) | 139.294
78+
1000 | 1000 | 1e6 ( 0.01%) | 0.003 (99.0% / 1.0%) | 312.191
79+
2000 | 2000 | 4e6 ( 0.01%) | 0.011 (98.1% / 1.9%) | 356.224
80+
3000 | 3000 | 9e6 ( 0.01%) | 0.026 (99.0% / 1.0%) | 347.011
81+
4000 | 4000 | 16e6 ( 0.01%) | 0.045 (99.0% / 1.0%) | 358.368
82+
5000 | 5000 | 25e6 ( 0.01%) | 0.071 (98.9% / 1.1%) | 356.423
83+
6000 | 6000 | 36e6 ( 0.01%) | 0.098 (98.9% / 1.1%) | 370.163
84+
7000 | 7000 | 49e6 ( 0.01%) | 0.133 (98.9% / 1.1%) | 373.096
85+
8000 | 8000 | 64e6 ( 0.01%) | 0.172 (98.9% / 1.1%) | 377.015
86+
9000 | 9000 | 81e6 ( 0.01%) | 0.218 (98.9% / 1.1%) | 374.817
87+
10000 | 10000 | 100e6 ( 0.01%) | 0.272 (99.0% / 1.0%) | 371.551
88+
20000 | 20000 | 400e6 ( 0.01%) | 1.053 (99.0% / 1.0%) | 383.731
9089

9190
Threshold: 0.7, Top 100 matches per record returned
9291
Size 1 | Size 2 | Comparisons | Total Time (s) | Throughput
9392
| | (match %) | (comparisons / matching)| (1e6 cmp/s)
9493
-------+--------+------------------+-------------------------+-------------
95-
1000 | 1000 | 1e6 ( 0.01%) | 0.009 (99.0% / 1.0%) | 111.640
96-
2000 | 2000 | 4e6 ( 0.01%) | 0.033 (98.6% / 1.4%) | 122.060
97-
3000 | 3000 | 9e6 ( 0.01%) | 0.074 (99.1% / 0.9%) | 123.237
98-
4000 | 4000 | 16e6 ( 0.01%) | 0.124 (99.0% / 1.0%) | 130.204
99-
5000 | 5000 | 25e6 ( 0.01%) | 0.208 (99.1% / 0.9%) | 121.351
100-
6000 | 6000 | 36e6 ( 0.01%) | 0.275 (99.0% / 1.0%) | 132.186
101-
7000 | 7000 | 49e6 ( 0.01%) | 0.373 (99.0% / 1.0%) | 132.650
102-
8000 | 8000 | 64e6 ( 0.01%) | 0.496 (99.1% / 0.9%) | 130.125
103-
9000 | 9000 | 81e6 ( 0.01%) | 0.614 (99.0% / 1.0%) | 133.216
104-
10000 | 10000 | 100e6 ( 0.01%) | 0.775 (99.1% / 0.9%) | 130.230
105-
20000 | 20000 | 400e6 ( 0.01%) | 2.939 (98.9% / 1.1%) | 137.574
94+
1000 | 1000 | 1e6 ( 0.01%) | 0.003 (98.9% / 1.1%) | 314.762
95+
2000 | 2000 | 4e6 ( 0.01%) | 0.011 (98.7% / 1.3%) | 357.730
96+
3000 | 3000 | 9e6 ( 0.01%) | 0.024 (98.9% / 1.1%) | 372.850
97+
4000 | 4000 | 16e6 ( 0.01%) | 0.044 (98.9% / 1.1%) | 363.783
98+
5000 | 5000 | 25e6 ( 0.01%) | 0.066 (98.9% / 1.1%) | 382.863
99+
6000 | 6000 | 36e6 ( 0.01%) | 0.095 (98.9% / 1.1%) | 383.880
100+
7000 | 7000 | 49e6 ( 0.01%) | 0.128 (98.9% / 1.1%) | 385.778
101+
8000 | 8000 | 64e6 ( 0.01%) | 0.171 (98.9% / 1.1%) | 377.762
102+
9000 | 9000 | 81e6 ( 0.01%) | 0.210 (99.0% / 1.0%) | 389.182
103+
10000 | 10000 | 100e6 ( 0.01%) | 0.275 (99.0% / 1.0%) | 367.465
104+
20000 | 20000 | 400e6 ( 0.01%) | 1.040 (99.0% / 1.0%) | 388.491
106105

107106

108107
The tables are interpreted as follows. Each table measures the throughput
@@ -135,24 +134,10 @@ matrix, which will be approximately `#comparisons * match% / 100`.
135134
Tests
136135
=====
137136

138-
Run unit tests with `pytest`:
139-
140-
::
141-
142-
$ pytest
143-
====================================== test session starts ======================================
144-
platform linux -- Python 3.6.4, pytest-3.2.5, py-1.4.34, pluggy-0.4.0
145-
rootdir: /home/hlaw/src/n1-anonlink, inifile:
146-
collected 71 items
147-
148-
tests/test_benchmark.py ...
149-
tests/test_bloommatcher.py ..............
150-
tests/test_e2e.py .............ss....
151-
tests/test_matcher.py ..x.....x......x....x..
152-
tests/test_similarity.py .........
153-
tests/test_util.py ...
137+
Run unit tests with ``pytest``::
154138

155-
======================== 65 passed, 2 skipped, 4 xfailed in 4.01 seconds ========================
139+
$ uv run pytest -q
140+
9051 passed, 2 skipped in 12.87s
156141

157142
To enable slightly larger tests add the following environment variables:
158143

0 commit comments

Comments
 (0)