Skip to content

Concatenate and shift functions#3165

Merged
jeromekelleher merged 3 commits intotskit-dev:mainfrom
hyanwong:concatenate
May 21, 2025
Merged

Concatenate and shift functions#3165
jeromekelleher merged 3 commits intotskit-dev:mainfrom
hyanwong:concatenate

Conversation

@hyanwong
Copy link
Copy Markdown
Member

@hyanwong hyanwong commented May 15, 2025

Description

Fixes #3164

PR Checklist:

  • Tests that fully cover new/changed functionality.
  • Documentation including tutorial content if appropriate.
  • Changelogs, if there are API changes.

@codecov
Copy link
Copy Markdown

codecov bot commented May 15, 2025

Codecov Report

Attention: Patch coverage is 97.72727% with 1 line in your changes missing coverage. Please review.

Project coverage is 89.61%. Comparing base (4187575) to head (a39e503).
Report is 4 commits behind head on main.

Files with missing lines Patch % Lines
python/tskit/trees.py 96.77% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3165      +/-   ##
==========================================
+ Coverage   89.60%   89.61%   +0.01%     
==========================================
  Files          28       28              
  Lines       31895    31939      +44     
  Branches     5857     5866       +9     
==========================================
+ Hits        28579    28622      +43     
  Misses       1886     1886              
- Partials     1430     1431       +1     
Flag Coverage Δ
c-tests 86.66% <ø> (ø)
lwt-tests 80.38% <ø> (ø)
python-c-tests 88.18% <ø> (ø)
python-tests 98.84% <97.72%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
python/tskit/tables.py 99.34% <100.00%> (+<0.01%) ⬆️
python/tskit/trees.py 98.81% <96.77%> (-0.03%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@hyanwong hyanwong force-pushed the concatenate branch 2 times, most recently from 429de0d to 8341a07 Compare May 15, 2025 20:43
@hyanwong
Copy link
Copy Markdown
Member Author

Thanks for the review @petrelharp . I tacked on another quick commit to add that ref to subset in the union docs.

@hyanwong hyanwong force-pushed the concatenate branch 2 times, most recently from 3330365 to 1923865 Compare May 16, 2025 07:45
Copy link
Copy Markdown
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a few tricky subtleties not handled here.

  1. We should definitely not alter a parameter table collection - so I don't think we need TableCollection.concatenate and can just have TreeSequence.concatenate(other1, other2,....) (I realise that I asked you to focus on the pairwise approach, but I hadn't realised that we were updating all the tablecollections in order to implement)
  2. I'm not clear how TableCollection.shift() is interacting with the indexes. This is important.

@hyanwong
Copy link
Copy Markdown
Member Author

OK, refactored so we only have a single concatenate tree sequence function. I changed my mind on the *args call, and went with what Jerome suggested (but still need a list of node_mappings)

@petrelharp
Copy link
Copy Markdown
Contributor

We should definitely not alter a parameter table collection

Sorry what did you mean there, @jeromekelleher?

@petrelharp
Copy link
Copy Markdown
Contributor

petrelharp commented May 17, 2025

And - I don't see why shifting should change the indexes?

@jeromekelleher
Copy link
Copy Markdown
Member

We should definitely not alter a parameter table collection

Sorry what did you mean there, @jeromekelleher?

The previous version called .shift on the parameter TableCollection, which modifies it. So, calling .concat twice on the same input would give different answers, e.g., which would be pretty nasty

@jeromekelleher
Copy link
Copy Markdown
Member

And - I don't see why shifting should change the indexes?

I guess it shouldn't alter the relative order of the edges all right - but I'm not clear whether the current code will automatically invalidate the indexes, or if they are just left as is and that's OK?

Copy link
Copy Markdown
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good - missing an edge case on migrations though.

@hyanwong
Copy link
Copy Markdown
Member Author

Addressed comments and squashed. Should be mergeable.

Copy link
Copy Markdown
Member

@jeromekelleher jeromekelleher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to merge, once we've covered the remaining missing bit

@jeromekelleher jeromekelleher added this pull request to the merge queue May 21, 2025
@jeromekelleher jeromekelleher mentioned this pull request May 21, 2025
Merged via the queue into tskit-dev:main with commit 0663aaa May 21, 2025
19 checks passed
@petrelharp petrelharp mentioned this pull request May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New concatenate function to join tree sequences

3 participants