IMP: Swap rarefy random_seed to CaptureHolder#352
IMP: Swap rarefy random_seed to CaptureHolder#352Oddant1 wants to merge 10 commits intoqiime2:devfrom
Conversation
ebolyen
left a comment
There was a problem hiding this comment.
I think we're missing a piece to make this scale across plugins effectively. I've got a few ideas below. Let me know what you think.
| ['O1', 'O2'], | ||
| ['S1', 'S2', 'S3']) | ||
| a = rarefy(t, 2) | ||
| a = rarefy(t, 2, random_seed=FakeCaptureHolder()) |
There was a problem hiding this comment.
I think the number of these needed might suggest something is off.
Should we consider that we either need to test the Plugin side of things, so that random_seed is initialized, or should we alter rarefy such that None is also fine (which is what the type technically says...). I'm leaning towards the latter, since Rachis would be obeying the signature by always providing a capture-holder, and it makes it easier to work with for the tests.
We could also imagine a test utility/decorator which injects any Rachis objects into the function, but I don't think that makes the tests any easier to write or think about.
There was a problem hiding this comment.
Yes, I found it annoying to change them all. No, I don't think we should alter rarefy such that None is actually a valid argument to it. We want a CaptureHolder with None not an actual None. I don't think nuisances arising from calling the functions behind actions without the framework should be resolved by modifying the actions because we kind of fundamentally don't respect calling the function in this way in a real world context.
q2_feature_table/_normalize.py
Outdated
| if random_seed.value is None: | ||
| random_int = random.randrange(RNG_MAX_SIZE) | ||
| random_seed.set_value(random_int) |
There was a problem hiding this comment.
Based on the above comment about the number of instances we need for a test, what if we imagined something like this:
| if random_seed.value is None: | |
| random_int = random.randrange(RNG_MAX_SIZE) | |
| random_seed.set_value(random_int) | |
| if random_seed is None or random_seed.value is None: | |
| random_int = random.randrange(RNG_MAX_SIZE) | |
| else: | |
| random_int = random_seed.value | |
| if random_seed and random_seed.value is None: | |
| random_seed.set_value(random_int) |
This also ends up being kind of bad...
So what about this:
| if random_seed.value is None: | |
| random_int = random.randrange(RNG_MAX_SIZE) | |
| random_seed.set_value(random_int) | |
| random_seed = CaptureHolder.getorset(random_seed, random.randrange(RNG_MAX_SIZE)) |
Where if random_seed is None, it will construct one, but if it was provided an CaptureHolder, it inspects the state to determine if it was set already by the outer _bind, and if not, will instantiate itself (untethered to a provenance chain) with the secondary argument as the value.
Alternatively the result could just be the value instead of an instance, which might be easier on everybody.
| if random_seed.value is None: | |
| random_int = random.randrange(RNG_MAX_SIZE) | |
| random_seed.set_value(random_int) | |
| random_int = CaptureHolder.getorset(random_seed, random.randrange(RNG_MAX_SIZE)) |
| table = table.subsample(sampling_depth, axis='sample', by_id=False, | ||
| with_replacement=with_replacement, | ||
| seed=random_seed) | ||
| seed=random_seed.value) |
There was a problem hiding this comment.
| seed=random_seed.value) | |
| seed=random_int) |
only if paired with the above
| # ---------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| class FakeCaptureHolder: |
There was a problem hiding this comment.
I expect this would be needed in a lot of places in practice, so it would probably make sense to put it in rachis.plugin.testing as a Mock, but I think we might also be able to avoid it if we are a little clever.
|
blocked by rachis-org/rachis#913 |
Use
CaptureHolderto track value ofrandom_seedin provenance.