Skip to content

Write tests for/add NaN handling of dataframe input case of metapool.format_pooling_echo_pick_list() #340

@AmandaBirmingham

Description

@AmandaBirmingham

This entire new branch was added, and according to comments is now be the current preferred functionality (old functionality is labelled "legacy") and there are no for it. When fixing this, also note that this branch does NOT sanitize NaNs (unlike the "legacy" case).

if isinstance(main_input, pd.DataFrame):
required_columns = ['Compressed Plate Name',
'Library Well',
pooling_vol_column]
if not all(column in main_input.columns for
column in required_columns):
raise ValueError(
"Your input dataframe does not have the "
"required columns ['Compressed Plate Name'",
"'Library Well','%s']. Perhaps you are running "
"this module out of sequential order."
% pooling_vol_column
)
formatted_df = main_input[['Compressed Plate Name',
'Library Well',
pooling_vol_column,
]]
# Writing picklist headers
contents = [
"Source Plate Name,Source Plate Type,Source Well,"
"Concentration,Transfer Volume,Destination Plate Name,"
"Destination Well"
]
# Destination well cycling logic
running_tot = 0
d = 1
if dest_plate_shape is None:
dest_plate_shape = (16, 24)
for i, pool_row in formatted_df[[pooling_vol_column]].iterrows():
pool_vol = pool_row[pooling_vol_column]
# test to see if we will exceed total vol per well
if running_tot + pool_vol > max_vol_per_well:
d += 1
running_tot = pool_vol
else:
running_tot += pool_vol
dest = "%s%d" % (
chr(ord("A") + int(np.floor(d / dest_plate_shape[0]))),
(d % dest_plate_shape[1]))
# writing picklist from row iterations
contents.append(",".join([formatted_df.loc[i, 'Compressed' +
' Plate Name'],
"384LDV_AQ_B2",
formatted_df.loc[i, 'Library Well'],
"",
"%.2f" % pool_vol,
"NormalizedDNA",

AFAICT this new case is only used for the very last pooling (i.e. the iseqnormed pooling) in the metatranscriptomics notebook:

"picklist = format_pooling_echo_pick_list(plate_df_normalized,\n",
" pooling_vol_column='iSeq normpool volume',\n",
" max_vol_per_well=30000)\n",
)

I found it because the dataframe merge upstream of this is apparently unstable on different platforms (local vs github CI) so the picklist created by this new case comes out in a different order on the different platforms (whereas the legacy case order is stable).

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions