Skip to content

Predictive proteins folder empty #14

@zinfin

Description

@zinfin

@Noonanav when running the protein-family-worfklow with a phenotype matrix with more than two columns with the regression task type, the modeling_results/model_performance/predictive_proteins folder is empty. In my case I had a phenotype matrix with the following columns:

Image

Here's the command that was used:

genophi protein-family-workflow --input_path_strain /data/od_strains --phenotype_matrix /data/GenoPHI/data/interation_matrices/260212_clean_relative_ods.csv --output ec_m9_no_outliers_v2/ --threads 24 --sample_column strain --phenotype_column relative_od --task_type regression --max_ram 24 --method rfe --num_runs_fs 25 --num_runs_modeling 50 --use_shap

And output from the log:

2026-03-05 01:35:49,903 - INFO - Step 5: Selecting top-performing cutoff and running predictive proteins workflow...
2026-03-05 01:35:49,904 - INFO - Step 1: Extracting predictive features.
2026-03-05 01:35:49,908 - INFO - Predictive feature groups detected: ['Inoculation_datec', 'sc', 'undiluted_ODc']
2026-03-05 01:35:49,908 - WARNING - More than two feature groups detected. Please ensure the correct feature groups are selected.
2026-03-05 01:35:49,908 - ERROR - An error occurred: local variable 'strain_features' referenced before assignment
2026-03-05 01:35:49,908 - INFO - Report saved to: ec_m9_no_outliers_v2/workflow_report.txt
2026-03-05 01:35:49,909 - INFO - CSV log saved to ec_m9_no_outliers_v2/workflow_report.csv.
2026-03-05 01:35:49,909 - ERROR - An error occurred: local variable 'strain_features' referenced before assignment
Traceback (most recent call last):
  File "/data/GenoPHI/genophi/cli.py", line 1402, in main
    protein_family_workflow_command(args)
  File "/data/GenoPHI/genophi/cli.py", line 1290, in protein_family_workflow_command    
    run_protein_family_workflow(
  File "/data/GenoPHI/genophi/workflows/protein_family_workflow.py", line 435, in run_protein_family_workflow
    run_predictive_proteins_workflow( 
  File "/data/GenoPHI/genophi/workflows/feature_annotations_workflow.py", line 60, in run_predictive_proteins_workflow
    select_features = get_predictive_features(feature_file_path, feature_type=feature_type, sample_column=strain_column, phenotype_column=phenotype_column)
  File "/data/GenoPHI/genophi/feature_annotations.py", line 102, in get_predictive_features
    predictive_features = strain_features
UnboundLocalError: local variable 'strain_features' referenced before assignment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions