Conversation
* ready for hPC * processed all segs * rerun organoid segs on HPC * fixed HPC script * fixed HPC script * update run list * update run list * update run list * update run list * update run list * segmentations re-completed * Update 2.segment_images/scripts/0.nuclei_segmentation.py Co-authored-by: Dave Bunten <ekgto445@gmail.com> * addressing comments --------- Co-authored-by: Dave Bunten <ekgto445@gmail.com>
* ready for hPC * processed all segs * rerun organoid segs on HPC * fixed HPC script * fixed HPC script * update run list * update run list * update run list * update run list * update run list * segmentations re-completed * Update 2.segment_images/scripts/0.nuclei_segmentation.py Co-authored-by: Dave Bunten <ekgto445@gmail.com> * addressing comments --------- Co-authored-by: Dave Bunten <ekgto445@gmail.com>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Pull request overview
This pull request implements z-wise (slice-by-slice) analysis of 3D microscopy images to compare two microscope systems (Echo and CQ1) imaging the same plate of organoids. The analysis computes three image quality metrics (signal-to-noise ratio, Michelson contrast, and RMS contrast) at two levels: across entire 3D volumes and for individual z-slices within each volume. The implementation supports parallel processing and handles both raw images and basicpy-corrected images.
Changes:
- Replaced single 3D analysis script with new 2D/3D analysis that computes metrics both volumetrically and per z-slice
- Added parallel processing capabilities with multiprocessing support
- Created separate visualization outputs for 2D (per-slice) and 3D (whole volume) metrics
- Cleaned up unused imports across multiple analysis scripts
Reviewed changes
Copilot reviewed 13 out of 27 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| 7.technical_analysis/scripts/raw_image_tech_analysis_2D_3D.py | New Python script implementing both 2D (slice-by-slice) and 3D (whole volume) metric calculations with parallel processing support |
| 7.technical_analysis/scripts/raw_image_tech_analysis.py | Removed old script that only computed 3D metrics |
| 7.technical_analysis/scripts/visualize_raw_image_tech_analysis.r | Updated R visualization script to generate separate plots for 2D and 3D metrics with z-slice normalization |
| 7.technical_analysis/notebooks/raw_image_tech_analysis_2D_3D.ipynb | New notebook version of the 2D/3D analysis script |
| 7.technical_analysis/notebooks/raw_image_tech_analysis.ipynb | Removed old notebook |
| 7.technical_analysis/call_2D_3D_analysis_HPC.sh | New shell script for running analysis on HPC with 64 processes |
| 7.technical_analysis/call_2D_3D_analysis.sh | New shell script for running analysis locally with 18 processes |
| 7.technical_analysis/scripts/preprocess_zslice_experiments.py | Cleaned up unused imports |
| 7.technical_analysis/scripts/merge_convolution_profiles.py | Removed unused seaborn import |
| 7.technical_analysis/scripts/compute_mAP_on_convolved_profiles.py | Cleaned up unused imports |
| 7.technical_analysis/scripts/compare_z_slice_number_segmentations.py | Cleaned up unused imports and replaced tqdm imports with pass statements |
| 7.technical_analysis/scripts/0.preprocess_temporary_dirs.py | Cleaned up unused imports |
| .pre-commit-config.yaml | Updated ruff version from v0.15.0 to v0.15.2 |
| .gitignore | Added entries for technical_analysis results and processed_data directories |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| argparse = argparse.ArgumentParser( | ||
| description="Process 2D and 3D image metrics in parallel" | ||
| ) | ||
| argparse.add_argument( | ||
| "--n_processes", | ||
| type=int, | ||
| default=18, | ||
| help="Number of processes to use", | ||
| ) | ||
| argparse.add_argument( | ||
| "--run_parallel", | ||
| action="store_true", | ||
| help="Run processing in parallel", | ||
| ) | ||
| n_processes = argparse.parse_args().n_processes | ||
| run_parallel = argparse.parse_args().run_parallel |
There was a problem hiding this comment.
The argparse module is being overwritten with an ArgumentParser instance. This will cause the subsequent calls to argparse.add_argument and argparse.parse_args to fail. The variable should be named differently, such as 'parser' or 'arg_parser'.
| argparse = argparse.ArgumentParser( | |
| description="Process 2D and 3D image metrics in parallel" | |
| ) | |
| argparse.add_argument( | |
| "--n_processes", | |
| type=int, | |
| default=18, | |
| help="Number of processes to use", | |
| ) | |
| argparse.add_argument( | |
| "--run_parallel", | |
| action="store_true", | |
| help="Run processing in parallel", | |
| ) | |
| n_processes = argparse.parse_args().n_processes | |
| run_parallel = argparse.parse_args().run_parallel | |
| parser = argparse.ArgumentParser( | |
| description="Process 2D and 3D image metrics in parallel" | |
| ) | |
| parser.add_argument( | |
| "--n_processes", | |
| type=int, | |
| default=18, | |
| help="Number of processes to use", | |
| ) | |
| parser.add_argument( | |
| "--run_parallel", | |
| action="store_true", | |
| help="Run processing in parallel", | |
| ) | |
| args = parser.parse_args() | |
| n_processes = args.n_processes | |
| run_parallel = args.run_parallel |
| n_processes = argparse.parse_args().n_processes | ||
| run_parallel = argparse.parse_args().run_parallel |
There was a problem hiding this comment.
The parse_args method is being called twice, which will parse command line arguments twice and could cause issues. Store the result of parse_args once and then access the n_processes and run_parallel attributes from that result.
| n_processes = argparse.parse_args().n_processes | |
| run_parallel = argparse.parse_args().run_parallel | |
| args = argparse.parse_args() | |
| n_processes = args.n_processes | |
| run_parallel = args.run_parallel |
| raw_image_2D_quality_metrics <- raw_image_2D_quality_metrics %>% | ||
| group_by(microscope, channel, well_fov) %>% | ||
| mutate( | ||
| z_slice_normalized = (z_slice - min(z_slice)) / (max(z_slice) - min(z_slice)) |
There was a problem hiding this comment.
The z-slice normalization formula will produce NaN values when max and min z_slice are equal (i.e., when there's only one z-slice in a group). Consider adding a check to handle this edge case, such as setting z_slice_normalized to 0 or 0.5 when the denominator is zero.
| z_slice_normalized = (z_slice - min(z_slice)) / (max(z_slice) - min(z_slice)) | |
| z_slice_normalized = ifelse( | |
| max(z_slice) == min(z_slice), | |
| 0.5, | |
| (z_slice - min(z_slice)) / (max(z_slice) - min(z_slice)) | |
| ) |
| + theme( | ||
| axis.text = element_text(size = 14), | ||
| axis.title = element_text(size = 16), | ||
| legend.text = element_text(size = 14), | ||
| legend.title = element_text(size = 16), | ||
| strip.text = element_text(size = 14) | ||
| ) |
There was a problem hiding this comment.
The plot has both an inline theme() call (lines 248-254) and the plot_theme applied afterwards (line 255). This creates redundancy as the plot_theme already defines the same styling properties. Consider removing the inline theme() call since plot_theme should override it anyway.
| + theme( | |
| axis.text = element_text(size = 14), | |
| axis.title = element_text(size = 16), | |
| legend.text = element_text(size = 14), | |
| legend.title = element_text(size = 16), | |
| strip.text = element_text(size = 14) | |
| ) |
| size = 0.2 | ||
| ) | ||
| + labs( | ||
| x = "Channel", |
There was a problem hiding this comment.
The x-axis label "Channel" is inconsistent with the actual x-axis data being z_slice_normalized. The label should be "Z Slice Normalized" or similar to match the actual data being plotted.
| x = "Channel", | |
| x = "Z Slice Normalized", |
|
|
||
| conda activate GFF_segmentation | ||
|
|
||
| python raw_image_tech_analysis_2D_3D.py --n_processes 64 |
There was a problem hiding this comment.
The script calls the Python file without the --run_parallel flag. Based on the Python script's argument parser, this flag is required if parallel processing is desired. Consider adding --run_parallel to explicitly enable parallel processing, especially since you're specifying the number of processes.
| python raw_image_tech_analysis_2D_3D.py --n_processes 64 | |
| python raw_image_tech_analysis_2D_3D.py --run_parallel --n_processes 64 |
|
|
||
| conda activate GFF_segmentation | ||
|
|
||
| python raw_image_tech_analysis_2D_3D.py --n_processes 18 |
There was a problem hiding this comment.
The script calls the Python file without the --run_parallel flag. Based on the Python script's argument parser, this flag is required if parallel processing is desired. Consider adding --run_parallel to explicitly enable parallel processing.
| python raw_image_tech_analysis_2D_3D.py --n_processes 18 | |
| python raw_image_tech_analysis_2D_3D.py --n_processes 18 --run_parallel |
| return instance_mask | ||
|
|
||
|
|
||
| def retreive_foreground_background_masks( |
There was a problem hiding this comment.
The function name 'retreive_foreground_background_masks' has a spelling error. It should be 'retrieve_foreground_background_masks' (with 'ie' not 'ei').
| df_2D = pd.concat([pd.read_parquet(f) for f in result_2D_files], ignore_index=True) | ||
| df_3D = pd.concat([pd.read_parquet(f) for f in result_3D_files], ignore_index=True) |
There was a problem hiding this comment.
The basicpy_status information (indicating whether images are raw or basicpy-corrected) is embedded in the filenames but not stored as a column in the results dataframes. When individual parquet files are concatenated, this metadata is lost. Consider extracting basicpy_status from the filename and adding it as a column to preserve this important metadata for downstream analysis.
| df_2D = pd.concat([pd.read_parquet(f) for f in result_2D_files], ignore_index=True) | |
| df_3D = pd.concat([pd.read_parquet(f) for f in result_3D_files], ignore_index=True) | |
| # when concatenating, preserve the basicpy/raw status encoded in filenames | |
| dfs_2D = [] | |
| for f in result_2D_files: | |
| df = pd.read_parquet(f) | |
| fname_lower = f.name.lower() | |
| if "basicpy" in fname_lower: | |
| basicpy_status = "basicpy" | |
| elif "raw" in fname_lower: | |
| basicpy_status = "raw" | |
| else: | |
| basicpy_status = None | |
| df["basicpy_status"] = basicpy_status | |
| dfs_2D.append(df) | |
| df_2D = pd.concat(dfs_2D, ignore_index=True) | |
| dfs_3D = [] | |
| for f in result_3D_files: | |
| df = pd.read_parquet(f) | |
| fname_lower = f.name.lower() | |
| if "basicpy" in fname_lower: | |
| basicpy_status = "basicpy" | |
| elif "raw" in fname_lower: | |
| basicpy_status = "raw" | |
| else: | |
| basicpy_status = None | |
| df["basicpy_status"] = basicpy_status | |
| dfs_3D.append(df) | |
| df_3D = pd.concat(dfs_3D, ignore_index=True) |
This PR compare two microscope datasets that are derived from the same plate of 3D images of organoids.
We compute three metrics:
These are calculated for:
For both: