113 kubernetes integration#130
Conversation
- Introduced a new backend option for Kubernetes in the Backends enum. - Updated segmentation algorithms to accept optional keyword arguments for Kubernetes backend, enhancing flexibility for users.
…thm execution - Added a new section outlining how to configure and use the Kubernetes backend for running algorithms. - Provided examples for setting the KUBECONFIG environment variable and specifying backend options in the inference method.
- Integrated Kubernetes backend option into the BraTSAlgorithm class. - Updated methods to accept optional keyword arguments for Kubernetes, allowing for enhanced configuration during inference. - Added error handling to ensure Kubernetes kwargs are only used with the Kubernetes backend.
- Updated pyproject.toml to include the Kubernetes package with a minimum version of 34.1.0, enabling support for Kubernetes features in the project.
…Kubernetes configuration - Changed the parameter name from `mount_path` to `data_mount_path` in the documentation to accurately describe its function in the `infer_single` method for Kubernetes backend usage.
- Implemented functions for creating and managing Kubernetes jobs, including PVC creation, job execution, and output handling. - Added methods for downloading additional files from Zenodo and checking file presence in pods. - Enhanced logging for better traceability during job execution and output verification. - Integrated command execution within pods to facilitate file uploads and downloads, ensuring smooth operation of the Kubernetes backend for algorithm inference.
…_from_pod function in kubernetes.py
…rnetes.py`. Tests cover command argument building, file handling, job creation, and PVC management for different algorithm configurations.
…`local_base_dir` parameter
…hanced Kubernetes backend configuration
|
/format |
|
🤖 I will now format your code with black. Check the status here. |
There was a problem hiding this comment.
Pull Request Overview
This PR adds Kubernetes backend support to the BraTS orchestrator, enabling remote algorithm execution via Kubernetes Jobs as an alternative to local Docker/Singularity containers. Key changes include:
- New Kubernetes backend implementation with job orchestration, file transfer, and PVC management
- Integration of Kubernetes backend into the existing algorithm inference pipeline
- Support for configurable Kubernetes resources via
kubernetes_kwargsparameter
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 14 comments.
Show a summary per file
| File | Description |
|---|---|
| brats/constants.py | Added KUBERNETES enum value to Backends |
| brats/core/kubernetes.py | New module implementing Kubernetes job execution with PVC management, file transfers, and pod lifecycle management |
| brats/core/brats_algorithm.py | Updated _infer_single and _infer_batch to support kubernetes_kwargs parameter and dispatch to Kubernetes backend |
| brats/core/segmentation_algorithms.py | Added kubernetes_kwargs parameter to infer_single and infer_batch methods for both Adult and Pediatric classes |
| tests/core/test_kubernetes.py | Comprehensive test suite for Kubernetes backend functionality |
| README.md | Added documentation for Kubernetes backend usage with configuration examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…new packages: `cachetools`, `durationpy`, `google-auth`, `kubernetes`, `oauthlib`, `pyasn1`, `pyasn1-modules`, `requests-oauthlib`, `rsa`, and `websocket-client`. Adjust version constraints and add optional dependencies for improved functionality.
…Lesion/BraTS into 113-kubernetes-integration
… path handling based on algorithm year, and streamline command execution logging.
There was a problem hiding this comment.
Pull Request Overview
Copilot reviewed 7 out of 8 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… and modify type hints in `kubernetes.py` for clarity. Enhance logging in `_download_folder_from_pod` and add TODO comments for future security context implementation. Remove commented-out code in `run_job` and adjust test cases in `test_kubernetes.py` to reflect changes.
|
Currently waiting for @SimoneBendazzoli93. Also, we have a student (Mehrnaz) who will look into this. |
… with customizable timeout and polling intervals
…rable resource retention
…py and kubernetes.py
…ython compatibility adjustments
…nd update test to reflect command output change
|
Hi @neuronflow and @MarcelRosier 😄 I have pushed some fixes as suggested in the review. |
|
welcome back @SimoneBendazzoli93 :) please address the above comments. |
…s in pyproject.toml
…ditional block in brats_algorithm.py
…, enhancing error handling, and ensuring safe extraction of tar files
…ror handling for job completion failures
…ependencies and specify Kubernetes version constraint
| core_v1_api = client.CoreV1Api() | ||
| logger.info(f"Waiting for Pod '{pod_name}' to be running...") | ||
| poll_attempts = int(timeout_seconds / poll_interval) | ||
| for _ in range(poll_attempts): | ||
| pod = core_v1_api.read_namespaced_pod(name=pod_name, namespace=namespace) | ||
| pod_phase = pod.status.phase | ||
| if pod.status.init_container_statuses: | ||
| exit_loop = False | ||
| for init_status in pod.status.init_container_statuses: | ||
| state = init_status.state | ||
| if state and state.running: | ||
| logger.info( | ||
| f"Pod '{pod_name}' initContainer '{init_status.name}' is running." | ||
| ) | ||
| exit_loop = True | ||
| break | ||
| if exit_loop: | ||
| break | ||
| if _check_pod_terminal_or_running(pod_phase, pod_name): | ||
| break | ||
| elif _check_pod_terminal_or_running(pod_phase, pod_name): | ||
| break | ||
| time.sleep(poll_interval) |
| base_folder_name = Path(folder_name).name | ||
| tarfile_path = local_base_dir / f"{base_folder_name}" | ||
| with open(tarfile_path, "wb") as tarfile_obj: | ||
| tarfile_obj.write(tar_data) |
There was a problem hiding this comment.
@SimoneBendazzoli93 consider using the Python tempfile storage here, this will also do automatic garbage collection etc.
https://docs.python.org/3/library/tempfile.html
| unknown_kubernetes_kwargs = set(kubernetes_kwargs) - VALID_KUBERNETES_KWARGS | ||
| if unknown_kubernetes_kwargs: | ||
| raise ValueError( | ||
| f"Unknown kubernetes_kwargs keys: {unknown_kubernetes_kwargs}" | ||
| ) |
| ## Kubernetes Support | ||
| BraTS orchestrator also supports Kubernetes to run the algorithms remotely, as an alternative to local execution with Docker or Singularity. | ||
| To use Kubernetes for execution, the orchestrator will automatically use your kubeconfig file from the default location (`~/.kube/config`). If your kubeconfig file is not in the default location, set the `KUBECONFIG` environment variable to the location of your kubeconfig file: | ||
| ```bash | ||
| export KUBECONFIG=/path/to/kubeconfig | ||
| ``` |
| By default, as shown above, the algorithm runs in the default Kubernetes namespace. It uses the default StorageClass and automatically creates a 1Gi PersistentVolumeClaim (PVC) to manage input and output data. If needed, you can customize settings such as the namespace, PVC name, storage size, storage class, job name, and mount path by providing related keyword arguments to the `infer_single` method. The `data_mount_path` parameter determines where the PVC will be mounted inside the Pod. | ||
| When using Kubernetes, the algorithm is executed inside a Kubernetes Job. Input data is first uploaded to a PersistentVolume, which is mounted into the Pod running the job. After the algorithm finishes running in the Pod, the output data is transferred back from the cluster to your local machine. |
| timeout_seconds (int): The timeout in seconds. Defaults to 600. | ||
| poll_interval (float): The poll interval in seconds. Defaults to 2.0. | ||
| keep_resources (bool): When False (default), delete created jobs and PVCs after the run completes or fails. |
|
|
||
| [project.optional-dependencies] | ||
| preprocessing = ["brainles_preprocessing>=0.6.7; python_version >= '3.10'"] | ||
| kubernetes = ["kubernetes>=34.1.0,<35.0.0"] |
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
|
fixes #113