This repositiory documents an investigation into MLOps and MLOps tools and platforms conducted as part of SURF's MultiCloud Capabilities project. The goal of the project was to create an inventory of essential MLOps components, as well as get hands-on experience with an MLOps platform, in our case Kubeflow.
MLOps is a set of practices for collaboration and communication between data scientists and operations. It follows the DevOps paradigm, which provides tools and practices for bridging the gap between development and operations.
An end-to-end MLOps platform should provide the following functionality:
- Data management and preprocessing
- Experimentation and model development
- Model deployment and serving
- Model monitoring and performance tracking
- Collaboration and version control
- Automated pipelining and workflow orchestration
- Model governance and compliance
- Integration with ML tools and libraries
There are various MLOps tools and platforms available, including Kubeflow, MlFlow, and ClearML. Each tool has its own strengths and weaknesses, and the choice of tool depends on the specific needs of the project.
MLOps is a critical component for professionalizing and operationalizing ML practices. An MLOps platform should provide a range of functionality to support the entire machine learning lifecycle. The required functionality depends on the MLOps maturity level that is being targeted, which can be different in a commercial context as compared to a research context.