Skip to content

Cover Inference Scalability #2

@DavidSolan0

Description

@DavidSolan0

As a Data Scientist, it is crucial to ensure the scalability of our model inference when deploying it into production. This GitHub issue addresses two key problems that can hinder inference scalability: computational complexity and memory management. We propose tackling these challenges by migrating the data preparation process from pandas to Spark, aiming to save time and computational resources.

Computational Complexity:

  • By migrating data preparation to Spark, which excels at distributed computing, we can leverage its parallel processing capabilities to handle larger workloads more efficiently.

Memory Management:

  • By migrating to Spark, we can benefit from its memory management capabilities, such as memory caching and efficient data storage formats, which can help mitigate memory overflow issues.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions