Skip to content

Job memory usage calculation -- (PSS calculation issue for new kernel) #11687

@z4027163

Description

@z4027163

There are large failures regarding Run3 Rereco WFs running on some of the sites with newer kernel versions. Details: Failures in Run 3 data reprocessing

The detailed reason is in this ticket as well. In short, Linux v6.0+ includes an additional field in smaps, Pss_Dirty. It is added to the PSS calculation and therefore jobs get killed earlier due to PSS exceeding the threshold.

We would like to seek a solution, such as using RSS as the metric to kill the jobs in terms of memory usage. Or else the sites with new kernels will constantly overestimate the memory usage and end jobs earlier.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions