For MVS Systems, move to malloc instead of MMAP for Constant Files#3400
For MVS Systems, move to malloc instead of MMAP for Constant Files#3400JacobEngelbrecht wants to merge 8 commits intoonnx:mainfrom
Conversation
|
Can one of the admins verify this patch? |
|
The code logic of mmap/munmap is such that when there are multiple threads calling In the proposed malloc code logic, however, all threads will malloc and then read in the entire file. They then compete to update the shared Also, sticking some malloc/read/free code in a function called Lastly, our Jenkin CI doesn't have a z/OS test machine so the code cannot be checked by the Jenkins CI. So the code will have to be tested somewhere else before it can be merged. |
Large file MMAP support on MVS systems can require system level alterations for memory allocations, sometimes making it easier to instead allocate memory in your address space and simply page in information at load time. Signed-off-by: Jacob Engelbrecht <jacob@engelbrecht.works>
48d843b to
d70f9bb
Compare
|
Can one of the admins verify this patch? |
|
Thank you for the feedback, we are working on verifying and this will be cleaned up and improved prior to officially opening the PR, thank you for the feedback and those changes will be implemented. |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
Can one of the admins verify this patch? |
|
still working on this, but verified approach. |
|
Can one of the admins verify this patch? |
|
@JacobEngelbrecht another thing to consider: who will free the allocated buffer? Since there might be multiple threads calling the same .so file, we don't know when all the threads finish. So we should provide a function like |
@tungld I haven't looked at the compiler code for a while so my understanding might be wrong. My understanding is that |
Yes, it is called from the entry point function inside .so, we can free the buffer at the end of the entry point function.
|
Support large constant files (>1GB) on z/OS via malloc fallback
Problem
z/OS has a 1GB limit (in reality around 1.2GB) on
mmap()operations, which causes failures when loading large constant weight files for neural network models. System-level configuration changes would be required to increase this limit, which is not always feasible in production environments.Solution
Implemented a system for z/OS that uses
malloc()+ chunkedread()instead ofmmap()for constant files larger than 1GB:read()behavior where large reads may return fewer bytes than requestedTechnical Details
mmap()with__MAP_MEGAflag on z/OSmalloc()+read()with proper cleanup on errors on z/OSmmap()for all file sizesomMallocAndReadFile()to encapsulate the malloc+read logicTesting
Tested with GPT-2XL model (6.5GB constant file) on z/OS - successfully loads and runs inference without system configuration changes. Tested with BERT uncased for <1GB constant file testing.
Notes
omMMapBinaryFileto avoid additional file changes and testing changes.