-
Notifications
You must be signed in to change notification settings - Fork 840
Description
We have a feature where we use wasm-opt to forcibly remove some functions in the .wasm file, to optimize down code size.
In our unit tests, we have tests that remove one or more functions, and assert that the size of those files is reduced.
Now after updating from Emscripten 3.1.38 to 4.0.19, we find these assertions failing. Investigating, it looks like the mere act of re-saving a .wasm file with wasm-opt results in the size of the .wasm file growing.
This is something that didn't happen with Emscripten 3.1.38.
- we have a
emcc -Os --profiling-funcs -o foo.wasmcompiled file -> foo.wasm.zip - then we run
wasm-opt foo.wasm --all-features -g -o out.wasmon it. -> out.wasm.zip - we observe that
out.wasmhas grown in size, compared to originalfoo.wasm. A diff can be seen here -> https://oummg.com/dump/diff_foo_out.html (be patient opening the url, this is a very large .html file on a slow network connection)
In our actual usage, in step 2, we also add --strip-function arg, but for purposes of reproing this behavior, this can be skipped.
foo.wasm: 41,439,952 bytes
out.wasm: 41,566,424 bytes
out.wasm has grown by +126,472 bytes.
This growth on re-serializing the file is a little bit tricky, since it makes it more difficult to unit test the behavior of stripping away functions.
At first we thought that we would just re-serialize the file once before setting the baseline, and only then run wasm-opt a second time to strip the functions we are interested in, but here I notice that wasm-opt in.wasm --all-features -g -o out.wasm is not stable under iteration. E.g. when I iterate re-saving the file, the size slowly creeps for a bit:
though it does seem to stabilize after a while.
Well, that is only a bit of a sidenote.
Coming back to the initial foo.wasm -> out.wasm code size increase, the diff page shows that the majority of the increase are located in two functions:
ParticleSystem::UpdateModulesPreSimulationIncremental(ParticleSystemUpdateData const&, ParticleSystemParticles&, unsigned long, unsigned long, float vector[4] const&, bool)
which grows by +16,535 bytes, and a
ParticleSystemGeometryJob::RenderJobCommon(ParticleSystemGeometryJob&, ParticleSystemParticlesTempData&, void*, void*)
which grows by +3,328b
Other functions grow by smaller amounts:
When I disassemble the before/after .wasm files, curiously the size of ParticleSystem::UpdateModulesPreSimulationIncremental in WAST is about 400 lines shorter in # of text lines after the re-serialization.
I.e. a -400 lines shorter in WAST, but +16,535 bytes larger in binary.
One thing I can see is that there are named locals $scratch_xxx that appear in the re-serialized file, that are not present in the original:
I suppose using -g makes wasm-opt generate those.
However, there are only 172 of scratch_* locals, and the string "scratch_" is only 8 characters, so that would account for only +1376 byte increase in added debug names? That is still missing a lot from the +16,535 bytes.
Apart from that, the disassembled functions are quite similar. At many places, they diff different to only the indexing of local variables. In some places the organization of the implementations differ. Though, the re-serialized WAST is 400 lines shorter still.
Here are the baseline and re-saved .wast dumps of the single large increased file:
I wonder if there might be any clues here as to why re-saving a .wasm file could grow in size like this? This is behavior that we did not experience with Emscripten 3.1.38, which was a very convenient property for setting up our unit tests for code stripping.