Problem
The current --shrink flag uses a heuristic approach — it removes known-unnecessary files by pattern (.git, LICENSE, SCM metadata, etc.). This is safe but leaves significant optimization potential on the table. Many resources inside uberjars are never accessed at runtime but cannot be identified by simple pattern matching.
Inspiration
GraalVM's native-image performs reachability analysis on resources — only resources that are actually referenced by reachable code are included in the final binary. Their build reports show that erroneous resource inclusion (e.g., overly broad regex patterns in resource configuration) is one of the most common causes of bloated binaries.
Their size optimization guide demonstrates that removing unreachable resources and code can reduce binary size by 40%+ in some cases.
Expected Outcome
An enhanced --shrink that goes beyond pattern matching to perform actual analysis of resource usage:
Analysis Layers
- Class reference analysis: scan bytecode for
getResource(), getResourceAsStream(), and similar calls to identify which resources are actually loaded
- Namespace dependency analysis: for Clojure AOT classes, trace namespace require chains to identify orphan namespaces (dev/test utilities bundled in production)
- Duplicate detection: find identical files included multiple times from different dependencies (common with
LICENSE, NOTICE, META-INF/services)
- Native library analysis: detect platform-specific native libs for other platforms (e.g., Linux binary bundling
.dll files from a Windows dependency)
- Dev-dependency detection: identify resources from dependencies that are typically dev-only (test frameworks, REPL utilities, documentation generators)
Shrink Levels
jbundle build --shrink # current behavior (safe patterns only)
jbundle build --shrink aggressive # add reachability analysis
jbundle build --shrink report # show what would be removed without removing
Categories of Removable Content
| Category |
Example |
Risk |
| SCM metadata |
.git/, .svn/, pom.properties |
None |
| Duplicate licenses |
Multiple LICENSE.txt from deps |
None |
| Wrong-platform natives |
.dll in Linux build, .dylib in Linux |
None |
| Unreferenced resources |
XML configs for unused features |
Low |
| Dev/test namespaces |
*-test.class, dev/*.class |
Low |
| Unused service providers |
META-INF/services/ for unused interfaces |
Medium |
| Unused class files |
Classes from deps never referenced |
Medium |
Safety Mechanism
--shrink (no argument) remains safe and conservative (current behavior)
--shrink aggressive performs deeper analysis but may break apps with highly dynamic resource loading
--shrink report (or integration with analyze) shows potential savings without removing anything
- Allow
--shrink-keep <pattern> to whitelist resources that analysis marks as removable but the user knows are needed
Impact Estimate
For a typical Clojure web application uberjar:
- Current
--shrink: removes ~5-15% (metadata, licenses, SCM)
- With reachability analysis: could remove ~20-40% (unused deps, wrong-platform natives, dev resources)
Problem
The current
--shrinkflag uses a heuristic approach — it removes known-unnecessary files by pattern (.git,LICENSE, SCM metadata, etc.). This is safe but leaves significant optimization potential on the table. Many resources inside uberjars are never accessed at runtime but cannot be identified by simple pattern matching.Inspiration
GraalVM's native-image performs reachability analysis on resources — only resources that are actually referenced by reachable code are included in the final binary. Their build reports show that erroneous resource inclusion (e.g., overly broad regex patterns in resource configuration) is one of the most common causes of bloated binaries.
Their size optimization guide demonstrates that removing unreachable resources and code can reduce binary size by 40%+ in some cases.
Expected Outcome
An enhanced
--shrinkthat goes beyond pattern matching to perform actual analysis of resource usage:Analysis Layers
getResource(),getResourceAsStream(), and similar calls to identify which resources are actually loadedLICENSE,NOTICE,META-INF/services).dllfiles from a Windows dependency)Shrink Levels
Categories of Removable Content
.git/,.svn/,pom.propertiesLICENSE.txtfrom deps.dllin Linux build,.dylibin Linux*-test.class,dev/*.classMETA-INF/services/for unused interfacesSafety Mechanism
--shrink(no argument) remains safe and conservative (current behavior)--shrink aggressiveperforms deeper analysis but may break apps with highly dynamic resource loading--shrink report(or integration withanalyze) shows potential savings without removing anything--shrink-keep <pattern>to whitelist resources that analysis marks as removable but the user knows are neededImpact Estimate
For a typical Clojure web application uberjar:
--shrink: removes ~5-15% (metadata, licenses, SCM)