Skip to content

Blazegraph performance: Add JVM heap memory configuration support #33

@WolfgangFahl

Description

@WolfgangFahl

Problem

Blazegraph containers started via pyomnigraph have poor SPARQL query performance due to insufficient JVM heap memory allocation. The default Docker container likely uses only 512MB-1GB heap, which is inadequate for complex queries on large datasets like GOV genealogy data.

Complex SPARQL queries (especially with property paths like (gp:isPartOf/gp:ref){1,10}) timeout or run very slowly on local dockerized Blazegraph instances.

Root Cause

The BlazegraphConfig.get_docker_run_command() method in omnigraph/servers/blazegraph.py (lines 37-44) does not set JVM heap parameters via the JAVA_OPTS environment variable.

Proposed Solution

Add JVM memory configuration support to the Blazegraph docker run command:

docker_run_command = (
    f"docker run -d --name {self.container_name} "
    f"-e BLAZEGRAPH_UID={os.getuid()} "
    f"-e BLAZEGRAPH_GID={os.getgid()} "
    f"-e JAVA_OPTS='-Xmx4g -Xms4g' "  # Add JVM heap settings
    f"-p {self.port}:8080 "
    f"-v {data_dir}/RWStore.properties:/RWStore.properties "
    f"-v {data_dir}:/data "
    f"{self.image}"
)

Recommendations

  1. Make it configurable: Add optional java_opts or heap_size field to ServerConfig
  2. Sensible defaults: Use at least 4GB heap by default for production use
  3. Consider additional optimizations:
    • Add query hint settings: -Dcom.bigdata.btree.writeRetentionQueue.capacity=4000
    • Add branching factor: -Dcom.bigdata.btree.BTree.branchingFactor=128

Configuration example

Allow users to specify in servers.yaml:

blazegraph:
  server: "blazegraph"
  heap_size: "6g"  # or
  java_opts: "-Xmx6g -Xms6g -Dcom.bigdata.btree.writeRetentionQueue.capacity=4000"

Impact

This significantly improves query performance for users working with large RDF datasets and complex SPARQL queries.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions