JVM Predictive Auto-Scaling
Thoras can predictively tune JVM heap settings (-Xmx and -Xms) alongside
container cpu and memory requests for Java workloads. Instead of hard-coding
heap sizes or relying on JVM ergonomics, Thoras forecasts heap and non-heap
memory usage and derives optimal values on every pod creation.
How It Works
Overview
Traditional JVM memory configuration is static: you set-Xmx (maximum heap)
and -Xms (initial heap) at deployment time and hope they’re sufficient. If
heap usage grows, you get OOM kills. If it shrinks, you waste resources.
Thoras solves this by:
- Collecting JVM metrics (
jvm_memory_used_bytesfor heap and non-heap areas) directly from your application - Forecasting future heap and non-heap memory needs using historical patterns
- Computing optimal
-Xmx,-Xms, and container memory request/limit values - Injecting these as environment variables into pods at creation time via a mutating admission webhook
The Calculation
Thoras uses three configurable buffers to compute JVM settings from its forecasts:| Parameter | Default | Purpose |
|---|---|---|
xmx_buffer | 10% | Headroom above observed/predicted heap usage for -Xmx |
xms_ratio | 80% | Ratio of -Xms to -Xmx (initial heap as a fraction of max heap) |
memory_buffer | 10% | Headroom above total JVM memory (heap + non-heap) for the container memory request |
xmx_buffer:
xmx_bounds are configured, the value is clamped within those bounds.
Step 3: Calculate Xms (initial heap)
The initial heap size is a ratio of the maximum:
Worked Example
Given:- Heap forecast: 200 MiB
- Non-heap forecast: 50 MiB
- All buffers at defaults (10%
xmx_buffer, 80%xms_ratio, 10%memory_buffer)
| Step | Calculation | Result |
|---|---|---|
| Xmx | 200 MiB * 1.10 | 220 MiB |
| Xms | 220 MiB * 0.80 | 176 MiB |
| Container memory | (220 MiB + 50 MiB) * 1.10 | 297 MiB |
THORAS_JVM_HEAP_MEMORY_XMX=-Xmx230686720(220 MiB in bytes)THORAS_JVM_HEAP_MEMORY_XMS=-Xms184549376(176 MiB in bytes)- Container memory request: 297 MiB
What Gets Injected
Thoras injects two environment variables into the JVM container on pod creation:| Environment Variable | Format | Description |
|---|---|---|
THORAS_JVM_HEAP_MEMORY_XMX | -Xmx<bytes> | Maximum heap size flag (e.g., -Xmx230686720) |
THORAS_JVM_HEAP_MEMORY_XMS | -Xms<bytes> | Initial/minimum heap size flag (e.g., -Xms184549376) |
JAVA_OPTS or
your application’s JVM argument configuration.
Configuration Guide
Prerequisites
- Prometheus JVM metric endpoint: Your Java application must expose
jvm_memory_used_bytesmetrics witharea="heap"andarea="nonheap"labels. Most JVM metric exporters (Micrometer, JMX Exporter, etc.) produce these by default. - Vertical scaling enabled on the AIScaleTarget in
auto(autonomous) mode.
Step 1: Configure Your Application to Use the Environment Variables
Your application must reference the Thoras-injected environment variables for its JVM heap settings. The simplest approach is to reference them inJAVA_OPTS
or JAVA_TOOL_OPTIONS:
JAVA_OPTS:
$(...) references at container startup using the
environment variables that Thoras injects.
Step 2: Create the AIScaleTarget
Configure an AIScaleTarget with JVM options enabled on the target container. The key requirements are:- Vertical mode must be set to
auto - The container must have
memory.limit.ratioset (required when JVM auto-sizing is enabled) jvm_options.enable_auto_heap_sizemust betrue
AIScaleTarget JVM Fields Reference
All JVM configuration lives underspec.vertical.containers[].jvm_options:
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
enable_auto_heap_size | bool | Yes | false | Enables automatic JVM heap sizing for this container. |
xmx_buffer | percentage | No | 10% | Buffer added above observed/predicted heap usage when computing -Xmx. Provides headroom to absorb heap growth between forecast cycles. |
xms_ratio | percentage | No | 80% | Ratio of -Xms to -Xmx. Controls how much heap the JVM pre-allocates at startup. Higher values reduce GC pressure from heap expansion. |
memory_buffer | percentage | No | 10% | Buffer added above total JVM memory (Xmx + non-heap) when computing the container memory request. Accounts for native memory, thread stacks, metaspace growth, etc. |
xmx_bounds.lowerbound | resource quantity | No | — | Minimum allowed Xmx value (e.g., 256Mi). Prevents the heap from being sized too small regardless of forecast. |
xmx_bounds.upperbound | resource quantity | No | — | Maximum allowed Xmx value (e.g., 4Gi). Caps the heap to prevent runaway growth. |
| Field | Relevance |
|---|---|
memory.limit.ratio | Required when enable_auto_heap_size is true. Defines the ratio of memory limit to request (e.g., "1.0" sets limit equal to request). |
memory.lowerbound | Minimum container memory request. Acts as a floor. |
memory.upperbound | (Optional) Maximum container memory request. Acts as a ceiling. |
Full Example with Custom Buffers
- Adds 15% headroom on heap forecasts when computing Xmx
- Sets Xms to 75% of Xmx
- Adds 20% headroom for the container memory request
- Clamps Xmx between 512 MiB and 4 GiB regardless of forecast
- Keeps the container memory request between 256 MiB and 8 GiB
Multi-Container Pods
JVM auto-scaling supports one JVM container per pod. In multi-container pods, only the container withjvm_options.enable_auto_heap_size: true will
have heap settings managed. Other containers (sidecars, init containers, etc.)
are unaffected:
my-jvm-app gets Thoras-managed heap settings while sidecar
gets standard vertical scaling (or no scaling, depending on configuration).
Constraints and Considerations
- One JVM container per pod: Only one container in the pod can have
enable_auto_heap_size: true. - Requires
memory.limit: The container must have amemory.limit.ratioconfigured. This ensures the JVM has a bounded memory limit. A ratio of"2.0"is typical for JVM workloads since the JVM manages its own memory within the heap boundary. - Never scales below current usage: Thoras always takes the maximum of the forecast and current observed usage to prevent sizing below what the JVM is actively using.
- Metrics must be available: The application must expose
jvm_memory_used_byteswitharea="heap"andarea="nonheap"labels. If these metrics are missing, JVM mutation is skipped for that pod. - Applied at pod creation: JVM settings are injected via the admission webhook when pods are created. Existing running pods are not modified in place — a rollout is required for new settings to take effect.