How It Works
When OOM kills are detected on a workload, Thoras:- Bumps memory requests and limits for affected containers by a 1.2x multiplier, computed independently from the average current memory request and limit across all running pods. Both request and limit are raised since the kernel OOM-kills based on the limit, and request guarantees node memory.
- Applies the adjustment to running pods and any newly created pods. For
running pods, Thoras uses your
update_policy: it attempts an in-place resize first (if enabled), and falls back to a rolling restart (if enabled) when resize fails or is not supported. Any pods created or rescheduled in the meantime (e.g., evictions, scale-out) automatically receive the adjusted memory regardless of whether the resize or restart succeeds. - Repeats every 2 minutes as long as OOM kills continue, compounding each time (1.2×, then 1.44×, then 1.73×, and so on) until OOMs stop.
- Holds the memory floor for a configurable stabilization window after the last OOM. During this window, forecasts can raise memory above the floor but cannot lower it. This prevents a new forecast, which may not yet reflect the OOM episode, from reversing the adjustment down to a point which would be prone to OOMing.
- Returns full control to the forecaster once the stabilization window expires. If OOMs recur, the cycle begins again.
Enabling OOM Remediation
Addoom_remediation to your spec.vertical configuration:
oom_remediation.enabled must be true and spec.vertical.mode must be
autonomous. Workloads in recommendation mode are not affected.
Stabilization Window
After Thoras applies an OOM memory adjustment, it holds a memory floor for the duration of the stabilization window. During this window:- Incoming forecasts cannot lower memory below the adjusted value.
- Forecasts can raise memory above the adjusted value if the forecast recommends it.
- Each new OOM resets the stabilization window, extending it from the time of the most recent OOM. The workload must be stable for the configured interval before memory may be scaled down from the multiplied value.
max(forecast_interval, 1h). The window is at minimum one
full forecast cycle, ensuring the forecaster has had at least one opportunity to
observe the workload at the adjusted memory level before the floor is removed.
The stabilization window should be long enough for the forecaster to observe
memory usage at the adjusted level, so future recommendations account for the
higher usage that triggered the OOMs.
To override the default, set stabilization_window explicitly:
Upper Bounds
memory.upperbound is a steady-state policy and is intentionally bypassed
during OOM remediation. OOM is an emergency response: clamping the adjustment to
upperbound would leave the workload exposed to the same kill threshold that
triggered the cycle in the first place. Adjustments may therefore land above the
configured upper bound while the stabilization window is active.
Once the stabilization window expires, the forecaster regains full control and
steady-state recommendations are clamped to upperbound as usual.
Memory Limits
OOM remediation raises both the memory request and the memory limit. How the limit is determined depends on whether Thoras manages it:-
Limit managed by Thoras (
memory.limit.ratiois set): during the stabilization window, the limit ismax(adjusted_request × ratio, adjusted_limit). The ratio is preserved on the way up, and the OOM-adjusted limit acts as a floor so a non-OOM reconcile cannot snap the limit back below it mid-episode. After the window expires, the limit follows the ratio normally as the forecaster regains control. -
Limit not managed by Thoras (no
memory.limit.ratio): the limit is raised by the same multiplier as the request during the stabilization window to give the workload room to recover. When the window expires, Thoras reverts the limit to the value in your workload’s pod template (deployment, statefulset, etc.). If your workload’s true steady-state memory usage permanently exceeds the limit set in your pod template, OOMs will recur each time the limit reverts, visible as repeatedOOMKilledevents. The resolution is to raise the limit in the workload’s pod template, or to configurememory.limit.ratioso Thoras can manage limits directly.
Interaction with Forecasts
During an active stabilization window, Thoras appliesmax(forecast, floor) to
memory when scaling. CPU is always sourced from the forecast unchanged.
When a new forecast arrives after the window has expired, the floor is cleared
and the forecast value is applied directly. If the workload OOMs again, the
remediation cycle restarts from the new forecast baseline.
Considerations
Memory may be held above the forecast during the stabilization window. If an OOM was caused by a one-off spike and the workload returns to lower memory usage afterward, requests will remain elevated until the window expires. Thoras favors avoiding another OOM over reclaiming memory immediately, and allows configuring the duration of the stabilization window. New pods after the window expires start at the forecast value. Once the floor clears, pods created from that point forward use the forecaster’s recommendation. If that recommendation is insufficient, another OOM may occur and trigger a new remediation cycle. Partial in-place resize failures. If some pods cannot be resized in place (e.g., the node lacks capacity), those pods continue at their previous memory until the node has room or they are rescheduled. The next adjustment cycle reads a blended average and targets a slightly higher value.Full Example
- OOM remediation is enabled with a 2-hour stabilization window
- Steady-state memory recommendations are clamped to 4 GiB (
upperbound); OOM adjustments may exceed this temporarily during stabilization - Pods are resized in place where possible, with recreation as fallback

