Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.thoras.ai/llms.txt

Use this file to discover all available pages before exploring further.

There are three ways to stop Thoras from autonomously scaling workloads, ordered from broadest to narrowest scope. Choose the option that matches the blast radius you need.
ApproachScopeRestarts PodsWorkload requestsModel & data retention
System-wide pauseAll autonomous ASTsNoRunning pods keep the last applied suggestion; restarted pods come up with the controller specThoras keeps forecasting and producing suggestions
recommendation modeOne ASTYesReverts to the controller specThoras keeps forecasting and producing suggestions
Unenroll (delete AST)One ASTYesReverts to the controller specHistory and model state for the AST is discarded; workload is no longer managed or observed by Thoras

1. System-wide pause

Halts all autonomous scaling actions across the entire cluster. Thoras continues collecting metrics and producing recommendations; only the apply step is suspended. Workloads in recommendation mode are unaffected. Pausing does not restart workload pods. Running pods keep the resource requests Thoras last applied to them. When new pods are created they will use the requests defined in the controller spec. When to use: planned maintenance windows, incident response, or baseline validation where you want to freeze allocations cluster-wide without changing any AIScaleTarget definitions. How to do it:
  1. Open the Manage Cluster dropdown in the dashboard header.
  2. Select Pause autonomous scaling.
  3. Confirm in the flyout, which shows the count of autonomous targets that will be affected.
To resume, open the same dropdown and select Resume autonomous scaling. The pause state persists across pod restarts and upgrades via the thoras-operator-system-config ConfigMap. See Pausing Autonomous Scaling for pod behavior during pause, visual indicators, and the ConfigMap-based advanced workflow.

2. Switch the AST from autonomous to recommendation mode

Stops autonomous scaling for a single workload while keeping the AIScaleTarget enrolled. Thoras continues to forecast and surface suggestions in the dashboard, but no scaling actions are applied.
Unlike a system-wide pause, switching to recommendation mode restarts the workload’s pods and reverts it to the requests defined in the controller spec.
When to use: you want to keep the workload monitored and the model warm (for example, to compare suggestions against actual usage), but you don’t want Thoras applying changes for that specific workload. How to do it: Edit the AIScaleTarget and set the active scaling direction’s mode to recommendation:
kubectl edit ast {{YOUR_AST_NAME}} -n {{YOUR_NAMESPACE}}
For a vertically autonomous workload:
spec:
  vertical:
    mode: recommendation
For a horizontally autonomous workload:
spec:
  horizontal:
    mode: recommendation
Recommendations continue to appear in the dashboard. To re-enable autonomous scaling later, set mode back to autonomous. See Understanding Vertical and Horizontal Scaling Modes for how the two directions interact, and the AIScaleTarget reference for the full mode specification.

3. Unenroll the workload (delete the AST)

Removes the workload from Thoras entirely. The workload is restarted and its pods come up with the resource requests defined in the controller spec.
Deleting an AST is permanent. Historical suggestions, utilization data, and model state for that AST will be lost. Re-creating an AST with the same name later produces a brand-new AST. If you want to stop scaling actions while keeping history intact, use option 1 or 2 above.
When to use: the workload should no longer be managed or observed by Thoras at all. For example, when the workload is being decommissioned, moved to a different scaling system, or you want a clean re-enrollment later. How to do it:
kubectl delete ast {{YOUR_AST_NAME}} -n {{YOUR_NAMESPACE}}