Skip to main content
Following this quick checklist helps ensure your workloads are ready for predictive autoscaling.

Autonomous Mode Checklist

  1. For autonomously scaling, Thoras requires the ability to persist historical metrics, which requires a Kubernetes persistent volume. See Configure Persistent Volume page for a walkthrough
  2. For managing more than one hundred workloads, you’ll want to allocate more forecast workers. See Configure Worker Capacity
  3. Deploying Thoras with ArgoCD? Check out the ArgoCD config considerations
  4. [Optional] kubectl port-forwarding is a great way to experiment, but you may eventually want to expose the dashboard service internally to enable broader, more stable access. You can do this by enabling Ingress via $.Values.thorasDashboard.ingress or GatewayAPI via $.Values.thorasDashboard.gatewayAPI. See GitHub for further documentation
  5. Check predictions in the dashboard - confirm metric suggestions for your workload are accurate. The Thoras reasoning engine typically needs to see a pattern twice to learn about it; so model performance can take between a few minutes to a few days to reach maximum accuracy.
predictions-vs-usage
  1. Understand application behavior - for vertical scaling, ensure the workload and its dependencies can handle periodic restarts. See the vertical pod rightsizing and aiscaletarget pages for an overview of scaling and restart configuration options.
  2. Choose a scaling direction - only one scaling direction (horizontal or vertical) can be in autonomous mode at a time. Choose the direction that best fits your workload:
    • Horizontal for workloads with variable traffic patterns that need to scale the number of replicas to handle demand spikes.
    • Vertical for workloads with more predictable resource patterns that benefit from optimized CPU and memory requests to improve cluster utilization and reduce costs. Note: before enabling Autonomous Mode, verify no competing vertical autoscaling controllers are modifying pod resource requests for the same workloads. Running multiple controllers that attempt to adjust pod resource allocation can lead to unstable behavior, oscillation, or conflicting decisions.
See Understanding Vertical and Horizontal Scaling Modes for more details.