Skip to main content
Thoras forecasts future demand for your workload and recommends resource levels accordingly. You can choose an optimization strategy that reflects your operational priorities. All strategies forecast demand and remain peak-aware. The difference is how closely recommendations track usage versus how much additional margin they maintain. Thoras modes span a spectrum from cost optimization to assurance focused: undershootcost_optimizedbalancedassurance_optimizedovershoot Because Thoras is forecasting future demand, no mode eliminates forecast misses entirely. More aggressive modes are more likely to run closer to real demand, while more conservative modes maintain more room above it.

Quick Reference

StrategyApproachRecommended For
undershootAllows recommendations to run below usage more oftenNon-critical or resilient jobs
cost_optimizedTracks tightly with minimal headroomAsync, retry-safe workloads
balanced (default)Tracks usage with balanced efficiency and assuranceGeneral production workloads
assurance_optimizedTracks above usage with moderate headroomUser-facing or consistency-sensitive workloads
overshootMaintains elevated headroom based on historical usageCritical workloads requiring additional consistency

Mode Details

balanced (default)

  • Balances efficiency and assurance with comfortable headroom
  • Provides strong coverage against historically observed demand volatility
  • Choose this for production workloads where you want both optimization and consistency

cost_optimized

  • Uses tighter recommendations to reduce unused capacity
  • May allow occasional brief resource pressure during rare demand spikes
  • Best for workloads that tolerate moderate variability: background jobs, async services, retry-safe systems

undershoot

  • Most aggressive low-headroom strategy
  • Intentionally allows recommendations to run below observed demand more often
  • Best for non-critical workloads, batch processing, or systems with strong retry logic and fallback behavior

assurance_optimized

  • Tracks above typical usage patterns while remaining responsive to changes
  • Stays close to usage for steady workloads, increasing headroom when variability rises
  • Best for workloads where consistency is important without significantly increasing resource usage

overshoot

  • Maintains consistent headroom based on historical usage levels
  • Less responsive to short-term changes in usage patterns
  • Best for workloads where maintaining additional headroom is preferred over tighter recommendations
If not specified, the default is mode: balanced.

Configuring Forecast Mode

Set the optimization strategy in your AIScaleTarget:
spec:
  model:
    mode: assurance_optimized
You can also configure different modes per metric. See model.metrics reference for details.

Fine-Tuning Forecast Behavior

After choosing a forecast optimization strategy, you can further tune forecast behavior using additional settings. These are best used for workload-specific adjustments after selecting your primary mode.

forecast_buffer_percentage

Adds a fixed percentage on top of the forecasted recommendation. Use this when:
  • The forecast is generally reasonable, but you want it to run slightly higher or lower overall
Reference: model.forecast_buffer_percentage

forecast_interval

Controls how often Thoras generates a new forecast. Use this when:
  • Workload behavior changes quickly and recommendations need to refresh more often
Reference: model.forecast_interval

forecast_blocks

Controls how far into the future each forecast is trying to cover. Use this when:
  • You want Thoras to account for demand farther ahead in time
Note: forecast_blocks should be greater than or equal to forecast_interval to ensure each forecast covers the next update window. Reference: model.forecast_blocks

Quick Guidance

If you’re deciding what to adjust:
  • Change mode when you want forecasts to behave more aggressively or conservatively
  • Change forecast_buffer_percentage when the forecast shape looks right, but you want to raise or lower the average prediction
  • Change forecast_interval when you want forecasts to update more or less frequently
  • Change forecast_blocks when you want to adjust how far ahead forecasts look (longer horizons are more conservative but may increase overprovisioning)
In most cases, model.mode should be your first tuning lever. The settings above are for smaller workload-specific adjustments.