undershoot → cost_optimized → balanced → assurance_optimized →
overshoot
Because Thoras is forecasting future demand, no mode eliminates forecast misses
entirely. More aggressive modes are more likely to run closer to real demand,
while more conservative modes maintain more room above it.
Quick Reference
| Strategy | Approach | Recommended For |
|---|---|---|
undershoot | Allows recommendations to run below usage more often | Non-critical or resilient jobs |
cost_optimized | Tracks tightly with minimal headroom | Async, retry-safe workloads |
balanced (default) | Tracks usage with balanced efficiency and assurance | General production workloads |
assurance_optimized | Tracks above usage with moderate headroom | User-facing or consistency-sensitive workloads |
overshoot | Maintains elevated headroom based on historical usage | Critical workloads requiring additional consistency |
Mode Details
balanced (default)
- Balances efficiency and assurance with comfortable headroom
- Provides strong coverage against historically observed demand volatility
- Choose this for production workloads where you want both optimization and consistency
cost_optimized
- Uses tighter recommendations to reduce unused capacity
- May allow occasional brief resource pressure during rare demand spikes
- Best for workloads that tolerate moderate variability: background jobs, async services, retry-safe systems
undershoot
- Most aggressive low-headroom strategy
- Intentionally allows recommendations to run below observed demand more often
- Best for non-critical workloads, batch processing, or systems with strong retry logic and fallback behavior
assurance_optimized
- Tracks above typical usage patterns while remaining responsive to changes
- Stays close to usage for steady workloads, increasing headroom when variability rises
- Best for workloads where consistency is important without significantly increasing resource usage
overshoot
- Maintains consistent headroom based on historical usage levels
- Less responsive to short-term changes in usage patterns
- Best for workloads where maintaining additional headroom is preferred over tighter recommendations
mode: balanced.
Configuring Forecast Mode
Set the optimization strategy in your AIScaleTarget:Fine-Tuning Forecast Behavior
After choosing a forecast optimization strategy, you can further tune forecast behavior using additional settings. These are best used for workload-specific adjustments after selecting your primary mode.forecast_buffer_percentage
Adds a fixed percentage on top of the forecasted recommendation.
Use this when:
- The forecast is generally reasonable, but you want it to run slightly higher or lower overall
forecast_interval
Controls how often Thoras generates a new forecast.
Use this when:
- Workload behavior changes quickly and recommendations need to refresh more often
forecast_blocks
Controls how far into the future each forecast is trying to cover.
Use this when:
- You want Thoras to account for demand farther ahead in time
forecast_blocks should be greater than or equal to
forecast_interval to ensure each forecast covers the next update window.
Reference:
model.forecast_blocks
Quick Guidance
If you’re deciding what to adjust:- Change
modewhen you want forecasts to behave more aggressively or conservatively - Change
forecast_buffer_percentagewhen the forecast shape looks right, but you want to raise or lower the average prediction - Change
forecast_intervalwhen you want forecasts to update more or less frequently - Change
forecast_blockswhen you want to adjust how far ahead forecasts look (longer horizons are more conservative but may increase overprovisioning)
model.mode should be your first tuning lever. The settings
above are for smaller workload-specific adjustments.