Skip to main content
Thoras forecasts future resource usage and scaling needs based on historical patterns. This reference documents how forecasting works and the key concepts involved.

How Forecasting Works

Thoras continuously collects workload metrics and uses those timeseries metrics to generate forecasts. When a forecast is generated, Thoras predicts future resource usage within the forecast block. When a target is in autonomous mode and the delta between forecasted and current usage exceeds configured thresholds, the target is proactively scaled to the predicted maximum within the forecast block, plus a small buffer. Thoras generates forecasts for two scaling dimensions:

Forecast Configuration

Forecasting behavior is configured in the model section of an AIScaleTarget. See the AIScaleTarget Definition for configuration options including:

Model Training

Models continuously adapt to a workload’s usage patterns, becoming better tuned for a workload over time.

Training Period

  • Newly Enrolled AiScaleTarget: Models begin learning a workload’s usage patterns after collecting initial workload metrics. Initial forecasts are based on a small sample size of workload usage data, and will improve as the data sample size increases. AiScaleTargets require 3 hours of metrics before being eligible for auto-scaling.
  • Minimum: Allow at least 24 hours in recommendation mode before switching to autonomous mode.
  • Optimal: 2 weeks of data provides the best accuracy, especially for workloads with weekly patterns.

Data Requirements

For accurate forecasts, Thoras needs:
  • Consistent metric collection from workloads
  • Representative traffic patterns (including peak and off-peak periods)

Container Startup Resource Spikes

Software often experiences temporary spikes in CPU and memory usage during initialization. These spikes can significantly exceed the container’s steady-state operating resource consumption.

How Forecasting Handles This

Newly started containers are automatically detected. Metrics from newly started containers are excluded from forecasts by default, giving the container time to reach steady-state operation. The forecaster then trains on adjusted metrics that represent your workload’s actual operational resource usage, resulting in:
  • More accurate resource predictions and reduced thrashing
  • Right-sized resource allocations
  • Reduced infrastructure costs

Viewing Startup Spikes

In the dashboard, you can toggle “Show restarts” on the Actual vs Predicted Usage chart to see the difference between:
  • Adjusted usage (default) - Resource usage with startup spikes filtered out. This adjusted data is what the forecaster uses to make predictions.
  • Actual usage - Raw resource usage including all startup spikes.
This helps visualize how forecasting excludes startup spikes.

Opting Out Of Excluding Startup Metrics From Forecasts

You may opt-out of excluding metrics for newly started containers from forecasting. Note that opting out may increase the volatility of both usage and forecasts, especially for low-usage workloads. To include startup metrics in forecast model training, set ignoreNewPods to false in your Helm values:
forecaster:
  # true (default): exclude metrics for newly started containers from model training.
  # false: include metrics for newly started containers in model training.
  ignoreNewPods: false