Skip to main content

Overview

Action Metrics

See Thoras in action. These metrics provide insight into how Thoras actively manages your application resources across your environment. They track when Thoras takes autonomous action to scale horizontally or vertically, when replicas or resource allocations differ from Thoras’ desired values, and how each AIScaleTarget is currently operating. Together, these metrics give teams the visibility they need to understand scaling behavior, make informed decisions, and maintain efficient, well-performing applications.
Metric NameTypeDescriptionLabels
thoras_horizontal_scale_totalCounterCounts horizontal scaling actions in autonomous modeai_scale_target, scale_metric, namespace
thoras_vertical_scale_totalCounterCounts vertical scaling adjustments in autonomous modeai_scale_target, namespace
thoras_recommendationGaugeCurrent scaling recommendation per AIScaleTargetai_scale_target, resource, namespace, container, unit
thoras_scale_targetsGaugeThe count of AIScaleTargets in various operational modes (autonomous or recommendation)vertical, horizontal
thoras_provisioning_ratioGaugeRatio of current resource allocation to forecasted/recommended value. Values > 1.0 indicate over-provisioning, values < 1.0 indicate under-provisioning.ai_scale_target, namespace, scaling_mode, resource, container, mode
thoras_provisioning_deltaGaugeAbsolute difference between current and recommended/forecasted values. Units depend on resource type (bytes for memory, cores for CPU)ai_scale_target, namespace, scaling_mode, resource, container, mode

System Health Metrics

Observe the state of Thoras’ system health. These metrics offer a window into the health and performance of the Thoras platform. They can also be easily integrated into popular observability tools like Datadog, Grafana, giving your team the flexibility to monitor Thoras wherever you already track system performance.
Metric NameTypeDescriptionLabels
thoras_api_http_response_totalCounterTracks total HTTP responses from internal API; spikes may indicate issuespath, method, code
thoras_api_http_request_duration_secondsGaugeCaptures average internal API response time; highlights potential slowdownspath, method
thoras_forecast_queue_pending_countGaugeNumber of pending forecast jobs currently in the queue
thoras_forecast_queue_pending_oldest_duration_secondsGaugeDuration in seconds that the oldest forecast job has been pending in the queue
thoras_forecast_queue_status_countCounterNumber of forecast jobs by statusstatus
thoras_job_completions_totalCounterTotal number of completed jobs by status and kindstatus, kind
thoras_job_duration_secondsGaugeDuration of jobs in seconds by status and kindstatus, kind

Advanced Usage

Provisioning Ratio

Metric: thoras_provisioning_ratio The provisioning ratio compares your current resource allocation to Thoras’ forecasted or recommended values. Interpretation: Note: Recommendations are based on the predicted maximum usage over the forecast window. Utilization ratio lowers when preemptive up-scaling occurs.
  • ratio > 1.0 — Current usage exceeds recommendation
  • ratio < 1.0 — Recommendations exceed current usage
Calculation:
  • Vertical scaling: current_avg_request / recommended_request
  • Horizontal scaling: current_total_usage / forecasted_value
Example Queries & Dashboards:
# Find workloads over-provisioned by more than 20%
thoras_provisioning_ratio{scaling_mode="horizontal"} > 1.2

# Find workloads under-provisioned by more than 10%
thoras_provisioning_ratio{scaling_mode="vertical"} < 0.9

# View provisioning ratio for a specific AIScaleTarget
thoras_provisioning_ratio{ai_scale_target="cartservice"}

# Count of AIScaleTargets by provisioning state
count (thoras_provisioning_ratio > 1.2)  # Over-provisioned
count (thoras_provisioning_ratio < 0.9)  # Under-provisioned
count (thoras_provisioning_ratio >= 0.9 or thoras_provisioning_ratio <= 1.2)

Provisioning Delta

Metric: thoras_provisioning_delta The provisioning delta shows the absolute difference between your current allocation and Thoras recommendations. Units:
  • Memory: bytes
  • CPU: cores
Calculation:
  • abs(current_value - recommended_value)
Example Queries & Dashboards:
# Find memory deltas exceeding 1GB
thoras_provisioning_delta{resource="memory", unit="bytes"} > 1073741824

# View CPU delta for a specific workload
thoras_provisioning_delta{ai_scale_target="cartservice", resource="cpu"}

# Memory waste from over-provisioning (in GB) - combines ratio + delta
(thoras_provisioning_delta{resource="memory", unit="bytes"}
  * on(ai_scale_target, namespace) group_left()
  (thoras_provisioning_ratio > 1)) / 1073741824

Horizontal Scale Total

Metric: thoras_horizontal_scale_total Tracks the total number of horizontal scaling actions Thoras has performed in autonomous mode. Each increment represents a scaling event where Thoras adjusted replica counts. Example Queries & Dashboards:
# Total horizontal scaling actions across all targets
sum(thoras_horizontal_scale_total)

# Scaling actions per AIScaleTarget
sum by (ai_scale_target) (thoras_horizontal_scale_total)

# Scaling actions per namespace
sum by (namespace) (thoras_horizontal_scale_total)

# Rate of scaling actions over the last hour
rate(thoras_horizontal_scale_total[1h])

Vertical Scale Total

Metric: thoras_vertical_scale_total Tracks the total number of vertical scaling actions Thoras has performed in autonomous mode. Each increment represents a scaling event where Thoras adjusted resource requests or limits. Example Queries & Dashboards:
# Total vertical scaling actions across all targets
sum(thoras_vertical_scale_total)

# Scaling actions per AIScaleTarget
sum by (ai_scale_target) (thoras_vertical_scale_total)

# Scaling actions per namespace
sum by (namespace) (thoras_vertical_scale_total)

# Rate of scaling actions over the last hour
rate(thoras_vertical_scale_total[1h])

# Compare horizontal vs vertical scaling activity
sum(thoras_horizontal_scale_total) / sum(thoras_vertical_scale_total)

Recommendation

Metric: thoras_recommendation Shows Thoras’ current scaling recommendation for each AIScaleTarget. The value and unit depend on the resource type being recommended. Units:
  • Memory: bytes
  • CPU: cores
  • Replicas: count
Example Queries & Dashboards:
# Current memory recommendations (in GB)
thoras_recommendation{resource="memory", unit="bytes"} / 1073741824

# Current CPU recommendations
thoras_recommendation{resource="cpu"}

# Recommendations for a specific AIScaleTarget
thoras_recommendation{ai_scale_target="cartservice"}

# Compare recommendations across containers in a workload
thoras_recommendation{ai_scale_target="frontend"} by (container, resource)

Scale Targets

Metric: thoras_scale_targets Reports the count of AIScaleTargets currently managed by Thoras, broken down by operational mode (autonomous or recommendation) and scaling dimension (vertical or horizontal). Example Queries & Dashboards:
# Total AIScaleTargets managed by Thoras
sum(thoras_scale_targets)

# Count of vertical scaling targets
thoras_scale_targets{vertical="true"}

# Count of horizontal scaling targets
thoras_scale_targets{horizontal="true"}

# Targets with both vertical and horizontal scaling enabled
thoras_scale_targets{vertical="true", horizontal="true"}

API HTTP Response Total

Metric: thoras_api_http_response_total Tracks the total number of HTTP responses from Thoras’ internal API. This counter increments for every API response and can help identify unusual traffic patterns or issues. Example Queries & Dashboards:
# Total API responses
sum(thoras_api_http_response_total)

# Response rate over the last 5 minutes
rate(thoras_api_http_response_total[5m])

# Responses by status code
sum by (code) (thoras_api_http_response_total)

# Error rate (4xx and 5xx responses)
sum(rate(thoras_api_http_response_total{code=~"4..|5.."}[5m]))

# Requests by endpoint
sum by (path, method) (thoras_api_http_response_total)

# Success rate percentage
(sum(rate(thoras_api_http_response_total{code=~"2.."}[5m]))
  / sum(rate(thoras_api_http_response_total[5m]))) * 100

API HTTP Request Duration

Metric: thoras_api_http_request_duration_seconds Measures the average response time for Thoras’ internal API requests in seconds. Rising values may indicate performance degradation or capacity issues. Example Queries & Dashboards:
# Response time by endpoint
avg by (path) (thoras_api_http_request_duration_seconds)

# Average response time over the last hour
avg_over_time(thoras_api_http_request_duration_seconds[1h])

Forecast Queue Pending Count

Metric: thoras_forecast_queue_pending_count Shows the current number of forecast jobs waiting in the queue. High values indicate that forecast requests are backing up, which may signal worker capacity issues or processing bottlenecks. Example Queries & Dashboards:
# Current pending job count
thoras_forecast_queue_pending_count

# Maximum pending count over the last hour
max_over_time(thoras_forecast_queue_pending_count[1h])

# Average pending count over the last 15 minutes
avg_over_time(thoras_forecast_queue_pending_count[15m])

# Alert on high queue depth (> 10 jobs)
thoras_forecast_queue_pending_count > 10

Forecast Queue Pending Oldest

Metric: thoras_forecast_queue_pending_oldest_duration_seconds Tracks how long the oldest forecast job has been waiting in the queue. Rising values indicate jobs are not being processed quickly enough. Example Queries & Dashboards:
# Current oldest job wait time
thoras_forecast_queue_pending_oldest_duration_seconds

# Maximum wait time over the last day
max_over_time(thoras_forecast_queue_pending_oldest_duration_seconds[24h])

# Average wait time over the last day
avg_over_time(thoras_forecast_queue_pending_oldest_duration_seconds[24h])

Forecast Queue Status Count

Metric: thoras_forecast_queue_status_count Tracks the total number of forecast jobs processed, categorized by status. This counter helps monitor forecast job lifecycle and identify issues with job processing. Example Queries & Dashboards:
# Total forecast jobs by status
sum by (status) (thoras_forecast_queue_status_count)

# Rate of forecast jobs by status over the last day
rate(thoras_forecast_queue_status_count[24h])

# Failed forecast jobs
sum(thoras_forecast_queue_status_count{status="failed"})

# Success rate percentage
(sum(rate(thoras_forecast_queue_status_count{status="completed"}[24h]))
  / sum(rate(thoras_forecast_queue_status_count[24h]))) * 100

# Total forecast jobs processed
sum(thoras_forecast_queue_status_count)

Job Completions Total

Metric: thoras_job_completions_total Tracks the total number of completed jobs, categorized by status (success, failure, etc.) and job kind. This counter helps identify job failure rates and patterns across different job types. Example Queries & Dashboards:
# Total job completions
sum(thoras_job_completions_total)

# Job completions by status
sum by (status) (thoras_job_completions_total)

# Job completions by kind
sum by (kind) (thoras_job_completions_total)

# Failed jobs only
sum(thoras_job_completions_total{status="failure"})

# Job completion rate over the last hour
rate(thoras_job_completions_total[1h])

# Failure rate percentage
(sum(rate(thoras_job_completions_total{status="failure"}[24h]))
  / sum(rate(thoras_job_completions_total[24h]))) * 100

Job Duration Seconds

Metric: thoras_job_duration_seconds Measures the duration of completed jobs in seconds, broken down by status and job kind. Use this to identify slow jobs or performance regressions. Example Queries & Dashboards:

# Average duration by job kind
avg by (kind) (thoras_job_duration_seconds)

# Jobs taking longer than 30 seconds
thoras_job_duration_seconds > 30