Metrics and Monitoring
Thoras exposes native Prometheus metrics that offer real-time visibility into the actions and health of the Thoras system. This empowers your teams with deeper insights, transparency, and observability.
Action Metrics
See Thoras in action. These metrics provide insight into how Thoras actively manages your application resources across your environment. They track when Thoras takes autonomous action to scale horizontally or vertically, when replicas or resource allocations differ from Thoras’ desired values, and how each AIScaleTarget
is currently operating.
Together, these metrics give teams the visibility they need to understand scaling behavior, make informed decisions, and maintain efficient, well-performing applications.
Metric Name | Type | Description | Labels |
---|---|---|---|
thoras_horizontal_scale_total | Counter | Counts horizontal scaling actions in autonomous mode | ai_scale_target , scale_metric , namespace |
thoras_vertical_scale_total | Counter | Counts vertical scaling adjustments in autonomous mode | ai_scale_target , namespace |
thoras_recommendation | Gauge | Current scaling recommendation per AIScaleTarget | ai_scale_target , resource , namespace , container , unit |
thoras_scale_targets | Gauge | The count of AIScaleTargets in various operational modes (autonomous or recommendation) | vertical , horizontal |
System Health Metrics
Observe the state of Thoras’ system health. These metrics offer a window into the health and performance of the Thoras platform. They can also be easily integrated into popular observability tools like Datadog, Grafana, giving your team the flexibility to monitor Thoras wherever you already track system performance.
Metric Name | Type | Description | Labels |
---|---|---|---|
thoras_api_http_response_total | Counter | Tracks total HTTP responses from internal API; spikes may indicate issues | path , method , code |
thoras_api_http_request_duration_seconds | Gauge | Captures average internal API response time; highlights potential slowdowns | path , method |
thoras_forecast_queue_seconds | Gauge | Reflects current max forecast queue wait time; signals worker or capacity issues |