How Thoras finds your HPA
The AIScaleTarget references a Deployment directly. It does not reference an HPA or KEDA ScaledObject. Thoras automatically discovers the HPA that targets the same Deployment by looking up HPAs in the same namespace and matching onscaleTargetRef.
This means you do not need to change your existing HPA or KEDA configuration
when adding Thoras. Thoras finds it, reads its configured metrics to know what
to collect, and incorporates its desired replica count into scaling decisions,
all without modifying the HPA itself.
KEDA
KEDA (Kubernetes Event-Driven Autoscaling) scales workloads based on external event sources: queue depth, Kafka lag, HTTP request rate, and more. How they work together KEDA ScaledObjects create and manage standard Kubernetes HPA objects behind the scenes. Thoras treats KEDA-managed HPAs identically to manually created HPAs. It reads KEDA’s configured metrics and incorporates KEDA’s desired replica count into its own scaling decisions. Two things happen at runtime:-
Thoras scales proactively. When a new forecast is ready, Thoras writes
its suggested replica count to the AIScaleTarget. The Thoras operator reads
both Thoras’s suggestion and the KEDA-managed HPA’s current desired replicas,
then scales the workload to
max(thoras_suggested, keda_desired). -
Thoras acts as a floor for KEDA’s reactive scaling. When KEDA’s HPA
submits a scale request, a Thoras admission webhook intercepts it and ensures
the final replica count is at least
max(keda_proposed, thoras_suggested). This prevents KEDA from scaling below Thoras’s predictive floor during quiet periods.
Cluster Autoscaler
Cluster Autoscaler adds and removes nodes based on whether pods can be scheduled. How they work together Thoras and Cluster Autoscaler operate at different layers and do not communicate directly. Thoras works at the pod level; Cluster Autoscaler works at the node level. The integration is through pod resource requests. Cluster Autoscaler decides how many nodes to provision based on the resource requests (resources.requests) of pending pods. Thoras keeps those requests
accurately sized to actual workload usage through vertical rightsizing. This
gives Cluster Autoscaler a more accurate signal:
- Under-specified requests cause Cluster Autoscaler to underestimate how many nodes are needed, leading to scheduling pressure and delayed scale-out.
- Over-specified requests waste node capacity and slow down Cluster Autoscaler’s scale-in decisions (nodes appear utilized even when workloads are idle).
Karpenter
Karpenter is a node provisioning tool that bin-packs pods onto nodes and selects optimal instance types automatically. How they work together Like Cluster Autoscaler, Karpenter uses pod resource requests to make provisioning decisions. The same principle applies: accurately sized requests from Thoras’s vertical rightsizing give Karpenter better bin-packing signals, enabling tighter consolidation and smaller, more cost-effective node selections. Karpenter’s consolidation feature (moving pods to fewer nodes) also benefits from right-sized requests. When pods request only what they actually use, Karpenter can consolidate more aggressively without risking resource pressure. Thoras does not configure or modify Karpenter NodePools or EC2NodeClasses.Summary
| Tool | Integration type | Thoras role |
|---|---|---|
| HPA | Direct | Reads HPA metrics; intercepts scale requests; never modifies HPA |
| KEDA | Via HPA | Same as HPA; KEDA-managed HPAs treated identically |
| Cluster Autoscaler | Indirect | Right-sized pod requests improve node provisioning signals |
| Karpenter | Indirect | Right-sized pod requests improve bin-packing and consolidation |