Skip to main content
Thoras is designed to complement your existing Kubernetes autoscaling infrastructure, not replace it. This guide explains how Thoras integrates with KEDA, Cluster Autoscaler, and Karpenter. For HPA-specific setup and configuration, see the Predictive HPA guide.

How Thoras finds your HPA

The AIScaleTarget references a Deployment directly. It does not reference an HPA or KEDA ScaledObject. Thoras automatically discovers the HPA that targets the same Deployment by looking up HPAs in the same namespace and matching on scaleTargetRef. This means you do not need to change your existing HPA or KEDA configuration when adding Thoras. Thoras finds it, reads its configured metrics to know what to collect, and incorporates its desired replica count into scaling decisions, all without modifying the HPA itself.

KEDA

KEDA (Kubernetes Event-Driven Autoscaling) scales workloads based on external event sources: queue depth, Kafka lag, HTTP request rate, and more. How they work together KEDA ScaledObjects create and manage standard Kubernetes HPA objects behind the scenes. Thoras treats KEDA-managed HPAs identically to manually created HPAs. It reads KEDA’s configured metrics and incorporates KEDA’s desired replica count into its own scaling decisions. Two things happen at runtime:
  1. Thoras scales proactively. When a new forecast is ready, Thoras writes its suggested replica count to the AIScaleTarget. The Thoras operator reads both Thoras’s suggestion and the KEDA-managed HPA’s current desired replicas, then scales the workload to max(thoras_suggested, keda_desired).
  2. Thoras acts as a floor for KEDA’s reactive scaling. When KEDA’s HPA submits a scale request, a Thoras admission webhook intercepts it and ensures the final replica count is at least max(keda_proposed, thoras_suggested). This prevents KEDA from scaling below Thoras’s predictive floor during quiet periods.
What Thoras never does Thoras never modifies KEDA ScaledObjects or the HPA objects KEDA manages. KEDA retains full control over its HPA configuration. Thoras only influences the replica count at the moment a scale decision is executed. Three complementary layers With KEDA and Thoras together, you get three scaling layers working in parallel: KEDA scales based on current event backlog, HPA reacts to live resource pressure, and Thoras scales ahead of both based on predicted future demand. Each layer handles a different class of signal, and none replaces the others.

Cluster Autoscaler

Cluster Autoscaler adds and removes nodes based on whether pods can be scheduled. How they work together Thoras and Cluster Autoscaler operate at different layers and do not communicate directly. Thoras works at the pod level; Cluster Autoscaler works at the node level. The integration is through pod resource requests. Cluster Autoscaler decides how many nodes to provision based on the resource requests (resources.requests) of pending pods. Thoras keeps those requests accurately sized to actual workload usage through vertical rightsizing. This gives Cluster Autoscaler a more accurate signal:
  • Under-specified requests cause Cluster Autoscaler to underestimate how many nodes are needed, leading to scheduling pressure and delayed scale-out.
  • Over-specified requests waste node capacity and slow down Cluster Autoscaler’s scale-in decisions (nodes appear utilized even when workloads are idle).
Right-sized requests let Cluster Autoscaler provision fewer, better-utilized nodes, reducing infrastructure cost without sacrificing reliability.

Karpenter

Karpenter is a node provisioning tool that bin-packs pods onto nodes and selects optimal instance types automatically. How they work together Like Cluster Autoscaler, Karpenter uses pod resource requests to make provisioning decisions. The same principle applies: accurately sized requests from Thoras’s vertical rightsizing give Karpenter better bin-packing signals, enabling tighter consolidation and smaller, more cost-effective node selections. Karpenter’s consolidation feature (moving pods to fewer nodes) also benefits from right-sized requests. When pods request only what they actually use, Karpenter can consolidate more aggressively without risking resource pressure. Thoras does not configure or modify Karpenter NodePools or EC2NodeClasses.

Summary

ToolIntegration typeThoras role
HPADirectReads HPA metrics; intercepts scale requests; never modifies HPA
KEDAVia HPASame as HPA; KEDA-managed HPAs treated identically
Cluster AutoscalerIndirectRight-sized pod requests improve node provisioning signals
KarpenterIndirectRight-sized pod requests improve bin-packing and consolidation