Skip to main content

Scaling

Adapter pods are stateless. They don't store any session data locally. All configuration lives in the Orchestrator. This means scaling is simple: add more pods, and the Orchestrator hands out more work.

Adding More Adapter Pods

adapter:
replicaCount: 5

Apply the change:

helm upgrade conncentric ./deployment/charts/conncentric -f my-values.yaml -n conncentric

New pods start up, register with the Orchestrator, and immediately begin picking up sessions that aren't currently running. No manual assignment is needed.

How Work Gets Distributed

When a pod starts, it asks the Orchestrator for sessions to run. The Orchestrator gives out sessions on a first-come, first-served basis.

  • Scale up: New pods immediately start claiming and running sessions
  • Scale down: When a pod is removed, its sessions expire after the lease timeout, then other pods pick them up

There's no concept of "sticky" sessions. Any pod can run any adapter.

How Many Pods Do You Need?

It depends on your deployment modes:

Deployment ModeMinimum PodsReasoning
Single1No redundancy needed
Active / Passive2+One pod runs the session; others are standbys
Active / Active2+Multiple pods share the same workload

As a rule: run at least as many pods as the number of sessions you want running simultaneously, plus at least one extra for failover capacity.

Resource Sizing

Resource usage varies by protocol and message throughput. Start with the defaults and tune based on what you observe in your environment.

The chart defaults (1000m request / 2000m limit CPU, 1Gi request / 2Gi limit memory) are sized for standard workloads. For high-throughput adapters processing large message volumes, consider increasing the limits:

adapter:
resources:
requests:
cpu: "2000m"
memory: "2Gi"
limits:
cpu: "4000m"
memory: "4Gi"

Tune after deployment based on what you observe.

Autoscaling

Conncentric exposes platform metrics that can drive Kubernetes Horizontal Pod Autoscaler (HPA) scaling. The platform provides a metric representing the number of enabled adapters, so the cluster scales proactively based on demand rather than reactively based on CPU.

This requires a custom metrics pipeline (e.g., Prometheus Adapter) to be installed in your cluster. See Metrics & Monitoring for details on available metrics.

CPU-based autoscaling also works. The Orchestrator distributes sessions as pods come and go regardless of how scaling is triggered:

  metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70

When using autoscaling, set adapter.replicaCount to your minimum and let the HPA manage scale-up.