Skip to main content

High Availability and Disaster Recovery

How High Availability Works

Conncentric uses three layers of protection to keep your integrations running.

Layer 1: Automatic Session Failover

Connectivity sessions and consumers are assigned to an active adapter pod. If that pod fails, another pod takes over the session automatically.

Here's what happens:

  1. The active pod sends a heartbeat to the Orchestrator every few seconds.
  2. If the heartbeats stop, the Orchestrator waits out a configurable timeout.
  3. After the timeout, another pod claims the session and starts it up.
  4. The previous pod, if it recovers, detects the conflict and shuts itself down immediately (fencing).

Time to recover: The lease timeout (configurable) plus the time to re-establish connectivity.

For production environments requiring Exclusive Consumption, run at least two adapter pods so there is always a standby available.

Layer 2: Orchestrator Redundancy

The Orchestrator can run with multiple instances. If one goes down, the others continue serving the Portal and the adapter pods without interruption.

Adapter pods keep their active sessions running during an Orchestrator restart. They do not require the Orchestrator to be available every second, only to maintain their heartbeats.

Layer 3: Database High Availability

All platform state lives in PostgreSQL. Use your database provider's high-availability option so the database itself can survive a server failure.

ProviderRecommended Option
AWS RDSMulti-AZ deployment
AWS AuroraAurora PostgreSQL with read replicas
Google Cloud SQLHigh availability configuration
Azure Database for PostgreSQLZone-redundant HA

Disaster Recovery

What Needs to Be Backed Up

DataWhere It LivesBackup Method
All adapter configurationsPostgreSQLDatabase backups
Artifacts (data dictionaries, schemas)PostgreSQLDatabase backups
Installed pluginsPostgreSQLDatabase backups
Helm valuesYour version control systemKeep in Git

Everything important lives in the database. If you back up PostgreSQL regularly, you can recover the full platform.

Recovery Behavior

ScenarioHow It Recovers
Adapter pod crashesKubernetes restarts it; another pod claims the session after lease timeout
Entire node failsKubernetes reschedules pods on another node; sessions are reclaimed
Orchestrator pod crashesKubernetes restarts it; adapter pods continue running during the restart
Database failoverProvider-managed automatic failover; platform reconnects when the database is available

How to Recover From a Full Cluster Loss

  1. Provision a new Kubernetes cluster
  2. Restore your PostgreSQL backup
  3. Deploy Conncentric using your saved Helm values
  4. Adapter pods start up and claim sessions automatically

No manual reconfiguration is needed. All the configuration is in the database.

Planned Maintenance

To shut down cleanly before maintenance:

  1. Disable all adapters in the Portal: go to Adapters, select all, click Disable.
  2. Wait for all sessions to close. Watch the operational status move to INACTIVE.
  3. Perform the maintenance.
  4. Re-enable adapters when done.