Skip to main content
Version: 1.19.0 (latest)

Managers

Overview

The Manager Service (kasm_manager) is the coordination layer between the Kasm API and the Agent fleet. Every Agent periodically checks in with a Manager to report available resources and receive instructions. Managers in turn track session state, monitor Agent health, and elect a Primary Manager per zone to handle time-sensitive tasks such as promoting Agents and reconciling session counts.

In a multi-server deployment, multiple Managers can run within the same Deployment Zone. At any given time, exactly one Manager per zone holds the Primary Manager role. If the current Primary Manager becomes unavailable, another Manager in the zone automatically assumes the role.

View Managers

Navigate to Infrastructure -> Managers in the Kasm admin UI to see a list of all registered Managers and their current state.

Each Manager entry shows:

ColumnDescription
Instance IDUnique identifier for the Manager container instance.
EnabledWhether the Manager is currently active and accepting Agent connections.
First Reported TimeTimestamp when the Manager first registered with the deployment.
Last Reported TimeTime elapsed since the Manager last sent a heartbeat.
ZoneThe Deployment Zone this Manager belongs to.
StatusCurrent operational status (running, unknown).
Primary ManagerIndicates which Manager holds the primary role in its zone.

Enable / Disable a Manager

Managers can be disabled without stopping or restarting the container. This is useful for taking a Manager out of rotation for maintenance, debugging, or rolling updates while keeping the Agent fleet operational.

Zone constraint

At least one Manager per zone must remain enabled at all times. The UI will display an error and prevent the action if disabling a Manager would leave a zone with no enabled Managers.

How it works

When a Manager's enabled flag is set to false in the database:

  1. Sentinel file created : The Manager process writes a sentinel file at /tmp/kasm_manager_draining inside the container (mapped to /opt/kasm/current/tmp/manager/ on the host).

  2. Nginx stops serving the Manager : The kasm_proxy nginx container checks for the sentinel file on every incoming request. When present, nginx returns 503 Service Unavailable with a Retry-After: 30 header on /health and stops proxying /manager_api/ requests — both behaviors triggered independently by the same sentinel check.

  3. Work suspended : The Manager's internal guardian loop skips all work when enabled=false, preventing a disabled Manager from promoting itself to Primary or modifying shared resources.

  4. Container stays healthy : Docker's container health reporting is patched to distinguish between "disabled" and "unhealthy", so a disabled Manager continues to report healthy to Docker and will not trigger an automatic restart.

When the Manager is re-enabled:

  • The sentinel file is removed.
  • Nginx resumes proxying requests normally and returns 200 OK.
  • The guardian loop resumes, and the Manager can eventually re-acquire the Primary role if appropriate.

Primary Manager failover

If you disable the current Primary Manager, another enabled Manager in the same zone automatically promotes itself to Primary. All Agents in the zone re-register with the new Primary Manager. This promotion happens within a few minutes as part of the regular guardian loop cycle.

Re-enabling takes a moment

After re-enabling a Manager, allow up to a minute for it to return to running status. The guardian loop processes state on a fixed interval, so the transition is not instantaneous.

Verifying Manager State

The following commands can be run on the host where the Manager container is running to confirm expected behavior after a state change.

Check the healthcheck endpoint response:

The Manager exposes two health endpoints serving different purposes:

EndpointUsed byDisabled response
/__healthcheckDocker internal healthcheck503 — remapped to healthy by the healthcheck script to prevent container restart
/healthkasm_proxy nginx503 — nginx stops proxying requests to the Manager

To inspect the Manager process directly (bypassing nginx):

docker exec -it kasm_manager curl -i http://localhost:8181/__healthcheck
  • 200 OK — Manager is enabled and healthy.
  • 503 with {"ok": false, "status": "disabled", "enabled": false} body — Manager is intentionally disabled. Docker's healthcheck script maps this to healthy so the container is not restarted.
  • 500 — Manager process is unhealthy (missed heartbeat or timeout).

Confirm Docker still considers the container healthy:

docker inspect --format='{{.State.Health.Status}}' kasm_manager

This should report healthy even when the Manager is disabled. A unhealthy result would indicate the container is at risk of being restarted.

Verify Agent routing after disabling (Diagnosis logs):

After disabling a Manager, check the Diagnostics section in the admin UI for 503 responses in the Manager request logs. Agents should stop routing to the disabled Manager within one check-in interval.