Skip to main content
Version: 1.19.0 (latest)

Troubleshoot OpenZiti egress in Kasm

Common failure modes, ordered by traffic path: workspace → sidecar → tunneler → policy → router → destination. See kziti architecture and the integration deep dive for background.

Where to look first

The four primary signals when troubleshooting are:

  • Sidecar log on the Kasm host: /var/log/kasm-sidecar/network_sidecar.log (network setup, identity injection, tunnel start).
  • Per-session Ziti tunnel log: /var/run/kasm-sidecar/$container_namespace/ziti.log (tunnel start, dial errors, policy failures).
  • kziti status (on the Ziti host): kziti status (overall controller and router health).
  • OpenZiti controller and router logs: docker compose -f /opt/kziti/docker-compose.yml logs ziti-controller and similarly for the router container.

A useful first step is to find the namespace for a failing session:

docker inspect -f '{{.NetworkSettings.SandboxKey}}' <container_name> \
| grep -o -E '[a-f0-9]+$'

Then check both the per-session ziti log and the sidecar log for that namespace.

A session reports an egress error at launch

The Kasm UI shows an egress error when the workspace launches.

Most common causes:

  1. No managed identity for this user or workspace. Reconciliation has not yet run, or has run but not picked up the new mapping. List identities on the Ziti host: kziti identity list | grep kasm- and confirm one matches the expected kasm-user-... or kasm-workspace-... pattern.
  2. Identity present but Kasm has not received the JSON. Restart the egress reconciliation in Kasm or re-save the mapping to force a refresh.
  3. Controller unreachable from the Kasm host. The sidecar needs outbound TCP to :1280 on the controller. Test from the Kasm host: curl -k https://<ziti-controller>:1280/edge/client/v1/version. A response with version metadata indicates connectivity.

The session's network-sidecar.log will name the specific failure. Look for failed to enroll, failed to authenticate, or failed to start ziti tunnel.

A session starts but cannot reach a service

The session is up and the Kasm UI does not show an egress error, but the user cannot reach a service the policies say they should.

Walk the policy axes in this order:

Step 1. Confirm the identity has the right access attribute

kziti access list <identity>

The output should show a role attribute (net-*, svcset-*, or svc-*) that covers the target service. If it is missing, the access grant was not made or the Kasm mapping has not reconciled.

Step 2. Confirm the target service exists and the destination is reachable from the private router

kziti service show <service-id>

Verify the destination DNS name or IP is correct. Test reachability from the host running the relevant private router: curl -k https://<destination>:<port> from inside the router host or via docker compose exec.

Step 3. Confirm the policies match

kziti ziti audit

The audit command checks that the universal edge router policy (erp-clients-dmz) exists, that every network has its Bind and Dial policies, and that no overly broad policies exist. Any ERROR finding is a hard blocker for connectivity; WARN findings indicate configuration drift. Under normal kziti-managed operations policy misconfigurations are uncommon, but adopting an existing controller or manual edits can introduce them.

Step 4. Confirm the routers and edge-router policies allow the path

kziti router list

The relevant private router should be online. The public routers should be online. If a router is offline, check its container logs.

Step 5. Check the session's tunnel log

In the Kasm session container's namespace dir on the Kasm host:

/var/run/kasm-sidecar/$container_namespace/ziti.log

A failed dial logs the service name and the reason. Common reasons: not authorized, no terminators (no router is hosting the service), service not found (intercept hostname does not match a service definition).

Identity is created but does not appear online in ZAC

The identity exists on the controller (visible via kziti identity list or in ZAC) but never reports online when a session is launched.

Most common causes:

  • The session never launched a Ziti tunnel. Look in network-sidecar.log for ziti tunnel run or its absence. If the tunnel never started, the egress sidecar code path was not taken — confirm the workspace is mapped to the OpenZiti provider in Kasm.
  • The tunnel started but failed enrollment. Check ziti.log for the session's namespace. A signed JWT that has expired or been used produces a clear error.
  • Network path to the public router is blocked. The session container needs TCP :3022 (or whatever port your public router listens on) reachable. Network policy on the Kasm host or its environment can block this.

A private router is offline after recent infrastructure changes

A private router that was previously healthy reports as offline.

Most common causes:

  • The router host lost outbound TCP to the controller (:6262) or the public router (:10080 for fabric link). Confirm with nc -zv from the router host.
  • The controller IP changed without a corresponding --existing-controller-ip update on the router host. If the router was provisioned when DNS resolution was different, you may need to update the router's compose override on its host.
  • The router certificate is rotating. OpenZiti routers occasionally re-enroll. The router log will say so.

HA cluster commands time out

Mutating commands fail with quorum or timeout errors but read-only commands still work.

This is the quorum-loss signature. See Recover from quorum loss.

Do not run recovery if the missing controller is just temporarily unreachable — wait for it to return. Recovery is for nodes that are permanently gone.

When to escalate

For raw policy bodies, Raft state, fabric link health, or PKI internals: OpenZiti docs and issue tracker. If you're on the Kasm-pinned v2 prerelease and suspect an upstream bug, report it to Kasm too.