Troubleshoot OpenZiti egress in Kasm
Common failure modes, ordered by traffic path: workspace → sidecar → tunneler → policy → router → destination. See kziti architecture and the integration deep dive for background.
Where to look first
The four primary signals when troubleshooting are:
- Sidecar log on the Kasm host:
/var/log/kasm-sidecar/network_sidecar.log(network setup, identity injection, tunnel start). - Per-session Ziti tunnel log:
/var/run/kasm-sidecar/$container_namespace/ziti.log(tunnel start, dial errors, policy failures). - kziti status (on the Ziti host):
kziti status(overall controller and router health). - OpenZiti controller and router logs:
docker compose -f /opt/kziti/docker-compose.yml logs ziti-controllerand similarly for the router container.
A useful first step is to find the namespace for a failing session:
docker inspect -f '{{.NetworkSettings.SandboxKey}}' <container_name> \
| grep -o -E '[a-f0-9]+$'
Then check both the per-session ziti log and the sidecar log for that namespace.
A session reports an egress error at launch
The Kasm UI shows an egress error when the workspace launches.
Most common causes:
- No managed identity for this user or workspace. Reconciliation has not yet run, or has run but not picked up the new mapping. List identities on the Ziti host:
kziti identity list | grep kasm-and confirm one matches the expectedkasm-user-...orkasm-workspace-...pattern. - Identity present but Kasm has not received the JSON. Restart the egress reconciliation in Kasm or re-save the mapping to force a refresh.
- Controller unreachable from the Kasm host. The sidecar needs outbound TCP to
:1280on the controller. Test from the Kasm host:curl -k https://<ziti-controller>:1280/edge/client/v1/version. A response with version metadata indicates connectivity.
The session's network-sidecar.log will name the specific failure. Look for failed to enroll, failed to authenticate, or failed to start ziti tunnel.
A session starts but cannot reach a service
The session is up and the Kasm UI does not show an egress error, but the user cannot reach a service the policies say they should.
Walk the policy axes in this order:
Step 1. Confirm the identity has the right access attribute
kziti access list <identity>
The output should show a role attribute (net-*, svcset-*, or svc-*) that covers the target service. If it is missing, the access grant was not made or the Kasm mapping has not reconciled.
Step 2. Confirm the target service exists and the destination is reachable from the private router
kziti service show <service-id>
Verify the destination DNS name or IP is correct. Test reachability from the host running the relevant private router: curl -k https://<destination>:<port> from inside the router host or via docker compose exec.
Step 3. Confirm the policies match
kziti ziti audit
The audit command checks that the universal edge router policy (erp-clients-dmz) exists, that every network has its Bind and Dial policies, and that no overly broad policies exist. Any ERROR finding is a hard blocker for connectivity; WARN findings indicate configuration drift. Under normal kziti-managed operations policy misconfigurations are uncommon, but adopting an existing controller or manual edits can introduce them.
Step 4. Confirm the routers and edge-router policies allow the path
kziti router list
The relevant private router should be online. The public routers should be online. If a router is offline, check its container logs.
Step 5. Check the session's tunnel log
In the Kasm session container's namespace dir on the Kasm host:
/var/run/kasm-sidecar/$container_namespace/ziti.log
A failed dial logs the service name and the reason. Common reasons: not authorized, no terminators (no router is hosting the service), service not found (intercept hostname does not match a service definition).
Identity is created but does not appear online in ZAC
The identity exists on the controller (visible via kziti identity list or in ZAC) but never reports online when a session is launched.
Most common causes:
- The session never launched a Ziti tunnel. Look in
network-sidecar.logforziti tunnel runor its absence. If the tunnel never started, the egress sidecar code path was not taken — confirm the workspace is mapped to the OpenZiti provider in Kasm. - The tunnel started but failed enrollment. Check
ziti.logfor the session's namespace. A signed JWT that has expired or been used produces a clear error. - Network path to the public router is blocked. The session container needs TCP
:3022(or whatever port your public router listens on) reachable. Network policy on the Kasm host or its environment can block this.
A private router is offline after recent infrastructure changes
A private router that was previously healthy reports as offline.
Most common causes:
- The router host lost outbound TCP to the controller (
:6262) or the public router (:10080for fabric link). Confirm withnc -zvfrom the router host. - The controller IP changed without a corresponding
--existing-controller-ipupdate on the router host. If the router was provisioned when DNS resolution was different, you may need to update the router's compose override on its host. - The router certificate is rotating. OpenZiti routers occasionally re-enroll. The router log will say so.
HA cluster commands time out
Mutating commands fail with quorum or timeout errors but read-only commands still work.
This is the quorum-loss signature. See Recover from quorum loss.
Do not run recovery if the missing controller is just temporarily unreachable — wait for it to return. Recovery is for nodes that are permanently gone.
When to escalate
For raw policy bodies, Raft state, fabric link health, or PKI internals: OpenZiti docs and issue tracker. If you're on the Kasm-pinned v2 prerelease and suspect an upstream bug, report it to Kasm too.
Related
- Operations with kziti — day-2 admin tasks.
- Integration deep dive — Kasm-side internals.
- OpenZiti egress provider — the user-facing concept page.