Troubleshooting¶
This page lists common issues and fast fixes for Kairos deployments.
Startup and Access¶
Application does not start¶
Symptoms:
- Process exits immediately
- Spring Boot startup fails with datasource or migration errors
Checks:
- Confirm Java version is 17+:
bash java -version - Confirm configuration values are present and valid:
SPRING_DATASOURCE_URLSPRING_DATASOURCE_USERNAMESPRING_DATASOURCE_PASSWORD- Review startup logs for
Flyway,datasource, orbinderrors.
Typical fixes:
- Use a valid datasource URL for your database type.
- Ensure the DB user can connect and has schema permissions.
- Verify no other process already uses port
8080.
Web UI is unreachable¶
Symptoms:
- Browser cannot connect to
http://localhost:8080 - Connection timeout from outside container/cluster
Checks:
- Verify Kairos is running and listening on 8080.
- For Docker, confirm port mapping:
bash docker ps - For Kubernetes, confirm service and pod health:
bash kubectl get pods -n kairos kubectl get svc -n kairos
Typical fixes:
- Add or correct Docker port mapping (
-p 8080:8080). - Use
kubectl port-forward -n kairos svc/kairos 8080:8080for local access. - Check ingress host/path rules if using ingress.
Authentication and Login¶
Cannot log in as admin¶
Symptoms:
- Invalid credentials message on login page
Checks:
- Verify you are using the expected account and password.
- If this is first startup, try default admin credentials from quickstart.
- Confirm OIDC settings are correct if OIDC is enabled.
Typical fixes:
- Correct misconfigured OIDC environment variables.
- Disable OIDC temporarily to validate local login behavior.
- Recreate environment with known credentials if this is a disposable setup.
API requests return 401/403¶
Symptoms:
401 Unauthorizedor403 Forbiddenfor protected endpoints
Checks:
- Verify endpoint access requirements in api.md.
- If using API key JWT, ensure header is correct:
Authorization: Bearer <token>- For session auth, include CSRF token for write operations.
Typical fixes:
- Regenerate API key in Admin -> API Keys.
- Use an admin account/key for admin-only endpoints.
- Include CSRF token for
POST,PUT, andDELETEwith session auth.
Monitoring and Check Results¶
Resource always shows unknown or stale status¶
Symptoms:
- Resource remains unknown (
-1) or does not refresh
Checks:
- Verify resource type interval and parallelism in Admin -> Resource Types.
- Trigger a manual check from the resource detail page.
- Confirm target endpoint or registry is reachable from Kairos runtime network.
Typical fixes:
- Lower interval or increase parallelism for busy environments.
- Fix DNS, firewall, or proxy restrictions.
- Correct invalid target URL/image references.
HTTP resource fails with TLS errors¶
Symptoms:
- Certificate or hostname validation errors
Checks:
- Validate certificate chain and hostname externally.
- Confirm target URL uses the expected certificate.
Typical fixes:
- Use valid certificates in production.
- For internal/self-signed setups, enable
skipTLSonly if acceptable for your risk profile.
Docker resource fails although image exists¶
Symptoms:
- Docker resource marked unavailable
- Pullability-related errors
Checks:
- Confirm image reference format and tag/digest.
- Review credential matching rules in authentication.md.
- Review registry pullability behavior in docker-pullability.md.
Typical fixes:
- Add or correct Docker credentials for registry scope.
- Ensure token has
pullpermission. - Disable
skipTLSunless required; if required, verify registry cert setup.
Data and Import/Export¶
YAML import skips resources¶
Symptoms:
- Import summary shows skipped entries
Checks:
- Validate YAML structure and required fields.
- Ensure
resourceTypeandtargetare valid. - Review format details in importexport.md.
Typical fixes:
- Export first from a working instance and use that file as template.
- Correct unknown resource types or malformed entries.
Metrics and Observability¶
Prometheus cannot scrape metrics¶
Symptoms:
- Scrape target down
- Missing
kairos_resource_statusseries
Checks:
- Verify endpoint is reachable:
/actuator/prometheus- Verify scrape config
metrics_pathis/actuator/prometheus. - Confirm network policy/firewall allows access.
Typical fixes:
- Correct Prometheus target/port/path.
- Expose actuator endpoint through service/ingress as needed.
Upgrade Issues¶
Problems after upgrading Kairos¶
Checks:
- Read startup logs for Flyway migration errors.
- Confirm database backup exists.
- Validate all runtime env vars after deployment update.
Typical fixes:
- Roll back to previous image/version if startup fails.
- Resolve migration prerequisites, then redeploy.
- Re-apply working configuration values.
Collect Useful Debug Information¶
Before opening an issue, collect:
- Kairos version/tag and deployment method (source, Docker, Helm)
- Relevant startup/runtime log excerpts
- Database type (H2/PostgreSQL)
- Sanitized configuration values
- Exact failing endpoint/resource target and error message