GraalVM Native Image Evaluation¶
Kairos continues to ship the JVM image by default. Native images are available as an experimental parallel path and are published under separate native-distroless-* tags.
The supported native-container story for this repository is Dockerfile-native. The earlier buildpack-native path is no longer the recommended direction because its default /workspace runtime layout conflicts with Kairos's default embedded H2 persistence path.
Native Build Path¶
The Maven build now has an explicit native profile that:
- runs Spring AOT processing
- runs
native-maven-plugincompile-no-forkduringpackage - keeps the required
--initialize-at-run-time=sun.security.util.Password$ConsoleHolderbuild argument
Build a native executable directly only when your local JDK includes GraalVM native-image:
mvn -Pnative -DskipTests package
If your host JDK is a regular Temurin build, use the Docker-native path instead. That is the expected flow for Kairos development and CI.
Build the experimental native container image:
docker build -f Dockerfile-native -t kairos-native-local:test .
Validate the built image with the shared endpoint check:
bash scripts/check-container-endpoints.sh \
kairos-native-local:test \
18081 \
native-local-endpoint-check.md \
"Native Local Endpoint Check"
The native runtime image uses:
WORKDIR /app- a distroless non-root runtime base
- only the native executable plus emitted
.sosidecar libraries SPRING_DATASOURCE_URL=jdbc:h2:file:./data/kairos;AUTO_SERVER=TRUE
That layout keeps Kairos's default embedded H2 files under /app/data/kairos.* without any manual runtime override.
CI Flow¶
Native CI is intentionally parallel and non-blocking.
.github/workflows/_docker-native.ymlbuildsDockerfile-nativeforlinux/amd64- the workflow runs
scripts/check-container-endpoints.shagainst the native image - Trivy is run against the native image and its report is uploaded
- both endpoint and Trivy reports are uploaded as artifacts and posted to pull requests
- native images are pushed only on
mainand only under separate tags
Published native tags are:
native-distroless-<version>native-distroless-mainnative-distroless-latestnative-distroless-sha-<shortsha>
The existing JVM image and release path remain unchanged. Native publication does not gate the main release workflow.
Local Validation Notes¶
The Docker-native path has been locally validated with:
docker build -f Dockerfile-native -t kairos-native-local:test .bash scripts/check-container-endpoints.sh kairos-native-local:test 18081 native-local-endpoint-check.md "Native Local Endpoint Check"- default H2 persistence confirmed under
/app/data/kairos.mv.dband/app/data/kairos.lock.db mvn teston the standard JVM path
Native runtime validation also required targeted runtime hints for Thymeleaf expression helper classes used by the UI templates.
Thymeleaf Guidance¶
The main native-runtime regression encountered so far was not application startup, but server-side template rendering.
Two concrete failure modes showed up:
- direct SpEL method calls on model objects such as
someList.isEmpty()triggered missing reflection registration in the native image - Thymeleaf expression-object helpers such as
#lists,#strings,#numbers, and#temporalsalso need to be available for reflective invocation in native mode - model object properties used only by server-rendered templates are not guaranteed to be discovered by Spring AOT
- private view helper records and Spring Data
PageImplpagination objects also need explicit template reflection hints when templates access their properties
To keep future UI work native-safe:
- Prefer standard Thymeleaf expression objects over ad hoc Java method calls inside templates.
Example: prefer
#lists.isEmpty(items)overitems.isEmpty(). Also prefer#strings.contains(value, 'needle')and#lists.contains(values, item)over callingvalue.contains(...)orvalues.contains(...)directly. - Prefer property-style expressions for records and enums.
Example: prefer
entry.kindandresource.resourceType.nameoverentry.kind()andresource.resourceType.name(). - Keep every Thymeleaf expression helper used by templates registered in NativeRuntimeHintsConfig.java.
If a future template introduces helpers such as
#maps,#sets, or similar, extend the runtime hints class in the same change. - Keep server-rendered model types registered in NativeRuntimeHintsConfig.java.
This includes DTOs, JPA entities, enums, view-only helper records, and third-party model objects such as
PageImplwhen templates access their properties. - Treat template changes as native-impacting changes.
Any new page, fragment, or significant
th:*expression change should be validated with the native Docker build, not only with JVM tests. - Re-run endpoint checks after UI changes and manually exercise the affected pages in the native container.
The minimum smoke check is
/,/api/resources, and/actuator/health; for UI-heavy work, also open the changed pages directly.
Concrete findings from native rollout testing:
- the public dashboard failed on
DashboardGroupShellproperty access until the DTO was registered for template reflection - the resource detail page uses
ResourceViewModel,TimelineBlockDTO,CheckResult,Outage,PageImpl, and privateHomeControllersummary records - admin pages use many entity-backed models directly, including announcements, users, API keys, resources, groups, discovery config, notification providers, notification policies, proxy settings, and custom header settings
- the admin sidebar used direct
String.contains(...); this was replaced with#strings.contains(...) - resource group multi-select templates used projected-list
.contains(...); this was replaced with#lists.contains(...) - admin check history uses
CheckAuditEntry; record accessors should be used as properties and the record must stay registered for reflection
Recommended workflow after Thymeleaf-related changes:
docker build -f Dockerfile-native -t kairos-native-local:test .
bash scripts/check-container-endpoints.sh kairos-native-local:test 18081 native-local-endpoint-check.md "Native Local Endpoint Check"
If the changed work touches templates beyond the public dashboard, start the native container and verify the concrete page paths you changed as well.
Flyway And Persistence Guidance¶
Another native-specific issue showed up only when starting against an existing persisted database, which is the normal Helm/PVC case.
The failure mode was:
- the native image opened the H2 database successfully under
/app/data - Flyway then failed validation because Java-based migrations already recorded in
flyway_schema_historywere not being discovered from native classpath scanning - startup aborted even though the same database worked in the JVM image
In Kairos, several migrations are implemented as Java migrations under src/main/java/db/migration. Those must not rely on native classpath scanning alone.
To keep future migrations native-safe:
- If you add a new Java Flyway migration, also register it in FlywayMigrationConfig.java.
- Treat migration changes as persisted-state changes, not only first-boot changes. A native image that works against an empty database can still fail against an existing PVC with prior migration history.
- Validate both cases after Flyway changes:
- clean database startup
- startup against a database first initialized by the JVM image or a previous release
Recommended validation flow after migration changes:
mvn -B -DskipTests package
docker build -f Dockerfile-native -t kairos-native-local:test .
Then verify:
- Native startup on a clean database.
- Native startup against an existing H2 database directory populated by the JVM application.
For Helm deployments this matters because the PVC preserves flyway_schema_history, so native rollout must remain compatible with migration metadata produced by earlier JVM releases.
JPA Lazy-Loading Guidance¶
Another native-specific startup failure came from Hibernate lazy loading, not from SQL or schema compatibility.
The failure mode was:
- Kairos started bootstrapping normally against the persisted H2 database
MetricsServiceloaded the latestCheckResultfor each resource during startup- the startup path then touched the lazy
CheckResult.resourceassociation - Hibernate tried to generate a runtime proxy and native startup aborted with
Generation of HibernateProxy instances at runtime is not allowed when the configured BytecodeProvider is 'none'
In practice this means native-safe code must not assume that startup-time service logic can freely traverse lazy JPA relations the way the JVM build often tolerates.
To keep future persistence-related work native-safe:
- Avoid dereferencing lazy associations in startup hooks such as
@PostConstruct, application-ready listeners, bootstrap caches, and metric initialization. - When startup code already has the owning entity or identifier, pass that state through explicitly instead of re-reading it from a lazily loaded relation later.
The fix for this regression was to initialize latest-check gauges from the already-known
MonitoredResourcestate and avoid callingresult.getResource()in the startup path. - If startup logic truly needs related data, load it explicitly with a query shape that is native-safe. Prefer repository methods with fetch joins or projections over incidental lazy traversal.
- Treat service-layer refactors around metrics, caches, dashboard bootstrapping, and initial synchronization as native-impacting even if they do not change templates or migrations.
Recommended validation flow after JPA/service bootstrap changes:
mvn -B -Dtest=MetricsServiceTest test
docker build -f Dockerfile-native -t kairos-native-local:test .
Then verify both:
- clean native startup
- native startup against an existing
/app/datadatabase directory or Helm PVC-style persisted data
The second case matters because startup bootstrap code often only touches historical entities when real prior data exists.
Runtime Validation Areas¶
Validate these areas before treating the native image as production-ready:
| Area | Scenarios |
|---|---|
| Persistence | H2 file mode, PostgreSQL mode, Flyway SQL migrations, and Flyway Java migrations. |
| Web UI | Dashboard, admin pages, Thymeleaf templates, WebJars assets, and static resources. |
| Security | Local login, API key authentication, and OIDC login with a real or test issuer. |
| API | /actuator/health, /api/resources, /api, /h2-console, /sse, and /mcp/message. |
| Checks | HTTP, TCP, Docker image, Docker repository discovery, and OpenShift route discovery. |
| Integrations | Import/export, email, Discord, generic webhook, GitLab notifications, and MCP tools. |
Native images use closed-world analysis. If a runtime path fails because reflection, resource loading, serialization, or proxy use was not discovered at build time, add focused runtime hints and rebuild the image.