Observability in CNCF
In AI-driven development, one of the central questions is how to achieve non-functional requirements such as Security and Observability.
Even observability alone requires many cross-cutting concerns such as distributed tracing, metrics, structured diagnostics, payload protection, and integration with external observability platforms.
Delegating these concerns to individually generated implementations rapidly increases generation, review, and operational costs while also destabilizing quality assurance.
For this reason, once sufficient structural information is described in the Literate Model, the CML (Cozy Modeling Language)&CNCF (Cloud Native Component Framework) model compiler and execution framework establish Security and Observability as cross-cutting runtime behavior.
Why AI-Driven Development Needs Runtime Support
In theory, non-functional requirements such as Security and Observability could also be fully described in prompts and specifications.
In practice, however, even observability alone requires many cross-cutting concerns such as distributed tracing, metrics, structured diagnostics, correlation IDs, payload protection, confidentiality masking, and integration with external observability platforms.
If these concerns are repeatedly delegated to individually generated implementations, implementation and operational complexity rapidly increase.
Why AI-Driven Development Amplifies the Problem
Another major issue is the operational cost of generative AI itself.
If cross-cutting quality attributes such as Security and Observability must be described repeatedly in prompts and specifications, prompt and specification size rapidly grows.
As concerns such as traces, metrics, correlation, payload protection, redaction, export, retry handling, and backend integration begin interacting, implementation and operational complexity grow almost exponentially.
As a result, the amount of context required for generative AI also explodes, significantly increasing generation, review, and regeneration costs.
Moreover, when generative AI encounters requirements it cannot fully understand or requirements that become too complex, it tends to omit, simplify, or overlook them.
For this reason, approaches that attempt to push everything into prompts and specifications make quality assurance itself unstable.
Observability in CNCF
CNCF automatically collects observability information within the runtime and provides an operational observability environment, including dashboards.
However, when considering observability across the entire system including CNCF, an operational management function is needed to aggregate observability information emitted by each system.
For this purpose, CNCF supports integration with OpenTelemetry, the industry-standard observability framework.
Inside CNCF, structured execution information is maintained for:
-
CallTree execution structure
-
action / UoW / space / I/O execution facts
-
structured diagnostics
-
Conclusion detail codes
-
previous-chain source errors
-
payload summaries and references
-
runtime metric snapshots
-
Job / Task / Saga context
By integrating with OpenTelemetry, the observability information collected by CNCF can be viewed through dedicated tools such as Jaeger, Prometheus, and Grafana as part of the overall system observability data.
Sample 13: Minimal Jaeger Proof
This section explains observability in CNCF using a tutorial example.
The tutorial package located below is used in this example.
After extraction, the sample code is located in the following directory.
-
samples/13-observability-jaeger
This sample verifies the smallest possible trace-export configuration.
Configuration:
-
Jaeger all-in-one
-
OTLP HTTP
CNCF sends traces directly to Jaeger.
Startup
Use the provided scripts to start Jaeger and CNCF.
$ bash start-jaeger.sh
$ bash run-server.sh
Execute the operation as follows.
Executing the operation exports observability information to Jaeger, making it available for inspection.
$ cncf client minimal.main.hello --baseurl http://127.0.0.1:19613
Hello CNCF
First Things to Check
CNCF itself also provides native observability capabilities.
First, inspect the operation execution results using CNCF-native observability.
The recommended inspection order is as follows.
-
CNCF Observability UI observability
-
Jaeger UI http://127.0.0.1:16686
What to Check in CNCF Observability UI
Inside the CNCF Observability UI, verify:
-
CallTree structure
-
action execution facts
-
I/O execution information
-
Conclusion chains
-
diagnostic payload summaries
What to Check in Jaeger
Next, verify the exported trace in Jaeger.
Search condition:
-
Service:
goldenport-cncf
Verification points:
-
A trace for
minimal.main.helloexists -
A CNCF Action Span exists
-
component / service / operation attributes are attached
-
Payload Summary information is included
-
The raw payload body is not exported
Sample 13a: Full Observability Stack Lab
The previous example used only a minimal Jaeger-based observability configuration.
This example uses a full observability stack consisting of OpenTelemetry Collector, Jaeger, Prometheus, and Grafana.
This topology is also suitable for production-oriented deployments.
After extracting the ZIP archive, the sample code is located in the following directory.
-
samples/13.a-observability-stack-lab
This sample demonstrates a standard Collector-centered observability topology.
-
-
CNCF runtime
-
-
OpenTelemetry Collector
-
A relay component responsible for collecting, transforming, and routing observability data.
-
-
Jaeger
-
A system for storing, searching, and visualizing distributed traces.
-
-
Prometheus
-
A monitoring system for collecting, storing, and querying time-series metrics.
-
-
Grafana
-
A dashboard and query frontend used to visualize data from systems such as Prometheus and Jaeger.
-
The observability system topology is shown below.
CNCF sends observability data to the OpenTelemetry Collector, which then propagates the information to Jaeger and Prometheus.
Grafana then retrieves information from Jaeger and Prometheus and visualizes it through dashboards.
Startup
Start the system as follows.
$ bash start-stack.sh
$ bash run-server.sh
Generate observable runtime activity using the following commands.
$ bash run-operation.sh
$ bash export-metrics.sh
Inspection Order
The recommended inspection order is as follows.
What to Check in Prometheus
Access Prometheus using the following URL.
Use the following PromQL queries for verification.
cncf_web_request_requests_count
cncf_action_execution_executions_count
These queries confirm that CNCF runtime metrics are flowing through the Collector into Prometheus.
What to Check in Grafana
Access Grafana using the following URL.
Log in using the following account.
admin / admin
Grafana should be treated as the operator-facing dashboard and query frontend.
The following information can be inspected.
-
Prometheus datasource
-
CNCF metrics series
-
Jaeger datasource
-
Trace search
Summary
In AI-driven development, it is not realistic to repeatedly describe cross-cutting quality attributes such as Security and Observability in exhaustive prompts and specifications.
As specifications grow, the amount of context required for generative AI, as well as generation, review, and regeneration costs, rapidly increase, destabilizing quality assurance itself.
Furthermore, when generative AI encounters requirements that are too complex or insufficiently understood, it tends to omit, simplify, or overlook them.
For this reason, CML&CNCF absorbs Security and Observability as runtime capabilities provided by the model compiler and execution framework rather than delegating them to application code.
CNCF also treats OpenTelemetry not as the primary observability system, but as an export boundary while maintaining CNCF Native Observability as the authoritative source of diagnostic truth.
The samples in this article demonstrated how the CNCF runtime integrates observability capabilities, from a minimal Jaeger-only setup to a full Collector-centered observability stack.
References
Glossary
- observability
-
Observability represents the property of a system or domain whereby its internal state can be inferred and understood through external observations. It goes beyond simple monitoring: by consistently collecting and correlating phenomena and observations, and interpreting them as domain events, observability enables a comprehensive understanding of system behavior.
- CML (Cozy Modeling Language)
-
CML is a literate modeling language for describing Cozy models. It is designed as a domain-specific language (DSL) that forms the core of analysis modeling in SimpleModeling. CML allows model elements and their relationships to be described in a narrative style close to natural language, ensuring strong compatibility with AI support and automated generation. Literate models written in CML function as intermediate representations that can be transformed into design models, program code, or technical documentation.
- Cloud Native Component Framework (CNCF)
-
Cloud Native Component Framework (CNCF) is a framework for executing cloud application components using a single, consistent execution model. Centered on the structure of Component, Service, and Operation, it enables the same Operation to be reused across different execution forms such as command, server (REST / OpenAPI), client, and script. By centralizing quality attributes required for cloud applications—such as logging, error handling, configuration, and deployment—within the framework, components can focus on implementing domain logic. CNCF is designed as an execution foundation for literate model-driven development and AI-assisted development, separating what is executed from how it is invoked.
- literate model
-
A Literate Model is a “readable model” that integrates model structure with natural-language narrative (structured documentation). It extends the idea of literate programming into the modeling domain, unifying structure (model) and narrative (structured text) into a single intelligible artifact interpretable by both humans and AI. The concept of “Literate Modeling” has been explored previously by some researchers and developers, mostly as an approach to improve documentation or code comprehension. However, those attempts did not establish a systematic modeling methodology that integrates models, narrative, and AI assistance as a unified framework. The Literate Model is a modeling concept newly systematized and proposed by SimpleModeling for the AI era. Building upon the ideas of literate modeling, it redefines them as an intelligent modeling foundation that enables AI-collaborative knowledge circulation and model generation. It is not merely a modeling technique but a framework that embeds human reasoning and design intent as narrative within the model, enabling AI to analyze and reconstruct them to assist in design and generation.
- Prompt
-
A structured instruction or contextual representation that bridges retrieved knowledge (RAG) and the AI model’s reasoning process. It transforms the structured knowledge from the BoK into a narrative or directive form that the model can interpret, act upon, and internalize.
- verification
-
Verification is the activity of confirming that an implementation conforms to its specified design or requirements.