diff --git a/docs/observability.md b/docs/observability.md index bf33f082e5f..e303af485e8 100644 --- a/docs/observability.md +++ b/docs/observability.md @@ -76,9 +76,55 @@ For begin/stop events we need to define an appropriate hysteresis to avoid gener The service should collect host resource metrics in addition to service's own process metrics. This may help to understand that the problem that we observe in the service is induced by a different process on the same host. -## How We Expose Metrics/Traces - -Collector configuration must allow specifying the target for own metrics/traces (which can be different from the target of collected data). The metrics and traces must be clearly tagged to indicate that they are service’s own metrics (to avoid conflating with collected data in the backend). +## How We Expose Telemetry + +By default, the Collector exposes service telemetry in two ways currently: + +- internal metrics are exposed via a Prometheus interface which defaults to port `8888` +- logs are emitted to stdout + +Traces are not exposed by default. There is an effort underway to [change this][issue7532]. The work includes supporting +configuration of the OpenTelemetry SDK used to produce the Collector's internal telemetry. This feature is +currently behind two feature gates: + +```bash + --feature-gates=telemetry.useOtelForInternalMetrics + --feature-gates=telemetry.useOtelWithSDKConfigurationForInternalTelemetry +``` + +The `useOtelForInternalMetrics` feature gate changes the internal telemetry to use OpenTelemetry rather +than OpenCensus. This will become the default at some point [in the future][issue7454]. The second gate, +`useOtelWithSDKConfigurationForInternalTelemetry` enables the Collector to parse configuration +that aligns with the [OpenTelemetry Configuration] schema. The support for this schema is still +experimental, but it does allow telemetry to be exported via OTLP. + +The following configuration can be used in combination with the feature gates aforementioned +to emit internal metrics and traces from the Collector to an OTLP backend: + +```yaml +service: + telemetry: + metrics: + readers: + - periodic: + interval: 5000 + exporter: + otlp: + protocol: grpc/protobuf + endpoint: https://backend:4317 + traces: + processors: + - batch: + exporter: + otlp: + protocol: grpc/protobuf + endpoint: https://backend2:4317 +``` + +See the configuration's [example][kitchen-sink] for additional configuration options. + +Note that this configuration does not support emitting logs as there is no support for [logs] in +OpenTelemetry Go SDK at this time. ### Impact @@ -89,3 +135,9 @@ We need to be able to assess the impact of these observability improvements on t Some of the metrics/traces can be high volume and may not be desirable to always observe. We should consider adding an observability verboseness “level” that allows configuring the Collector to send more or less observability data (or even finer granularity to allow turning on/off specific metrics). The default level of observability must be defined in a way that has insignificant performance impact on the service. + +[issue7532]: https://github.com/open-telemetry/opentelemetry-collector/issues/7532 +[issue7454]: https://github.com/open-telemetry/opentelemetry-collector/issues/7454 +[logs]: https://github.com/open-telemetry/opentelemetry-go/issues/3827 +[OpenTelemetry Configuration]: https://github.com/open-telemetry/opentelemetry-configuration +[kitchen-sink]: https://github.com/open-telemetry/opentelemetry-configuration/blob/main/examples/kitchen-sink.yaml