-
Notifications
You must be signed in to change notification settings - Fork 620
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add RFC 0008 - Custom Event Metadata from Annotations
Signed-off-by: Matheus Pimenta <[email protected]>
- Loading branch information
1 parent
b45a8b5
commit 820476b
Showing
1 changed file
with
246 additions
and
0 deletions.
There are no files selected for viewing
246 changes: 246 additions & 0 deletions
246
rfcs/0008-custom-event-metadata-from-annotations/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,246 @@ | ||
# RFC-0008 Custom Event Metadata from Annotations | ||
|
||
**Status:** implementable | ||
|
||
<!-- | ||
Status represents the current state of the RFC. | ||
Must be one of `provisional`, `implementable`, `implemented`, `deferred`, `rejected`, `withdrawn`, or `replaced`. | ||
--> | ||
|
||
**Creation date:** 2024-05-23 | ||
|
||
**Last update:** 2024-12-17 | ||
|
||
## Summary | ||
|
||
Flux users often run into situations where they wish to send custom, static metadata fields defined | ||
in Flux objects on the events dispatched by the respective Flux controller to Kubernetes and | ||
notification-controller. This proposal offers a solution for supporting those use cases uniformly | ||
across all Flux controllers by sending the annotation keys in Flux objects that are prefixed with | ||
the API Group `event.toolkit.fluxcd.io` followed by a slash, i.e. `event.toolkit.fluxcd.io/`. | ||
|
||
## Motivation | ||
|
||
This RFC comes as a response to the need for adding custom metadata to events about Flux objects | ||
sent to notification providers. See specific user stories in the [User Stories](#user-stories) section. | ||
|
||
### Goals | ||
|
||
Provide a method for Flux users to embed custom/static metadata in their Flux objects | ||
and have that metadata propagated to the notification providers. | ||
|
||
### Non-Goals | ||
|
||
In this proposal we **do not** aim to provide a method for Flux users to send etcd-indexed custom metadata | ||
fields from Flux objects in events to notification-controller, most specifically labels. By design an event | ||
already contains enough identification information to locate the associated Flux object inside the cluster, | ||
which covers the use case of labels. Flux does not wish to incentivize practices that are impactful to clusters | ||
without a strong reason or benefit. | ||
|
||
## Proposal | ||
|
||
When sending events about Flux objects, we propose sending annotation keys prefixed with the well-defined | ||
API Group `event.toolkit.fluxcd.io` followed by a slash, i.e. prefixed with `event.toolkit.fluxcd.io/`, in | ||
addition to all the metadata that is already sent in the event. | ||
|
||
### User Stories | ||
|
||
#### Story 1 | ||
|
||
> As a user, I want to embed Flux into my GitHub Workflow in a way that it only succeeds if | ||
> the deployment made by Flux is successful. | ||
For example, embedding a Deployment ID from the GitHub API in a `HelmRelease` object like the one below: | ||
|
||
```yaml | ||
apiVersion: helm.toolkit.fluxcd.io/v2 | ||
kind: HelmRelease | ||
metadata: | ||
name: podinfo | ||
namespace: flux-system | ||
annotations: | ||
event.toolkit.fluxcd.io/deploymentID: e076e315-5a48-41c3-81c8-8d8bdee7d74d | ||
spec: | ||
chart: | ||
spec: | ||
chart: podinfo | ||
version: 6.5.* | ||
sourceRef: | ||
kind: HelmRepository | ||
name: podinfo | ||
``` | ||
Should cause notification-controller to propagate an event like the one below (most fields omitted for brevity): | ||
```json | ||
{ | ||
"involvedObject": { | ||
"apiVersion": "helm.toolkit.fluxcd.io/v2", | ||
"kind": "HelmRelease", | ||
"name": "podinfo", | ||
"namespace": "flux-system", | ||
"uid": "7d0cdc51-ddcf-4743-b223-83ca5c699632" | ||
}, | ||
"metadata": { | ||
"deploymentID": "e076e315-5a48-41c3-81c8-8d8bdee7d74d" | ||
} | ||
} | ||
``` | ||
|
||
#### Story 2 | ||
|
||
> As a user, I want to embed the new image tag in a `HelmRelease` object when the image is updated by an `ImageUpdateAutomation` | ||
> and have that information propagated to the notification providers. | ||
For example: | ||
|
||
```yaml | ||
apiVersion: helm.toolkit.fluxcd.io/v2 | ||
kind: HelmRelease | ||
metadata: | ||
name: podinfo | ||
namespace: flux-system | ||
annotations: | ||
event.toolkit.fluxcd.io/image: ghcr.io/stefanprodan/podinfo:latest # {"$imagepolicy": "flux-system:podinfo"} | ||
spec: | ||
chart: | ||
spec: | ||
chart: podinfo | ||
sourceRef: | ||
kind: HelmRepository | ||
name: podinfo | ||
values: | ||
image: | ||
tag: latest # {"$imagepolicy": "flux-system:podinfo:tag"} | ||
``` | ||
In this example image-automation-controller would update the image and tag near the markers. If, for example, it | ||
updates the image to `ghcr.io/stefanprodan/podinfo:6.5.0`, then it would cause notification-controller to start | ||
propagating events like the one below (most fields omitted for brevity): | ||
|
||
```json | ||
{ | ||
"involvedObject": { | ||
"apiVersion": "helm.toolkit.fluxcd.io/v2", | ||
"kind": "HelmRelease", | ||
"name": "podinfo", | ||
"namespace": "flux-system", | ||
"uid": "7d0cdc51-ddcf-4743-b223-83ca5c699632" | ||
}, | ||
"metadata": { | ||
"image": "ghcr.io/stefanprodan/podinfo:6.5.0" | ||
} | ||
} | ||
``` | ||
|
||
### Alternatives | ||
|
||
#### Alternative 1 | ||
|
||
An alternative for specifying custom metadata fields in Flux objects for sending on events | ||
is defining `.spec` APIs for such, like `.spec.eventMetadata` available in the Alert API. | ||
This alternative is not great because: | ||
|
||
* Such APIs would be fairly redundant with the well-known Kubernetes annotations. | ||
* Technically speaking, it is much easier to implement an alternative where the | ||
field storing the custom metadata is the same and is already available across all the | ||
Flux objects rather than introducing a new API. | ||
|
||
In the specific case of the Alert API this field was introduced because the Alert API | ||
is obviously a special one in the context of events and alerting. In particular, the | ||
Alert objects do not generate events themselves, but rather serve as an aggregation | ||
configuration for matching and propagating events from other Flux objects. | ||
|
||
#### Alternative 2 | ||
|
||
Instead of introducing a new API Group, i.e. `event.toolkit.fluxcd.io`, we could use the API | ||
Group `notification.toolkit.fluxcd.io` for the same purpose. This alternative is not great | ||
because it emphasizes an exclusive relationship with notification-controller, which is not the | ||
case. The events here are also Kubernetes Events, and an API Group that is more general and | ||
closer to Kubernetes Events is more appropriate. | ||
|
||
## Design Details | ||
|
||
All the Flux controllers use our implementation of the `EventRecorder` interface from the Go | ||
package `k8s.io/client-go/tools/record`: [`(*github.com/fluxcd/pkg/runtime/event.Recorder).AnnotatedEventf()`](https://github.com/fluxcd/pkg/blob/6f2619522699f1a78e8c7b41583ad9f7b7c9544e/runtime/events/recorder.go#L119). | ||
This implementation sends the events to notification-controller and also calls the same method | ||
from an injected `EventRecorder`, which sends the events to Kubernetes. To support the use cases | ||
discussed here we would modify this implementation to look for annotations prefixed with | ||
`event.toolkit.fluxcd.io/` in the Flux objects and send them alongside the other metadata of | ||
the event. Here we are talking specifically about the object annotations retrieved from the | ||
Flux object itself, i.e. the first argument of the `AnnotatedEventf()` method: `object runtime.Object`. | ||
This implementation would not change the interface `EventRecorder` used by the controllers, | ||
so all we need to do is bump the Go package `github.com/fluxcd/pkg/runtime` across all controllers. | ||
|
||
On the notification-controller side we would start accepting metadata keys starting with this | ||
prefix and remove it before sending the metadata key-value pair to the notification providers. | ||
This is an important aspect of the implementation because notification-controller only | ||
accepts metadata keys that are prefixed with the Group of the respective API the involved | ||
Flux object belongs to, so we need to add an exception for the new prefix. | ||
|
||
The API Group `event.toolkit.fluxcd.io` would be introduced as a constant in the package | ||
`github.com/fluxcd/pkg/apis/event` with the name `Group`. This constant would be used in the package | ||
`github.com/fluxcd/pkg/runtime/event` and notification-controller for the implementation described above. | ||
|
||
### Precedence Order | ||
|
||
After this change there would be four sources of metadata being sent on notifications. | ||
They are listed below with the proposed order of precedence, from lowest to highest: | ||
|
||
1. User-defined summary on the Alert object, set with `.spec.summary`. | ||
2. User-defined metadata on Flux objects, set with the `event.toolkit.fluxcd.io/` prefix in the keys of the object's `.metadata.annotations`. | ||
3. User-defined metadata on the Alert object, set with `.spec.eventMetadata`. | ||
4. Controller-defined metadata, set with the `<controller group>.toolkit.fluxcd.io/` prefix in the metadata keys of the event payload. | ||
|
||
Upon any key conflicts when combining all these metadata, notification-controller would | ||
resolve them according to the precedence order specified above, print an `info` log and emit a | ||
Kubernetes Event containing all the key conflicts to warn the user and prompt them to change | ||
their configuration to remove those conflicts. | ||
|
||
#### Reasoning | ||
|
||
Controller-defined metadata has the highest precedence because it integrates with external systems, | ||
e.g. commit SHAs, digests, chart versions, etc. | ||
|
||
Alert-level metadata (`.spec.eventMetadata`) is usually also cluster-level, e.g. the cluster name, | ||
region, environment, etc. We don't want tenants overriding cluster-level metadata. | ||
|
||
User-defined metadata on Flux objects, whose use cases are described in the [User Stories](#user-stories) | ||
section, would usually be defined by cluster tenants. Hence it should not override cluster-level metadata. | ||
|
||
And last comes `.spec.summary` from the Alert API. For now it can serve as a default summary for alerts | ||
in the cluster, e.g.: | ||
|
||
`Apps are failing in the cluster. Playbooks: (<URL to all incident response playbooks>)` | ||
|
||
An object-level override could be: | ||
|
||
`App X is failing. Playbook: (<URL to incident response playbook for App X>)` | ||
|
||
The `.spec.summary` field from the Alert API was the first to support user-defined metadata. | ||
Later `.spec.eventMetadata` was introduced to support structured user-defined metadata. The | ||
Flux team considers the `.spec.eventMetadata` a generalization of `.spec.summary`, and the | ||
`.spec.summary` field may be deprecated in the future. | ||
|
||
### How can this feature be enabled / disabled? | ||
|
||
To enable the feature, use the `event.toolkit.fluxcd.io/` prefix in Flux object annotations, | ||
for example: | ||
|
||
* `event.toolkit.fluxcd.io/image: ghcr.io/stefanprodan/podinfo` | ||
* `event.toolkit.fluxcd.io/deploymentID: e076e315-5a48-41c3-81c8-8d8bdee7d74d` | ||
|
||
It's important to notice that not all Flux objects emit events, e.g. Alert and Provider objects. | ||
For a list of the Flux objects that emit events, see the kinds allowed on the | ||
`.spec.eventSources[].kind` field of the Alert API. | ||
|
||
To disable the feature, do not use `event.toolkit.fluxcd.io/` as a prefix in Flux object annotations. | ||
|
||
## Implementation History | ||
|
||
<!-- | ||
Major milestones in the lifecycle of the RFC such as: | ||
- The first Flux release where an initial version of the RFC was available. | ||
- The version of Flux where the RFC graduated to general availability. | ||
- The version of Flux where the RFC was retired or superseded. | ||
--> |