From 91c3a814ef94641159c4d0536fd2d663e7cb0dbd Mon Sep 17 00:00:00 2001
From: Bailin He <15058035+bailinhe@users.noreply.github.com>
Date: Mon, 20 Nov 2023 15:12:56 -0500
Subject: [PATCH] Add Extension Docs (#91)
* init docs
---
docs/extensions.md | 608 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 608 insertions(+)
create mode 100644 docs/extensions.md
diff --git a/docs/extensions.md b/docs/extensions.md
new file mode 100644
index 0000000..20857a8
--- /dev/null
+++ b/docs/extensions.md
@@ -0,0 +1,608 @@
+# Extend Governor APIs
+
+**Governor Extensions** allows the Governor API itself to be
+extended, in a design that is heavily influenced by Kubernetes' CRD and operator
+patten.
+
+See [Governor Extension Example](https://github.com/equinixmetal/governor-extension-example)
+for a complete implementation of a Governor Extension.
+
+## Data Schema
+
+```mermaid
+erDiagram
+ extension {
+ uuid id
+ string url
+ string name
+ string slug
+ string description
+ bool enabled
+ enum status "offline, online"
+ }
+
+ extension_resource_definition {
+ uuid id
+ string name
+ string slug_singular
+ string slug_plural
+ enum scope "system or user"
+
+ bool enabled
+ string version
+ jsonb schema "JSON schema"
+ }
+
+ user_extension_resource {
+ uuid id
+ uuid resource_version
+ jsonb resource
+ }
+
+ system_extension_resource {
+ uuid id
+ uuid resource_version
+ jsonb resource
+ }
+
+ extension ||--o{ extension_resource_definition: has
+
+ user ||--o{ user_extension_resource: has
+ extension_resource_definition ||--o{ user_extension_resource: validates
+
+ extension_resource_definition ||--o{ system_extension_resource: validates
+```
+
+## Extension Life Cycles
+
+### Extension Registration and Bootstrapping
+
+- [Example Bootstrapping](https://github.com/equinixmetal/governor-extension-example/blob/main/internal/server/bootstrap.go)
+- [Example Deployment](https://github.com/equinixmetal/k8s-governor-extension-example/blob/main/values.yaml#L15)
+
+After the development of the extension itself, the admin will initiate the
+registration process prior to the deployment of the extension,
+the Governor API will then create an extension entity with its UUID.
+This UUID will be deployed with the extension (e.g. env var), and the extension,
+in turn, will use this UUID to gather all the necessary information to facilitate
+the process.
+
+A future improvement is shown below and can be added when governor API has the
+capability of provisioning NATS credentials.
+
+```mermaid
+sequenceDiagram
+ actor a as admin
+ participant g as Governor API
+ participant e as Extension
+
+
+ a->>g: initiate registration
+ note over a,g: extension name, description, URL,
etc.
+ g->>a: 201 created
+
+ a->>e: deploy with UUID
+
+ opt bootstrap
+ e->>g: GET /extensions/:uuid
+ g->>e: resp
+
+ alt status == 404 or enabled == true
+ note over e,g: extension not registered or disabled
+ e->>e: do nothing
+ else
+ e->>g: PATCH /extensions/:uuid
+ note over e,g: basic info, e.g., health check endpoint
+ g->>e: 200 ok
+
+ loop for each ERDs not already exists
+ e->>g: POST /extensions/:uuid/erds
+ g->>e: 201 created
+ end
+ end
+
+ opt NATS credentials
+ g->>g: provision NATS credential
+ note over g,e: this credential can access core events and
resource events that are own by this extension
+ g->>e: send credentials
+ end
+
+ e->>g: PATCH /extensions/:uuid
+ note over e,g: { status: "online" }
+ g->>e: 200 ok
+ end
+```
+
+### Serving an Extension
+
+- [Example: Event Subscription](https://github.com/equinixmetal/governor-extension-example/blob/main/internal/server/events.go)
+- [Example: Event Processing](https://github.com/equinixmetal/governor-extension-example/blob/main/pkg/greetings/process.go)
+
+```mermaid
+sequenceDiagram
+ actor u as User
+ participant g as Governor API
+ participant d as DB
+ participant m as Message Bus
+ participant e as Extension
+
+ e->>m: subscribe
+
+ critical initialization
+ g->>d: extension info?
+ d->>g: extension info
+ g->>g: add and serve resources routes
+ end
+
+ u->>g: [Create/Delete/Update]
/resources
+ activate g
+ g->>g: permissions check
+ g->>g: validation
+ g->>d: store in DB
+ g->>m: publish event: User did a thing
+ g->>u: response
+ deactivate g
+
+ m->>e: receive event:
User did a thing
+ activate e
+ opt this thing concerns me
+ e->>g: fetch relevant resources
+ g->>e:
+ note over e,g: user info, group info, custom resources, etc.
+ e->>e: do other relative things
+ end
+ deactivate e
+```
+
+### Update
+
+JSON schema for extension resource definition should be immutable, new versions
+of the resource definition can be created. Otherwise the bootstrap process
+is the same as shown above.
+
+### Disabling
+
+```mermaid
+sequenceDiagram
+ actor a as admin
+ participant g as Governor API
+ participant db as DB
+ participant m as Message Bus
+ participant e as Extension
+
+ a->>g: initiate disabling
+
+ g->>g: revoke NATS credentials
+ g->>g: withdraw routes
+
+ g->>m: extension update event
+ g->>a: 200: ok
+
+ opt termination logics
+ m->>e: receive event
+ note over e,m: { enabled: false }
+ e->>e: do termination things
+ note over e: reload, ignore NATS error, etc
+ end
+```
+
+### Removal
+
+```mermaid
+sequenceDiagram
+ actor a as admin
+ participant g as Governor API
+ participant db as DB
+ participant m as Message Bus
+ participant e as Extension
+
+ a->>g: initiate disabling
+
+ g->>g: revoke NATS credentials
+ g->>g: (soft) delete all resources
+ g->>g: withdraw routes
+
+ g->>m: extension update event
+ g->>a: 200: ok
+
+ opt termination logics
+ m->>e: receive event
+ note over e,m: { enabled: false }
+ e->>e: do termination things
+ note over e: reload, ignore NATS error, etc
+ end
+```
+
+## Endpoints
+
+### Extension Management
+
+| Operation | Method | URI |
+|---|---|---|
+| **list** | `GET` | /api/v1alpha1/extensions |
+| **create** | `POST` | /api/v1alpha1/extensions |
+| **get** | `GET` | /api/v1alpha1/extensions/:slug-or-id |
+| **update** | `PATCH` | /api/v1alpha1/extensions/:slug-or-id |
+| **delete** | `DELETE` | /api/v1alpha1/extensions/:slug-or-id |
+
+### Extension Resource Definitions Management
+
+| Operation | Method | URI |
+|---|---|---|
+| **list** | `GET` | /api/v1alpha1/extensions/:extension-slug-or-id/erds |
+| **create** | `POST` | /api/v1alpha1/extensions/:extension-slug-or-id/erds |
+| **get by ID** | `GET` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:uuid |
+| **get by slug** | `GET` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:slug-singular/:version |
+| **update by ID** | `PATCH` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:uuid |
+| **update by slug** | `PATCH` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:slug-singular/:version |
+| **delete by ID** | `DELETE` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:uuid |
+| **delete by slug** | `DELETE` | /api/v1alpha1/extensions/:extension-slug-or-id/erds/:slug-singular/:version |
+
+### User Resources
+
+#### Prefixes
+
+- authenticated user resources: `/api/v1alpha1/user/extension-resources`
+- admin managing user resources: `/api/v1alpha1/users/:user-id/extension-resources`
+
+#### URIs
+
+| Operation | Method | URI |
+|---|---|---|
+| **list** | `GET` | /\/\/\/\ |
+| **create** | `POST` | /\/\/\/\ |
+| **get** | `GET` | /\/\/\/\/\ |
+| **update** | `PATCH` | /\/\/\/\/\ |
+| **delete** | `DELETE` | /\/\/\/\/\ |
+
+### System Resources
+
+#### URI Prefixes
+
+- use existing `/v1alpha1/` api group: `/api/v1alpha1/extension-resources`
+
+#### Resource URIs
+
+| Operation | Method | URI |
+|---|---|---|
+| **list** | `GET` | /\/\/\/\ |
+| **create** | `POST` | /\/\/\/\ |
+| **get** | `GET` | /\/\/\/\/\ |
+| **update** | `PATCH` | /\/\/\/\/\ |
+| **delete** | `DELETE` | /\/\/\/\/\ |
+
+### Examples
+
+`user-1` is an admin, `user-2` is a regular user. URI prefixes approach-2 is chosen,
+all extensions are hypothetical
+
+1. admin registers notification extension
+
+ ```HTTP
+ POST /api/v1alpha1/extensions
+ {
+ "url": "notifications.governor.svc"
+ "name": "notifications"
+ "description": "notifications"
+ }
+ ```
+
+1. users checks their own `notification-preferences` provided by the notification extension
+
+ ```HTTP
+ GET /api/v1alpha1/user/extension-resources/notifications/notification-preferences/v1
+ ```
+
+1. admin checks `user-2`'s `pager-duty-groups` provided by the pager duty extension
+
+ ```HTTP
+ GET /api/v1alpha1/users/user-2/extension-resources/pager-duty/pager-duty-groups/v1
+ ```
+
+1. admin updates system resource `notification-targets` provided by the notification extension
+
+ ```HTTP
+ PATCH /api/v1alpha1/extension-resources/notifications/notification-targets/v1/slack
+ ```
+
+## Events
+
+Example Event:
+
+```json
+{
+ "subject": "",
+ "version": "v1alpha1",
+ "action": "create",
+
+ "extension-resource-id": "some-id"
+
+ // ... optional metadata
+ // group ID
+ // user ID
+ // application ID
+}
+```
+
+## Trace
+
+Governor API populates `TraceContext` to each event it emits to the event bus,
+extensions can inherit the parent parent trace context in `event.TraceContext`.
+
+[Example](https://github.com/equinixmetal/governor-extension-example/blob/main/pkg/greetings/process.go#L83-L86)
+
+## VIPs
+
+Very interesting problems.
+
+In some cases, an extension or multiple extensions might need to update the
+resources upon reconciliation. This could lead to potential issues like the
+extensions reacting to its own updates.
+
+### Extensions update loop
+
+```mermaid
+sequenceDiagram
+ actor u as User
+ participant g as Governor API
+ participant m as Message Bus
+ participant e as Extension
+
+
+ u->>g: create /things
+ activate g
+ g->>m: publish event: a thing was created
+ activate m
+ g->>u: response
+ deactivate g
+
+ m->>e: receive event:
a thing was created
+ deactivate m
+ activate e
+ e->>g: fetch thing with ID from event
+ g->>e:
+ e->>e: reconciliation
+ e->>g: update /things/:id
+
+ activate g
+ g->>m: publish event: a thing was updated
+ activate m
+ g->>e: response
+ deactivate g
+
+ deactivate e
+
+ note over g,e: here the extension is reacting to its own update
+
+ m->>e: receive event:
a thing was updated
+ deactivate m
+ activate e
+ e->>g: fetch thing with ID from event
+ g->>e:
+ e->>e: reconciliation
+ e->>g: update /things/:id
+
+ activate g
+ g->>m: publish event: a thing was updated
+ activate m
+ g->>e: response
+ deactivate g
+
+ deactivate e
+
+ note over g,e: so on and so forth an infinite loop is formed
+
+ m->>e: receive event:
a thing was updated
+ deactivate m
+
+ activate e
+ note over e: .
.
.
+ deactivate e
+```
+
+The above scenario can be mitigated by allowing the extensions to keep a
+local copy (cache) of the resource and only react to changes when the resources
+in Governor is actually changed from their own cache.
+
+However, local cache will not solve the issue where there are multiple extensions
+reacting to updates on the same resource. More precisely, the extensions will
+likely react to the same update $N$ times per extension,
+where $N$ is the number of extensions that react to same resource.
+(It will eventually stop because eventually the updates will not change any
+values in the resource, **unless** the extension is updating the resource
+with a unique value that changes every time, e.g. timestamps).
+
+### Annotations
+
+To mitigate this problem, **Annotations** can be introduced
+to the extension resources, and having the extension to mark a resource
+as `processed`, then ignore any subsequent updates on resources with this
+annotation.
+
+Annotations in an extension resource should be a map of `string => any`,
+the key of this map should follow this format:
+
+```txt
+proccessed.extension.governor/
+```
+
+The type of the annotation value is up to the extension developer's design,
+here are some example annotation values:
+
+- boolean value: extension simply discard the event if `true`
+- unix time stamp: indicates the last time the resource was processed by
+ the extension, and the extension can apply an "expiry" to the annotation
+
+### Multiple Extension Race Conditions
+
+```mermaid
+sequenceDiagram
+ actor u as User
+ participant g as Governor API
+ participant m as Message Bus
+ participant e1 as Extension 1
+ participant e2 as Extension 2
+
+
+ u->>g: create /account
+ activate g
+ g->>m: publish event: an account was created
+ activate m
+ g->>u: response
+ deactivate g
+
+ m->>e1: receive event:
an account was created
+ activate e1
+
+ m->>e2: receive event:
an account was created
+ deactivate m
+
+ activate e2
+
+ e1->>g: fetch resource with ID from event
+ g->>e1:
+
+ e2->>g: fetch resource with ID from event
+ g->>e2:
+
+ note over g, e2: resp: { "name": "Alice", balance: 0 }
+
+ e1->>e1: deposite $100
+ e2->>e2: deposite $50
+
+ e1->>g: update /accounts/:id
+ note over g, e1: payload: { "balance": 100 }
+ g->>e1:
+ deactivate e1
+
+ e2->>g: update /accounts/:id
+ note over g, e2: payload: { "balance": 50 }
+ g->>e2:
+ deactivate e2
+
+
+ note over g, e2: here Extension 1's update to the resource is discarded,
and Alice's account balance ended up $100 less than it was supposed to be
+```
+
+One of the solutions is to avoid having multiple extensions react on the same
+extension resource, in doing so, not only will it increase the complexity of the system,
+but also introduce a dependency from one extension to another. Nonetheless,
+this is still a potential issue to be addressed.
+
+### Resource Versioning
+
+To mitigate this problem, **Resource versioning** can be introduced
+to the extension resources. The governor API can generate a new UUID as the
+resource version each time the resource is created or updated.
+To update a resource, the extension must provide the resource version along
+with the patch to the governor API, the the governor API will reject any
+updates with resource version mismatches.
+
+### Complete Solution Examples
+
+Here are two examples showing how resource versioning and annotations will address the above
+issues.
+
+```mermaid
+sequenceDiagram
+ actor u as User
+ participant g as Governor API
+ participant m as Message Bus
+ participant e1 as Extension 1
+ participant e2 as Extension 2
+
+ note over u, e2: create account example: prevent race condition
+
+ u->>g: create /account
+ activate g
+ g->>m: publish event: an account was created
+ activate m
+ g->>u: response
+ deactivate g
+
+ m->>e1: receive event:
an account was created
+ activate e1
+
+ m->>e2: receive event:
an account was created
+ deactivate m
+
+ activate e2
+
+ e1->>g: fetch resource with ID from event
+ g->>e1:
+
+ e2->>g: fetch resource with ID from event
+ g->>e2:
+
+ note over g, e2: resp: { "name": "Alice", balance: 0, "resource_version": "1" }
+
+ e1->>e1: deposite $100
+ e2->>e2: deposite $50
+
+ e1->>g: update /accounts/:id
+ activate g
+ note over g, e1: payload:
{ "balance": 100, "resource_version": "1", "annotations": { "proccessed.extension.governor/ext_1": true }}
+
+ g->>g: update resource version
to "2"
+ g->>m: publish event: an account was updated
+ activate m
+
+ g->>e1: 🚀 200 OK
+ deactivate e1
+ deactivate g
+
+ e2->>g: update /accounts/:id
+ activate g
+ note over g, e2: payload:
{ "balance": 50, "resource_version": "1",
"annotations": { "proccessed.extension.governor/ext_2": true }}
+ g->>e2: ❌ 400 Bad Request: resource version mismatch
+ deactivate g
+
+ loop retry until success or backoff
+ e2->>g: fetch resource with ID from event
+ g->>e2:
+
+ note over g, e2: resp:
{ "balance": 50, "resource_version": "1",
"annotations": { "proccessed.extension.governor/ext_2": true }}
+
+ e2->>e2: deposite $50
+ e2->>g: update /accounts/:id
+ activate g
+ note over g, e2: payload:
{ "balance": 150, "resource_version": "2",
"annotations": { "proccessed.extension.governor/ext_1": true,
"proccessed.extension.governor/ext_2": true }}
+ g->>g: update resource version
to "3"
+ g->>m: publish event: an account was updated
+ g->>e2: 🚀 200 OK
+ deactivate g
+ end
+
+ deactivate e2
+
+ note over u, e2: end of create account example
+
+ note over u, e2: begin of prevent update loop example
+
+ m->>e1: receive event:
an account was updated
+ activate e1
+
+ m->>e2: receive event:
an account was updated
+ deactivate m
+
+ activate e2
+
+ e1->>g: fetch resource with ID from event
+ g->>e1:
+
+ e2->>g: fetch resource with ID from event
+ g->>e2:
+
+ note over g, e2: resp:
{ "balance": 150, "resource_version": "2",
"annotations": { "proccessed.extension.governor/ext_1": true,
"proccessed.extension.governor/ext_2": true }}
+
+ e1->>e1: verify annotation
"proccessed.extension.governor/ext_1" already exists
+ e2->>e2: verify annotation
"proccessed.extension.governor/ext_2" already exists
+
+ e1->>e1: discard event
+ deactivate e1
+
+ e2->>e2: discard event
+ deactivate e2
+
+ note over u, e2: end of prevent update loop example
+```